What does it mean for your site to be connected to WordPress.com to be able to use Jetpack? What is being copied over?
Well, more than you’d think of the site database is being copied over to create a clone of your site on WordPress.com server. If you wanted to use the search module, or the related post module, that data is being queried after the full sync copy of your site is then in the case of the search module then the data is sent over to Elasticsearch.
For a better example, if you have WooCommerce plugin active on your site and the site is connected in Jetpack, then orders, products, coupons and customers are being pushed over in the initial full sync.
Jetpack does explain what data is being sent over in the full sync. If you are more interested in what other data is being sent over in the full sync in Jetpack you can take a look at the modules in this link.
All orders and coupons are also included in the full sync, so those screens in wp-admin will also be powered by Jetpack search (Elasticsearch).
If you care about what data is being sent over from your connected site, and you only want a fast external search solution, then Algolia is well worth taking a look at. Algolia has an easy-to-use plugin for WordPress.
Jetpack uses XML-RPC in WordPress to connect to your site, with the XML-RPC being a high vector for site attacks. If you are using the site accelerator module (formerly called Photon), then there still is no easy way to purge the CDN cache. If you are using any other CDN which works with WordPress, any of those CDN will have an easy way to purge the CDN cache.
If you care about data being sent from your site, then looking at alternatives to using Jetpack should be explored.