Threads, Meta's new microblogging platform, is updating its terms to focus on data collection from "Third Party Users".
This shouldn't come as a huge surprise. Meta is moving forward with their plans for Theads and the Fediverse, and their adjusted terms reflect a new impending reality for Fediverse users.
You aren't making the point you think you're making. Sure, at somewhere between 8 to 11 million accounts, the Fediverse is a small pond. Meta is a gigantic whale. Ingesting the entire graph of everyone on the network would be relatively trivial for them, storage-wise.
Yes, but do you analyse this information to sell it to advertisers? Will you start posting sponsored content based on this information? And will the money you collect benefit the community you live in, or will it buy you another politician?
Altering the language of a service policy (or, writing a new one) is usually a good indication that something is indeed about to change at a larger level.
What's to stop them from scraping the Fediverse without federating? If they really want the data, they could very well find a way. At least they're spelling it out here and announced an attempt at proper federation.
The article discusses this, a bit. One of the other platforms is considering an enhancement to require request signatures on non-ActivityPub APIs, I.E. Meta can make unsigned requests, where the server doesn't know who they're from, but only get minimal (or no) data back, or Meta can make signed requests, and instance owners get to decide what data (if any) they're okay with sharing to Meta, based on Meta's privacy policies. Beyond API's, you're talking about web scraping, which is something the industry has been handling for decades.
I might end up using a personal instance as well. But in that case I'll probably end up with an instance whitelist, rather than defederating from disliked ones.