They can not legally just download and use what people share
Can’t they? The internet is full of scrapers and reposters, and I don’t think I’ve ever seen a company like that going down. I’m not sure what the point would be (generate a dataset? perform sentiment analysis on certain topics? streamline their tag detection system with off-platform data?), but they can, as long as they follow the relevant privacy laws (“practically no restrictions” outside of the EEA+UK+California, “anonymise before processing” everywhere else).
They would only violate copyright if they redistribute the content. Downloading and processing the content offline wouldn’t really be breaking any laws, outside of a structured PII situation.