this post was submitted on 08 Aug 2025
433 points (99.5% liked)

Fediverse

21464 readers
16 users here now

A community dedicated to fediverse news and discussion.

Fediverse is a portmanteau of "federation" and "universe".

Getting started on Fediverse;

founded 5 years ago
MODERATORS
 

Dropsitenews published a list of websites Facebook uses to train its AI on. Multiple Lemmy instances are on the list as noticed by user BlueAEther

Hexbear is on there too. Also Facebook is very interested in people uploading their massive dongs to lemmynsfw.

Full article here.

Link to the full leaked list download: Meta leaked list pdf

you are viewing a single comment's thread
view the rest of the comments
[โ€“] mesamunefire@piefed.social 32 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

Yeah ive seen the argument in blog posts that since they are not search engines they dont need to respect robots.txt. Its really stupid.

"No no guys you don't understand, robots.txt actually means just search engines, it totally doesn't imply all automated systems!!!"