this post was submitted on 25 Aug 2025
131 points (100.0% liked)
chapotraphouse
13996 readers
762 users here now
Banned? DM Wmill to appeal.
No anti-nautilism posts. See: Eco-fascism Primer
Slop posts go in c/slop. Don't post low-hanging fruit here.
founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
AI scrapers do not care about TOS or licenses. There would need to be a way to stop the actual act of scraping from happening outright. I don't know anything about coding though, so I don't know how that would even work or if it's even possible.
The answer iin expecting is:
And maybe some FOSS org would be up for taking in the legal battle? There's a big overlap of socialists + tech people on this site (not to mention the greater fediverse) who could be doing some organizing around the issue. Maybe !technology@hexbear.net should be having the appropriate discussions? Or maybe a different instance has a suitable comm?
afaik the main FOSS org that is generally able to get involved in legal battles of their own choosing is Software Freedom Conservancy. I don't know all what they're up to, but Give Up GitHub campaign is significantly about AI model training. it is much more constrained in scope, but due to the nature of the source material, would be probably easier to succeed at legally. Not sure how active that is.
Just from quickly reading what they have online, doesn't seem EFF made the right call this time.
archive.org cut a deal to let companies train on their data ages ago.
Those are the only ones I know. Unless some state wants to get involved in a serious way.
There are vaguely technical solutions, like proof of work javascript blocks, or obfuscating data until it's un-obfuscated by javascript or something, but nothing is 100% effective, and can degrade user experience. I'm all for the idea of us just doing mass data injection and including billions of paragraphs about how communism is great.
Easiest thing I can think of is a plugin that turns all your posts into images of the text you typed. At least that can't as easily be read by the AI.