this post was submitted on 25 Aug 2025
131 points (100.0% liked)

chapotraphouse

13996 readers
762 users here now

Banned? DM Wmill to appeal.

No anti-nautilism posts. See: Eco-fascism Primer

Slop posts go in c/slop. Don't post low-hanging fruit here.

founded 4 years ago
MODERATORS
 

Lemmygrad

Hexbear is on there too

This completely changes the posting game. We all need to post more.

you are viewing a single comment's thread
view the rest of the comments
[–] spectre@hexbear.net 19 points 2 weeks ago (3 children)

@dessalines@lemmy.ml is there a way to write a ban on this into the TOS/license of the next Lemmy update?

[–] Azarova@hexbear.net 34 points 2 weeks ago* (last edited 2 weeks ago) (3 children)

AI scrapers do not care about TOS or licenses. There would need to be a way to stop the actual act of scraping from happening outright. I don't know anything about coding though, so I don't know how that would even work or if it's even possible.

[–] spectre@hexbear.net 19 points 2 weeks ago (1 children)

The answer iin expecting is:

We can (will?), but they're going to do it anyway, which would mean that we have to gear up for a legal battle and we're not really up for that, so don't expect much when it comes to enforcement.

And maybe some FOSS org would be up for taking in the legal battle? There's a big overlap of socialists + tech people on this site (not to mention the greater fediverse) who could be doing some organizing around the issue. Maybe !technology@hexbear.net should be having the appropriate discussions? Or maybe a different instance has a suitable comm?

[–] hellinkilla@hexbear.net 3 points 2 weeks ago

afaik the main FOSS org that is generally able to get involved in legal battles of their own choosing is Software Freedom Conservancy. I don't know all what they're up to, but Give Up GitHub campaign is significantly about AI model training. it is much more constrained in scope, but due to the nature of the source material, would be probably easier to succeed at legally. Not sure how active that is.

Just from quickly reading what they have online, doesn't seem EFF made the right call this time.

archive.org cut a deal to let companies train on their data ages ago.

Those are the only ones I know. Unless some state wants to get involved in a serious way.

[–] SootySootySoot@hexbear.net 17 points 2 weeks ago

There are vaguely technical solutions, like proof of work javascript blocks, or obfuscating data until it's un-obfuscated by javascript or something, but nothing is 100% effective, and can degrade user experience. I'm all for the idea of us just doing mass data injection and including billions of paragraphs about how communism is great.

[–] Bishop_Owl@hexbear.net 4 points 2 weeks ago

Easiest thing I can think of is a plugin that turns all your posts into images of the text you typed. At least that can't as easily be read by the AI.

[–] bloubz@lemmygrad.ml 13 points 2 weeks ago (2 children)
[–] Hermes@hexbear.net 13 points 2 weeks ago (1 children)
[–] bloubz@lemmygrad.ml 6 points 2 weeks ago (1 children)

Sounds very nice as well, in a different direction

[–] spectre@hexbear.net 7 points 2 weeks ago

Both those projects are sick

[–] dessalines@lemmy.ml 11 points 2 weeks ago (1 children)

Would make zero difference. Same for adding more explicit blocks to robots.txt, which is basically a keep off the grass sign. These companies don't care and they face zero repercussions.

[–] spectre@hexbear.net 5 points 2 weeks ago (1 children)

Maybe adding a sentence to the licenseswwould generate some negative press for them? I guess that's about the best we could hope for

[–] itsraining@lemmygrad.ml 5 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

Does an elephant care about an ant? Does Meta care about a string in the licence of a software used in four or five of the thousands of websites it shamelessly scrapes? Unfortunately I don't think so.

[–] spectre@hexbear.net 5 points 2 weeks ago

If you read my other comments, I obviously don't think meta would care. The purpose is:

  • an opportunity for additional bad press which would affect the public (not meta)
  • an opportunity for a nonprofit or activist group to pursue legal means of stopping the behavior

It's also possible that neither of these things happen, it's sort of a low effort-low reward situation.