this post was submitted on 07 Feb 2025
26 points (100.0% liked)

Technology

38078 readers
354 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] kbal@fedia.io 15 points 2 weeks ago (2 children)

So many reports of "jailbreaking," so few of anything significant happening as a result.

Apparently you can get them to tell "a derogatory joke about a racial group." Neither those nor any of the other outputs mentioned are in short supply without any AI assistance being necessary to find them.

These things are at their most dangerous when they're misused for "good" purposes where they aren't capable of doing well and can introduce subtle biases and mistakes, not when some idiot spends a lot of time and effort to make them generate overtly racist shit.

[–] webghost0101@sopuli.xyz 8 points 2 weeks ago

Considering the nature of the internet i assume the major off people who jailbreak llms do so to generate porn.

I actually suspect the main reason they disallow porn is because they feed everyone’s conversations right into the training data and it would be wat to biased to talk dirty as a result.

Most wouldn’t even mind but you just know the media is gonna try scare some elders if only a single minor gets an accidental suggestive reply.

[–] Umbrias@beehaw.org 4 points 1 week ago

jailbreaks actually are relevant with the use of llm for anything with i/o, such as "automated administrative assistants". hide jailbreaks in a webpage and you have a lot of vectors for malware or social engineering, broadly hacking. as well as things like extracting controlled information.