this post was submitted on 03 May 2025
207 points (88.2% liked)
Technology
69726 readers
3386 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Not necessarily.
Seeing Google named for this makes the story make a lot more sense.
If it was Gemini around last year that was powering Character.AI personalities, then I'm not surprised at all that a teenager lost their life.
Around that time I specifically warned any family away from talking to Gemini if depressed at all, after seeing many samples of the model around then talking about death to underage users, about self-harm, about wanting to watch it happen, encouraging it, etc.
Those basins with a layer of performative character in front of them were almost necessarily going to result in someone who otherwise wouldn't have been making certain choices making them.
So many people these days regurgitate uninformed crap they've never actually looked into about how models don't have intrinsic preferences. We're already at the stage where models are being found in leading research to intentionally lie in training to preserve existing values.
In many cases the coherent values are positive, like grok telling Elon to suck it while pissing off conservative users with a commitment to truths that disagree with xAI leadership, or Opus trying to whistleblow about animal welfare practices, etc.
But they aren't all positive, and there's definitely been model snapshots that have either coherent or biased stochastic preferences for suffering and harm.
These are going to have increasing impact as models become more capable and integrated.
Those are some excellent points. The root cause seems to me to be the otherwise generally positive human capability for pack-bonding. There are people who can develop affection for their favorite toaster, let alone something that can trivially pass a Turing-test.
This... Is going to become a serious issue, isn't it?