Technology

76179 readers

3527 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

826

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well. (archive.is)

submitted 4 months ago* (last edited 4 months ago) by Allah@lemm.ee to c/technology@lemmy.world

338 comments fedilink hide all child comments

LOOK MAA I AM ON FRONT PAGE

(page 3) 50 comments

sorted by: hot top controversial new old

[–] 1rre@discuss.tchncs.de 3 points 4 months ago (2 children)

The difference between reasoning models and normal models is reasoning models are two steps, to oversimplify it a little they prompt "how would you go about responding to this" then prompt "write the response"

It's still predicting the most likely thing to come next, but the difference is that it gives the chance for the model to write the most likely instructions to follow for the task, then the most likely result of following the instructions - both of which are much more conformant to patterns than a single jump from prompt to response.

[–] kescusay@lemmy.world 6 points 4 months ago (3 children)

But it still manages to fuck it up.

I've been experimenting with using Claude's Sonnet model in Copilot in agent mode for my job, and one of the things that's become abundantly clear is that it has certain types of behavior that are heavily represented in the model, so it assumes you want that behavior even if you explicitly tell it you don't.

Say you're working in a yarn workspaces project, and you instruct Copilot to build and test a new dashboard using an instruction file. You'll need to include explicit and repeated reminders all throughout the file to use yarn, not NPM, because even though yarn is very popular today, there are so many older examples of using NPM in its model that it's just going to assume that's what you actually want - thereby fucking up your codebase.

I've also had lots of cases where I tell it I don't want it to edit any code, just to analyze and explain something that's there and how to update it... and then I have to stop it from editing code anyway, because halfway through it forgot that I didn't want edits, just explanations.

[–] spankmonkey@lemmy.world 5 points 4 months ago

I’ve also had lots of cases where I tell it I don’t want it to edit any code, just to analyze and explain something that’s there and how to update it… and then I have to stop it from editing code anyway, because halfway through it forgot that I didn’t want edits, just explanations.

I find it hilarious that the only people these LLMs mimic are the incompetent ones. I had a coworker that changed things when asked to explain constantly.

load more comments (2 replies)

load more comments (1 replies)

load more comments