this post was submitted on 22 Jan 2025
13 points (100.0% liked)
Technology
1147 readers
48 users here now
A tech news sub for communists
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Tangential but I understand how these reasoning based systems are supposed to work. I saw some sample output from R1 and it looks like it generates a thought process for answering the prompt before actually answering it. I can see how it would make it more likely to answer more logically but the thinking part was like 2 to 3 times longer than the actual answer.
I am assuming OpenAI, Anthropic etc. are doing something similar. As a concept I see nothing wrong with it but since these services charge per token wouldn't this process balloon the context size? It would make querying them much more expensive.
It is more expensive although Deepseek's model is quite cheaper than the others making this much less of a factor. Aditionally these "reasoning models" aren't necessecerilly better for every task though, so for many things a normal, and cheaper model, might be prefered still.