this post was submitted on 18 Sep 2025
151 points (100.0% liked)
Data is Beautiful
2756 readers
43 users here now
Be respectful
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
LLMs can’t deal with highly reflective languages at the moment. In English or Chinese you can assign tokens to entire words without having to account for word morphology (which is also why models fail at counting letters in words) but it falls apart quickly in Polish or Russian. The way models like ChatGPT work now is that they do their „reasoning” in English first and translate back to the query language at the end.