Oh well, LLMs are notoriously bad at chess. Other tasks as well. Maths, especially counting... But I'd say this is because they're not very intelligent. And not trained to be a chess computer. So a clever elementary-school kid will outperform them at keeping the chess moves to legal ones. Or at counting letters in words. But that doesn't directly lead to the conclusion they're not world models. That'd need some qualifying, quantification and a mathematical proof. This is just opinion.
And I mean what is a world model? LLMs can mostly perceive the world through words, they can't see as I do. (Hence the name... This sort of AI had originally been designed to model language.) Not smell or hear properly. And they can't interact with the world and (directly) learn from that feedback. And I bet not the entire world is available as internet text for their datasets. So they won't have a world model similar to mine... But mine isn't complete either. I can only see and hear a small bandwidth. There are things I can't even sense. I haven't been to a lot of places... I'm bad at chess as well... And there's generally a lot of things I don't know or can't do. So I'm not sure what a world model is even supposed to be and how complete we want it to be. Do I have one?