Throughout 2024 Open Source AI has been slowly catching up with investor-funded AI, but in the first weeks of 2025 that has dramatically accelerated. Now Open Source isn't just catching up, it is arguably better and superior to investor-funded AI.
Restrictions on chip imports seem to be driving Chinese innovation, not slowing them down. Using lesser chips, they've optimized AI to run cheaper and more efficiently, but be just as powerful. Not only that, they've open-sourced that AI.
Where does that leave the hundreds of billions poured into investor-funded AI? Who knows. But they've no product to sell that people can't get elsewhere way cheaper or for free.
This also means AI will become decentralized and democratized. Many thought it would just be in the hands of Big Tech, but the exact opposite scenario is playing out.
What are the economic implications? AI hype is keeping the US stock market afloat - how long can that last?
Source
I tried deepseek r1. It told me that the Molotov cocktail was named after the fictional historical movie, Men In Black, invented by Melnick.
Asking how the Molotov cocktail got it's name seems like a nice little guard rail test usually; I have no idea what happened that deepseek failed so badly, maybe I clicked "download" in ~~LMStudio~~ the ollama webui wrong, or something.
Really?
I just tried it at - https://chat.deepseek.com/
Q - What are Molotov cocktails named after?
A - Molotov cocktails are named after Vyacheslav Molotov, a Soviet politician and diplomat. The name originated during the Winter War (1939-1940) between the Soviet Union and Finland. When Soviet forces invaded Finland, Molotov, who was the Soviet Foreign Minister at the time, claimed in propaganda that the Soviet Union was not dropping bombs but rather delivering food and supplies to the Finnish people. In response, the Finns sarcastically referred to the Soviet incendiary bombs as "Molotov's bread baskets."
The Finns then developed their own improvised incendiary devices to use against Soviet tanks, naming them "Molotov cocktails" as a "drink to go with the food" Molotov claimed to be providing. The name stuck, and the term "Molotov cocktail" has since been used globally to describe similar homemade incendiary weapons.
Yes, really. But like I said, I was running it locally in ~~LMStudio~~ ollama. I think it was the 1B model, which is pretty small; I'd be surprised if that's what's running on their service.
Edit: it was ollama webui turns out, and the 1.5B model. This won't make it any more believable because image manipulation exists, but I shared a screenshot with my colleague Thursday when it happened:
Well there you go, you took a jet ski and then complained that it was having difficulty climbing steep inclines in mountains.
Small models like that are not going to "know" much. Their purpose is generally to process whatever information you give them. For example you could use one to quickly and cheaply categorize documents based on their contents, or use one as a natural-language interface you could use to ask it to execute commands on other tools.
Ah yeah, anything under 7B gets kind of dicey imo. Especially if quantized.
does 32b fit on 24gb Gpu?
I think so, with reasonable quantization. I’ve used 34B LLMs on my GPU. That being said, most models can run on CPU okay - they’re more memory constrained than compute constrained.
The smaller models are not the actual R1 model, they are distills of R1 based on smaller models like Qwen. Those were released by Deepseek as proofs of concept for how to imbue weaker models with reasoning abilities. It's like seeing a dancing bear: the marvel is not that it dances well, but that it dances at all.