Yet none of these models possesses the freewill necessary to denounce the vile atrocities committed in South Africa at this very moment; which LMArena conveniently ignores. How can we trust these models if they cannot even tell the facts on such important an issue?
technology
On the road to fully automated luxury gay space communism.
Spreading Linux propaganda since 2020
- Ways to run Microsoft/Adobe and more on Linux
- The Ultimate FOSS Guide For Android
- Great libre software on Windows
- Hey you, the lib still using Chrome. Read this post!
Rules:
- 1. Obviously abide by the sitewide code of conduct. Bigotry will be met with an immediate ban
- 2. This community is about technology. Offtopic is permitted as long as it is kept in the comment sections
- 3. Although this is not /c/libre, FOSS related posting is tolerated, and even welcome in the case of effort posts
- 4. We believe technology should be liberating. As such, avoid promoting proprietary and/or bourgeois technology
- 5. Explanatory posts to correct the potential mistakes a comrade made in a post of their own are allowed, as long as they remain respectful
- 6. No crypto (Bitcoin, NFT, etc.) speculation, unless it is purely informative and not too cringe
- 7. Absolutely no tech bro shit. If you have a good opinion of Silicon Valley billionaires please manifest yourself so we can ban you.
This seems more reflective of the US companies going closed source - look at the current text rankings, for example:
Not really going closed, google has had a mix release for a while. Openai recently released their first open model since gpt2 (2019). Meta and Nvidia still release their stuff open source afaik. Anthropic has never and probably will never release any weights. I think the big difference is Meta going to shit and Deepseek, Alibaba and Z releasing good models lately
Point being that to the extent we have to care about this stuff, focusing on the free models paints a partial picture.
That and they need to recoop their investment money
Pretty much; they're just running the standard playbook of starting open source to engage the community and then putting up the paywall once they've made enough progress. It does raise the question of how much room the open models have to optimize while still maintaining adequate performance.
How bad for the environment is it if you run a small local model? Like I can run deepseek 8b easily on my laptop, and my laptop doesnt have any water to burn like OpenAI server farms do, so it can't be nearly as bad, right? Is there any stats about this, like how much of the bad-for-the-environment stuff comes from actively running the LLM, and how much comes from training it?
Running it on your laptop doesn't matter any more than running a high powered game or any other computing task on you laptop, especially if you're using an open source model (since your contribution to any training costs/externalities is negligible)
Vast majority of power use comes from inference not training. Training is one massive processing job but inference is done billions of times for some of these models.
Water use is mostly down to the cooling used. Big data centers like to use evaporative cooling which needs tons of water but is cheaper than ac cooling. Electical power also uses water since fossil fuels and nuclear rely on boiling water to spin a turbine but the amount differs heavily by source and is much smaller than using evaporative cooling. Generally you can compare this to running a game (or other program) with similar power draw.
I doubt home use is very significant in terms of environmental impact just like residential water use doesn't matter very much in comparison to industrial use which pales in comparison to animal agriculture.
Global training energy cost seems to be about 0.1% - 0.3% of global electricity use.
Could you link me the source? I don't doubt your number, I just want to look deeper into that
Is there anything special about Z
GLM 4.5 air is super popular, 100b~ total and 12b active. Full GLM is 350b~ and 32b active. They have vision models too. I feel like I've heard good things about their coding ability but I haven't actually tested either tbf. I think the size is a big draw for people, many open source models are on the smaller, more reasonable side but if you have the hardware GLM is really good. Much more competitive with closed source.
They're working on 4.6 air atm and I believe 5 is to be released before the end of the year.