this post was submitted on 09 Oct 2023
36 points (100.0% liked)
Futurology
1801 readers
45 users here now
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Can this be easily self hosted?
The problem is most of these models need like a terabyte of VRAM... And consumers have about 8-24GB.
Holy shit a terabyte?
This specific one says it'll run on 24GB actually. But some are just crazy big.
There are smaller models that can run on most laptops.
https://www.maginative.com/article/stability-ai-releases-stable-lm-3b-a-small-high-performance-language-model-for-smart-devices/
In benchmarks this looks like it is not far off Chat-GPT 3.5.
It's not even close, less than half of 3.5's 85.5% in ARC. Some larger Open models are competitive in Hellaswag, TruthfulQA and MMLU but ARC is still a major struggle for small models.
3Bs are kind of pointless right now because the machines with processors capable of running them at a usable speed probably have enough memory to run a 7B anyway.