Futurology

3196 readers

18 users here now

founded 2 years ago

MODERATORS

Multiple LLMs voting together on content validation catch each other’s mistakes to achieve 95.6% accuracy. (arxiv.org)

submitted 8 months ago by Lugh to c/futurology

25 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] kippinitreal@lemmy.world 8 points 8 months ago (3 children)

Genuine question: how energy intensive is it to run a model compared to training it? I always thought once a model is trained it's (comparatively) trivial to query?

[–] oktoberpaard@feddit.nl 7 points 8 months ago (1 children)

A 100-word email generated by an AI chatbot using GPT-4 requires 0.14 kilowatt-hours (kWh) of electricity, equal to powering 14 LED light bulbs for 1 hour.

Source: https://www.washingtonpost.com/technology/2024/09/18/energy-ai-use-electricity-water-data-centers/

[–] hitmyspot@aussie.zone 0 points 8 months ago

How much energy does it take for the PC to be on and the user to type out that email manually?

I assume we will get to a point where energy required starts to reduce as the computing power increases with moores law. However, it's awful for the environment in the mean time.

I don't doub that rather than reducing energy, instead they will use more complex models requiring more power for these tasks for the foreseeable future. However eventually it will be diminishing returns on power and efficiency will be more profitable.

[–] DavidGarcia@feddit.nl 6 points 8 months ago (2 children)

For the small ones, with GPUs a couple hundred watts when generating. For the large ones, somewhere between 10 to 100 times that.

With specialty hardware maybe 10x less.

[–] pennomi@lemmy.world 3 points 8 months ago (2 children)

A lot of the smaller LLMs don’t require GPU at all - they run just fine on a normal consumer CPU.

[–] copygirl@lemmy.blahaj.zone 3 points 8 months ago (1 children)

Wouldn't running on a CPU (while possible) make it less energy efficient, though?

[–] pennomi@lemmy.world 3 points 8 months ago

It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.

[–] DavidGarcia@feddit.nl 1 points 8 months ago

yeah but 10x slower, at speeds that just don't work for many use cases. When you compare energy consumption per token, there isn't much difference.

[–] kippinitreal@lemmy.world 2 points 8 months ago

Good god. Thanks for the info.

[–] 4am@lemm.ee 2 points 8 months ago

Still requires thirsty datacenters that use megawatts of power to keep them online and fast for thousands of concurrent users