this post was submitted on 26 Jan 2024
425 points (82.9% liked)

Technology

59666 readers
2743 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

We Asked A.I. to Create the Joker. It Generated a Copyrighted Image.::Artists and researchers are exposing copyrighted material hidden within A.I. tools, raising fresh legal questions.

you are viewing a single comment's thread
view the rest of the comments
[–] AFaithfulNihilist@lemmy.world 7 points 10 months ago* (last edited 10 months ago) (1 children)

Chat GPT it's over 500 gigs of training data plus over 300 gigs of RAM, and Sam Altman has been quite adamant about how another order of magnitude worth of storage capacity is needed in order to advance the tech.

I'm not convinced that these are compressed much at all. I would bet this image in its entirety is actually stored in there someplace albeit in an exploded format.

[–] jao@lemy.lol -1 points 10 months ago (1 children)

I purchased a 128 GB flash drive for around 12-15$ (I forgot the exact price) last year, and on Amazon, there are 10 TB hard drives for $100. So, the actual storage doesn't seem to be an issue.

RAM is expensive 128 GB of RAM on Amazon is $500.

But then again, I am talking about the consumer grade stuff. It might be different for the people who are making AI's as they might be using the industrial/whatever it's called grade stuff.

[–] AFaithfulNihilist@lemmy.world 5 points 10 months ago

It depends on what kind of RAM you're getting.

You could get Dell R720 with two processors and 128 gigs of RAM for $500 right now on eBay, but it's going to be several generations old.

I'm not saying that the model is taking up astronomical amounts of space, but it doesn't have to store movies or even high resolution images. It is also not being expected to know every reference, just the most popular ones.

I have 120tb storage server in the basement. So the footprint of this learning model is not particularly massive by comparison, but It does contain this specific whole joker image. It's not something that could have been generated without the original to draw from.

In order to build a bigger model they would need not necessarily just more storage but actually a new way of having more and faster RAM connected to lower latency storage. LLMs are the kinds of software that become hard to subdivide to be distributed across purpose-built arrays of hardware.