this post was submitted on 13 Dec 2023

97 points (92.2% liked)

Technology

59578 readers

3420 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

submitted 11 months ago by L4s@lemmy.world to c/technology@lemmy.world

43 comments fedilink hide all child comments

Mind-reading AI can translate brainwaves into written text: Using only a sensor-filled helmet combined with artificial intelligence, a team of scientists has announced they can turn a person’s thou...::A system that records the brain's electrical activity through the scalp can turn thoughts into words with help from a large language model – but the results are far from perfect

you are viewing a single comment's thread
view the rest of the comments

[–] Not_mikey@lemmy.world 1 points 11 months ago (1 children)

This is not how LLMs work, they are not a database nor do they have access to one. They are a trained neural net with a set of weights on matrices that we don't fully understand. We do know that it can't possibly have all the information from its training set since the training sets (measured in tb or pb) are orders of magnitude bigger than the models (measured in gb). The llm itself is just what it learned from reading all the training data, just like how you don't memorize every passage in a book you read, just core concepts, relationships and lessons. So if I ask you " who was gatsbys love interest?" You don't remember the line and page of the text that says he loves Daisy, your brain just has a strong connection of neurons between Gatsby, Daisy , love, longing etc. that produces the response "Daisy". The same is true in an LLM, it doesn't have the whole of the great Gatsby in its model but it too would have a strong connection somewhere between Gatsby, Daisy, love etc. to answer the question.

What your thinking of are older chatbots like Siri or Google assistant which do have a set of preset responses mixed in with some information from a structured database.

[–] knightly@pawb.social 1 points 11 months ago* (last edited 11 months ago) (1 children)

This is not how LLMs work, they are not a database nor do they have access to one.

Please do explain how you think they make LLMs without a database of training examples to build a statistical model from.

The llm itself is just what it learned from reading all the training data,

I.e. "a model that encodes a database".

They are a trained neural net with a set of weights on matrices that we don't fully understand.

I.e., "we applied a very lossy compression algorithm to this database".

We do know that it can't possibly have all the information from its training set since the training sets (measured in tb or pb) are orders of magnitude bigger than the models (measured in gb).

Check out the demoscene sometime, you'll be surprised how much complexity can be generated from a very small set of instructions. I've seen entire first person shooter video games less than 100kb in size that algorithmically generate hundreds of megabytes of texture data at runtime. The idea that a mere 1,000x non-lossless compression of text would be impossible is laughable, especially when lossless text compression using neural network techniques achieved a 250x compression ratio years ago.

[–] Not_mikey@lemmy.world 1 points 11 months ago (1 children)

If LLMs were just lossy encodings of their database they wouldn't be able to answer any questions outside of there training set. They can though, and quite well as shown by the fact you can give it completely made up information that it can't possibly have "seen" and it will go along with it and give plausible answers. That is where it's intelligence lyes and what separates it from older chatbots like Siri that cannot infer and are bound by the database they pull from.

How do you explain the hallucinations if the llm is just a complex lookup engine? You can't lookup something you've never seen.

[–] knightly@pawb.social 1 points 11 months ago (1 children)

If LLMs were just lossy encodings of their database they wouldn't be able to answer any questions outside of there training set.

Of course they could, in the same way that hitting the autocomplete key can finish a half-completed sentence you've never written before.

The fact that models can produce useful outputs from novel inputs is the whole reason why we build models. Your argument is functionally equivalent to the claim that wind tunnels are intelligent because they can characterise the aerodynamics of both old and new kinds of planes.

How do you explain the hallucinations if the llm is just a complex lookup engine? You can't lookup something you've never seen.

For the same reason that a random number generator is capable of producing never-before-seen strings of digits. LLM inference engines have a property called "temperature" that governs how much randomness is injected into their responses:

[–] Not_mikey@lemmy.world 1 points 11 months ago (1 children)

Auto complete is not a lossy encoding of a database either, it's a product of a dataset, just like you are a product of your experiences, but it is not wholly representative of that dataset.

A wind tunnel is not intelligent because it doesn't answer questions or process knowledge/data it just creates data. A wind tunnel will not answer the question "is this aerodynamic" but you can observe a wind tunnel and use your intelligence to process that and answer the question.

Temperature and randomness don't explain hallucinations, they are a product of inference. If you turned the temperature down to 0 and asked it the question " what happened in the great Christmas fire of 1934" it will give it's best guess of what happened then even though that question is not in it's dataset and it can't look up the answer. The temperature would just mean that between runs it would consistently give the same story, the one that is most statistically probable, as opposed to another one that may be less probable but was pushed up due to randomness. Hallucinations are a product of inference, of taking something at face value then trying to explain it. People will do this too, if you tell someone a lie confidently then ask them about it they will use there intelligence to rationalize a story about what happened.

[–] knightly@pawb.social 1 points 11 months ago (1 children)

Auto complete is not a lossy encoding of a database either, it's a product of a dataset, just like you are a product of your experiences, but it is not wholly representative of that dataset.

If LLMs don't encode their training data, then why are they proving susceptible to data exfiltration techniques where they output the content of their training dataset verbatim? https://m.youtube.com/watch?v=L_1plTXF-FE

[–] Not_mikey@lemmy.world 0 points 11 months ago (1 children)

I'm not saying it doesn't encode some of its training data, I'm saying it's not just encoding its training data. It probably does "memorize" a bunch of trivial facts from its training data and regurgitate them when asked. I'm saying that's not all they are and that's not what makes the intelligent, their ability to also answer questions outside their training data is.

[–] knightly@pawb.social 1 points 11 months ago (1 children)

But they don't "answer questions", they just respond to prompts. You can't use them to learn anything without checking their responses against authoritative sources you should have used in the first place.

There's no intelligence there, just a plagirism laundromat and some rules for formatting text like a 7th grader.

[–] Not_mikey@lemmy.world 0 points 11 months ago (1 children)

It can answer questions as well as any person. Just because you may need to check with another source doesn't mean it didn't answer the question it just means you can't fully trust it. If I ask someone who's the fourth u.s. president and they say Jefferson they still answered the question, they just answered it wrong. You also don't have to check with another source in the same way you do with asking a person a question, if it sounds right. If that person answered Madison and I faintly recall it and think it sounds right I will probably not check their answer and take it as fact.

For example I asked chatgpt for a chocolate chip cookie recipe once. I make cookies pretty often so would know if the recipe seemed off but the one it provided seemed good, I followed it and made some pretty good cookies. It answered the question correctly as shown by the cookies. You could argue it plagiarized but while the ingredients and steps were pretty close to some I found later none were a perfect match which is about as good as you can get with recipes which tend to converge in the same thing. The only real difference between most of them is the dumb story they give at the beginning which thankfully chatgpt doesn't do.

The 7th grader and plagiarism comment make me think you haven't played with them much or really tested them. I have had it write contracts, one of which I had reviewed by a lawyer who only had some small comments, as well as other letters and documents I needed for my mortgage and buying a home. All of these were looked over by proffesionals and none of them realized it was a bot. None of them were plagiarized too because the parameters I gave it and the output it created were way too unique to be in its training set.

[–] knightly@pawb.social 1 points 11 months ago (1 children)

It can answer questions as well as any person.

The 7th grader and plagiarism comment make me think you haven't played with them much or really tested them.

Of course I have, my employer has me shoehorning ChatGPT into everything, and I agree with what the research says: Children can answer questions better than LLMs can.

https://techxplore.com/news/2023-12-artificial-intelligence-excel-imitation.html

Stochastic plagirism is still plagirism.

[–] Not_mikey@lemmy.world 0 points 11 months ago

That study is like giving a written test to an illiterate adult, seeing them do worse than a child and saying they aren't intelligent or innovative. Like I said earlier intelligence is multi-faceted, and chatgpt excels at rhetorical, conversational and other types of written intelligence. It does not, as that study shows, do well in spatial manipulation, that doesn't mean it's not intelligent. If you gave that same test to a paralyzed blind person with little to no concept of spatial reality they'd probably do just as bad. If you asked them to compose a short story or an essay they might be good at it because that's where they're capabilities lye. That short story could still be innovative in its composition and characters, and could be way better than anything a child wrote.

You have to measure different types of intelligence with different tests. If you asked chatgpt and a set of adults and children to write a short story about a wholey new subject chatgpt would beat most of the children and probably some of the adults.

And if that short story is about a new subject matter completey out of its training set what/who is it plagiarizing from? You could say it's taking common tropes, themes and story elements from other stories, but that's fundamentally what a lot of writing and culture is. If that's plagiarism then you should be more worried about the marvel franchise as it's a plagiarism machine that has way more cultural impact.