this post was submitted on 28 Jun 2024

926 points (98.6% liked)

Science Memes

16783 readers

2621 users here now

Welcome to c/science_memes @ Mander.xyz!

A place for majestic STEMLORD peacocking, as well as memes about the realities of working in a lab.

Rules

Don't throw mud. Behave like an intellectual and remember the human.
Keep it rooted (on topic).
No spam.
Infographics welcome, get schooled.

This is a science community. We use the Dawkins definition of meme.

Research Committee

!spiders@lemmy.world

Other Mander Communities

Science and Research

Biology and Life Sciences

Physical Sciences

Humanities and Social Sciences

Practical and Applied Sciences

Memes

Miscellaneous

founded 2 years ago

MODERATORS

Sal@mander.xyz

fossilesque@mander.xyz

SciBot@mander.xyz

fossilesque@lemmy.dbzer0.com

926

I wish I was as bold as these authors. (discuss.tchncs.de)

submitted 1 year ago by jupyter_rain@discuss.tchncs.de to c/science_memes@mander.xyz

107 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] Tar_alcaran@sh.itjust.works 33 points 1 year ago (3 children)

LLMs work differently, statistically predicting the next token (roughly equivalent to a word) based on all those that came before it, and parameters finetuned during training.

Which is what a parrot does.

[–] naevaTheRat@lemmy.dbzer0.com 22 points 1 year ago (1 children)

Yeah this is the exact criticism. They recombine language pieces without really doing language. The end result looks like language, but it lacks any of the important characteristics of language such as meaning and intention.

If I say "Two plus two is four" I am communicating my belief about mathematics.

If an llm emits "two plus two is four" it is outputting a stochastically selected series of tokens linked by probabilities derived from training data. If the statement is true or false then that is accidental.

Hence, stochastic parrot.

[+] ignotum@lemmy.world -17 points 1 year ago (2 children)

If i train an LLM to do math, for the training data i generate a+b=cstatements, never showing it the same one twice.

It would be pointless for it to "memorize" every single question and answer it gets since it would never see that question again. The only way it would be able to generate correct answers would be if it gained a concept of what numbers are, and how the add operation operates on them to create a new number.
Rather than memorizing and parroting it would have to actually understand it in order to generate responses.

It's called generalization, it's why large amounts of data is required (if you show the same data again and again then memorizing becomes a viable strategy)

If I say "Two plus two is four" I am communicating my belief about mathematics.

Seems like a pointless distinction, you were told it so you believe it to be the case? Why can't we say the LLM outputs what it believes is the correct answer? You're both just making some statement based on your prior experiences which may or may not be true

[–] naevaTheRat@lemmy.dbzer0.com 17 points 1 year ago (2 children)

You're arguing against a position I didn't put forward. Also

Seems like a pointless distinction, you were told it so you believe it to be the case? Why can't we say the LLM outputs what it believes is the correct answer? You're both just making some statement based on your prior experiences which may or may not be true

This is what excessive reduction does to a mfer. That is just such a hysterically absurd take.

[–] artichokecustard@lemmy.world 8 points 1 year ago (1 children)

but, the LLM has faith!

[–] naevaTheRat@lemmy.dbzer0.com 10 points 1 year ago (1 children)

I'm a curmudgeonly physics nerd, it's very strange being on the side of a debate going "No now come on, that's way too reductive"

[–] yuri@pawb.social 2 points 1 year ago

That just means you’re better equipped when it comes up lmao

[+] ignotum@lemmy.world -6 points 1 year ago (2 children)

The AI builds some kind of a model of the world in order to better understand the input and improve the output prediction,

You have some mental model of how maths work which you have built up through school and other experiences in your life,

You both are given a maths problem, you both give an answer based on your understanding of mathematics

[–] naevaTheRat@lemmy.dbzer0.com 7 points 1 year ago* (last edited 1 year ago)

The algorithm assigns weights to nodes in a neural network. These weights are derived by statistical association of tokens in the training data after they have been cleaned.

That is so enormously far from how we think humans learn (you don't teach a kid to understand theory of mind by plopping them in front of the Gutenberg project and saying good luck, and yet they learn to explain theory of mind problems all the same) that it is just comically farcial to assume something similar is happening underneath.

It is very interesting that llms are able to appear to be conversational but claiming they have some sort of mind with an understanding of maths is as ridiculous as suggesting a chess bot understands the Pauli exclusion principle because it never moves two pieces into the same physical space.

[–] yuri@pawb.social 7 points 1 year ago (1 children)

You’ve been speaking with your chest this whole time and now that we’re into the nitty gritty you really just said “The ai does... something!” It’s so general a description that by your measure automated thermostats are engaging in human reasoning when they make it a little bit cooler on a hot day.

You might’ve been oversimplifying on purpose. I just can’t help but think you have no idea how LLMs work outside of this inherently flawed comparison to human thought.

[–] Hackworth@lemmy.world 1 points 1 year ago (1 children)

Not OP, but speaking from a fairly deep layman understanding of how LLMs work - all anyone really knows is that capabilities of fundamentally higher orders (like deception, which requires theory of mind) emerged by simply training larger networks. Since we don't have a great understanding of how our own intelligence emerges from our wetware, we're only guessing.

[–] yuri@pawb.social 4 points 1 year ago (1 children)

Something that looks like higher order reasoning emerged from training larger networks. At the end of the day it’s still just spicy autocomplete. Theoretically you could give it a large enough dataset to “predict” almost anything with really high accuracy, but all it’s doing is pattern recognition. One could argue that that’s all humans do, but that’s getting more into philosophy and skipping a lot of nuance.

I’m not like, trying to argue with you by the way. Just having a fun time with this line of thought ^^

[–] Hackworth@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

What makes the "spicy autocomplete" perspective incomplete is also what makes LLMs work. The "Attention is All You Need" paper that introduced attention transformers describes a type of self-awareness necessary to predict the next word. In the process of writing the next word of an essay, it navigates a 22,000-dimensional semantic space, And the similarity to the way humans experience language is more than philosophical - the advancements in LLMs have sparked a bunch of new research in neurology.

[–] kogasa@programming.dev 9 points 1 year ago (1 children)

If you fine tune a LLM on math equations, odds are it won't actually learn how to reliably solve novel problems. Just the same as it won't become a subject matter expert on any topic, but it's a lot harder to write simple math that "looks, but is not, correct" than it is to waffle vaguely about a topic. The idea of a LLM creating a robust model of the semantics of the text it's trained on is, at face value, plausible; it just doesn't seem to actually happen in practice.

[–] ignotum@lemmy.world -3 points 1 year ago (1 children)

Prompt:

What is 183649+72961?

ChatGPT:

The sum of 183649 and 72961 is 256610.

It's trained to generate what is most plausible, but with math, the only plausible response is the correct answer (assuming it has been trained on data where that has been the case)

[–] kogasa@programming.dev 4 points 1 year ago (1 children)

ChatGPT uses auxiliary models to perform certain tasks like basic math and programming. Your explanation about plausibility is simply wrong.

[–] ignotum@lemmy.world -2 points 1 year ago (1 children)

It has access to a python interpreter and can use that to do math, but it shows you that this is happening, and it did not when i asked it.

I asked it to do another operation, this time specifying i wanted it to use an external tool, and it did

You have access to a dictionary, that doesn't prove you're incapable of spelling simple words on your own, like goddamn people what's with the hate boners for ai around here

[–] kogasa@programming.dev 4 points 1 year ago

It has access to a python interpreter and can use that to do math, but it shows you that this is happening, and it did not when i asked it.

That's not what I meant.

You have access to a dictionary, that doesn’t prove you’re incapable of spelling simple words on your own, like goddamn people what’s with the hate boners for ai around here

??? You just don't understand the difference between a LLM and a chat application using many different tools.

[–] doubtingtammy@lemmy.ml 4 points 1 year ago

This is parrot libel