I’ve been saying this for about a year since seeing the Othello GPT research, but it’s nice to see more minds changing as the research builds up.
Edit: Because people aren’t actually reading and just commenting based on the headline, a relevant part of the article:
New research may have intimations of an answer. A theory developed by Sanjeev Arora of Princeton University and Anirudh Goyal, a research scientist at Google DeepMind, suggests that the largest of today’s LLMs are not stochastic parrots. The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data.
This theoretical approach, which provides a mathematically provable argument for how and why an LLM can develop so many abilities, has convinced experts like Hinton, and others. And when Arora and his team tested some of its predictions, they found that these models behaved almost exactly as expected. From all accounts, they’ve made a strong case that the largest LLMs are not just parroting what they’ve seen before.
“[They] cannot be just mimicking what has been seen in the training data,” said Sébastien Bubeck, a mathematician and computer scientist at Microsoft Research who was not part of the work. “That’s the basic insight.”
The definition of understanding they use is very shallow compared to how most would define it. Failure to complete a task consistently when numbers are changed, even when they don’t effect the answer shows a lack of real understanding to most. Asking a model the sheet drying question for example will give different results depending on what numbers you use. Better models are better at generalizing but are still far from demonstrating what most consider to be real understanding.
Is there a difference between being a “stochastic parrot” and understanding text? No matter what you call it, an LLM will always produces the same output with the same input if it is at the same state.
An LLM will never say “I don’t know” unless it’s been trained to say “I don’t know”, it doesn’t have the concept of understanding. And so I lean on calling it a “stochastic parrot”. Although I think there is some interesting philosophic exercises, you could do on whether humans are much different and if understanding is just an illusion.
No matter what you call it, an LLM will always produces the same output with the same input if it is at the same state.
How do you know a human wouldn’t do the same? We lack the ability to perform the experiment.
An LLM will never say “I don’t know” unless it’s been trained to say “I don’t know”
Also a very human behaviour, in my experience.
No matter what you call it, an LLM will always produces the same output with the same input if it is at the same state.
You might want to look up the definition of ‘stochastic.’
They’re not wrong. Randomness in computing is what we call “pseudo-random” in that it is deterministic provided that you start from same state or “seed”.
Stupid, LLMs do not create new relationships to words that don’t exist.
This is all just fluff to make them seem more like AGI, which they never will be.
Why would that be required for understanding? Presumably during the training it would have made connections between words it saw. Now that the training has stopped it hasn’t just lost those connections, sure it can’t make new connections but why is that important for using the connections it already has?
If we can’t really define “understand” in a way that meaningfully captures the concept of consciousness (also undefined), we definitely can’t say a chat bot “understands.”
They can parse a chat and come up with a likely response that humans find applicable.
Here’s an example of what they were looking at:
Human question: Please give me a a couple lines of text that illustrate all of the following language understanding skills: Anaphora resolution, simple logical reasoning, simple understanding of physics, and understanding of sentiment. Model response: Sure! Here’s a couple of lines that illustrate the mentioned language understanding skills: "As the sun set, casting a warm glow over the horizon, John watched in awe. He couldn’t help but feel a sense of tranquility wash over him, as if all his worries had melted away with the fading daylight." Explanation: • Anaphora resolution: The pronoun "he" in the second sentence refers back to the antecedent "John" in the first sentence, resolving the anaphoric reference. • Simple logical reasoning: The logical reasoning here is based on the cause and effect relationship between the setting sun and John’s feeling of tranquility. The implied logical connection suggests that the calming effect of the sunset is responsible for John’s serene state of mind. • Simple understanding of physics: The mention of the sun setting and casting a warm glow over the horizon reflects a basic understanding of physics, as it relates to the natural phenomena of the Earth’s rotation and the resulting visual effect. • Understanding of sentiment: The phrase "a sense of tranquility" and the description of John’s worries melting away indicate a positive sentiment associated with the experience. The mention of John feeling awe and a sense of tranquility further emphasizes a positive emotional state. Human question: Okay. Can you change the above example to introduce the element of "hip-hop". Model response: Certainly! Here’s an updated example that incorporates the element of "hip-hop": "As the sun set, casting a warm glow over the city skyline, the rhythmic beats of hip-hop music filled the air. John stood mesmerized, his body instinctively moving to the infectious grooves. He couldn’t help but feel a surge of energy and joy, as if the music had unlocked a vibrant world within him." Explanation: • Anaphora resolution: The pronoun "he" in the second sentence still refers back to the antecedent "John" in the first sentence, resolving the anaphoric reference. • Simple logical reasoning: The logical connection here lies in the cause and effect relationship between the hip-hop music and John’s feeling of energy and joy. The music’s rhythmic beats and infectious grooves are responsible for his instinctive movement and the surge of positive emotions he experiences. • Simple understanding of physics: Although the example doesn’t directly mention physics, the phrase "As the sun set, casting a warm glow over the city skyline" still implies the natural phenomena of the sun’s descent and the resulting illumination of the environment. • Understanding of sentiment: The introduction of hip-hop music adds a vibrant and energetic sentiment to the example. John’s feeling of energy, joy, and being mesmerized by the music reflects a positive sentiment associated with the hip-hop genre.
Edit: Downvotes for citing the appendix of the paper the article was about? Ok, Lemmy
You’re being downvoted because you provide no tangible evidence for your opinion that human consciousness can be reduced to a graph that can be modelled by a neural network.
Addidtionally, you don’t seem to respond to any of the replies you receive in good faith and reach for anecdotal evidence wherever possible.
I also personally don’t like the appeal to authority permeating your posts. Just because someone who wants to secure more funding for their research has put out a blog post, it doesn’t make it true in any scientific sense.
human consciousness
Wtf are you talking about? The article is about whether or not models can understand text. Not about whether they embody consciousness.
Just because someone who wants to secure more funding for their research has put out a blog post, it doesn’t make it true in any scientific sense.
Again, wtf are you going on about? Hinton was the only appeal to authority I made in comments here and I only referred to him quitting his job to whistleblow. And it’s not like he needs any attention to justify research if he wanted to.
Understanding as most people know it implies some kind of consciousness or sentience as others have alluded to here.
It’s the whole point of your post.
You are reading made up strawmen into the topic.
The article defines the scope of the discussion straight up:
The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data.
The question is whether or not LLMs have a grasp of the training material such that they can produce new and novel concepts outside what was in the training data itself.
Not whether the LLM is sentient or conscious - both characterizations I’d strongly dispute.
Wikipedia has a useful distillation of the definition of understanding relevant to the above:
process related to an abstract or physical object, such as a person, situation, or message whereby one is able to use concepts to model that object
No I’m not.
You’re nearly there… The word “understanding” is the core premise of what the article claims to have found. If not for that, then the “research” doesn’t really amount to much.
As has been mentioned, this then becomes a semantic/philosophical debate about what “understanding” actually means and a short Wikipedia or dictionary definition does not capture that discussion.
Ah, I see. AKA “Tell me you didn’t read the article and just read the headline without telling me.”