If we were talking about humans trying to predict next word, that would be true.
There is no reason to suppose than an LLM is doing anything other than deep pattern prediction pursuant to, and no better than needed for, next word prediction.
There is plenty reason. This article is just one example of many. People bring it up because LLMs routinely do things we call reasoning when we see them manifest in other humans. Brushing it off as 'deep pattern prediction' is genuinely meaningless. Nobody who uses that phrase in that way can actually explain what they are talking about in a way that can be falsified. It's just vibes. It's an unfalsifiable conversation-stopper, not a real explanation. You can replace "pattern matching" with "magic" and the argument is identical because the phrase isn't actually doing anything.
A - A force is required to lift a ball
B - I see Human-N lifting a ball
C - Obviously, Human-N cannot produce forces
D - Forces are not required to lift a ball
Well sir, why are you so sure Human-N cannot produce forces? How is she lifting the ball ? Well Of course Human-N is just using s̶t̶a̶t̶i̶s̶t̶i̶c̶s̶ magic.
First, the obvious one, is that LLMs are trained to auto-regressively predict human training samples (i.e. essentially to copy them, without overfitting), so OF COURSE they are going to sound like the training set - intelligent, reasoning, understanding, etc, etc. The mistake is to anthropomorphize the model because it sounds human, and associate these attributes of understanding etc to the model itself rather than just reflecting the mental abilities of the humans who wrote the training data.
The second point is perhaps a bit more subtle, and is about the nature of understanding and the differences between what an LLM is predicting and what the human cortex - also a prediction machine - is predicting...
When humans predict, what we're predicting is something external to ourself - the real world. We observe, over time we see regularities, and from this predict we'll continue to see those regularities. Our predictions include our own actions as an input - how will the external world react to our actions, and therefore we learn how to act.
Understanding something means being able to predict how it will behave, both left alone, and in interaction with other objects/agents, including ourselves. Being able to predict what something will do if you poke it is essentially what it means to understand it.
What an LLM is predicting is not the external world and how it reacts to the LLMs actions, since it is auto-regressively trained - it is only predicting a continuation of it's own output (actions) based on it's own immediately preceding output (actions)! The LLM therefore itself understands nothing since it has no grounding for what it is "talking about", and how the external world behaves in reaction to it's own actions.
The LLMs appearance of "understanding" comes solely from the fact that it is mimicking the training data, which was generated by humans who do have agency in the world and understanding of it, but the LLM has no visibility into the generative process of the human mind - only to the artifacts (words) it produces, so the LLM is doomed to operate in a world of words where all it might be considered to "understand" is it's own auto-regressive generative process.
You’re restating two claims that sound intuitive but don’t actually hold up when examined:
1. “LLMs just mimic the training set, so sounding like they understand doesn’t imply understanding.”
This is the magic argument reskinned. Transformers aren’t copying strings, they’re constructing latent representations that capture relationships, abstractions, and causal structure because doing so reduces loss. We know this not by philosophy, but because mechanistic interpretability has repeatedly uncovered internal circuits representing world states, physics, game dynamics, logic operators, and agent modeling. “It’s just next-token prediction” does not prevent any of that from occurring. When an LLM performs multi-step reasoning, corrects its own mistakes, or solves novel problems not seen in training, calling the behavior “mimicry” explains nothing. It’s essentially saying “the model can do it, but not for the reasons we’d accept,” without specifying what evidence would ever convince you otherwise. Imaginary distinction.
2. “Humans predict the world, but LLMs only predict text, so humans understand but LLMs don’t.”
This is a distinction without the force you think it has. Humans also learn from sensory streams over which they have no privileged insight into the generative process. Humans do not know the “real world”; they learn patterns in their sensory data. The fact that the data stream for LLMs consists of text rather than photons doesn’t negate the emergence of internal models. An internal model of how text-described worlds behave is still a model of the world.
If your standard for “understanding” is “being able to successfully predict consequences within some domain,” then LLMs meet that standard, just in the domains they were trained on, and today's state of the art is trained on more than just text.
You conclude that “therefore the LLM understands nothing.” But that’s an all-or-nothing claim that doesn’t follow from your premises. A lack of sensorimotor grounding limits what kinds of understanding the system can acquire; it does not eliminate all possible forms of understanding.
Wouldn't the birds that have the ability to navigate from the earth's magnetic field soon say humans have no understanding of electromagnetism ? They get trained on sensorimotor data humans will never be able to train on. If you think humans have access to the "real world" then think again. They have a tiny, extremely filtered slice of it.
Saying “it understands nothing because autoregression” is just another unfalsifiable claim dressed as an explanation.
> This is the magic argument reskinned. Transformers aren’t copying strings, they’re constructing latent representations that capture relationships, abstractions, and causal structure because doing so reduces loss.
Sure (to the second part), but the latent representations aren't the same as a humans. The human's world that they have experience with, and therefore representations of, is the real word. The LLM's world that they have experience with, and therefore representations of, is the world of words.
Of course an LLM isn't literally copying - it has learnt a sequence of layer-wise next-token predictions/generations (copying of partial embeddings to next token via induction heads etc), with each layer having learnt what patterns in the layer below it needs to attend to, to minimize prediction error at that layer. You can characterize these patterns (latent representations) in various ways, but at the end of the day they are derived from the world of words it is trained on, and are only going to be as good/abstract as next token error minimization allows. These patterns/latent representations (the "world model" of the LLM if you like) are going to be language-based (incl language-based generalizations), not the same as the unseen world model of the humans who generated that language, whose world model describes something completely different - predictions of sensory inputs and causal responses.
So, yes, there is plenty of depth and nuance to the internal representations of an LLM, but no logical reason to think that the "world model" of an LLM is similar to the "world model" of a human since they live in different worlds, and any "understanding" the LLM itself can be considered as having is going to be based on it's own world model.
> Saying “it understands nothing because autoregression” is just another unfalsifiable claim dressed as an explanation.
I disagree. It comes down to how do you define understanding. A human understands (correctly predicts) how the real world behaves, and the effect it's own actions will have on the real world. This is what the human is predicting.
What an LLM is predicting is effectively "what will I say next" after "the cat sat on the". The human might see a cat and based on circumstances and experience of cats predict that the cat will sit on the mat. This is because the human understands cats. The LLM may predict the next word as "mat", but this does not reflect any understanding of cats - it is just a statistical word prediction based on the word sequences it was trained on, notwithstanding that this prediction is based on the LLMs world-of-words-model.
>So, yes, there is plenty of depth and nuance to the internal representations of an LLM, but no logical reason to think that the "world model" of an LLM is similar to the "world model" of a human since they live in different worlds, and any "understanding" the LLM itself can be considered as having is going to be based on it's own world model.
So LLMs and Humans are different and have different sensory inputs. So what ? This is all animals. You think dolphins and orcas are not intelligent and don't understand things ?
>What an LLM is predicting is effectively "what will I say next" after "the cat sat on the". The human might see a cat and based on circumstances and experience of cats predict that the cat will sit on the mat.
Genuinely don't understand how you can actually believe this. A human who predicts mat does so because of the popular phrase. That's it. There is no reason to predict it over the numerous things cats regularly sit on, often much more so the mats (if you even have one). It's not because of any super special understanding of cats. You are doing the same thing the LLM is doing here.
Orca and human brains are similar, in the sense we have a common ancestor if you look back far enough, but they are still very different and focus on entirely different slices of reality and input than humans will ever do. It's not something you can brush off if you really believe in input supremacy so much.
From the orca's perspective, many of the things we say we understand are similarly '2nd hand hearsay'.
To follow your hypothetical, if an Orca were to be exposed to human language, discussing human terrestrial affairs, and were able to at least learn some of the patterns, and maybe predict them, then it should indeed be considered not to have any understanding of what that stream of words meant - I wouldn't even elevate it to '2nd hand hearsay'.
Still, the Orca, unlike an LLM, does at least does have a brain, and does live in and interact with the real world, and could probably be said to "understand" things in it's own watery habitat as well as we do.
Regarding "input supremacy" :
It's not the LLMs "world of words" that really sets it apart from animals/humans, since there are also multi-model LLMs with audio and visual inputs more similar to a humans sensory inputs. The real difference is what they are doing with those inputs. The LLM is just a passive observer, whose training consisted of learning patterns in it's inputs. A human/animal is an active agent, interacting with the world, and thereby causing changes in the input data it is then consuming. The human/animal is learning how to DO things, and gaining understanding of how the word reacts. The LLM is learning how to COPY things.
There are of course many other differences between LLMs/Transformers and animal brains, but even if we were to eliminate all these differences the active vs passive one would still be critical.
If you ask a human to complete the phrase "the cat sat on the", they will probably answer "mat". This is memorization, not understanding. The LLM can do this too.
If you just input "the cat sat on the" to an LLM, it will also likely just answer "mat" since this is what LLMs do - they are next-word input continuers.
If you said "the sat sat on the" to a human, they would probably respond "huh?" or "who the hell knows!", since the human understands that cats are fickle creatures and that partial sentences are not the conversational norm.
If you ask an LLM to explain it's understanding of cats, it will happily reply, but the output will not be it's own understanding of cats - it will be parroting some human opinion(s) it got from the training set. It has no first hand understanding, only 2nd hand heresay.
>If you said "the sat sat on the" to a human, they would probably respond "huh?" or "who the hell knows!", since the human understands that cats are fickle creatures and that partial sentences are not the conversational norm.
I'm not sure what you're getting at here ? You think LLMs don't similarly answer 'What are you trying to say?'. Sometimes I wonder if the people who propose these gotcha questions ever bother to actually test them on said LLMs.
>If you ask an LLM to explain it's understanding of cats, it will happily reply, but the output will not be it's own understanding of cats - it will be parroting some human opinion(s) it got from the training set. It has no first hand understanding, only 2nd hand heresay.
Again, you're not making the distinction you think you are. Understanding from '2nd hand heresay' is still understanding. The vast majority of what humans learn in school is such.
> Sometimes I wonder if the people who propose these gotcha questions ever bother to actually test them on said LLMs
Since you asked, yes, Claude responds "mat", then asks if I want it to "continue the story".
Of course if you know anything about LLMs you should realize that they are just input continuers, and any conversational skills comes from post training. To an LLM a question is just an input whose human-preferred (as well as statistically most likely) continuation is a corresponding answer.
I'm not sure why you regard this as a "gotcha" question. If you're expressing opinions on LLMs, then table stakes should be to have a basic understanding of LLMs - what they are internally, how they work, and how they are trained, etc. If you find a description of LLMs as input-continuers in the least bit contentious then I'm sorry to say you completely fail to understand them - this is literally what they are trained to do. The only thing they are trained to do.
>Of course if you know anything about LLMs you should realize that they are just input continuers, and any conversational skills comes from post training.
No, they don't. Post-training makes things easier, more accessible and consistent but conversation skills are in pre-trained LLMs just fine. Append a small transcript to the start of the prompt and you would have the same effect.
>I'm not sure why you regard this as a "gotcha" question. If you're expressing opinions on LLMs, then table stakes should be to have a basic understanding of LLMs - what they are internally, how they work, and how they are trained, etc.
You proposed a distinction and explained a situation which would make that distinction falsifiable. And I simply told you LLMs don't respond the way you claim they would. Even when models respond mat (Now I think your original point had a typo?), it is clearly not due to a lack of understanding of what normal sentences are like.
>If you find a description of LLMs as input-continuers in the least bit contentious then I'm sorry to say you completely fail to understand them - this is literally what they are trained to do. The only thing they are trained to do.
They are predictors. If the training data is solely text then the output will be more text, but that need not be the case. Words can go in while Images or actions or audio may come out. In that sense, humans are also 'input continuers'.
>Yeah - you might want to check what you actually typed there.
That's what you typed in your comment. Go check. I just figured it was intentional since surprise is the first thing you expect humans to show in response to it.
>Not sure what you're trying to prove by doing it yourself though. Have you heard of random sampling? Never mind ...
I guess you fancy yourself a genius who knows all about LLMs now, but sampling wouldn't matter here. Your whole point was that it happens because of a fundamental limitation on the part of LLMs that causes them unable to do it. Even one contrary response, never mind multiple would be enough. After all, some humans would simply say 'mat'.
Anyway, it doesn't really matter. Completing 'mat' doesn't have anything to do with a lack of understanding. It's just the default 'assumption' that it's a completion that is being sought.
If you want to say human and LLM intelligence are both 'deep pattern prediction' then sure, but mostly and certainly in the case I was replying to, people often just use it as a means to make an imaginary unfalsifiable distinction between what LLMs do and what the super special humans do.
One problem for your argument is that transformer networks are not, and weren't meant to be, calculators. Their raw numerical calculating abilities are shaky when you don't let them use external tools, but they are also entirely emergent. It turns out that language doesn't just describe logic, it encodes it. Nobody expected that.
To see another problem with your argument, find someone with weak reasoning abilities who is willing to be a test subject. Give them a calculator -- hell, give them a copy of Mathematica -- and send them to IMO, and see how that works out for them.
Well being able to extrapolate solutions to "novel" mathematical exercises based on a very large sample of similar tasks in your dataset seems like a reasonable explanation.
Question is how well it would do if it was trained without those samples?
Gee, I don't know. How would you do at a math competition if you weren't trained with math books? Sample problems and solutions are not sufficient unless you can genuinely apply human-level inductive and deductive reasoning to them. If you don't understand that and agree with it, I don't see a way forward here.
A more interesting question is, how would you do at a math competition if you were taught to read, then left alone in your room with a bunch of math books? You wouldn't get very far at a competition like IMO, calculator or no calculator, unless you happen to be some kind of prodigy at the level of von Neumann or Ramanujan.
> A more interesting question is, how would you do at a math competition if you were taught to read, then left alone in your room with a bunch of math books?
But that isn't how an LLM learnt to solve math olympiad problems. This isn't a base model just trained on a bunch of math books.
The way they get LLMs to be good at specialized things like math olympiad problems is to custom train them for this using reinforcement learning - they give the LLM lots of examples of similar math problems being solved, showing all the individual solution steps, and train on these, rewarding the model when (due to having selected an appropriate sequence of solution steps) it is able itself to correctly solve the problem.
So, it's not a matter of the LLM reading a bunch of math books and then being expert at math reasoning and problem solving, but more along the lines "of monkey see, monkey do". The LLM was explicitly shown how to step by step solve these problems, then trained extensively until it got it and was able to do it itself. It's probably a reflection of the self-contained and logical nature of math that this works - that the LLM can be trained on one group of problems and the generalizations it has learnt works on unseen problems.
The dream is to be able to teach LLMs to reason more generally, but the reasons this works for math don't generally apply, so it's not clear that this math success can be used to predict future LLM advances in general reasoning.
The dream is to be able to teach LLMs to reason more generally, but the reasons this works for math don't generally apply
Why is that? Any suggestions for further reading that justifies this point?
Ultimately, reinforcement learning is still just a matter of shoveling in more text. Would RL work on humans? Why or why not? How similar is it to what kids are exposed to in school?
An important difference between reinforcement learning (RL) and pre-training is the error feedback that is given. For pre-training the error feedback is just next token prediction error. For RL you need to have a goal in mind (e.g. successfully solving math problems) and the training feedback that is given is the RL "reward" - a measure of how well the model output achieved the goal.
With RL used for LLMs, it's the whole LLM response that is being judged and rewarded (not just the next word), so you might give it a math problem and ask it to solve it, then when it was finished you take the generated answer and check if it is correct or not, and this reward feedback is what allows the RL algorithm to learn to do better.
There are at least two problems with trying to use RL as a way to improve LLM reasoning in the general case.
1) Unlike math (and also programming) it is not easy to automatically check the solution to most general reasoning problems. With a math problem asking for a numerical answer, you can just check against the known answer, or for a programming task you can just check if the program compiles and the output is correct. In contrast, how do you check the answer to more general problems such "Should NATO expand to include Ukraine?" ?! If you can't define a reward then you can't use RL. People have tried using "LLM as judge" to provide rewards in cases like this (give the LLM response to another LLM, and ask it if it thinks the goal was met), but apparently this does not work very well.
2) Even if you could provide rewards for more general reasoning problems, and therefore were able to use RL to train the LLM to generate good solutions for those training examples, this is not very useful unless the reasoning it has learnt generalizes to other problems it was not trained on. In narrow logical domains like math and programming this evidentially works very well, but it is far from clear how learning to reason about NATO will help with reasoning about cooking or cutting your cat's nails, and the general solution to reasoning can't be "we'll just train it on every possible question anyone might ever ask"!
I don't have any particular reading suggestions, but these are widely accepted limiting factors to using RL for LLM reasoning.
I don't think RL for humans would work too well, and it's not generally the way we learn, or kids are mostly taught in school. We mostly learn or are taught individual skills and when they can be used, then practice and learn how to combine and apply them. The closest to using RL in school would be if the only feedback an English teacher gave you on your writing assignments was a letter grade, without any commentary, and you had to figure out what you needed to improve!
Going back to the grandparent reply, there's a phrase that carries a LOT of water:
The LLM was explicitly shown how to step by step solve these problems, then trained extensively until it got it
Again, that's all we do. We train extensively until we "get it." Monkey-see, monkey-do turns out not only to be all you need, so to speak... it's all there is.
In contrast, how do you check the answer to more general problems such "Should NATO expand to include Ukraine?"
If you ask a leading-edge model a question like that, you will find that it has become diplomatic enough to remain noncommittal. If this ( https://gemini.google.com/share/9f365513b86f ) isn't adequate, what would you expect a hypothetical genuinely-intelligent-but-not-godlike model to say?
There is only way to check the answer to that question, and that's to sign them up and see how Russia reacts. (Frankly I'd be fine with that, but I can see why others aren't.)
Also see the subthread at https://news.ycombinator.com/item?id=45483938 . I was really impressed by that answer; it wasn't at all what I was expecting. I'm much more impressed by that answer than by the HN posters I was engaging with, let's put it that way.
Ultimately it's not fair to judge AI by asking it for objective answers in questions requiring value judgement. Especially when it's been "aligned" to within an inch of its simulated life to avoid bias. Arguably we are not being given the access we need to really understand what these things are capable of.
> It's not fair to judge AI by asking it for objective answers in questions requiring value judgement... especially when it's been "aligned" to within an inch of its simulated life to avoid bias.
They aren’t aligned to avoid bias (which is an incoherent concept, avoiding bias is like not having priors), they are aligned to incorporate the preferred bias of the entity doing it the alignment work.
(That preferred bias may be for a studious neutrality on controversial viewpoints in the surrounding society as perceived by the aligner, but that’s still a bias, not the absence of bias.)
> Again, that's all we do. We train extensively until we "get it." Monkey-see, monkey-do turns out not only to be all you need, so to speak... it's all there is.
Which is fine for us humans, but would only be fine for LLMs if they also had continual learning and whatever else was necessary for them to be able to learn on the job and be able to pick up new reasoning skills by themselves, post-deployment.
Obviously right now this isn't the case, so therefore we're stuck with the LLM companies trying to deliver models "out of the box" that have some generally useful reasoning capability that goes beyond whatever happened to be in their pre-training data, and the way they are trying to do that is with RL ...
Agreed, memory consolidation and object permanence are necessary milestones that haven't been met yet. Those are the big showstoppers that keep current-generation LLMs from serving as a foundation for something that might be called AGI.
It'll obviously happen at some point. No reason why it won't.
Just as obviously, current LLMs are capable of legitimate intelligent reasoning now, subject to the above constraints. The burden of proof lies on those who still claim otherwise against all apparent evidence. Better definitions of 'intelligence' and 'reasoning' would be a necessary first step, because our current ones have decisively been met.
Someone who has lost the ability to form memories is still human and can still reason, after all.
I think continual learning is a lot different than memory consolidation. Learning isn't the same as just stacking memories, and anyways LLMs aren't learning the right thing - to create human/animal-like intelligence requires predicting the outcomes of actions, not just auto-regressive continuations.
Continual learning, resulting in my AI being different from yours, because we've both got them doing different things, is also likely to turn the current training and deployment paradigm on it's head.
I agree we'll get there one day, but I expect we'll spend the next decade exploiting LLMs before there is any serious effort more on to new architectures.
In the meantime, DeepMind for one have indicated they will try to build their version of "AGI" with an LLM as a component of it, but it remains to be seen exactly what they end up building and how much new capability that buys. In the long term building in language as a component, rather than building in the ability to learn language, and everything else that humans are capable of learning, is going to prove a limitation, and personally I wouldn't call it AGI until we do get to that level of being able to learn everything that a human can.
If we were talking about humans trying to predict next word, that would be true.
There is no reason to suppose than an LLM is doing anything other than deep pattern prediction pursuant to, and no better than needed for, next word prediction.