> And on a more prosaic note, Google's LaMDA is clearly ahead of ChatGPT (it's just not public), and explicitly tackles the bullshit/falsehood problem by having a second layer that fact-checks the LLM by querying a fact database / knowledge-graph.
Isn't that more-or-less what he's proposing, though? It does feel intuitive to me that something based on probabilistic outcomes (neural nets) would have a very hard time consistently returning accurate deterministic answers.
Of course (some) humans get there too, but that assumes what we're doing now with ML can ever reach human-brain level which is of course very much not an answered question.
I think he's proposing that the LLM should know how to call out to a knowledge engine at inference time. He thinks the knowledge engine continues to be its own (human-curated) system of knowledge that is valuable.
I am suggesting the LLM will (effectively) call out to a knowledge engine at training time, learn everything the knowledge engine knows, and render it obsolete.
So it's similar in some sense (collaboration between the two systems), but crucially, a diametrically opposed prediction in terms of the long-term viability of Wolfram Alpha.
Crucially, he says "[an LLM] just isn’t a good fit in situations where there are structured computational things to do", but I think it's dubious to claim this; LLMs can learn structured domains too, if they are well-represented in the training set.
edit to add: I see that you're specifically noting the LaMDA point, yes, you're right that this is more like what he's proposing. My main claim is that things will not move in that direction, rather the direction of the Mind's Eye paper I linked.
Isn't this an effectively infinite set? Wolfram Alpha could be said to know "all the numbers", and "all the formulas".
> LLMs can learn structured domains too if they are well-represented in the training set
But can they learn how to apply structured knowledge in precise ways? In mathematical or computational ways? I don't follow the field in great detail but the commentary I read seems to be saying this is not at all the case. And my own experiments with ChatGPT show it has no systematic grasp of logic.
No, the thing you'd want the LLM to be learning would be the rules.
> But can they learn how to apply structured knowledge in precise ways?
I personally believe: clearly yes, already. You can already get a LLM to generate code for simple logical problems. You can ask ChatGPT to modify a solution in a particular way, showing it has some understanding of the underlying logic, rather than just regurgitating solutions it saw.
I'd just note that a lot of commentators make quite simple errors of either goalpost-moving or a failure to extrapolate capabilities a year or two ahead. Of course, no linear or exponential growth curve continues indefinitely. But betting against this curve, now, seems to me a good way of losing money.
> I am suggesting the LLM will (effectively) call out to a knowledge engine at training time, learn everything the knowledge engine knows, and render it obsolete.
And when facts change, your only option will be to retrain. Since facts are always changing, you’ll always be training.
I don't think most of the interesting knowledge encoded in Wolfram Alpha changes. Mathematics and pure Logic is true, and immutable. Most of Physics, ditto.
Thats true for the physics and math but some of the things are updated all the time. For example you can get a current weather report. And they have structured data about movies, TV shows, music, and notable people [1]. Every time any country has an election you’re going to retrain your model? That gets really expensive really fast.
On top of that, the training process isn’t that trustworthy. There’s no guarantee your model won’t accidentally say that e.g. Obama is the president.
All of this is to say that the best path forward is to translate questions into queries to an auditable knowledge base and then integrate the responses back into the conversation. It’s a couple years old but the best I’ve seen in this area is Retrieval Augmented Generation [2]. And even that’s imperfect, in my experience.
Right, I see what you're getting at. I do agree that AI systems will need to be able to use oracles, "current weather" is a great example of something a human also looks up.
The reason I want the model itself to learn Physics, Maths, etc. is that I think it is going to end up being critical to the challenge of actually developing logical reasoning on a par or above humans, and to gain a true "embodied understanding" of the real world.
But yeah, it would be nice to have your architecture support updating facts without full retraining. One approach is to use an oracle as you note. Another would be to have systems do some sort of online learning (as humans do). (Why not both?) The advantage of the latter approach is that it allows an agent to deeply update their model of the world in response to new facts, as humans can sometimes do. Anything that I'm just pulling statelessly from an oracle cannot update the rest of my "mind". But this is perhaps a bit speculative; I agree in the short-term at least we'll see better performance with hybrid LLM + Oracle models. (As I noted, LaMDA already does this.)
However I think that a big part of Wolfram's argument in the OP is that he thinks that an LLM can't learn Physics or Maths, or reliably learn static facts that a smart human might have memorized like distances between cities. And that's the position I was really trying to argue against. I think more scale and more data likely gets us way further than Wolfram wants to give credit for.
>> I am suggesting the LLM will (effectively) call out to a knowledge engine at training time, learn everything the knowledge engine knows, and render it obsolete.
You are suggesting that the LLM would "learn" to apply the rules built into the knowledge engine. I'm not as optimistic as to think that a statistical algorithm that uses random inputs would be reliable at applying deterministic rules. But for the sake of argument, let's assume that you are correct and that we can have an LLM replicate deterministic logic and exact computation, or that we can have it be right 95% of the time. That's basically the extent of human intellect[^1]: statistical processing for most common life situations, and deeper analytical thinking for some rather atypical cases (i.e., analyzing algorithm complexity, trying to demonstrate that the L4 Lagrange point is on the vertex of an equilateral triangle, or applying logic to forecast the stock market next week or how many degrees the global temperature will raise in the next two decades).
Crucially, we are good at having computers check a proof, or simulate a system to ensure correctness, but before LLMs we were the only ones that could create that proof (by using socially learned heuristics to probe the proof space), design the system (by using knowledge passed down by fellow humans and assimilated by our "internal rule engines"), or come up with a definition of correctness (by doing a lot of inner and outer argumentation and possibly obtaining peer consensus). If we get an LLM do that as well, for most practical purposes we would have achieved AGI.
If we are there already (or if we will be, in a few months or years), the world is going to look very different. Not necessarily in an apocalyptic way, but some priorities are going to shift[^2].
[^1]: Admittedly, there is also some biological characteristics that bias our intellectual processes in a certain way, but that's next-level madness and nobody is talking--for now--about giving those to an LLM.
[^2]: If you could have at your beck and call a general intellect engine that could build entire programs and systems for you, what would you have it build?
Sure, but as usual (just like the cellular automata business) Wolfram gives/has the impression that he is discussing something novel. And it ain’t novel, to say nothing of the fact that it is also a fairly obvious thing to do. Symbolic AI folks are not taking this LM business well. They are all coping.
These decades, whenever the word "Wolfram" comes up, reliably discussion will center on his tone and style. My advice: don't confuse the message with the messenger.
Mathematica (yes even I can't bring myself to call it "Wolfram Language" or whatever) is an exquisite and indispensable software system, a true aid to thought. Likewise for Wolfram Alpha.
(And yes, agree his cellular automata stuff is unconvincing.)
How is the guy who says "combine symbolic with probabilistic" the one who is "coping" with his system not being powerful enough, but the team who deployed a bot that is almost always wrong, is not "coping"?
You're right. How could I know this? I have formed an opinion about a person I have never personally met. Mea culpa. My impression then is that Stephan Wolfram, in spite of his considerable and quite & justifiably impressive brain, refuses to apply the necessary corrective measures to adjust for the fact of the existence of external agencies in the world when formulating his personal theory of the world.
> a bot that is almost always wrong
"It’s always amazing when things suddenly “just work”. It happened to us with Wolfram|Alpha back in 2009. It happened with our Physics Project in 2020. And it’s happening now with OpenAI’s ChatGPT. "
It is possible I missed the widespread excitement about Wolfram|Alpha and Project Physics. The former did make waves in geek circles, I remember that. The latter did not make it to the New York Times, did it?
Isn't that more-or-less what he's proposing, though? It does feel intuitive to me that something based on probabilistic outcomes (neural nets) would have a very hard time consistently returning accurate deterministic answers.
Of course (some) humans get there too, but that assumes what we're doing now with ML can ever reach human-brain level which is of course very much not an answered question.