Yes, applied research has yielded the modern expert system, which is really usef...

wizzwizz4 · 2025-11-06T15:00:42 1762441242

It's not the "modern expert system", unless you're throwing away the existing definition of "expert system" entirely, and re-using the term-of-art to mean "system that has something to do with experts".

HarHarVeryFunny · 2025-11-06T20:16:38 1762460198

I don't know what the parent was referring to, but IMO "expert system" is one of the more accurate and insightful ways of describing LLMs.

An expert system is generically a system of declarative rules, capturing an expert's knowledge, that can be used to solve problems.

Traditionally expert systems are symbolic systems, representing the rules in a language such as Prolog, with these rules having been laboriously hand derived, but none of this seems core to the definition.

A pre-trained LLM can be considered as an expert system that captures the rules of auto-regressive language generation needed to predict the training data. These rules are represented by the weights of a transformer, and were learnt by SGD rather than hand coded, but so what?

wizzwizz4 · 2025-11-06T21:14:29 1762463669

If you can extract anything resembling a declarative rule from the weights of a transformer, I will put you in for a Turing award.

Expert systems are a specific kind of thing (see https://en.wikipedia.org/wiki/Expert_system#Software_archite...): any definition you've read is a description. If the definition includes GPT models, the definition is imprecise.

HarHarVeryFunny · 2025-11-06T21:54:15 1762466055

Well, OK, perhaps not a declarative rule, more a procedural one (induction heads copying data around, and all that) given the mechanics of transformer layers, but does it really make a conceptual difference?

Would you quibble if an expert system was procedurally coded in C++ rather than in Prolog? "You see this pattern, do this".

wizzwizz4 · 2025-11-07T14:29:21 1762525761

Yes, it makes a conceptual difference. Expert systems make decisions according to an explicit, explicable world model consisting of a database of facts, which can be cleanly separated from the I/O subsystems. This does not describe a transformer-based generative language model. The mathematical approaches for bounding the behaviour of a language model are completely different to those involved in bounding the behaviour of an expert system. (And I do mean completely different: computer programs and formal logic are unified in fields like descriptive complexity theory, but I'm not aware of any way to sensibly unify mathematical models of expert systems and LLMs under the same umbrella – unless you cheat and say something like cybernetics.)

You could compile an expert system into C++, and I'd still call it an expert system (even if the declarative version was never written down), but most C++ programs are not expert systems. Heck, a lot of Prolog programs aren't! To the extent a C++ program representing GPT inference is an expert system, it's the trivial expert system with one fact.