I'll need to work on the diagram to make it clearer next time.
What it's trying to communicate is, in general, a human operating a computer has to turn their imprecise thinking into "specific and exact commands", and subsequently, understand the "specific and exact output" in whatever terms they're thinking off, prioritizing and filtering out data based on situational context. LLMs enter the picture in two places:
1) In many situations, they can do the "imprecise thinking" -> "specific and exact commands" step for the user;
2) In many situations, they can do the "specific and exact output" -> contextualized output step for the user;
In such scenarios, LLMs are not replacing software, they're being slotted as intermediary between user and classical software, so the user can operate closer to what's natural for them, vs. translating between it and rigid computer language.
This is not applicable everywhere, but then, this is also not the only way LLMs are useful - it's just one broad class of scenarios in which they are.
That's true, but that is the input section of the diagram, not the output section where [specific and exact output] is labeled, so I believe there was legitimate confusion I was responding to.
To your point, which I think is separate but related, that IS a case where LLMs are good at producing specific and exact commands. The models + the right prompt are pretty reliable at tool calling by themselves, because you give them a list of specific and exact things they can do. And they can be fully specific and exact at inference time with constrained output (although you may still wish it called a different tool.)
The tool may not even exist. LLMs are really terrible at admitting where the limits of the training are. They will imagine a tool into being. They will also claim the knowledge is within their realm, when it isn't.
That would only be possible, if you could prevent hallucinations from ever occurring. Which you can't. Even if you supply a strict schema, the model will sometimes act outside of it - and infer the existence of "something similar".
That's not true. You say the model will sometimes act outside of the schema, but models don't act at all, they don't hallucinate by themselves, they don't produce text at all, they do all of this in conjunction with your inference engine.
The model's output is a probability for every token. Constrained output is a feature of the inference engine. With a strict schema the inference engine can ignore every token that doesn't adhere to the schema and select the top token that does adhere to the schema.
Yes, we've been discussing "specific and exact" output. As I said, you might wish it called at different tool; nothing in this discussion is addressing that.