Isn't it a matter of training? This is the way I think about LLMs. They just "learned" so much context that they spit out tokens one after the other based on that context. So if it hallucinates it's because it understood the context wrong or doesn't grasp the nuance in the context. The more complicated the task the higher chance of hallucinations. Now I don't know if this can be improved with more training but that is the only tool we have.