Interesting read. Got me thinking, I’d love to see what happens when modern AI meets open world simulation. Not just prettier graphics, but actual reasoning NPCs. Imagine arguing with a World of Warcraft innkeeper about the price of ale. Priceless.
Wiring a chatbot to dialogue is less interesting to me than the possibility of AI directing scenes and orchestrating reactivity across multiple characters. A reasoning model can ensure that the world responds to the player in a reasonable and narratively interesting way, without having to script everything or make individual characters particularly intelligent.
We're used to thinking of game AI as a property of the entity it's attached to (the NPC, the enemy, the opposing player) but an LLM can sit above that, more like a dungeon master.
Wasn't this the goal of the Director AI in Left 4 Dead?[1] Monitoring player progress (or lack of it) and tailoring how zombies and items spawned outside of script events, and in L4D2 how the map, pathing, and weather worked in order to maximize tension or encourage progress?
Years ago when I was a bit obsessed about the Holy Grail of a living & breathing CRPG world the approach that seemed most promising to me then was having an expert system style AI module running on top of the complex but mechanical and boring low level simulation. This GM module would then find and tie together predefined hierarchical abstract patterns from the engines event log, adding some narration and meaning to it all and slightly nudging things along to some hopefully more interesting and meaningful paths.
I have been thinking that the current LLMs might actually make something like this more feasible, a kind of an GM in a Chinese Room that translates game events in to potential narrative arcs that the player is then free to follow if they wish. As the LLM's actions would be both inspired and limited by the game engine this would probably also tone down the problems with hallucinations and slop.
Not possible, because can't be guardrailed with 100% accuracy. You'll ask it something outside of the Warcraft world (e.g. US politics), and it'll happily oblige. I imagine NPCs will generate really weird immersion breaking stuff even if you cannot freeform interact with them anyway.
You can do that also while playing a traditional tabletop RPG. Players typically don't do it because why would they ruin immersion?
I understand that in multiplayer with strangers it would be a problem because you could affect other players' experiences, but in a single-player game I don't see this as a big issue, as long as the NPC doesn't spontaneously bring immersion-breaking topics into the conversation without the player starting it (which I suppose could be achieved with a suitable system prompt and some fine-tuning on in-lore text).
If it's the player that wants to troll the game and break immersion by "jailbreaking" the NPCs, it's on them, just like if they use a cheat code and make the game trivial.
It's still gonna be hallucinatory AI slop. For the same reasons it makes uninteresting quests and boring planets. It's lazy and it can't replace actual writing and art.
AI is great for getting tasks done where you can pull the information you need out of the slop. For quality immersive entertainment it's not there.
I’m not at all sure of this. You can use classifiers, fine tuning, and prompting to mitigate the issue both on user input and model output. And you’d probably want a bunch of fine tuning anyway to get their voice right.
> Not possible, because can't be guardrailed with 100% accuracy. You'll ask it something outside of the Warcraft world (e.g. US politics), and it'll happily oblige. I imagine NPCs will generate really weird immersion breaking stuff even if you cannot freeform interact with them anyway.
> Not to mention the current token cost.
You of course have to train the AI from ground up and on material that is as much as possible only related to the topics that are in the game world (i.e. don't include real-world events in the training data that has no implications in-universe).
You don't, for example, expect some ordinary farmer or tramp in the game world to know a lot about the (in-game) world or be capable of doing deep conversations about complicated topics.
So I don't think the necessary amount of text that you need to train the AI on is as insanely large as you imagine (but of course nevertheless a lot of texts have to be written - this is the price of having "much more dynamic" AI characters in the game).
Write a couple of lore books, in-universe cyclopedia, some character sheets and exclusively train on them. Maybe some out-of-game lore for cross-over universes!
The question that poses to me is the quantity of writing you need for training before you can reasonably expect a generation system to produce something new and interesting, however much work on the right knowledge is in the right place, and is worth the costs for how you expect the player to interact with the game beyond the manual work.
I doubt there's telemetry in the elder scrolls games, but I'd love to know how many go around the world exploring everything the characters have to say, or reading all the books. How many get the lore in secondary media, wikis or watching a retelling or summary on youtube. On a certain level it's important they're there as an opt-in method to convey the 'secondary' world lore to the player without a "sit down and listen" info dump, plus give the impression it was written by someone so these objects would would exist organically in the world or certain characters would talk about those topics, but I wonder how much of the illusion would still be there if it was just each book having a title.
Is that feasible? I was under the impression that fully training an LLM requires untold mountains of data, way more than a game dev company could reasonably create.
You are correct. The fact that so many people are saying “lol just train it on text about the game bro” reveals how little people understand how these models work, how they are trained, etc.
Microsoft's phi models are trained on a much smaller dataset. They generally aren't as amazing as the models that get talked about more, but they are more than enough to get the job done for npc lines in a game.
For this to work you pretty much have to start from scratch, putting in "obvious" things like "the sun exists and when its out it casts light and shadow" and "water is a liquid (what's a liquid?) and flows downhill". Is there a corpus of information like this, but also free of facts that might be anachronistic in-universe?
With the advent of unoptimized UE5 releases becoming the norm and the mentality of shipping badly broken games by default and them only being in a good state years later if at all, I’m not sure running an LLM on device would be a good idea.
I enjoy getting my ale at the click of a button, and keep my arguing capabilities for stranger online.
There may be a place for AI driven games but there is literally no reason to shove it everywhere. Pre-written dialogue is much more enjoyable to engage with on the long term, contrasted with having to think about phrasing for an NPC that spouts generic fantasy speak.