The context required to write real software is just way too big for LLMs. Software is the business, codified. How is an LLM supposed to know about all the rules in all the departments plus all the special agreements promised to customers by the sales team?
Right now the scope of what an LLM can solve is pretty generic and focused. Anytime more than a class or two is involved or if the code base is more than 20 or 30 files, then even the best LLMs start to stray and lose focus. They can't seem to keep a train of thought which leads to churning way too much code.
If LLMs are going to replace real developers, they will need to accept significantly more context, they will need a way to gather context from the business at large, and some way to persist a train of thought across the life of a codebase.
I'll start to get nervous when these problems are close to being solved.
> Anytime more than a class or two is involved or if the code base is more than 20 or 30 files, then even the best LLMs start to stray and lose focus. They can't seem to keep a train of thought which leads to churning way too much code.
At least if looking at this specific portion: You can have a 20-30 file code base, with 5-10 classes in the context, in full, and the rest of them filtered in a sensible way. Even a model with a 200k context window can handle this.
The output definitely can stray, but it's not the norm in my experience. Of course, if the output does start to stray, it needs to be snipped in the bud. And the fixes can range anywhere from working but bad code, to very close how you'd written it yourself, if you've clearly described how you want the code to be written.
If you're trying to fix a specific bug, for example, but don't provide thorough logs on what is happening in the code, it's much more likely the output will stray towards some average of what the problem could be, rather than what it actually is in the current code.
And this is absolutely not to say that LLMs could do what Antirez is doing. There is a massive amount of variation in how deeply people think about the code they're reading or writing.
Yeah, I’m saying that I expect us to have a world where we can cram almost all business knowledge into the context window for coding LLMs and we can get an early glimpse of that today by pasting in the full contents of 50+ files into Gemini 2.5 and only using 10% of its context window today, which is the worst it’ll ever be.
Right now the scope of what an LLM can solve is pretty generic and focused. Anytime more than a class or two is involved or if the code base is more than 20 or 30 files, then even the best LLMs start to stray and lose focus. They can't seem to keep a train of thought which leads to churning way too much code.
If LLMs are going to replace real developers, they will need to accept significantly more context, they will need a way to gather context from the business at large, and some way to persist a train of thought across the life of a codebase.
I'll start to get nervous when these problems are close to being solved.