Instead of downvotes i would appreciate some insightful comments on this, as i'm currently struggling with this problem. In the last week i've vibe-code (vibe-engineered?) a typescript project with 230+ commits, 64 typescripts files, with 27k+ lines of code. Too much to actually read. Validation mostly through testing, automated test, architecture reviews (generate mermaid diagrams). I'm mostly reviewing the code structure and architecture, libraries it uses, etc. It has 600+ unit and integration tests, but even reviewing those is too much...
Our problem is not coding. Our problem is knowledge. If no one reads it and no one knows how it works and that’s what the company wants because we need to ship fast then the company doesn’t understand what software is all about.
Code is a language, we write stories that makes a lot of sense and has consequences. If the companies does not care that humans need to know and decide in details the story and how it’s written then let it accept the consequence of a sttastistically generated story with no human supervision. Let it trust the statistics when there will be a bug and no one knows how it works because no one read it and no one is there anymore to debug.
We’ll see in the end if it’s cheaper to let the code be written and only understood by statistical algorithms.
Otherwise, just work differently instead of generating thousand of loc, it’s your responsibility to review and understand no matter how long it takes.
> In the last week i've vibe-code (vibe-engineered?) a typescript project with 230+ commits, 64 typescripts files, with 27k+ lines of code. Too much to actually read.
Congratulations, you discovered that generating code is only part of software development process. If you don't understand what the code is actually doing, good luck maintaining it. If it's never reviewed, how do you know these tests even test anything? Because they say "test passed"? I can write you a script that prints "test passed" a billion times - would you believe it is a billion unit tests? If you didn't review them, you don't have tests. You have a pile of code that looks like tests. And "it takes too long to review" is not an excuse - it's like saying "it's too hard to make a car, so I just took a cardboard box, wrote FERRARI on it and sit inside it making car noises". Fine, but it's not a car. It's just pretending. If it's not properly verified, what you have is not tests, it's just pretending.
I’m well aware, thank you, have been coding for 40+ years (including 6502 and 68000 assembly), masters in computer science, have built healthcare software where bugs can lead to death. But with LLMs enabling us to generate source code faster, our review process is becoming an increasingly larger bottleneck for productivity. We need to start thinking how we can scale this process.
It's as much bottleneck for productivity as cars being made of metal are bottleneck for speed. Sure, you can make a paper car. It probably would be faster. Until you collide with something and then you discover why the metal frame was a good idea. If you generate code that you can not verify or test, sure, it's faster. Until something goes wrong.
Yeah, you aren't wrong.... I predict two things to happen with this.
1. A more biological approach to programming - instead of reviewing every line of code in a self-contained way, the system would be viewed in a more holistic way, observing its behaviour and test whether it works for the inputs you care about. If it does, great, ship it, if not, fix it. This includes a greater openness to just throwing it away or massively rewriting it instead of tinkering with it. The "small, self-contained PRs" culture worked well when coding was harder and humans needed to retain knowledge about all of the details. This leads to the next point, which is
2. Smaller teams and less fungibility-oriented practices. Most software engineering practices are basically centred around making the bus factor higher, speeding onboarding up and decrease the volatility in programmers' practices. With LLM-assisted programming, this changes quite a bit, a smaller, more skilled team can more easily match the output of a larger, more sluggish one, due to the reduced communication overhead and being able to skip all the practices which slow the development velocity down in favour of doing things.
A day ago, the good old Arthur Whitney-style C programming was posted to this site (https://news.ycombinator.com/item?id=45800777) and most commenters were horrified. Yes, it's definitely a mouthful on first read but this style of programming does have value - it's easier to overview, easier to modify than a 10KLOC interpreter spanning 150 separate files, and it's also quite token-efficient too. Personally, I'd add some comments but I see why this style is this way.
Same with style guides and whatnot - the value of having a code style guide (beyond basic stuff like whitespace formatting or wordwrapping on 160) drastically drops when you do not have to ask people to maintain the same part for years. You see this discussion playing out, "my code formatter destroyed my code and it made much more unreadable" - "don't despair, it was for the greater good for the sake of codebase consistency!". Again, way less of a concern when you can just tell an LLM to reformat/rename/add comments if you want it.
I'd definitely say that getting the architecture right is way more important, and let the details play out in an organic way, unless you're talking about safety-critical software. LLM-written code is "eventually correct", and that is a huge paradigm shift from "I write code and I expect the computer to do what I have written".