These posts are funny to me because prompt engineers point at them as evidence of the fast-approaching software engineer obsolescence but the years of experience in software engineering necessary to even guide an AI in this way is very high.
The reason he keeps adjusting the prompts is because he knows how to program. He knows what it should look like.
The argument is that this stuff will so radically improve senior engineer productivity that the demand for junior engineers will crater. And without a pipeline of junior engineers, the junior-to-senior trajectory will radically atrophy
Essentially, the field will get frozen where existing senior engineers will be able to utilize AI to outship traditional senior-junior teams, even as junior engineers fail to secure employment
I don’t think anything in this article counters this argument
Right. I don’t understand why everyone thinks this will make it impossible for junior devs to learn. The people I had around to answer my questions when I was learning knew a whole lot less than Claude and also had full time jobs doing something other than answering my questions.
It won't make it impossible for junior engineers to learn.
It will simply reduce the amount of opportunities to learn (and not just for juniors), by virtue of companies' beancounters concluding "two for one" (several juniors) doesn't return the same as "buy one get one free" (existing staff + AI license).
I dread the day we all "learn from AI". The social interaction part of learning is just as important as the content of it, really, especially when you're young; none of that comes across yet in the pure "1:1 interaction" with AI.
I learnt programming on my own, without any social interaction involved. In fact, I loved programming because it does not involve any social interaction.
Programming has become more of a "social game" in the last 15 years or so. AI is a new superpower for people like me, bringing balance to the Force.
But it is not a social interaction. An LLM is a machine.
I think there is also a big difference between being forced to use an LLM in a certain way, and being able to completely design your interaction with the LLM yourself. The former I imagine can be indeed tedious and frustrating, the latter is just miraculous.
No one is forcing me to use LLMs, so that’s not it. The interaction is social in the sense that it is natural-language based and nondeterministic, and that LLMs exhibit a certain “character”. They have been trained to mimic certain kinds of human social interaction.
It probably also depends on what your favourite weapon of choice is. Mine was always written language, and code is just a particular manifestation of it.
Junior devs using AI can get a lot better at using AI and learn those existing patterns it generates, but I notice, for myself, that if I let AI write a lot of the code I remember and thereby understand it later on less well. This applies in school and when trying to learn new things but the act of writing down the solution and working out the details yourself trains our own brain. I'd say that this has been a practice for over a thousand years and I'm skeptical that this will make junior devs grow their own skills faster.
I think asking questions to the AI for your own understanding totally makes sense, but there is a benefit when you actually create the code versus asking the AI to do it.
I'm sure there is when you're just getting your sea legs in some environment, but at some point most of the code you write in a given environment is rote. Rote code is both depleting and mutagenic --- if you're fluent and also interested in programming, you'll start convincing yourself to do stupid stuff to make the code less rote ("DRY it up", "make a DSL", &c) that makes your code less readable and maintainable. It's a trap I fall into constantly.
> but at some point most of the code you write in a given environment is rote
"Most of the code one writes in a given environment is rote" is true in the same sense that most of the words one writes in any given bit of text are rote e.g. conjunctions, articles, prepositions, etc.
Some writers I know are convinced this is true, but I still don't think the comparison is completely apt, because deliberately rote code with modulated expressiveness is often (even usually) a virtue in coding, and not always so with writing. For experienced or enthusiastic coders, that is to say, the effort is often in not doing stupid stuff to make the code more clever.
Straight-line replacement-grade mid code that just does the things a prompt tells it to in the least clever most straightforward way possible is usually a good thing; that long clunky string of modifiers goes by the name "maintainability".
See, I get what you're saying, but this is my whole thing. No. Abstracting code out or building a bespoke codegen system is not always or even usually an improvement on straight-line code that just does what it says it does.
You learn by doing.. eg typing the code. It's not just knowledge, it's the intuition you develop when you write code yourself. Just like physical exercise. Or playing an instrument. It's not enough to know the theory, practice is key.
AI makes it very easy to avoid typing and hence make learning this skill less attractive.
But I don't necessarily see it as doom and gloom, what I think will happen - juniors will develop advanced intuition about using AI and getting the functionality they need, not the quality of the code, while at the same time the AI models will get increasingly better and write higher quality code.
This is not necessarily true in practical terms when it comes to hiring or promoting. Often a senior dev becomes a senior because of having an advanced skillset, despite years on the job. Similarily, often developers who have been on the job for many years aren’t ready for senior because of their lack or soft and hard skills.
Maybe you could enlighten the rest of us then. According to your favorite definition, what does senior mean, what does seniority mean, and what's a term for someone who knows what they're doing?
Time is required to be a senior engineer, but time does not _make_ you a senior engineer.
You need time to accumulate experience. You need experience, time in the proverbial trenches, to be a senior engineer.
You need to be doing different things too, not just implementing the same cookie cutter code repeatedly. If you are doing that, and havent automated it, you are not a senior engineer.
There's what "senior"-level developers say about themselves, and there's what's actually generally true about them. The two notions are, of course, not the same.
> The argument is that this stuff will so radically improve senior engineer productivity that the demand for junior engineers will crater.
What makes people think that an increase in senior engineer productivity causes demand for junior engineers to decrease?
I think it will have the opposite effect: an increase in senior engineer productivity enables the company to add more features to its products, making it more valuable to its customers, who can therefore afford to pay more for the software. With this increase in revenue, the company is able to hire more junior engineers.
> It just blurs the line between engineer and tool.
I realise you meant it as “the engineer and their tool blend together”, but I read it like a funny insult: “that guy likes to think of himself as an engineer, but he’s a complete tool”.
I think it makes sense that GP is skeptical of this article considering it contains things like:
> this tool is improving itself, learning from every interaction
which seem to indicate a fundamental misunderstanding of how modern LLMs work: the 'improving' happens by humans training/refining existing models offline to create new models, and the 'learning' is just filling the context window with more stuff, not enhancement of the actual model or the model 'learning' - it will forget everything if you drop the context and as the context grows it can 'forget' things it previously 'learned'.
When you consider the "tool" as more than just the LLM model, but the stuff wrapped around calling that model then I feel like you can make a good argument it's improving when it keeps context in a file on disk and constantly updates and edits that file as you work throguh the project.
I do this routinely for large initiatives I'm kicking off through Claude Code - it writes a long detailed plan into a file and as we work through the project I have it constantly updating and rewriting that document to add information we have jointly discovered from each bit of the work. That means every time I come back and fire it back up, it's got more information than when it started, which looks a lot more improvement from my perspective.
You're letting Claude do your programming for you, and then sweeping up whatever it does afterwards. Bluntly, you're off-loading your cognition to the machine. If that's fine by you then that's fine enough, it just means that the quality of your work becomes a function of your tooling rather than your capabilities.
I don't agree. The AI largely does the boring and obvious parts. I'm still deciding what gets built and how it is designed, which is the interesting part.
> I'm still deciding what gets built and how it is designed, which is the interesting part.
How, exactly? Do you think that you're "deciding what gets built and how it's designed" by iterating on the prompts that you feed to the LLM that generates the code?
Or are you saying that you're somehow able to write the "interesting" code, and can instruct the LLM to generate the "boring and obvious" code that needs to be filled-in to make your interesting code work? (This is certainly not what's indicated by your commit history, but, who knows?)
My prompts specify very precisely what should be implemented. I specified the public API and high-level design upfront. I let the AI come up with its own storage schema initially but then I prompted it very specifically through several improvements (e.g. "denormalize this table into this other table to eliminate a lookup"). I designed the end-to-end encryption scheme and told it in detail how to implement it. I pointed out bugs and explained how to fix them. And so on.
All the thinking happened in those prompts. With the details I provided, combined with the OAuth spec, there was really very little room left for any creativity in the code. It was basically connect-the-dots at that point.
Right, so -- 'you think that you're "deciding what gets built and how it's designed" by iterating on the prompts that you feed to the LLM that generates the code'
> My prompts specify very precisely what should be implemented.
And the precision of your prompt's specifications, has no reliable impact on exactly what code the LLM returns as output.
> With the details I provided, combined with the OAuth spec, there was really very little room left for any creativity in the code. It was basically connect-the-dots at that point.
I truly don't know how you can come to this conclusion, if you have any amount of observed experience with any of the current-gen LLM tools. No amount of prompt engineering gets you a reliable mapping from input query to output code.
> I designed the end-to-end encryption scheme and told it in detail how to implement it. I pointed out bugs and explained how to fix them. And so on.
I guess my response here is that, if you think that this approach to prompt engineering gets you a generated code result that is in any sense equivalent, or even comparable, in terms of quality, to the work that you could produce yourself, as a professional and senior-level software engineer, then, man, we're on different planets. Pointing out bugs and explaining how to fix them in your prompts in no way gets you deterministic, reliable, accurate, high-quality code as output. And actually forget about high-quality, I mean even just bare minimum table-stakes requirements-satisfying stuff.. !
> My prompts specify very precisely what should be implemented. I specified the public API and high-level design upfront. I let the AI come up with its own storage schema initially but then I prompted it very specifically through several improvements (e.g. "denormalize this table into this other table to eliminate a lookup"). I designed the end-to-end encryption scheme and told it in detail how to implement it. I pointed out bugs and explained how to fix them. And so on.
OK. Replace "[expected] deterministic output" with whatever term best fits what this block of text is describing, as that's what I'm talking about. The claim is that a sufficiently-precisely-specified prompt can produce reliably-correct code. Which is just clearly not the case, as of today.
I don't even think anybody expects reliably-correct code. They expect code that can be made as reliably as they themselves could make code, with some minimal amount of effort. Which clearly is the case.
Forget about reliably-correct. The code that any current-gen LLM generates, no matter how precise the prompt it's given, is never even close to the quality standards expected of any senior-level engineer, in any organization I've been a part of, at any point in my career. They very much never produce code that is as good as what I can create. If the LLM-generated code you're seeing passes this level of muster, in your view, then that's really a reflection on your situation(s), and 100% not any kind of truth that you can claim as part of a blog post or whatever...
> The code that any current-gen LLM generates, no matter how precise the prompt it's given, is never even close to the quality standards expected of any senior-level engineer, in any organization I've been a part of, at any point in my career.
You are just making assertions here with no evidence.
If you prompt the LLM for code, and then you review the code, identify specific problems, and direct the LLM to fix those problems, and repeat, you can, in fact, end up with production-ready code -- in less time than it would take to write by hand.
Proof: My project. I did this. It worked. It's in production.
It seems like you believe this code is not production-ready because it was produced using an LLM which, you believe, cannot produce production-ready code. This is a cyclic argument.
> If you prompt the LLM for code, and then you review the code, identify specific problems, and direct the LLM to fix those problems, and repeat, you can, in fact, end up with production-ready code
I guess I will concede that this is possible, yes. I've never seen it happen, myself, but it could be the case, at some point, in the future.
> in less time than it would take to write by hand.
This is my point of contention. The process you've described takes ages longer than however much time it would take a competent senior-level engineer to just type the code from first principles. No meaningful project has ever been bottle-necked on how long it takes to type characters into editors.
All of that aside, the claim you're making here is that, speaking as a senior IC, the code that an LLM produces, guided by your prompt inputs, is more or less equivalent to any code that you could produce yourself, even controlling for time spent. Which just doesn't match any of my experiences with any current-gen LLM or agent or workflow or whatever. If your universe is all about glue code, where typing is enemy no. 1, and details don't matter, then fair enough, but please understand that this is not usually the domain of senior-level engineers.
"the code that an LLM produces, guided by your prompt inputs, is more or less equivalent to any code that you could produce yourself, even controlling for time spent"
That's been my personal experience over the past 1.5 years. LLMs, prompted and guided by me, write code that I would be proud to produce without them.
I have only claimed that for this particular project it worked really well, and was much faster than writing by hand. This particular project was arguably a best-case scenario: a greenfield project implementing a well-known standard against a well-specified design.
I have tried using AI to make changes to the Cloudflare Workers Runtime -- my usual main project, which I started, and know like the back of my hand, and which incidentally handles over a trillion web requests every day -- and in general in that case I haven't found it saved me much time. (Though I've been a bit surprised by the fact that it can find its way around the code at all, it's a pretty complicated C++ codebase.)
It's possible kiitos has (or had?) a higher standard in mind for what should constitute a senior/"lead engineer" at Cloudflare and how much they should be constrained by typing as part of implementation.
Out of interest: How much did the entire process take and how much would you estimate it to take without the LLM in the loop?
> It's possible kiitos has (or had?) a higher standard in mind for what should constitute a senior/"lead engineer" at Cloudflare and how much they should be constrained by typing as part of implementation.
See again here, you're implying that I or my code is disappointing somehow, but with no explanation for how except that it was LLM-assisted. I assert that the code is basically as good as if I'd written it by hand, and if you think I'm just not a competent engineer, like, feel free to Google me.
It's not the typing itself that constrains, it's the detailed but non-essential decision-making. Every line of code requires making several decisions, like naming variables, deciding basic structure, etc. Many of these fine-grained decisions are obvious or don't matter, but it's still mentally taxing, which is why nobody can write code as fast as they can type even when the code is straightforward. LLMs can basically fill in a bunch of those details for you, and reviewing the decisions -- especially the fine-grained ones that don't matter -- is a lot faster than making them.
> How much did the entire process take and how much would you estimate it to take without the LLM in the loop?
I spent about five days mostly focused on prompting the LLM (although I always have many things interrupting me throughout the day, so I wasn't 100% focused). I estimate it would have taken me 2x-5x as long to do by hand, but it's of course hard to say for sure.
> See again here, you're implying that I or my code is disappointing somehow, but with no explanation for how except that it was LLM-assisted. I assert that the code is basically as good as if I'd written it by hand, and if you think I'm just not a competent engineer, like, feel free to Google me.
I think you're reading a bit too deeply into what I wrote; I explained what I interpreted kiitos' posts as essentially saying. I realize that you've probably had to deal with a lot of people being skeptical to the point of "throwing shade", as it were, so I understand the defensive posture. I am skeptical, but the reason I'm asking questions (alongside the previous bit) is because I'm actually curious about your experiment.
> It's not the typing itself that constrains, it's the detailed but non-essential decision-making. Every line of code requires making several decisions, like naming variables, deciding basic structure, etc. Many of these fine-grained decisions are obvious or don't matter, but it's still mentally taxing, which is why nobody can write code as fast as they can type even when the code is straightforward. LLMs can basically fill in a bunch of those details for you, and reviewing the decisions -- especially the fine-grained ones that don't matter -- is a lot faster than making them.
In your estimation, what is your mental code coverage of the code you ended up with? Do you feel like you have a complete mapping of it, i.e. you could get an external request for change and map it quickly to where it needs to be made and why exactly there?
> In your estimation, what is your mental code coverage of the code you ended up with? Do you feel like you have a complete mapping of it, i.e. you could get an external request for change and map it quickly to where it needs to be made and why exactly there?
I know the code structure about as well as if I had written it.
Honestly the code structure is not very complicated. It flows pretty naturally from the interface spec in the readme, and I'd expect anyone who knows OAuth could find their way around pretty easily.
But yes, as part of prompting improvements to the code, I had to fully understand the implementation. My prompts are entirely based on reading the code and deciding what needed to be changed -- not based on any sort of black-box testing of the code (which would be "vibe coding").
Personally, I spend _more_ time thinking with Claude. I can focus on the design decisions while it does the mechanical work of turning that into code.
Sometimes I give the agent a vague design ("make XYZ configurable") and it implements it the wrong way, so I'll tell it to do it again with more precise instructions ("use a config file instead of a CLI argument"). The best thing is you can tell it after it wrote 500 lines of code and updated all the tests, and its feelings won't be hurt one bit :)
It can be useful as a research tool too, for instance I was porting a library to a new language, and I told the agent to 1) find all the core types and 2) for each type, run a subtask to compare the implementation in each language and write a markdown file that summarizes the differences with some code samples. 20 min later I had a neat collection of reports that I could refer to while designing the API in the new language.
I was recently considering a refactoring in a work codebase. I had an interesting discussion with Claude about the tradeoffs, then had it show me what the code would look like after the refactor, both in a very simple case, and in one of the most complex cases. All of this informed what path I ended up taking, but especially the real-world examples meant this was a much better informed decision than just "hmm, yeah seems like it would be a lot of work but also probably worth it."
I mean yeah, the very first prompt given to the AI was put together by an experienced developer; a bunch of code telling the AI exactly what the API should look like and how it would be used. The very first step in the process already required an experienced developer to be involved.
The reason he keeps adjusting the prompts is because he knows how to program. He knows what it should look like.
It just blurs the line between engineer and tool.