Here is the prompt I gave both Claude Code CLI, and the VSCode agent for my TS project:
```
I have modified the type signature and behaviour of how jobs are created. Previously, job definition create took a batch argument (created from a queue). Now it takes the queue directly, is async, requires the databaseClient to be passed in at creation (vs. when the batch is executed). It no longer returns anything - which is fine because the result was only being used for logging - which is now done for us so we don't have to worry. Can we refactor the codebase to make use of the new JobDefinition.create? Remove the vestigial "Job created" log please.
Perform this task and this task only. If you see something unrelated that you believe needs to be refactored - DO NOT MODIFY IT. ONLY PERFORM ACTIONS DIRECTLY RELEVANT TO THIS TASK
```
So there are two instructions:
1. Do the task
2. Don't do stuff that isn't the task (added in frustration on subsequent attempts)
My experience:
The agent flow started well - it found all the files that needed to change and began making edits.
By about file #5 I noticed that on top of requested refactor it started re-ordering object keys of the `JobDefinition.create` method. Although semantically a no-op, this was incredibly frustrating as it made diffs much harder to review.
A little later, it started to modify log messages it wasn't happy with before eventually completely going off the rails and adding arguments to my function definitions that it _thought_ they needed (introducing type/run-time errors).
VSCode would periodically pause and ask for a confirmation in order to continue. Each time I used the opportunity to re-prompt the agent to stay on target:
Me: "STOP GOING OFF TASK - STOP RENAMING VARIABLES, REORDERING PARAMS. JUST DO AS THE TASK TELLS YOU AND NOTHING ELSE"
Agent: "You're absolutely right. I apologize for going off task. Let me focus solely on the task: refactoring JobDefinition.create calls to use the new signature and removing vestigial "Job created" logs"
And each time the bad behavior would return after some time.
I'm not sure what I'm doing wrong. I assumed this sort of mechanical monkey work would be bread and butter for an agentic workflow - but it just keeps losing coherence.
I ended up reverting all the changes as I had absolutely 0 trust in the quality of the generated code.
I apologise for the wall of text but I'm quite frustrated about all the time wasted and am desperate to know what I'm doing wrong!
Thanks in advance!
The way to get around this is to never have the model just “do the thing”. Have it create a plan and create a todo list from the plan (it will do this on its own typically in Claude Code), the. You “approve” the plan, then start working against that todo list and plan.
This ensures that the “task” is never very large (it is always just the next thing on the todo list, which has already been scoped to be small) and there is never any ambiguity over what to do next.
So for your prompt I would ask it to find all locations that use the old job api and put them in a planning document. For each location, have it note if it anticipates any difficulty transitioning to the new api in the planning document. If you want to get fancy, have it use the Task tool to have a subset do the analysis, this keeps the context of the main model less cluttered. I usually use planning mode for this in Claude Code. Then look at the plan, approve it (or tweak it) and have it execute that plan.