> AI needs appropriate and sufficient guidance to be able to write code that does the job.
Note that the example (shitty Microsoft) implementation was not able to properly run tests during its work, not even tests it had written itself.
If you have an existing codebase that already has a plenty tests and you ask AI to refactor something whilst giving it the access it needs to run tests, it can already sometimes do a great job all by itself.
Good specification and documentation also do a lot, of course, but the iterative approach with feedback if things are actually working as intended is a game changer. Not unsurprisingly also a lot closer to how humans do things.
The iterative approach has one problem -- it is onerous to repeat the lengthy iterative process with a different model, as it will lead to an entirely different conversation. In contrast, when the spec is well-written up-front, it is trivial to switch models to see how the other model implements it differently.
Note that the example (shitty Microsoft) implementation was not able to properly run tests during its work, not even tests it had written itself.
If you have an existing codebase that already has a plenty tests and you ask AI to refactor something whilst giving it the access it needs to run tests, it can already sometimes do a great job all by itself.
Good specification and documentation also do a lot, of course, but the iterative approach with feedback if things are actually working as intended is a game changer. Not unsurprisingly also a lot closer to how humans do things.