I think most of this is good stuff but I disagree with not letting Claude touch ...

diwank · 2025-06-08T02:24:31 1749349471

True but, in my experience, a few major pitfalls that happened:

1. We ran into really bad minefields when we tried to come back to manually edit the generated tests later on. Claude tended to mock everything because it didn’t have context about how we run services, build environments, etc.

2. And this was the worst, all of the devs on the team including me got realllyy lazy with testing. Bugs in production significantly increased.

jaakl · 2025-06-08T03:42:11 1749354131

Did you try to put all this (complex and external) context to the context (claude.md or whatever), with intructions how to do proper TDD, before asking for the tests? I know that may be more work than actual coding it as you know all it by heart and external world is always bigger than internal one. But in long term and with teams/codebases with no good TDD practises that might end up with useful test iterations. Of course developer commiting the code is anyway responsible for it, so what I would ban is putting “AI did it” to the commits - it may mentally work as “get out of jail card” attempt for some.

diwank · 2025-06-08T15:57:08 1749398228

we tried a few different variations but tbh had universally bad results. for example, we use `ward` test runner in our python codebase, and claude sonnet (both 3.7 and 4) keep trying to force-switch it to pytest lol. every. single. time.

maybe we could either try this with opus 4 and hope that cheaper models catch up, or just drink the kool-aid and switch to pytest...

ayewo · 2025-06-08T10:57:27 1749380247

I literally LOLed at #2, haha! LLMs are making devs lazy at scale :)

Devs almost universally hate 3 things:

1. writing tests;

2. writing docs;

3. manually updating dependencies;

and LLMs are a big boon wrt to helping us avoiding all 3, but forcing your team to pick writing tests is a sensible trade off in this context, since as you say bugs in prod increased significantly.

diwank · 2025-06-08T15:58:34 1749398314

yeah, this might change in the future but I also found that since building features has become faster, asking devs to write the tests themselves sort of demands that they take responsibility of the code and the potential bugs