Most of my AI coding experience is through Github Copilot (GHCP), mostly because that is available to me professionally. GHCP has improved greatly over the past half year in my opinion. I do use it a lot, burning up my enterprise allowance almost every month working on complex python codebases.
When it comes to models in GHCP, I vastly prefer Claude over Codex. It's not that Codex is bad, it just feels tonedeaf to me. It writes code in its own preferred style and doesn't adjust to the context of the codebase. Additionally, for me, Sonnet and Opus are much less prone to getting stuck in loops for longer or more complex agentic tasks.
I do like Codex for review tasks. When I'm working on something complex, both planning and implementation, I frequently ask Codex to review Claude's work, and it does a good job at that, frequently catching a mistake or coming up with a different angle.
I've toyed with kilocode, cline and the related forks through Claude Opus 4.5 API, but I'd argue my experience with Claude Sonnet/Opus through Copilot has just been... better. More consistent. Faster.
Sometimes I code with local models, when I'm working on highly confidential projects or data. Prefer GPT-OSS 20b or Qwen3-coder-30b then, but without an agentic harness as prompts get big and slow.
I would find it a nice read to work a case and see two models/harnesses duke it out, see whether it matches your expectations and gut feeling.
When it comes to models in GHCP, I vastly prefer Claude over Codex. It's not that Codex is bad, it just feels tonedeaf to me. It writes code in its own preferred style and doesn't adjust to the context of the codebase. Additionally, for me, Sonnet and Opus are much less prone to getting stuck in loops for longer or more complex agentic tasks.
I do like Codex for review tasks. When I'm working on something complex, both planning and implementation, I frequently ask Codex to review Claude's work, and it does a good job at that, frequently catching a mistake or coming up with a different angle.
I've toyed with kilocode, cline and the related forks through Claude Opus 4.5 API, but I'd argue my experience with Claude Sonnet/Opus through Copilot has just been... better. More consistent. Faster.
Sometimes I code with local models, when I'm working on highly confidential projects or data. Prefer GPT-OSS 20b or Qwen3-coder-30b then, but without an agentic harness as prompts get big and slow.
I would find it a nice read to work a case and see two models/harnesses duke it out, see whether it matches your expectations and gut feeling.