Sure that can happen but it hasn’t been my experience. I just spent a whole day using it for some pretty hefty refactors, many rounds of back-and-forths, thousands of lines of code changes, reviews, investigations, many subagents running parallel tasks, the works. Total cost $0.95, altogether.
I had attempted this with Opus 4.6 in the past and it burned through the $10 budget I’d given it before it returned from my initial prompt.
Even if it’s heavily discounted, it would still have cost me single digits for a complete solution vs double-digits for exactly nothing.
I didn't want to say that they're not cheaper to run, artificial analysis also shows that they're cheaper. My main point was about it being important to also look at token efficiency, not only cost per token, to get the full picture.
I agree! I don't find Claude models to be particularly efficient anyway though. Maybe when running through Claude Code? I don't know, I tried it a while back but it didn't suit me and I kept hitting bugs so I dropped it in favour of something that does something closer to what I want rather than what the provider wants!
Mostly OpenCode but I've been experimenting with Pi a bit lately.
I use Agent Hive [0] for more complex tasks. It sends off subagents with models and parameters I can configure for each different agent (i.e. a low-temp coder, a higher temp with some top_k / top_p for research and architecture, etc).
I had attempted this with Opus 4.6 in the past and it burned through the $10 budget I’d given it before it returned from my initial prompt.
Even if it’s heavily discounted, it would still have cost me single digits for a complete solution vs double-digits for exactly nothing.