I think its much better than opus. I would describe its code output as boring and to the point, no fluff. O3 Pro is better at abstraction, but grok heavy is better at bug hunting, and only doing exactly as needed. I swapped my openai pro license for grok, its good. Another big advantage is the context window size. Honestly I use these models all day long, and have felt that while sonnet 3.5 was ground breaking, but that anthropic is behind google, openai, and now xai.
For tools, I use repo prompt + grok website. Personally think claude code is overrated, and hand building the context by selecting the files is far better for complicated tasks
How do you find o3-pro for coding? I've also been taking the approach of hand building context and copying and pasting it in for complicated tasks where I want lots of reasoning, like bug/security audits.
I found o1-pro unbelievably good for coding, but when o3-pro was released, I saw the response length in ChatGPT was gimped severely compared to o1-pro, so didn't find it all that useful - it couldn't output long enough responses. I actually cancelled my ChatGPT subscription as it seemed like such a downgrade, though I'll probably try using again via OpenAI's API at some point, so long as the response length isn't capped. I'm tempted to try out Grok 4 Heavy.
o3 pro is really good, but the context is really constrained, so its hard to use and doesnt output enough. This makes it suitable for ideating on good abstractions, but cant really make broad sweeping changes. Grok will output a full file. If you use the o3 pro API, its actually great, but it gets really expensive.
It happens all the time, I have seen it in NYC. Usually its an early stage thing, cofounder leaves after 1 year etc. Much harder to do with a complicated cap table. Investors I could name even suggest it
all that OCAML we only hire the smartest is often a veil for what is really a simpler operation that is borderline illegal. probably alot of employees dont even really understand the systems they work on
I know someone that made 10 million a year for a long time on wall street. They said, generally you can assume anyone making a large amount of money is a criminal. Any large deviations from the typical returns you would see in an asset class was suspicious
ive been coding 5+ hours a day almost every day for 15 years. i think ai will replace 70% of SWE in the near future. not employement, but 70% of the current work done by engineers
I don’t even spend 70% of my time coding. I suspect that’s common and looking at data it’s more like 25% on average. So even if it replaces 100% of coding (unlikely) that’s the extent of the gain.
Agreed, seems it's a great day if I get close to 50% of coding time. The rest is various meetings, communication, and code review.
And even with reviews you can currently plausibly automate only the code correctness check part, the juicy part of reviews is always manual testing of the change and doing the logical reasoning if the change is doing a meaningful thing. And no, the ticket with the spec is not a reliable source of this info for an LLM as it's always just a partial understanding of the concept.
Some of my biggest productivity gains with llms come from areas that aren’t coding. Research, summation, communication and operational issues have all seen pretty dramatic improvements for me when adding llms.
I don’t think ai will replace the career of software development but I do think the tools we will be using to to it will be dramatically different.
Agreed. I see AI as a major tool upgrade in the same way the IDE was an upgrade from text editors. It will quickly replace the need to do trivial things and greatly reduce the time needed to do complex things.
And I agree. Because ultimately we don’t need that much code in the first place. We need robust data sets.
AI models will enable the data driven machine state dream. Chips that self improve models will boot strap from them and rely on humans to iteratively improve updates.
Coding like it’s 1970 in the 2020s and beyond is not that high tech.
At which point you're potentially looking at Jevon's Paradox.
Software developers do X and Y. AI thing can now do X, so it's used for that, and it's cheaper, so the number of projects increase because you get more demand at a lower price. Those projects each need someone to do Y.
I’ll be surprised if it does. Software jobs are slumping for several reasons and the section 174 hack fixes one for a while but causes between one to four other problems depending on where you live.
For tools, I use repo prompt + grok website. Personally think claude code is overrated, and hand building the context by selecting the files is far better for complicated tasks