Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do you really?

Frontier models seems remarkably similar in performance.

Yeah some nuances for sure, but the whole article could apply to every model.



4o on ChatGPT.com vs. Opus in an IDE is like cooking food without kitchen tools vs. using them. 4o is neither a coding-optimized model nor a reasoning model in general.


You're not pushing them hard enough if you're not seeing a vast difference between 4o and Opus. Or possibly they're equivalent in the field you're working in but I suspect it's the former.


Opus, in my opinion, is steps away from AGI. 4o doesn't come close.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: