Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Bold to launch this before the roll out their own reasoning models / deep research. Seems like table stakes if you want to capture power users but maybe that's just my workflow.


Anthropic have recently said that they don't like the "reasoning" label and don't intend to have separate base and reasoning models, but rather one model that does it all. The current Claude 3.7 already seems to be doing some reasoning, displaying "pondering" and "analyzing" messages between different stages of output generation.


Regardless of naming, they don't have a model competitive with o1 Pro for coding (or with Gemini 2.5, for that matter). As someone who pays $200/month for GPT Pro I definitely wouldn't pay the same for Claude (and am considering cancelling GPT Pro if they don't deliver something better than Gemini 2.5 soon).


Gemini 2.5 is awesome, but when it screws the pooch, it really screws the pooch (e.g., https://gemini.google.com/share/374ac006497d ). It has delivered some of the best responses I've seen lately but also some of the worst.

I have been tempted several times to kill my GPT Pro account, but it is still valuable for cases where Gemini and Claude don't get the job done for whatever reason.


I tried gemini 2.5 just once for coding and it immediately spit out hilariously invalid python, not even clearing the lowest of bars.

What (language/topic) are you coding in that gemini (or even chatgpt) is better than claude? Very surprised to hear this.


For Golang and Haskell I found o1 Pro and Gemini 2.5 are much better at generating code that works first try than Claude. Gemini 2.5 in particular can generate thousands and thousands of lines of code that correctly use types and functions that were generated earlier in the context, with minimal errors.


I've been using Gemini pro 2.5 over Claude 3.5 in cursor all week - some people have Gemini do a prd and tasks and Claude do the actual coding. I switch between them but have been pretty impressed with Gemini.

Gemini sometimes fucks up diffs or doesn't actually apply the edits - Claude is rock solid at that. They're both very good though - but 3.7 really likes refactoring unneeded shit and removing code sometimes.


Their reasoning is called "extending thinking". It was released alongside Claude Sonnet 3.7.

They have web search, but it's true, no "deep research". It honestly is not very good and it's WAY too trigger-happy with it. As a result, if you accidentally leave it on, you get terrible answers to simple questions that non-search mode would have answered well.

(And for context Claude Sonnet 3.7 is my model of choice)


They already have a reasoning model out. I don't think deep research is on the agenda atm.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: