As a test, I asked Gemini 2.5 Flash and Gemini 2.5 Pro to decode a single BASE64...

rafterydj · 2025-06-07T03:44:55 1749267895

well that's not a very convincing argument. That's just a failure to recognize when the use of a tool- base64 decoder- is needed, not a reasoning problem at all, right?

layer8 · 2025-06-07T11:18:10 1749295090

A moderately smart human who understands how Base64 works can decode it by hand without external tools other than pen and paper. Coming up with the exact steps to perform is a reasoning problem.

BoorishBears · 2025-06-07T06:58:12 1749279492

That's not really a cop out here: both models had access to the same tools.

Realistically there are many problems that non-reasoning models do better on, especially when the answer cannot be solved by a thought process: like recalling internal knowledge.

You can try to teach the model the concept of a problem where thinking will likely steer it away from the right answer, but at some point it becomes like the halting problem... how does the model reliably think its way into the realization a given problem is too complex to be thought out?

Jensson · 2025-06-07T10:36:45 1749292605

Translating to BASE64 is a good test to see how well it works as a language translator without changing things, because its the same skill for an AI model.

If the model changes things it means it didn't really capture the translation patterns for BASE64, so then who knows what it will miss when translating between languages if it can't even do BASE64?

einsteinx2 · 2025-06-08T20:48:36 1749415716

If the reasoning model was truly reasoning while the flash model was not then by definition shouldn’t it be better at knowing when to use the tool than the non-reasoning model? Otherwise it’s not really “smarter” as claimed, which seems to line up perfectly with the paper’s conclusion.

bayindirh · 2025-06-07T10:02:40 1749290560

I don't know whether Flash uses a tool or not, but it answers pretty quickly. However, Pro opts to use its own reasoning, not a tool. When I look at the reasoning train, it pulls and pulls knowledge endlessly, refining that knowledge and drifting away.