Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I pasted the image of the chart into ChatGPT-5 and prompted it with

>there seems to be a mistake in this chart ... can you find what it is?

Here is what it told me:

> Yes — the likely mistake is in the first set of bars (“Coding deception”). The pink bar for GPT-5 (with thinking) is labeled 50.0%, while the white bar for OpenAI o3 is labeled 47.4% — but visually, the white bar is drawn shorter than the pink bar, even though its percentage is slightly lower.

So they definitely should have had ChatGPT review their own slides.



>but visually, the white bar is drawn shorter than the pink bar, even though its percentage is slightly lower.

But the white bar is not shorter in the picture.


funny isn't it - makes me feel like it's kind of over-fitted to try and be logical now, so when it's trying to express a contradiction it actually can't


Does it work that well if you don’t tell it there is a mistake though?


That's the secret, you should always tell it to doubt everything and find a mistake!


But how could have they used ChatGPT-5 if they were working on the blog post announcing it?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: