I pasted the image of the chart into ChatGPT-5 and prompted it with
>there seems to be a mistake in this chart ... can you find what it is?
Here is what it told me:
> Yes — the likely mistake is in the first set of bars (“Coding deception”).
The pink bar for GPT-5 (with thinking) is labeled 50.0%, while the white bar for OpenAI o3 is labeled 47.4% — but visually, the white bar is drawn shorter than the pink bar, even though its percentage is slightly lower.
So they definitely should have had ChatGPT review their own slides.
funny isn't it - makes me feel like it's kind of over-fitted to try and be logical now, so when it's trying to express a contradiction it actually can't
>there seems to be a mistake in this chart ... can you find what it is?
Here is what it told me:
> Yes — the likely mistake is in the first set of bars (“Coding deception”). The pink bar for GPT-5 (with thinking) is labeled 50.0%, while the white bar for OpenAI o3 is labeled 47.4% — but visually, the white bar is drawn shorter than the pink bar, even though its percentage is slightly lower.
So they definitely should have had ChatGPT review their own slides.