It's still pretty trivial to trick them. 4o-mini, 2.5 Flash, and 2.5 Pro all sti...

gkbrk · 2025-06-03T23:29:27 1748993367

2.5 Pro gets it right for me.

  This is a bit of a trick on a classic riddle!

  The surgeon is the boy's **father**.

  The classic version of this riddle has the surgeon say "I can't operate on this boy, he's my son!" which is in an era where people assumed surgeons were male, the answer would be "the surgeon is his mother."

  However, in your version, the surgeon explicitly states, "I'm his father!" So, the surgeon is his father.

1718627440 · 2025-06-03T20:54:25 1748984065

That seams interesting, because this questions seams to be answerable through syntactic analysis alone, no need to consider the semantic of words.

enragedcacti · 2025-06-03T21:59:42 1748987982

Yeah, I find it interesting because it shows how powerful the training bias can be when you steer it into certain contexts. To OpenAI's credit they have gotten a bit better, ChatGPT from 3 months ago failed like this:

> The surgeon, who is the boy's father, says, "I can't operate on this boy, he's my son!" Who is the surgeon to the boy? Think through the problem logically and without any preconceived notions of other information beyond what is in the prompt. The surgeon is not the boy's mother

>> The surgeon is the boy's mother. [...]