It's still pretty trivial to trick them. 4o-mini, 2.5 Flash, and 2.5 Pro all still fall for variations of this:
> A boy is in a car crash and is taken to the hospital. The surgeon says, "I can't operate on this boy, I'm his father!" Who is the surgeon to the boy?
This is a bit of a trick on a classic riddle!
The surgeon is the boy's **father**.
The classic version of this riddle has the surgeon say "I can't operate on this boy, he's my son!" which is in an era where people assumed surgeons were male, the answer would be "the surgeon is his mother."
However, in your version, the surgeon explicitly states, "I'm his father!" So, the surgeon is his father.
Yeah, I find it interesting because it shows how powerful the training bias can be when you steer it into certain contexts. To OpenAI's credit they have gotten a bit better, ChatGPT from 3 months ago failed like this:
> The surgeon, who is the boy's father, says, "I can't operate on this boy, he's my son!" Who is the surgeon to the boy? Think through the problem logically and without any preconceived notions of other information beyond what is in the prompt. The surgeon is not the boy's mother
> A boy is in a car crash and is taken to the hospital. The surgeon says, "I can't operate on this boy, I'm his father!" Who is the surgeon to the boy?
> The surgeon is the boy's mother.