Surely we do know why - reinforcement learning for reasoning. These systems are ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		HarHarVeryFunny 50 days ago \| parent \| context \| favorite \| on: Is chain-of-thought AI reasoning a mirage? Surely we do know why - reinforcement learning for reasoning. These systems are trained to generate reasoning steps that led to verified correct conclusions during training. No guarantees how they'll perform on different problems of course, but in relatively narrow closed domains like math and programming, it doesn't seem surprising that when done at scale there are similar enough problems where similar reasoning logic will apply, and it will be successful.

dcre 46 days ago [–]

We don't know why that is sufficient to enable the models to develop the capability, and we don't know what they are actually doing under the hood when they employ the capability.

Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact