Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn’t that the point of the hidden chain of thought tokens, rather than the visible cruft?

I think the fluff, the emojis, the sycophancy is all symptomatic of the training process and human feedback.





I thought PP was saying that the "Thinking" text is only used for one turn, and the response text is the compressed thinking that survives into future turns.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: