Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The "cross-linked internals" only go one direction and only one token at a time, slide window and repeat. The RL layer then picks which few sequences of words are best based on human feedback in a single step. Even "thinking" is just doing this in a loop with a "think" token. It is such a ridiculously simplistic model that it is vastly closer to an adder than a human brain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: