Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn't this factually wrong? Grok-4 used as much compute on RL as they did on pre-training. I'm sure GPT-5 was the same (or even more)


It was true for models up to o3, but there isn't enough public info to say much about GPT-5. Grok 4 seems to be the first major model that scaled RL compute 10x to near pre-training effort.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: