Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
am17an
21 days ago
|
parent
|
context
|
favorite
| on:
Mistral raises 1.7B€, partners with ASML
Isn't this factually wrong? Grok-4 used as much compute on RL as they did on pre-training. I'm sure GPT-5 was the same (or even more)
sigmoid10
21 days ago
[–]
It was true for models up to o3, but there isn't enough public info to say much about GPT-5. Grok 4 seems to be the first major model that scaled RL compute 10x to near pre-training effort.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: