Hacker News new | past | comments | ask | show | jobs | submit login

Reward hacking is a well known and tracked problem at frontier labs - Claude 4’s system card reports on it for instance. It’s not surprising that a framework built on current llms would have reward hacking tendencies.

For this part of the stack the interesting question to me is how to identify and mitigate.






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: