Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well yes but there is no better way to measure without resorting to pure hearsay. How would you make an accurate assessment of something so inherently vague?


Alter the benchmark space that we care about, for example focus only on ARC-AGI-2 and then suddenly the gains are no longer diminishing but are accelerating.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: