Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I always thought it seemed likely that most needle in a haystack tests might run into the issue of the model just encoding some idea of 'out of place-ness' or 'significance' and querying on that, rather than actually saying something meaningful about generalized retrieval capabilities. Does that seem right? Is that the motivation for this test?


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: