One thing I'd like to see is an apples-to-apples benchmark against e.g. aider's ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

energy123 32 days ago | parent | context | favorite | on: Launch HN: Relace (YC W23) – Models for fast and r...

One thing I'd like to see is an apples-to-apples benchmark against e.g. aider's edit formats, on the same set of tasks. There is a published benchmark on your site, but it isn't apples-to-apples, it only establishes the relative superiority of the fine-tuned model within this patching framework -- it's not a comparison across patching frameworks.

pfunctional 32 days ago [–]

You're super right -- this is probably the one crack in our narrative and one that I sorely need to address. Hope to be back with something positive on this front soon, we're setting up all the benchmark harnesses to do this more equitably.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact