And by a sample that has become increasingly known as a benchmark. Newer trainin...

criddell · 2025-06-08T12:26:02 1749385562

And that’s why he says he’s going to have to find a new benchmark.

viraptor · 2025-06-08T13:22:19 1749388939

Would it though? There really aren't that many valid answers to that question online. When this is talked about, we get more broken samples than reasonable ones. I feel like any talk about this actually sabotages future training a bit.

I actually don't think I've seen a single correct svg drawing for that prompt.

cyanydeez · 2025-06-08T12:20:39 1749385239

So what you really need to do is clone this blog post, find and replace pelican with any other noun, run all the tests, and publish that.

Call it wikipediaslop.org

YuccaGloriosa · 2025-06-08T16:30:08 1749400208

If the any other noun becomes fish... I think I disagree.