"World's best OCR model" - that is quite a statement. Are there any well-known b...

themanmaran · 2025-03-06T18:31:43 1741285903

We published this benchmark the other week. We'll can update and run with Mistral today!

https://github.com/getomni-ai/benchmark

themanmaran · 2025-03-06T23:14:02 1741302842

Update: Just ran our benchmark on the Mistral model and results are.. surprisingly bad?

Mistral OCR:

- 72.2% accuracy

- $1/1000 pages

- 5.42s / page

Which is pretty far cry from the 95% accuracy they were advertising from their private benchmark. The biggest thing I noticed is how it skips anything it classifies as an image/figure. So charts, infographics, some tables, etc. all get lifted out and returned as [image](image_002). Compared to the other VLMs that are able to interpret those images into a text representation.

https://github.com/getomni-ai/benchmark

https://huggingface.co/datasets/getomni-ai/ocr-benchmark

https://getomni.ai/ocr-benchmark

Thaxll · 2025-03-07T02:14:06 1741313646

Do you benchmark the right thing though? It seems to focus a lot on image / charts etc...

The 95% from their benchmark: "we evaluate them on our internal “text-only” test-set containing various publication papers, and PDFs from the web; below:"

Text only.

themanmaran · 2025-03-07T07:22:30 1741332150

Our goal is to benchmark on real world data. Which is often more complex than plain text. If we have to make the benchmark data easier for the model to perform better, it's not an honest assessment of the reality.

kergonath · 2025-03-06T18:37:04 1741286224

Excellent. I am looking forward to it.

cdolan · 2025-03-06T18:52:49 1741287169

Came here to see if you all had run a benchmark on it yet :)

WhitneyLand · 2025-03-06T18:39:32 1741286372

It’s interesting that none of the existing models can decode a Scrabble board screen shot and give an accurate grid of characters.

I realize it’s not a common business case, came across it testing how well LLMs can solve simple games. On a side note, if you bypass OCR and give models a text layout of a board standard LLMs cannot solve Scrabble boards but the thinking models usually can.

xnx · 2025-03-06T18:25:53 1741285553

https://huggingface.co/spaces/echo840/ocrbench-leaderboard

ChemSpider · 2025-03-06T18:34:13 1741286053

Interesting. But no mistral on it yet?

resource_waste · 2025-03-06T21:33:37 1741296817

Its Mistral, they are the only homegrown AI Europe has, so people pretend they are meaningful.

I'll give it a try, but I'm not holding my breath. I'm a huge AI Enthusiast and I've yet to be impressed with anything they've put out.