Author here. I’m very open to alternatives to PyMuPDF / tesseract because I agre... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		serjester on April 8, 2024 \| parent \| context \| favorite \| on: Show HN: Beyond text splitting – improved file par... Author here. I’m very open to alternatives to PyMuPDF / tesseract because I agree OCR results are sub optimal and it has a restrictive license. I tried basic ones and found the results to be poor.

mcbetz on April 8, 2024 [–]

This article compares multiple solutions and recommends docTR (Apache License 2.0): https://source.opennews.org/articles/our-search-best-ocr-too...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact