I honestly can't tell if you are being serious. Is there any doubt that the "OCR pipeline" will just be an LLM and it's just a matter of time?
What you are describing is similar to how computer used to detect cats. You first extract edges, texture and gradient. Then use a sliding window and run a classifier. Then you use NMS to merge the bounding boxes.
What you are describing is similar to how computer used to detect cats. You first extract edges, texture and gradient. Then use a sliding window and run a classifier. Then you use NMS to merge the bounding boxes.