In my open source tool http://docrouter.ai I run both OCR and LLM/Gemini, using litellm to support multiple LLMs. The user can configure extraction schema & prompts, and use tags to select which prompt/llm combination runs on which uploaded PDF.
LLM extractions are searched in OCR output, and if matched, the bounding box is displayed based on OCR output.
LLM extractions are searched in OCR output, and if matched, the bounding box is displayed based on OCR output.
Demo: app.github.ai (just register an account and try) Github: https://github.com/analytiq-hub/doc-router
Reach out to me at [email protected] for questions. Am looking for feedback and collaborators.