I work in healthcare domain, We've had great success converting printed lab reports (95%) to Json format using 1.5-Flash model. This post is really exciting for me. will definitely try out 2.0 models.
The struggle which almost every ocr usecase faces is with handwritten documents(doctor prescriptions with bad handwriting) With gemini 1.5 flash we've had ~75-80% percent accuracy (based on random sampling by pharmacists). we're planning to improve this further by fine-tuning gemini models with medical data.
What could be other alternative services/models for accurate handwriting ocr?
I'm guessing that human accuracy may be lower or around that value, given that handwritten notes are generally difficult to read. A better metric for document parsing might be accuracy relative to human performance (how much better the LLM performs compared to a human).
I've been the software engineering game for the last 8+ and this is the best lesson I've learned in the past two years, been facing lot of health issues due to my workoholic nature and ignoring my physical health. Somehow my mental health has been good due to the support systems I have in my life and I'm super grateful for that.
Having a balanced life is my biggest goal right now as other carrer related things are mostly on auto pilot mode. I have a system set for that.
I just met an azure representative in a event and they said managed mysql is in private beta right now. It will take them around 3 months to get it out if it. Even I'm waiting for it
doesn't anyone think about performance anymore? I mean just add more vms or increase their config, no one talks about efficient resource utilization asfaik.
And there might be times where you containerizing everything is an over overkill.
Funnily enough, people do. In fact, containerization came from Google caring a lot about performance! A good intro to how containers help, and to the probable future of computation: http://m.youtube.com/watch?v=7MwxA4Fj2l4
I saw some of the videos and i agree with the suggestions provided here but i really like the simple way of explaining and less mathematical more programmatic approach. It'd be awesome if you could make some neural networks with cuda or any other library like tensorflow or theano. Good luck.
I use a lenovo z500 3rd gen core i5 6 gigs of RAM and 250 gb samsung 850 evo ssd with Windows 7 and Linux mint. It's ssd runs like a charm. You gotta love the stability and simplicity of Windows 7 plus I experiment with stuffs on linux mint which is really awesome with mate as desktop environment and compiz window manager (you can do a lot of awesome window tweaks with it).
I have been using asmallorange.com they're great with support and technically. They have a no bullshit policy opposed to many other hosting providers who give "unlimited" plans but still have restrictions. They use Marian for data bases and they also give you jailed shell access to your server even on shared hosting plan that's pretty awesome.
The struggle which almost every ocr usecase faces is with handwritten documents(doctor prescriptions with bad handwriting) With gemini 1.5 flash we've had ~75-80% percent accuracy (based on random sampling by pharmacists). we're planning to improve this further by fine-tuning gemini models with medical data.
What could be other alternative services/models for accurate handwriting ocr?