Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just tried this and it did not appear to work for me. Prompt:

>Please provide me strict bounding boxes that encompasses the following text in the attached image? I'm trying to draw a rectangle around the text.

> - Use the top-left coordinate system

>this input document is 1080 x 1236 px. return the bounding boxes as integers



https://github.com/google-gemini/cookbook/blob/a916686f95f43...

They say there's no magic prompt but I'd start with their default since there is usually some format used to improve performance with posttraining with tasks like this


"Might" being the operative word, particularly with models that have less prompt adherence. There's a few other prompt massaging tricks beyond the scope of a HN comment, the decimal issue is just one optimization.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: