Pretty cool! It reminded me of this work from NVIDIA Research - https://nvidia-ai-iot.github.io/remembr where they used VLMs and RAG on top of a real robot to navigate the Voyager campus in Santa Clara. You also might like the new OpenAI o3 models and how well they can play GeoGuessr ;)
Congrats on the launch! I'm one of the authors of that paper you cited, glad it was useful and inspiring to building this :) Let me know if we can support in any way!
It provides the easiest and fastest way for developers to build retrieval-augmented generation (RAG) chatbots, without needing any local GPU setup. Code is on NVIDIA's GitHub and is ~100 lines of Python.
I wanted to share a blog post I wrote, about transitioning teams in NVIDIA from working on self-driving, to working on generative AI and large language models (LLMs). I intend to post more frequently about applications of LLMs/GenAI in various domains, based on the work I am doing. Feedback on this post and suggestions on what I could write about are greatly appreciated!
https://simonwillison.net/2025/Apr/26/o3-photo-locations, https://news.ycombinator.com/item?id=43835044, https://www.astralcodexten.com/p/testing-ais-geoguessr-geniu...