Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Awesome job launching guys! We used Browser Use last week to order burgers from our smart glasses:

https://x.com/caydengineer/status/1889835639316807980

One thing I'm hoping for is an increase in speed. Right now, the agent is slow for complex tasks, so we're still in an era where it might be better to codify popular tasks (eg: sending a WhatsApp message) instead of handling them with browser automation. Have yall looked into Groq / Cerberus?



One option could be for the main apps like WhatsApp to have defined custom actions, which are almost like an API to the service. I think the interplay between LLM and automation scripts will succeed here:

Agent call 1: Send WhatsApp message (to=Magnus, text=hi) Inside, you open WhatsApp and search for Magnus (without LLM)

Agent call 2: Select contact from all possible Magnus contacts Script 3: Type the message and click send

So in total, 2 calls - with Gemini, you could already achieve this in 10-15 seconds.


That was such a cool demo man! We are working on speed, we are already 3-4x faster than operator with gpt4o




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: