Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Google has two interlocked monopolies, one is the search index

The index is the farthest thing from a monopoly Google has - anyone can recreate it. Heck, you can even just download Commoncrawl to get a massive head start.



I see it a bit differently, many (most?) web sites explicitly deny scraping execept for Google. Further Google has the infrastructure to crawl several trillion web pages and create a relevant index out of the most authoritative 1.5 trillion. To re-create that on your own, you would need both the web to allow it, and the infrastructure to do it. I would agree that this isn't an insurmountable moat but it is a good one.


Most websites only explicitly deny scraping by bad bots (robots.txt). Things like Cloudflare are a completely different matter, and I have a whole batch of opinions about how they are destroying the web.

I'd love to compete directly with OpenAI, but the cost of a half million GPUs is a me problem - not a them problem. Google can't be faulted for figuring out how to crawl the web in an economically viable way.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: