Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Right, I saw that. ChatGPT does the same.

My question is how you can confirm the entity you're referencing in each source is actually the entity you're looking for?

An example I ran into recently is Vast (https://www.vastspace.com/). There are a number of other notable startups named Vast (https://vast.ai/, https://www.vastdata.com/).

I understand Clay, which your Websets product is clearly inspired by, does a fair amount of matching based on domain name or LinkedIn url.

If Websets is doing fuzzy or naive matching, that's okay. I'm just trying to understand the limitations and potential uses cases of your current system.



Deduplication is mainly driven by LLMs with search results as context. Our entity resolution works well because Exa’s main business is crawling and indexing the web at scale, and we can control how we search across that within Websets.

As far as I know ChatGPT’s search is primarily a wrapper around another company’s search engine, which is why it often feels like it’s just summarizing a page of search results and sometimes hallucinates badly.


Thanks for the info. That makes sense.

Looking forward to trying out the product more when I have a moment.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: