Hacker News new | past | comments | ask | show | jobs | submit login

Congrats on the launch!

How do you dedupe entities, like companies and people? I've noticed ChatGPT tends to provide "great" results when asking about different entities, but in reality it just groups similar sounding entities together in its answer.

For example, I asked ChatGPT about a well known startup. It gave me a confident answer about how much they raised, their current status, etc. When looking at the 3 sources they cited though, it was actually 3 different companies that all had similar sounding names that it just grouped together to form its answer.

Basically, how do I trust the output of your system?




We find supporting references when evaluating the search criteria / enrichments of each result, and you can view these citations

https://imgur.com/dsGK5dS


Right, I saw that. ChatGPT does the same.

My question is how you can confirm the entity you're referencing in each source is actually the entity you're looking for?

An example I ran into recently is Vast (https://www.vastspace.com/). There are a number of other notable startups named Vast (https://vast.ai/, https://www.vastdata.com/).

I understand Clay, which your Websets product is clearly inspired by, does a fair amount of matching based on domain name or LinkedIn url.

If Websets is doing fuzzy or naive matching, that's okay. I'm just trying to understand the limitations and potential uses cases of your current system.


Deduplication is mainly driven by LLMs with search results as context. Our entity resolution works well because Exa’s main business is crawling and indexing the web at scale, and we can control how we search across that within Websets.

As far as I know ChatGPT’s search is primarily a wrapper around another company’s search engine, which is why it often feels like it’s just summarizing a page of search results and sometimes hallucinates badly.


Thanks for the info. That makes sense.

Looking forward to trying out the product more when I have a moment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: