Deduplication is mainly driven by LLMs with search results as context. Our entity resolution works well because Exa’s main business is crawling and indexing the web at scale, and we can control how we search across that within Websets.
As far as I know ChatGPT’s search is primarily a wrapper around another company’s search engine, which is why it often feels like it’s just summarizing a page of search results and sometimes hallucinates badly.
As far as I know ChatGPT’s search is primarily a wrapper around another company’s search engine, which is why it often feels like it’s just summarizing a page of search results and sometimes hallucinates badly.