Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How usable would the document embedding be for a nearest neighbor search if the dimensions were reduced to three?


Might be usable for low column count tabular data, but it would be pretty terrible for any other semantically dense modality e.g. video, molecules, geospatial, etc.


Nearly useless for most applications unless there's been a major improvement in the SOTA that I missed.


can i use the three dimensions to encode a space-filling curve over a 1000-dimensional embedding?


Not precisely, but if you had 50 documents in that 1000-dimensional embedding and you reduced the dimensions to three and still got at least the exact same nearest neighbor ordering then it would at least still function, right?

I guess the problem is taking a new document (like a search term) in the higher dimensional embedding and reducing it to three dimensions for searching in that reduced space and expecting that to also maintain the same nearest neighbor ordering.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: