Might be usable for low column count tabular data, but it would be pretty terrible for any other semantically dense modality e.g. video, molecules, geospatial, etc.
Not precisely, but if you had 50 documents in that 1000-dimensional embedding and you reduced the dimensions to three and still got at least the exact same nearest neighbor ordering then it would at least still function, right?
I guess the problem is taking a new document (like a search term) in the higher dimensional embedding and reducing it to three dimensions for searching in that reduced space and expecting that to also maintain the same nearest neighbor ordering.