Easiest example is taking three words: Universe, University, College. - Universi...

osigurdson · on Oct 26, 2023

Another interesting point is that math can be performed on embedding vectors: emb("king") - emb("man") + emb("woman") = emb("queen").

minimaxir · on Oct 26, 2023

That's a property of Word2Vec specifically due to how it's trained (a shallow network where most of the "logic" would be contained within the embeddings themselves). Using it for embeddings generated from LLMs or Embedding layers will not give as fun results; in practice the only thing you can do is average or cluster them.

TeMPOraL · on Oct 26, 2023

> That's a property of Word2Vec specifically due to how it's trained (a shallow network where most of the "logic" would be contained within the embeddings themselves).

Is it though? I thought the LLM-based embeddings are even more fun for this, as you have many more interesting directions to move in. I.e. not just:

emb("king") - emb("man") + emb("woman") = emb("queen")

But also e.g.:

emb(<insert a couple paragraph long positive book review>) + av(sad) + bv(short) - c*v(positive) = emb(<a single paragraph, negative and depressing review>)

Where a, b, c are some constants to tweak, and v(X) is a vector for quality X, which you can get by embedding a bunch of texts expressing the quality X and averaging them out (or doing some other dimensional reduction trickery).

I've suggested this on HN some time ago, but only been told that I'm confused and the idea is not even wrong. But then, there was this talk on some AI conference recently[0], where the speaker demonstrated exactly this kind of latent space translations of text in a language model.

--

[0] - https://www.youtube.com/watch?v=veShHxQYPzo&t=13980s - "The Hidden Life of Embeddings", by Linus Lee from Notion.

simonw · on Oct 26, 2023

That talk used a novel embeddings model trained by the speaker which does exhibit this kind of property - but that was a new (extremely cool) thing, not something that other embeddings models can do.

osigurdson · on Oct 27, 2023

Interesting video. When he says "we decode the embedding", does he essentially mean that he is searching a vector database or something else?

Philpax · on Oct 27, 2023

The model is an encoder-decoder, which encodes some text into a latent embedding, and can then decode it back into text. It’s a feature of the model itself.