https://www.anthropic.com/engineering/contextual-retrieval Anthropic found embed...

bredren · 2025-11-29T06:59:14 1764399554

FWIW, the org decided against vector embeddings for Claude Code due in part to maintenance. See 41:05 here: https://youtu.be/IDSAMqip6ms

mips_avatar · 2025-11-29T08:06:05 1764403565

It would also blow up the price/latency of Claude code if every chunk of every file had to be read into haiku->summarized->sent to an embedding model ->reindexed into a project index and that index stored somewhere. Since there’s a lot of context inherent in things like the file structure, storing the central context in Claude.md is a lot simpler. I don’t think them not using vector embeddings in the project space is anything other than an indication that it’s hard to manage embeddings in Claude code.

andai · 2025-11-29T19:04:33 1764443073

Some agents integrate with code intelligence tools which do use embeddings, right? (As well as "mechanical" solutions like LSPs, I imagine.)

I think it's just a case of "this isn't something we need to solve, other companies solve it already and then our thing can integrate with that."

Or maybe it's really just marginal gains compared with iterative grepping. I don't know. (Still amazed how well that works!)

mips_avatar · 2025-11-30T01:07:15 1764464835

I think your last point captures it, for various reasons (RL, inherent structure of code) iterative grepping is unreasonably effective. Interestingly Cursor does use embedding vectors for codebase indexing:

https://cursor.com/docs/context/codebase-indexing

Seems like sometimes Cursor has a better understanding of the vibe of my codebase than Claude code, maybe this is part of it. Or maybe it’s just really marginally important in codebase indexing. Vector dbs still have a huge benefit in less verifiable domains.

skrebbel · 2025-11-29T14:42:16 1764427336

who's "the org"?

noobcoder · 2025-11-29T06:25:02 1764397502

No cross encoders?