Something not obvious to me with these VC diagrams wrt the memory tier being just vector DBs vs also including knowledge graphs
Good: We're (of course) doing a lot of these architectures behind-the-scenes for louie.ai and client projects around that. Vector embeddings are an easy way to do direct recall for data that's bigger-than-context. As long as the user has a simple question that just needs recalling a text snippet that fairly directly overlaps with the question, vector embeddings are magical. Conversational memory for sharing DB queries across teammates, simple discussion of decades of PDF archives and internal wikis... amazing.
Not so good: What happens when the text data to answer your question isn't a directly semantic search match away? "Why does Team X have so many outages?" => What projects is Team X on" + "Outages for those projects" + "Analysis for outage" . AFAICT, this gets into:
A. Failure: Stick with query -> vector DB -> LLM summary and get the wrong answer over the wrong data
B. AutoGPT: Getting into an autoGPT langchain that iteratively queries the vector DB, and iteratively reasons over results, & iteratively plans, until it finds what it wants. But autoGPT seems to be more excitement than production use. Many questions like speed, cost, & quality...
C. Knowledge graphs: Getting into use the LLM to generate a higher-quality knowledge graph of the data that is more receptive to LLM querying. The above question now becomes a simpler multi-hop query over the KG, so both fast and cost-effective... If you've indexed correctly and taught your LLM to generate the right queries.
(Related: If you're into this kind of topic, we're hiring here to build out these systems + help use them on our customers in investigative areas like cyber, misinfo, & emergency response. See new openings up @ ttps://www.graphistry.com/careers !)
As a blackbox, it is a generalization of word2vec to sequence2vec. For example, simply summing or averaging word vectors in a sentence can give you a fast & cheap sentence embedding.
But natural language sentences have more structure than natural language words. Ex: it matters precisely where "not" goes in a sentence. So a lot of impressive scientific experimentation went into making these models smarter, with many evolutions. Impressively, this so blackboxed now that doesn't super matter.
Implicit to my post here... that's powerful, and easy to use... but not necessarily a great knowledge representation for someone who wants good Q&A over enterprise-scale data. One of our customer scenarios: "What is known vs believed about incident X." We can index each paragraph as multiple sentence embeddings, so if any phrase matches a query, the full paragraphs can get thrown into GPT as part of our answer. Easy. However, if information in the paragraph may lead to wanting to get information from elsewhere in the system (mention of another team, project, incident, ...), that means either a Planning agent needs to then realize that and recursively generate more vector search queries (mini-AutoGPT)... or we need to index on more than the sentence embedding.
Again, super interesting problems, and we're hiring for folks interested in helping work on it!
Good: We're (of course) doing a lot of these architectures behind-the-scenes for louie.ai and client projects around that. Vector embeddings are an easy way to do direct recall for data that's bigger-than-context. As long as the user has a simple question that just needs recalling a text snippet that fairly directly overlaps with the question, vector embeddings are magical. Conversational memory for sharing DB queries across teammates, simple discussion of decades of PDF archives and internal wikis... amazing.
Not so good: What happens when the text data to answer your question isn't a directly semantic search match away? "Why does Team X have so many outages?" => What projects is Team X on" + "Outages for those projects" + "Analysis for outage" . AFAICT, this gets into:
A. Failure: Stick with query -> vector DB -> LLM summary and get the wrong answer over the wrong data
B. AutoGPT: Getting into an autoGPT langchain that iteratively queries the vector DB, and iteratively reasons over results, & iteratively plans, until it finds what it wants. But autoGPT seems to be more excitement than production use. Many questions like speed, cost, & quality...
C. Knowledge graphs: Getting into use the LLM to generate a higher-quality knowledge graph of the data that is more receptive to LLM querying. The above question now becomes a simpler multi-hop query over the KG, so both fast and cost-effective... If you've indexed correctly and taught your LLM to generate the right queries.
(Related: If you're into this kind of topic, we're hiring here to build out these systems + help use them on our customers in investigative areas like cyber, misinfo, & emergency response. See new openings up @ ttps://www.graphistry.com/careers !)