GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

I basically dedicated my professional life towards getting developers to be able to build better applications and build applications better by leveraging not just individual data points, kind of retrieved at once, like one at a time, or summed up or group calculated averages, but individual data points connected by relationships, right?

And today I'm going to talk about that applied in the world of LLMs and Gen AI. So before I do that, though, I'm going to take a little bit of a detour. I'm going to talk about search, the evolution of search. Everyone here in this room knows that the vast majority of web searches today are handled with Google, but some of you know that it didn't start that way.

It started this way. Who here recognizes this web page? Right, yeah. Who here recognizes Alta Vista as a name? Like a few people, right? Back in the mid-90s, there was dozens of web search companies, dozens plural, like 30, 40, 50 web search companies, and they all used basically the same technology.

They used keyword-based text search, inverted index type search, BM25 like, for those of you who know what that means. And it worked really, really well until it didn't. And the Alta Vista effect kicked in, which was the notion that you search for something, you got a thousand or thousands of hits back, and you have to look through page after page until you found the result that was relevant to you.

The Alta Vista effect, you got too much back from the internet. That wasn't a problem in the beginning because most of the things you search for when I went onto the internet in the beginning got zero results back because there was no content about that on the internet, right?

But the Alta Vista effect, too many search results, was solved by Google. This is Google's press release mid-2000. They talk about a billion URLs they've indexed, right? But they also talk about the technology that they use behind the scenes, the technology called PageRank that delivers the most important search results really early on.

In fact, the first, the top 10 blue links on that first page, right? That technology, PageRank, is actually a graph algorithm, which is actually, it's called Eigenvector Centrality, and the innovation that Google did was applying that to the scale of the internet and the scale of the web, right?

PageRank. That ushered in and created, honestly, the most valuable company on the planet for quite some while, the PageRank era, right? That lasted for about a decade, about a dozen years, until in 2012, Google wrote this blog post, which is an amazing blog post introducing the knowledge graph, things, not strings, where they said, you know what, guys?

We've done an upgrade on the back end of our search technology, the biggest one since we invented PageRank, where we're moving away from not just storing the text and the links between the documents, but also the concepts embedded in those documents, things, not just strings. And we all know what the knowledge graph looks like visually.

When you search for something on Google today, Moscone Center, just around the corner from here, you're going to get this little panel right on the right hand side. If you look at that panel, it has a combination of unstructured text, in this case from Wikipedia, with structured text, it has the address, the owner of the Moscone building, you know, that kind of stuff.

This thing is backed on the back end by the data structure looking like this, right? It has these concept, the rings, that we call nodes, that are connected to other nodes through relationships, and both the nodes and the relationships have key value properties. You can attach two, three, a thousand, ten thousand on both the nodes, and very importantly, also on the relationships.

This is a knowledge graph, and that was the next decade or so, 12 years, of Google's dominance. Until a few months ago. A few months ago, at Google I/O, they took the next step, ushered in by the AI engineers conference a year ago. Well, not quite, but of course, the entire craze around Gen AI.

And this is one of the examples that they did, the classic travel itinerary. They helped me plan out this travel. Everyone here in this room knows that this is backed by an LLM. And it is backed by an LLM in combination with this knowledge graph data structure, GraphRag. This is ushering in the next era of web search, the GraphRag era.

What I'm going to talk to you about today is how can you use, well, first of all, should you, and if so, how can you use GraphRag for your own RAG-based applications? So what is GraphRag, right? It is very, very simple. GraphRag is RAG where on the retrieval path you use a knowledge graph.

Very, very simple. It doesn't say you only use a knowledge graph, but you use a knowledge graph. Maybe in combination with other technologies like vector search. So let's take the classic example of a customer service bot, right? And let's say that you are working at a company that is building Wi-Fi routers, for example, right?

And you have a bunch of support articles, right? And they've been stored in text files, right? And then you are tasked with building a bot that either gives direct end users access to it or your own customer service agents, employees, like access to this information. And you know how to do this because you live in the LLM world, in the Gen.AI world, so you're going to use RAG for this, right?

And so you have that data. It's text documents. You've added that text onto the properties of particular nodes, right? So you have a node per article. But then you've also said that, you know what, this article is about this particular Wi-Fi product, right? So you have a relation to that Wi-Fi product.

And that Wi-Fi product sits in a hierarchy of other Wi-Fi products. And it's written by this particular customer service engineer, you know, that kind of stuff. And then the end user has a question. Hey, my Wi-Fi lights are flashing yellow and my connection drops. Like, what should I do?

Something like that. I think we all know how we do this. We vectorize the search, right? We get some kind of vector embedding back. We use vector search to get the core documents. But here's where the graph RAG part kicks in. You get those core articles back, which are linked to the nodes.

Actually, the text is on the nodes. But then you use the graph to traverse from there and retrieve more context around it. Maybe it's not just that particular article for that particular Wi-Fi, but something else in that family. Maybe you use the fact that this particular engineer has very highly ranked content.

And then you rank that higher, right? You retrieve more context than what you get out of the ANN-based search from your vector store. And you pass that on to the LLM. Along with the question, you get an answer back, and you hand it to the user. So the core pattern is actually really, really simple, but really, really powerful.

All right? You start with doing a vector search. I think of this almost as a primary key. It's, of course, not a primary key, but almost like a primary key lookup into the graph. You use that vector search. You get an initial set of nodes. Then you walk the graph, and you expand that and find relevant content based on the structure of the graph.

Then you take that and you return it to the LLM. Or optionally, maybe that gives you 1,000 or 10,000 nodes back. And then you do what Google did. You rank that. You get the top K based on the structure of the graph. Maybe you even use page rank, right?

You get that. You pass it on to the LLM. It's really, really simple, but really, really powerful. And then there's a number more advanced patterns, but that's kind of the next talk I'll do in a year. It's more sophisticated graph retrieval patterns, right? But the core one, very, very simple.

Okay. So if that's what GraphRag is, what are the benefits of GraphRag? When should you use it? When should you not use it? The first and most stark benefit is accuracy. It's directly correlated to the quality of the answer. There's been a ton of research articles about this in the last six months, something like that.

I believe the first one was this one by data.world. I just picked out three at random here that I like. This is the first one that I know of by data.world, which is a data cataloging company based on a knowledge graph. And they proved out across, I think, 43 different questions that on average, the response quality, the accuracy, was three times higher if they use a knowledge graph in combination with vector search.

I love this paper by LinkedIn. It shows a very similar type. I think it's like 75% or 77% increase in accuracy. But it also has a great architecture view. So you can take the QR code right there and look at that paper, which combines various components and also the flow through that, that I thought was just really pedagogical.

But by and large, it's showing the same thing, a little bit of different numbers, but significantly higher accuracy when it used graph in combination with vector search. And then Microsoft had a fantastic blog post and subsequently, I think, two academic papers. The blog post was in February of this year, where they also talk about the increased quality of response.

But also beyond that, hey, you know what? GraphRag enables us to answer another important class of questions that we couldn't even do with vector search alone or baseline vector search or baseline RAG alone. So first benefit, higher quality response back. The second one is easier development. And this one is a little bit interesting because there's an asterisk in there.

Because what we hear very clearly from our users is that it's easier to build RAG applications with graph RAG compared to baseline RAG. But we also hear it's like, it's actually hard. And what's the nuance there? Well, the nuance is if you already have a knowledge graph up and running.

So there's a learning curve where people need to learn, how do I create the knowledge graph in the first place? Once you have that, it's a lot easier. But how do you create that knowledge graph, right? So let's put a little pin in that. If I rush through the next few slides quickly enough, I'm going to show you hopefully a demo on that.

But let's put a little pin in that. So this is an example. This is from a very high growth stage fintech company that is very cutting edge in AI. And they started playing around with graph RAG a few, about six months ago. And they took an existing application and they said, you know what?

We are going to port this from a vector database to Neo4j, and most of the operations yield a better result. They can calculate the embeddings on a database level. Getting related actions is as simple as following the relationships between nodes. And this one I love. The cache, and the cache here is their application, they call it the cache, can be visualized.

This is an extremely valuable debugging tool. And in the parenthesis, I actually said that you've already fixed a couple of bugs just thanks to this. Right? Amazing. Like once you've been able to create that graph, it's a lot easier to build your RAG application. And why is that? Right?

So let's talk a little bit about representation. Let's say we have the phrase in there, apples and oranges are both fruit. And we want to represent that in vector space and in graph space. In graph space, we already talked about this. Apple is a fruit. Orange is a fruit.

Pretty easy. That's the representation in graph space. In vector space, it looks like this. Maybe. Or maybe this is something else like we actually don't know. Two different ways of representing that phrase. And then we can run similarity calculations in different ways using these both representations that I'm not going to go through right now.

We can search in different ways. These are not competing ways of doing it. They're complementary ways of doing it. Right? One is not better than the other. Except I will make one statement. Which is, when you sit down and you write your application, when you build your application, I'm actually going to make the statement that one of them is superior.

This vector space representation is completely opaque to a human being. But the graph representation is very, very clear. It is explicit. It's deterministic. It's visual. You can see it. You can touch it as you build your applications. This is the, I already fixed a couple of bugs thanks to this.

Just by porting it from a vector-only store to graph rag, they were able to see and work with the data. And that is really freaking powerful. That shows up in development time as you're building your applications. It's also showing up for our friends in IT who worry about things maybe that is not directly related to building the application, which is explainability, which is auditability, which is governance.

That explicit data structure has knock-on effects over there that are really, really powerful. Once you're up and running in production, you need to be able to explain why something happened. So, higher accuracy, better answers, easier to build once you're through the hump of creating the knowledge graph, and then increased explainability and governance for IT and the business, right?

Those are the three things. So, how do you get started with graph rag? Well, I've talked a lot about this already. Like, how do you create the knowledge graph in the first place? So, a little bit of nuance here. So, basically, there are three types of data out in the world that I care about when I think about knowledge graph creation.

The first one is structured data. So, this is your data in your snowflake or something like that or Postgres, right? The other one is unstructured data, PDF files, raw text from a web page. And the other one, the third one is mixed. People tend to call this semi-structured, but it's not.

Hit me up afterwards and I'll tell you why it's not. But basically, what this one is is structured data where some of the fields are long form text, right? Basically, we're great in the first bucket in the graph world. It's very easy to go from Snowflake or Postgres or MySQL or Oracle into a property graph model.

The unstructured one is really freaking hard, right? It's hard to do in theory. It's also had immature tooling for a long run. The middle one is actually where the majority of at least enterprise production use cases are in the real world. So, man, two and a half minutes. This is rough.

There are two types of graphs and I'm not going to talk about them. I want to talk about them. Lexical graphs and domain graphs is actually really relevant, but I really want to get to this demo. So, I've talked about creating graphs with unstructured information. So, we just built this new tool that we launched just a few weeks ago called the Knowledge Graph Builder.

And you see it here. Can you see the screen? Okay. So, basically, here you can drag and drop your PDF files. You can put in YouTube links, Wikipedia links. You can point it to your kind of cloud service buckets, right? And it's going to extract the data from there and create the graph.

So, here I added a few things. I added a PDF of Andrew Ning's newsletter, The Batch. I added the Wikipedia page for OpenAI. And I added the YouTube from Swix and Alessio's, you know, the Four Wars Latent Space Podcast. So, I added all that and I uploaded it into this Knowledge Graph Builder.

And when I do that, it creates if -- let's see here. I knew the Ethernet connection was going to do it. It automagically I created a little Knowledge Graph. If it renders -- wait for it. It says one minute here. So, it better render pretty soon. All right. Let me do this again.

Please work. And I was like trying to keep it alive in the -- in the thing, too. All right. Okay. Let's see. I think we are here. And then it says show me a graph. And it's not going to show me the graph. Oh, yeah, it will. Come on.

You can do it. All right. Yes. So, what we have here -- check this shit out. I would love to sit here and just drink in your applause, but we need to look at this data. So, check this out. This is the document, the Four Wars document. Here are the various chunks.

And then you can take a chunk and you can expand that. This, I put in the the embedding. And you can -- I'll zoom out here. And you can see that it takes the logical concept elements out of that chunk. Right? Machine learning. They talk about something that is developed in a similar fashion.

I don't even know. There's some company there. Right? And you get that entire graph of all this information. On top of that, I really don't have time to show it. But there's also -- I really don't have time to show it. There's a chat bot in here that you can use.

And you can introspect the result that gets back. I'll -- one more second. Take up your phones. If you think this looks cool, take a photo of this QR code. And you're going to have an amazing landing page where you have access to all of this information. You can get up and running yourself.

Thank you for the additional minutes. Thank you, Emil. Thanks, Emil. Thanks, everyone, for paying attention. Thank you, Emil. Thanks, everyone. Thanks, everyone. Thanks, everyone.

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem

Transcript