Back to Index

Practical GraphRAG: Making LLMs smarter with Knowledge Graphs — Michael, Jesus, and Stephen, Neo4j


Transcript

We are talking about GraphRack today. That's the GraphRack track, of course. And we want to look at patterns for successful GraphRack applications for making LLMs a little bit smarter by putting knowledge graph into the picture. My name is Michael Hunger. I lead the developer relations at Neo4j. Actually, we are both co-authoring.

This is fun because we are both already authors and finally we have been friends for years. We are co-authoring GraphRack, the definitive guide for O'Reilly. Basically, we didn't sleep this past weekend because we had a book deadline. I am going to talk a little bit about at a high level, what GraphRack is, why it is important, what we are seeing in the media.

And then Michael is going to drill down into all of the details and patterns and give you a bunch of takeaways and things you can do. This is probably, if you want to know how to do GraphRack, Michael's quick deep dive on this is the best introduction you can get.

So, I'm also excited. Awesome. That's good going. Okay. So, the case for GraphRack is where we are going to start. And the challenge with using LLMs and using other patterns for this is basically they don't have the enterprise domain knowledge. They don't verify or explain the answers. They are subject to hallucinations.

And they have ethical and data bias concerns. And you can see that very much like our friendly parrot here. They are all the things which parrots behave and act like, except a cute bird. So, we want to do better than this with GraphRack and figure out how we can use domain-specific knowledge, accurate, contextual, and explainable answers.

And really, I think what a lot of companies and what the industry is figuring out is it's really a data problem. You need good data. You need to have data you can power your system with. One of the patterns you can do this with is RAG. So, you can stick your external data into a RAG system.

You can get stuff back from a database for the pattern. But vector databases and RAG fall short because it's lacking kind of your full data set. It's only pulling back a fraction of the information by vector-similar algorithms. Typically, a lot of the, especially modern vector databases, which everyone's using, they're easy to start with, but they're not robust.

They're not mature. They're not something which has scalability and fallback and gives you that, what you need to get into, build a strong, robust enterprise system. And vector similarity is not the same as relevance. So, results you get back from using a basic RAG system. They give you back things which are related to the topic, but it's not complete, and it's typically also not very relevant.

And then it's very hard to explain what's coming out of the system. So, we need to answer. Lifeline. Yeah. Graphrag. And what Graphrag is, is we're bringing the knowledge and the context and the environment to what LLMs are good at. So, you can think of this kind of like the the human brain.

Our left brain is, our right brain is more creative. It does more like building things. It does more extrapolation of information. Whereas our left brain is the logical part. That's what actually has reasoning, has facts, and can enrich data. And it's built off of knowledge graphs. So, a knowledge graph is a collection of nodes, relationships, and properties.

Here's a really simple example of a knowledge graph where you have two people. They live together. You have a car. But when you look into the details, it's actually like a little bit more complex than it seems at first, because they both have a car, but the owner of the car is not the person who drives it.

This is kind of like my family. My wife does all the bills, but then she hands me the keys whenever we get on the freeway. She hates driving. So, knowledge graphs also are a great way of getting really rich data. Here's an example of the stack overflow graph built into a knowledge graph where you can see all of the rich metadata and the complexity of the results.

And we can use this to evolve RAG into a more complex system, basically graph RAG, where we get better relevancy. We're getting more relevant results. We get more context, because now we can actually pull back all of the related information by graph closeness algorithms. We can explain what's going on because it's no longer just vectors, it's no longer statistical probabilities coming out of a vector database.

We actually have nodes, we have structure, we have semantics we can look at, and we can add in security and role-based access on top of this. So, it's context-rich, it's grounded. This gives us a lot of power, and it gives us the ability to start explaining what we're doing, where now we can visualize it, we can analyze it, and we can log all of this.

Now, this is one of the initial papers, the graph RAG paper from Microsoft Research, where they went through this and they showed that you could actually get not only better results, but less token costs. It was actually less expensive to do a graph RAG algorithm. There have been a lot of papers since then, which show all of the different research and interesting work which is going on in the graph RAG area.

This is just a quick view of the different studies and results which are coming out, but even from the early Data.World study, where they showed a three-times improvement in graph RAG capabilities, and the analysts are even showing how graph RAG is trending up. This is the Gartner hype cycle from 2024, and you can see generic AI is kind of on the downtrends.

RAG is getting over the hump, but graph RAG and a bunch of these things actually are providing and breathing more life into the AI ecosystem. So, a lot of great reports from Gartner showing that it's grounded in facts, it resolves hallucinations, together knowledge graphs and AI are solving these problems, and it's getting a lot of adoption by different industry leaders, by big organizations who are taking advantage of this and actually producing production applications and making it work like LinkedIn customer support, where they actually wrote this great research paper, where they showed that using a knowledge graph for customer support scenarios actually gave them better results and allowed them to improve the quality and reduce the response time for getting back to customers.

Median per issue resolution time was reduced by 28.6%. I mentioned the data.world study, which basically was a comparison of doing RAG on SQL versus RAG on graph databases, and they showed a three-times improvement in accuracy of LLM responses. And let's chat about patterns, Michael, because I think everyone's here to learn how to do this.

Exactly. So, let's look at how to do this, actually, right? So, and if you look at GraphREG, there are actually two sides to the coin. So, one, of course, you don't start in a vacuum, you have to create your knowledge graph, right? So, NVC, basically, multiple steps to get there.

Initially, you get unstructured information, you substructure it, you put it into a lexical graph, which represents documents, chunks, and their relationships. And a second step, you can then extract entities using, for instance, LLMs with this graph schema to extract entities and their relationships from that graph. And in a third phase, you would enrich this graph, for instance, with graph algorithms, doing things like, you know, page rank, community summarization, and so on.

And then when you have this build-up knowledge graph, then you do GraphREG as the search mechanism, either with local search or global search and other ways, right? So, let's first look at the first phase of, like, knowledge graph construction a little bit. So, like, always in data engineering, there is, if you want to have higher quality outputs, you have to put in more effort at the beginning, right?

So, it's basically nothing comes for free, there's no free lunch after all. But what you do at the beginning is basically paying off multiple times, because what you get out of your unstructured documents is actually high quality, high structured information, which you then can use to extract contextual information for your queries, which allows to reach retrieval at the end.

So, after seeing GraphREG being used by a number of users, customers, we've seen, we looked at research papers, we saw that a number of patterns emerging in terms of, like, how we structure our graphs, how we query these graphs, and so on. And so, we started to collect these patterns and put them on GraphREG.com.

And we want to, I wanted to show what this looks like. So, we have, basically, example graphs in the pattern, the pattern has a name, description, context, and we see also queries that are used for extracting this information, right? So, for instance, here's a mix of a lexical graph and then domain graph, and then we can have the query that fetches this information.

Let's look at the three steps in a little bit more detail on the graph model side. So, on one side, we have for lexical graphs, you represent documents and the elements. So, that could be something simple as a chunk. But if you have structured elements and documents, you can also do something like, okay, I have a book, which has chapters, which have sections, which have paragraphs, where the paragraph is the semantically cohesive unit that you would use to, for instance, create a vector embedding of that you can use later for vector search.

But what's really interesting in the graph is, basically, you can connect these things all up, right? So, you know exactly who's the predecessor, who's the successor to a chunk, who's the parent of an element, and using something like vector or text similarity, you can also connect these chunks as well by a neighbor or similarity graph, where you basically store similarities between chunks, and then you put on the relationship between them and weighted score, basically, how similarity to chunks.

And then you can use all these relationships when you extract the context and the retrieval phase to, for instance, find what are related chunks by document, by a temporal sequence, by a similarity and other things. Right? So, that's only on the lexical side. This looks like this. So, for instance, you have an RFP, and you want to break it up in a structured way, then you basically create the relationships between these chunks, these chunks, or these subsections, edit text, do the vector embeddings, and then you do it at scale, and then you get a full lexical graph out of that.

Next phase is entity extraction, which is also something that has been around for quite some time with NLP, but LLMs actually take this to the next level with their multi-language understanding, with their high flexibility, good language skills for extraction. So, you basically provide a graph schema and an instruction prompt to the LLM plus your pieces of information, pieces of text.

Now, with large context windows, you can then put in 10,000 tokens for extraction. If you have, you can also put in already existing ground tools. So, for instance, if you have existing structure data where your entities, let's say products, or genes, or partners, or clients are already existing, then you can also pass this in as part of the prompt, so that the LLM doesn't do an extraction, but more an recognition and finding approach where you find your entities, and then you extract relationships from them, and then you can store additional facts and additional information that you store as part of relationships and entities as well.

So, basically, in the first part, you have the lexical graph which is representing document structure, but then in the second part, you extract the relevant entities and their relationships. If you have already an existing knowledge graph, you can also connect this to an existing knowledge graph. So, imagine you have an CRM where you already have customer clients and leads in your knowledge graph, but then you want to enrich this with, for instance, protocols from core transcripts, and then you basically connect this to your existing structure data as well.

So, that's also a possibility. And then in the next phase, what you can do is you can run graph algorithms for enrichment, which then, for instance, can do clustering on the entity graph and then you generate something like communities where an LLM can generate summaries across them as such.

And for especially last one, it's interesting because what you identify is actually cross document topics, right? So, because it's basically each document is in basic temporal vertical representation of information, but what this is is actually it looks at which topics are reoccurring across many different documents. So, you find these kind of topic clusters across documents as well.

Cool. So, if you look at the second phase, the search phase, which is basically the retrieval part of rec. What we see here is basically that in a graphic retriever, you don't just do a simple vector lookup to get returns, results return, but what you do, you do an initial index search.

It could be vector search, full text search, hybrid search, spatial search, other kinds of searches to find the entry points in your graph. And then you basically can take, as you can see here, starting from these entry points, you don't follow the relationships up to a certain degree or up to a certain relevancy to fetch in additional context.

And this context can be coming from the user question. It can be external user context that comes in, for instance, when someone from, let's say, your finance department is looking at your data, you return different information. And if someone from the, let's say, engineering department is looking at your data, right?

So, you also take this external context into account, how much and which context you retrieve. And then you return to the LLM to generate the answer, not just basically text fragments, like you would do in vector search, but you also create the returnees, more complete subset of the, of the contextual graph to the LLM as well.

And modern LLM is actually more trained on graph processing as well. So, they can actually deal with these additional pattern structures where you have node relationship, node patterns that you provide as additional context to the LLM. And then, of course, I mentioned that you can enrich it using graph algorithms.

So, basically, you can do things like clustering, link prediction, page rank, and other things to enrich your data. Cool. Let's look at some practical examples. We don't have too much time left. So, one is knowledge graph construction from unstructured sources. So, there's a number of libraries. You've already heard some today from people that do these kind of things.

So, one thing that you build is a tool that allows you to take PDFs, YouTube transcripts, local documents, web articles, Wikipedia articles, and it extracts your data into a graph. And let me just switch over to the demo here. So, this is the tool. So, I uploaded information from different Wikipedia pages, YouTube videos, articles, and so on.

And here's, for instance, a Google DeepMind extraction. So, you can use a lot of different LLMs here. And then, you can also, if you want to, in graph enhancement, provide graph schema as well. So, you can, for instance, a person works for a company and add these patterns to your, to your schema.

And then, the LLM is using this information to drive the extraction as well. And so, if you look at the data that has been extracted from DeepMind, it is this one here. We can actually see from the Wikipedia article, two aspects. One is the document with the chunks, which is this part of the graph, right?

And then, the second part is the entities that have been extracted from, from this article as well. So, you see, actually, the connected knowledge graph of entities, which are companies, locations, people, and technologies. So, it followed our, followed our schema to extract this. And then, if I want to run graph reg, you have here a number of different retrievers.

So, we have vector retriever graph in full text, and others that you can select. All of this is also an open source project. So, you can just go to GitHub and have a look at this. And so, I just ran this before because the internet is not so reliable here.

So, what has DeepMind worked on? And I get a detailed explanation. And then, if I want to, I can here look at details. So, it shows me which sources that it used. As a folder, Wikipedia, another PDF. I see the chunks have been used, which is basically the full text and hybrid search.

But then, I also see which entities that can use from the graph. So, I can actually really see from an explainability perspective, these are the entities that have been retrieved by the graph reg retriever, passed to the LLM, in addition to the text that's connected to these entities. So, it gets a richer response as such.

And then, you can also do evil on that with the as well. So, while I'm on the screen, let me just show you another thing that we worked on, which is more like an energetic approach where you basically put these individual retrievers into an configuration where you have basically domain-specific retrievers that are running individual suffer queries.

So, for instance, if you look at, let's say, this one, it has the query here. And basically, a tool with inputs and a description. And then, you can have an agentic loop using these tools, basically doing a graphic with each individual tool, taking the responses, and then doing deeper tool calls.

I'll show you a deeper example in a minute. So, this is basically what I showed you. This is all available as open source libraries. You can use it yourself from Python as well. And what's interesting here in the agentic approach, you don't just use vector search to retrieve your data, but you basically break down the user question into individual tasks and extract parameters and run these individual tools, which then run in sequence or in a loop to return the data, and then you get basically these outputs back.

And basically, for each of these things, individual tools are called and used here. And the last thing that I want to show is the graphic Python package, which is basically also encapsulating all of this in construction and retrieval into one package. So, you can build a knowledge graph, you can implement retrievers, and create the pipelines here.

And here's an example of where I pass in PDFs plus a graph schema, and then basically, it runs the import into Neo4j. And then I can, in the Python notebook, visualize the data later on. And with that, I leave you with one second. To take away, which is on graphic.com, you find all of these resources, a lot of the patterns, and we'd love to have contributions, and love to talk more.

I'm outside at the booth if you have more questions. Yeah. So, that was great. And I think you're getting it all from the expert with all the tooling. Actually, Michael's team builds a lot of the tools like Knowledge Graph Builder. Very excited you all came to the Graph Rag track and hope to chat with you all more.

If you have questions for me and Michael, just meet us in the Neo4j booth across the way. Thank you. Thank you.