back to indexPractical GraphRAG: Making LLMs smarter with Knowledge Graphs — Michael, Jesus, and Stephen, Neo4j

00:00:00.000 |
We are talking about GraphRack today. That's the GraphRack track, of course. And we want to look 00:00:19.760 |
at patterns for successful GraphRack applications for making LLMs a little bit smarter by putting 00:00:26.400 |
knowledge graph into the picture. My name is Michael Hunger. I lead the developer relations 00:00:34.720 |
at Neo4j. Actually, we are both co-authoring. This is fun because we are both already authors 00:00:41.680 |
and finally we have been friends for years. We are co-authoring GraphRack, the definitive 00:00:47.560 |
guide for O'Reilly. Basically, we didn't sleep this past weekend because we had a book deadline. 00:00:52.800 |
I am going to talk a little bit about at a high level, what GraphRack is, why it is important, 00:01:01.120 |
what we are seeing in the media. And then Michael is going to drill down into all of the details 00:01:05.880 |
and patterns and give you a bunch of takeaways and things you can do. This is probably, if 00:01:10.480 |
you want to know how to do GraphRack, Michael's quick deep dive on this is the best introduction 00:01:17.640 |
you can get. So, I'm also excited. Awesome. That's good going. 00:01:20.600 |
Okay. So, the case for GraphRack is where we are going to start. And the challenge with 00:01:27.000 |
using LLMs and using other patterns for this is basically they don't have the enterprise domain 00:01:32.600 |
knowledge. They don't verify or explain the answers. They are subject to hallucinations. 00:01:37.160 |
And they have ethical and data bias concerns. And you can see that very much like our friendly 00:01:43.640 |
parrot here. They are all the things which parrots behave and act like, except a cute bird. 00:01:49.640 |
So, we want to do better than this with GraphRack and figure out how we can use domain-specific 00:01:55.560 |
knowledge, accurate, contextual, and explainable answers. And really, I think what a lot of companies 00:02:02.760 |
and what the industry is figuring out is it's really a data problem. You need good data. You need to have 00:02:08.120 |
data you can power your system with. One of the patterns you can do this with is RAG. So, you can 00:02:13.480 |
stick your external data into a RAG system. You can get stuff back from a database for the pattern. 00:02:21.080 |
But vector databases and RAG fall short because it's lacking kind of your full data set. It's only 00:02:29.400 |
pulling back a fraction of the information by vector-similar algorithms. Typically, a lot of the, 00:02:34.280 |
especially modern vector databases, which everyone's using, they're easy to start with, but they're not 00:02:38.920 |
robust. They're not mature. They're not something which has scalability and fallback and gives you 00:02:44.520 |
that, what you need to get into, build a strong, robust enterprise system. And vector similarity is not 00:02:50.840 |
the same as relevance. So, results you get back from using a basic RAG system. They give you back things 00:02:57.720 |
which are related to the topic, but it's not complete, and it's typically also not very relevant. 00:03:04.040 |
And then it's very hard to explain what's coming out of the system. So, we need to answer. Lifeline. 00:03:10.520 |
Yeah. Graphrag. And what Graphrag is, is we're bringing 00:03:14.920 |
the knowledge and the context and the environment to what LLMs are good at. So, you can think of this kind of like the 00:03:22.360 |
the human brain. Our left brain is, our right brain is more creative. It does more like building things. It 00:03:30.280 |
does more extrapolation of information. Whereas our left brain is the logical part. That's what actually 00:03:35.880 |
has reasoning, has facts, and can enrich data. And it's built off of knowledge graphs. So, a knowledge 00:03:42.440 |
graph is a collection of nodes, relationships, and properties. Here's a really simple example of a 00:03:48.440 |
knowledge graph where you have two people. They live together. You have a car. But when you look 00:03:53.400 |
into the details, it's actually like a little bit more complex than it seems at first, because 00:03:57.720 |
they both have a car, but the owner of the car is not the person who drives it. This is kind of like my 00:04:05.080 |
family. My wife does all the bills, but then she hands me the keys whenever we get on the freeway. She 00:04:11.000 |
hates driving. So, knowledge graphs also are a great way of getting really rich data. Here's an example of 00:04:17.720 |
the stack overflow graph built into a knowledge graph where you can see all of the rich metadata and the 00:04:22.600 |
complexity of the results. And we can use this to evolve RAG into a more complex system, basically graph RAG, 00:04:29.320 |
where we get better relevancy. We're getting more relevant results. We get more context, because now we can 00:04:35.000 |
actually pull back all of the related information by graph closeness algorithms. We can explain what's going 00:04:40.680 |
on because it's no longer just vectors, it's no longer statistical probabilities coming out of a vector 00:04:46.600 |
database. We actually have nodes, we have structure, we have semantics we can look at, and we can add in 00:04:52.280 |
security and role-based access on top of this. So, it's context-rich, it's grounded. This gives us a lot of power, and it 00:04:59.560 |
gives us the ability to start explaining what we're doing, where now we can visualize it, we can analyze it, and we can log all of this. 00:05:06.360 |
Now, this is one of the initial papers, the graph RAG paper from Microsoft Research, where they went 00:05:12.120 |
through this and they showed that you could actually get not only better results, but less token costs. It 00:05:17.720 |
was actually less expensive to do a graph RAG algorithm. There have been a lot of papers since then, which 00:05:24.440 |
show all of the different research and interesting work which is going on in the graph RAG area. This is 00:05:31.480 |
just a quick view of the different studies and results which are coming out, but even from the early 00:05:35.720 |
Data.World study, where they showed a three-times improvement in graph RAG capabilities, and the 00:05:42.440 |
analysts are even showing how graph RAG is trending up. This is the Gartner hype cycle from 2024, and you can see 00:05:52.040 |
generic AI is kind of on the downtrends. RAG is getting over the hump, but graph RAG and a bunch of 00:05:58.120 |
these things actually are providing and breathing more life into the AI ecosystem. So, a lot of great 00:06:03.560 |
reports from Gartner showing that it's grounded in facts, it resolves hallucinations, together knowledge 00:06:09.800 |
graphs and AI are solving these problems, and it's getting a lot of adoption by different industry leaders, 00:06:15.560 |
by big organizations who are taking advantage of this and actually producing production applications 00:06:21.080 |
and making it work like LinkedIn customer support, where they actually wrote this great research paper, 00:06:27.480 |
where they showed that using a knowledge graph for customer support scenarios actually gave them 00:06:32.520 |
better results and allowed them to improve the quality and reduce the response time for getting back to 00:06:39.000 |
customers. Median per issue resolution time was reduced by 28.6%. I mentioned the data.world study, 00:06:46.920 |
which basically was a comparison of doing RAG on SQL versus RAG on graph databases, and they showed a 00:06:52.920 |
three-times improvement in accuracy of LLM responses. And let's chat about patterns, Michael, 00:06:58.520 |
because I think everyone's here to learn how to do this. 00:07:01.000 |
Exactly. So, let's look at how to do this, actually, right? So, and if you look at GraphREG, 00:07:07.320 |
there are actually two sides to the coin. So, one, of course, you don't start in a vacuum, 00:07:12.200 |
you have to create your knowledge graph, right? So, NVC, basically, multiple steps to get there. 00:07:16.520 |
Initially, you get unstructured information, you substructure it, you put it into a lexical graph, 00:07:21.640 |
which represents documents, chunks, and their relationships. And a second step, you can then 00:07:27.240 |
extract entities using, for instance, LLMs with this graph schema to extract entities and 00:07:32.760 |
their relationships from that graph. And in a third phase, you would enrich this graph, for instance, 00:07:36.840 |
with graph algorithms, doing things like, you know, page rank, community summarization, and so on. 00:07:42.760 |
And then when you have this build-up knowledge graph, then you do GraphREG as the search mechanism, 00:07:49.560 |
either with local search or global search and other ways, right? So, let's first look at the first 00:07:55.880 |
phase of, like, knowledge graph construction a little bit. So, like, always in data engineering, 00:08:01.400 |
there is, if you want to have higher quality outputs, you have to put in more effort at the 00:08:05.560 |
beginning, right? So, it's basically nothing comes for free, there's no free lunch after all. But what 00:08:09.800 |
you do at the beginning is basically paying off multiple times, because what you get out of your 00:08:14.520 |
unstructured documents is actually high quality, high structured information, which you then can use to 00:08:20.200 |
extract contextual information for your queries, which allows to reach retrieval at the end. 00:08:25.560 |
So, after seeing GraphREG being used by a number of users, customers, we've seen, we looked at 00:08:35.400 |
research papers, we saw that a number of patterns emerging in terms of, like, how we structure our 00:08:40.680 |
graphs, how we query these graphs, and so on. And so, we started to collect these patterns and put them on 00:08:45.560 |
GraphREG.com. And we want to, I wanted to show what this looks like. So, we have, basically, 00:08:51.720 |
example graphs in the pattern, the pattern has a name, description, context, and we see also queries 00:08:59.400 |
that are used for extracting this information, right? So, for instance, here's a mix of a lexical graph and 00:09:04.920 |
then domain graph, and then we can have the query that fetches this information. Let's look at the three 00:09:10.200 |
steps in a little bit more detail on the graph model side. So, on one side, we have for lexical graphs, 00:09:17.080 |
you represent documents and the elements. So, that could be something simple as a chunk. But if you have 00:09:21.800 |
structured elements and documents, you can also do something like, okay, I have a book, which has chapters, 00:09:26.520 |
which have sections, which have paragraphs, where the paragraph is the semantically cohesive unit 00:09:32.120 |
that you would use to, for instance, create a vector embedding of that you can use later for vector search. 00:09:37.000 |
But what's really interesting in the graph is, basically, you can connect these things all up, 00:09:40.600 |
right? So, you know exactly who's the predecessor, who's the successor to a chunk, who's the parent 00:09:45.560 |
of an element, and using something like vector or text similarity, you can also connect these chunks as 00:09:51.960 |
well by a neighbor or similarity graph, where you basically store similarities between chunks, and then 00:09:59.880 |
you put on the relationship between them and weighted score, basically, how similarity to chunks. And then you 00:10:05.080 |
can use all these relationships when you extract the context and the retrieval phase to, for instance, 00:10:09.960 |
find what are related chunks by document, by a temporal sequence, by a similarity and other things. 00:10:16.520 |
Right? So, that's only on the lexical side. This looks like this. So, for instance, you have an RFP, 00:10:21.800 |
and you want to break it up in a structured way, then you basically create the relationships between 00:10:26.200 |
these chunks, these chunks, or these subsections, edit text, do the vector embeddings, and then you do it at 00:10:32.680 |
scale, and then you get a full lexical graph out of that. Next phase is entity extraction, which is also 00:10:41.320 |
something that has been around for quite some time with NLP, but LLMs actually take this to the next level 00:10:46.680 |
with their multi-language understanding, with their high flexibility, good language skills for extraction. 00:10:52.280 |
So, you basically provide a graph schema and an instruction prompt to the LLM plus your pieces of 00:11:01.240 |
information, pieces of text. Now, with large context windows, you can then put in 10,000 tokens for 00:11:07.560 |
extraction. If you have, you can also put in already existing ground tools. So, for instance, if you have 00:11:13.560 |
existing structure data where your entities, let's say products, or genes, or partners, or clients are 00:11:20.920 |
already existing, then you can also pass this in as part of the prompt, so that the LLM doesn't do an 00:11:25.480 |
extraction, but more an recognition and finding approach where you find your entities, and then you 00:11:32.120 |
extract relationships from them, and then you can store additional facts and additional information 00:11:38.120 |
that you store as part of relationships and entities as well. So, basically, in the first part, 00:11:42.680 |
you have the lexical graph which is representing document structure, but then in the second part, 00:11:46.440 |
you extract the relevant entities and their relationships. If you have already an existing 00:11:51.160 |
knowledge graph, you can also connect this to an existing knowledge graph. So, imagine you have an 00:11:54.360 |
CRM where you already have customer clients and leads in your knowledge graph, but then you want to 00:12:01.480 |
enrich this with, for instance, protocols from core transcripts, and then you basically 00:12:06.680 |
connect this to your existing structure data as well. So, that's also a possibility. And then in the next phase, 00:12:11.480 |
what you can do is you can run graph algorithms for enrichment, which then, for instance, can do 00:12:16.600 |
clustering on the entity graph and then you generate something like communities where an LLM can generate 00:12:23.320 |
summaries across them as such. And for especially last one, it's interesting because what you identify is 00:12:30.600 |
actually cross document topics, right? So, because it's basically each document is in 00:12:35.960 |
basic temporal vertical representation of information, but what this is is actually it looks at which topics are 00:12:42.600 |
reoccurring across many different documents. So, you find these kind of topic clusters across documents as well. 00:12:49.160 |
Cool. So, if you look at the second phase, the search phase, which is basically the retrieval 00:12:56.360 |
part of rec. What we see here is basically that in a graphic retriever, you don't just do a simple 00:13:02.520 |
vector lookup to get returns, results return, but what you do, you do an initial index search. It could be 00:13:09.480 |
vector search, full text search, hybrid search, spatial search, other kinds of searches to find the entry 00:13:14.200 |
points in your graph. And then you basically can take, as you can see here, starting from these entry points, 00:13:21.240 |
you don't follow the relationships up to a certain degree or up to a certain relevancy to fetch in 00:13:27.480 |
additional context. And this context can be coming from the user question. It can be external user 00:13:32.760 |
context that comes in, for instance, when someone from, let's say, your finance department is looking at 00:13:37.640 |
your data, you return different information. And if someone from the, let's say, engineering department is 00:13:43.320 |
looking at your data, right? So, you also take this external context into account, how much and which 00:13:47.720 |
context you retrieve. And then you return to the LLM to generate the answer, not just basically text 00:13:53.960 |
fragments, like you would do in vector search, but you also create the returnees, more complete 00:14:00.440 |
subset of the, of the contextual graph to the LLM as well. And modern LLM is actually more trained on 00:14:07.960 |
graph processing as well. So, they can actually deal with these additional pattern structures where you have 00:14:13.800 |
node relationship, node patterns that you provide as additional context to the LLM. 00:14:19.160 |
And then, of course, I mentioned that you can enrich it using graph algorithms. So, basically, you can 00:14:24.920 |
do things like clustering, link prediction, page rank, and other things to enrich your data. 00:14:29.080 |
Cool. Let's look at some practical examples. We don't have too much time left. 00:14:33.960 |
So, one is knowledge graph construction from unstructured sources. So, there's a number of libraries. 00:14:39.480 |
You've already heard some today from people that do these kind of things. So, one thing that you build 00:14:46.920 |
is a tool that allows you to take PDFs, YouTube transcripts, local documents, web articles, Wikipedia 00:14:55.000 |
articles, and it extracts your data into a graph. And let me just switch over to the 00:15:02.840 |
demo here. So, this is the tool. So, I uploaded information from different Wikipedia pages, YouTube 00:15:11.560 |
videos, articles, and so on. And here's, for instance, a Google DeepMind extraction. So, you can use a lot of 00:15:17.800 |
different LLMs here. And then, you can also, if you want to, in graph enhancement, provide graph schema as 00:15:23.960 |
well. So, you can, for instance, a person works for a company and add these patterns to your, to your 00:15:33.960 |
schema. And then, the LLM is using this information to drive the extraction as well. And so, if you look at 00:15:39.400 |
the data that has been extracted from DeepMind, it is this one here. We can actually see from the Wikipedia article, 00:15:50.520 |
two aspects. One is the document with the chunks, which is this part of the graph, right? And then, 00:15:56.280 |
the second part is the entities that have been extracted from, from this article as well. So, you 00:16:01.320 |
see, actually, the connected knowledge graph of entities, which are companies, locations, people, 00:16:05.800 |
and technologies. So, it followed our, followed our schema to extract this. And then, if I want to run 00:16:12.760 |
graph reg, you have here a number of different retrievers. So, we have vector retriever graph in full text, 00:16:18.120 |
and others that you can select. All of this is also an open source project. So, you can just go to 00:16:23.960 |
GitHub and have a look at this. And so, I just ran this before because the internet is not so reliable 00:16:28.680 |
here. So, what has DeepMind worked on? And I get a detailed explanation. And then, if I want to, 00:16:34.040 |
I can here look at details. So, it shows me which sources that it used. As a folder, Wikipedia, 00:16:40.600 |
another PDF. I see the chunks have been used, which is basically the full text and hybrid search. But then, 00:16:45.640 |
I also see which entities that can use from the graph. So, I can actually really see from an 00:16:49.400 |
explainability perspective, these are the entities that have been retrieved by the graph reg retriever, 00:16:54.600 |
passed to the LLM, in addition to the text that's connected to these entities. So, it gets a richer 00:17:00.040 |
response as such. And then, you can also do evil on that with the as well. So, while I'm on the screen, 00:17:09.080 |
let me just show you another thing that we worked on, which is more like an energetic approach where 00:17:13.960 |
you basically put these individual retrievers into an configuration where you have basically 00:17:19.720 |
domain-specific retrievers that are running individual suffer queries. So, for instance, 00:17:26.600 |
if you look at, let's say, this one, it has the query here. And basically, a tool with inputs and a 00:17:34.120 |
description. And then, you can have an agentic loop using these tools, basically doing a graphic 00:17:40.440 |
with each individual tool, taking the responses, and then doing deeper tool calls. I'll show you 00:17:46.760 |
a deeper example in a minute. So, this is basically what I showed you. This is all available as open source 00:17:54.120 |
libraries. You can use it yourself from Python as well. And what's interesting here in the 00:18:08.920 |
agentic approach, you don't just use vector search to retrieve your data, but you basically break down 00:18:14.280 |
the user question into individual tasks and extract parameters and run these individual tools, 00:18:20.280 |
which then run in sequence or in a loop to return the data, and then you get basically these 00:18:25.400 |
outputs back. And basically, for each of these things, individual tools are called and used here. 00:18:32.520 |
And the last thing that I want to show is the graphic Python package, which is basically also 00:18:38.920 |
encapsulating all of this in construction and retrieval into one package. So, you can build a knowledge 00:18:44.120 |
graph, you can implement retrievers, and create the pipelines here. And here's an example of where I 00:18:49.880 |
pass in PDFs plus a graph schema, and then basically, it runs the import into Neo4j. And then I can, 00:18:58.360 |
in the Python notebook, visualize the data later on. And with that, I leave you with one second. 00:19:05.960 |
To take away, which is on graphic.com, you find all of these resources, a lot of the patterns, 00:19:13.400 |
and we'd love to have contributions, and love to talk more. I'm outside at the booth if you have more 00:19:20.440 |
questions. Yeah. So, that was great. And I think you're getting it all from the expert with all the 00:19:25.400 |
tooling. Actually, Michael's team builds a lot of the tools like Knowledge Graph Builder. Very excited you 00:19:30.440 |
all came to the Graph Rag track and hope to chat with you all more. If you have questions for me and 00:19:35.080 |
Michael, just meet us in the Neo4j booth across the way. Thank you. Thank you.