back to index

Practical GraphRAG: Making LLMs smarter with Knowledge Graphs — Michael, Jesus, and Stephen, Neo4j


Whisper Transcript | Transcript Only Page

00:00:00.000 | We are talking about GraphRack today. That's the GraphRack track, of course. And we want to look
00:00:19.760 | at patterns for successful GraphRack applications for making LLMs a little bit smarter by putting
00:00:26.400 | knowledge graph into the picture. My name is Michael Hunger. I lead the developer relations
00:00:34.720 | at Neo4j. Actually, we are both co-authoring. This is fun because we are both already authors
00:00:41.680 | and finally we have been friends for years. We are co-authoring GraphRack, the definitive
00:00:47.560 | guide for O'Reilly. Basically, we didn't sleep this past weekend because we had a book deadline.
00:00:52.800 | I am going to talk a little bit about at a high level, what GraphRack is, why it is important,
00:01:01.120 | what we are seeing in the media. And then Michael is going to drill down into all of the details
00:01:05.880 | and patterns and give you a bunch of takeaways and things you can do. This is probably, if
00:01:10.480 | you want to know how to do GraphRack, Michael's quick deep dive on this is the best introduction
00:01:17.640 | you can get. So, I'm also excited. Awesome. That's good going.
00:01:20.600 | Okay. So, the case for GraphRack is where we are going to start. And the challenge with
00:01:27.000 | using LLMs and using other patterns for this is basically they don't have the enterprise domain
00:01:32.600 | knowledge. They don't verify or explain the answers. They are subject to hallucinations.
00:01:37.160 | And they have ethical and data bias concerns. And you can see that very much like our friendly
00:01:43.640 | parrot here. They are all the things which parrots behave and act like, except a cute bird.
00:01:49.640 | So, we want to do better than this with GraphRack and figure out how we can use domain-specific
00:01:55.560 | knowledge, accurate, contextual, and explainable answers. And really, I think what a lot of companies
00:02:02.760 | and what the industry is figuring out is it's really a data problem. You need good data. You need to have
00:02:08.120 | data you can power your system with. One of the patterns you can do this with is RAG. So, you can
00:02:13.480 | stick your external data into a RAG system. You can get stuff back from a database for the pattern.
00:02:21.080 | But vector databases and RAG fall short because it's lacking kind of your full data set. It's only
00:02:29.400 | pulling back a fraction of the information by vector-similar algorithms. Typically, a lot of the,
00:02:34.280 | especially modern vector databases, which everyone's using, they're easy to start with, but they're not
00:02:38.920 | robust. They're not mature. They're not something which has scalability and fallback and gives you
00:02:44.520 | that, what you need to get into, build a strong, robust enterprise system. And vector similarity is not
00:02:50.840 | the same as relevance. So, results you get back from using a basic RAG system. They give you back things
00:02:57.720 | which are related to the topic, but it's not complete, and it's typically also not very relevant.
00:03:04.040 | And then it's very hard to explain what's coming out of the system. So, we need to answer. Lifeline.
00:03:10.520 | Yeah. Graphrag. And what Graphrag is, is we're bringing
00:03:14.920 | the knowledge and the context and the environment to what LLMs are good at. So, you can think of this kind of like the
00:03:22.360 | the human brain. Our left brain is, our right brain is more creative. It does more like building things. It
00:03:30.280 | does more extrapolation of information. Whereas our left brain is the logical part. That's what actually
00:03:35.880 | has reasoning, has facts, and can enrich data. And it's built off of knowledge graphs. So, a knowledge
00:03:42.440 | graph is a collection of nodes, relationships, and properties. Here's a really simple example of a
00:03:48.440 | knowledge graph where you have two people. They live together. You have a car. But when you look
00:03:53.400 | into the details, it's actually like a little bit more complex than it seems at first, because
00:03:57.720 | they both have a car, but the owner of the car is not the person who drives it. This is kind of like my
00:04:05.080 | family. My wife does all the bills, but then she hands me the keys whenever we get on the freeway. She
00:04:11.000 | hates driving. So, knowledge graphs also are a great way of getting really rich data. Here's an example of
00:04:17.720 | the stack overflow graph built into a knowledge graph where you can see all of the rich metadata and the
00:04:22.600 | complexity of the results. And we can use this to evolve RAG into a more complex system, basically graph RAG,
00:04:29.320 | where we get better relevancy. We're getting more relevant results. We get more context, because now we can
00:04:35.000 | actually pull back all of the related information by graph closeness algorithms. We can explain what's going
00:04:40.680 | on because it's no longer just vectors, it's no longer statistical probabilities coming out of a vector
00:04:46.600 | database. We actually have nodes, we have structure, we have semantics we can look at, and we can add in
00:04:52.280 | security and role-based access on top of this. So, it's context-rich, it's grounded. This gives us a lot of power, and it
00:04:59.560 | gives us the ability to start explaining what we're doing, where now we can visualize it, we can analyze it, and we can log all of this.
00:05:06.360 | Now, this is one of the initial papers, the graph RAG paper from Microsoft Research, where they went
00:05:12.120 | through this and they showed that you could actually get not only better results, but less token costs. It
00:05:17.720 | was actually less expensive to do a graph RAG algorithm. There have been a lot of papers since then, which
00:05:24.440 | show all of the different research and interesting work which is going on in the graph RAG area. This is
00:05:31.480 | just a quick view of the different studies and results which are coming out, but even from the early
00:05:35.720 | Data.World study, where they showed a three-times improvement in graph RAG capabilities, and the
00:05:42.440 | analysts are even showing how graph RAG is trending up. This is the Gartner hype cycle from 2024, and you can see
00:05:52.040 | generic AI is kind of on the downtrends. RAG is getting over the hump, but graph RAG and a bunch of
00:05:58.120 | these things actually are providing and breathing more life into the AI ecosystem. So, a lot of great
00:06:03.560 | reports from Gartner showing that it's grounded in facts, it resolves hallucinations, together knowledge
00:06:09.800 | graphs and AI are solving these problems, and it's getting a lot of adoption by different industry leaders,
00:06:15.560 | by big organizations who are taking advantage of this and actually producing production applications
00:06:21.080 | and making it work like LinkedIn customer support, where they actually wrote this great research paper,
00:06:27.480 | where they showed that using a knowledge graph for customer support scenarios actually gave them
00:06:32.520 | better results and allowed them to improve the quality and reduce the response time for getting back to
00:06:39.000 | customers. Median per issue resolution time was reduced by 28.6%. I mentioned the data.world study,
00:06:46.920 | which basically was a comparison of doing RAG on SQL versus RAG on graph databases, and they showed a
00:06:52.920 | three-times improvement in accuracy of LLM responses. And let's chat about patterns, Michael,
00:06:58.520 | because I think everyone's here to learn how to do this.
00:07:01.000 | Exactly. So, let's look at how to do this, actually, right? So, and if you look at GraphREG,
00:07:07.320 | there are actually two sides to the coin. So, one, of course, you don't start in a vacuum,
00:07:12.200 | you have to create your knowledge graph, right? So, NVC, basically, multiple steps to get there.
00:07:16.520 | Initially, you get unstructured information, you substructure it, you put it into a lexical graph,
00:07:21.640 | which represents documents, chunks, and their relationships. And a second step, you can then
00:07:27.240 | extract entities using, for instance, LLMs with this graph schema to extract entities and
00:07:32.760 | their relationships from that graph. And in a third phase, you would enrich this graph, for instance,
00:07:36.840 | with graph algorithms, doing things like, you know, page rank, community summarization, and so on.
00:07:42.760 | And then when you have this build-up knowledge graph, then you do GraphREG as the search mechanism,
00:07:49.560 | either with local search or global search and other ways, right? So, let's first look at the first
00:07:55.880 | phase of, like, knowledge graph construction a little bit. So, like, always in data engineering,
00:08:01.400 | there is, if you want to have higher quality outputs, you have to put in more effort at the
00:08:05.560 | beginning, right? So, it's basically nothing comes for free, there's no free lunch after all. But what
00:08:09.800 | you do at the beginning is basically paying off multiple times, because what you get out of your
00:08:14.520 | unstructured documents is actually high quality, high structured information, which you then can use to
00:08:20.200 | extract contextual information for your queries, which allows to reach retrieval at the end.
00:08:25.560 | So, after seeing GraphREG being used by a number of users, customers, we've seen, we looked at
00:08:35.400 | research papers, we saw that a number of patterns emerging in terms of, like, how we structure our
00:08:40.680 | graphs, how we query these graphs, and so on. And so, we started to collect these patterns and put them on
00:08:45.560 | GraphREG.com. And we want to, I wanted to show what this looks like. So, we have, basically,
00:08:51.720 | example graphs in the pattern, the pattern has a name, description, context, and we see also queries
00:08:59.400 | that are used for extracting this information, right? So, for instance, here's a mix of a lexical graph and
00:09:04.920 | then domain graph, and then we can have the query that fetches this information. Let's look at the three
00:09:10.200 | steps in a little bit more detail on the graph model side. So, on one side, we have for lexical graphs,
00:09:17.080 | you represent documents and the elements. So, that could be something simple as a chunk. But if you have
00:09:21.800 | structured elements and documents, you can also do something like, okay, I have a book, which has chapters,
00:09:26.520 | which have sections, which have paragraphs, where the paragraph is the semantically cohesive unit
00:09:32.120 | that you would use to, for instance, create a vector embedding of that you can use later for vector search.
00:09:37.000 | But what's really interesting in the graph is, basically, you can connect these things all up,
00:09:40.600 | right? So, you know exactly who's the predecessor, who's the successor to a chunk, who's the parent
00:09:45.560 | of an element, and using something like vector or text similarity, you can also connect these chunks as
00:09:51.960 | well by a neighbor or similarity graph, where you basically store similarities between chunks, and then
00:09:59.880 | you put on the relationship between them and weighted score, basically, how similarity to chunks. And then you
00:10:05.080 | can use all these relationships when you extract the context and the retrieval phase to, for instance,
00:10:09.960 | find what are related chunks by document, by a temporal sequence, by a similarity and other things.
00:10:16.520 | Right? So, that's only on the lexical side. This looks like this. So, for instance, you have an RFP,
00:10:21.800 | and you want to break it up in a structured way, then you basically create the relationships between
00:10:26.200 | these chunks, these chunks, or these subsections, edit text, do the vector embeddings, and then you do it at
00:10:32.680 | scale, and then you get a full lexical graph out of that. Next phase is entity extraction, which is also
00:10:41.320 | something that has been around for quite some time with NLP, but LLMs actually take this to the next level
00:10:46.680 | with their multi-language understanding, with their high flexibility, good language skills for extraction.
00:10:52.280 | So, you basically provide a graph schema and an instruction prompt to the LLM plus your pieces of
00:11:01.240 | information, pieces of text. Now, with large context windows, you can then put in 10,000 tokens for
00:11:07.560 | extraction. If you have, you can also put in already existing ground tools. So, for instance, if you have
00:11:13.560 | existing structure data where your entities, let's say products, or genes, or partners, or clients are
00:11:20.920 | already existing, then you can also pass this in as part of the prompt, so that the LLM doesn't do an
00:11:25.480 | extraction, but more an recognition and finding approach where you find your entities, and then you
00:11:32.120 | extract relationships from them, and then you can store additional facts and additional information
00:11:38.120 | that you store as part of relationships and entities as well. So, basically, in the first part,
00:11:42.680 | you have the lexical graph which is representing document structure, but then in the second part,
00:11:46.440 | you extract the relevant entities and their relationships. If you have already an existing
00:11:51.160 | knowledge graph, you can also connect this to an existing knowledge graph. So, imagine you have an
00:11:54.360 | CRM where you already have customer clients and leads in your knowledge graph, but then you want to
00:12:01.480 | enrich this with, for instance, protocols from core transcripts, and then you basically
00:12:06.680 | connect this to your existing structure data as well. So, that's also a possibility. And then in the next phase,
00:12:11.480 | what you can do is you can run graph algorithms for enrichment, which then, for instance, can do
00:12:16.600 | clustering on the entity graph and then you generate something like communities where an LLM can generate
00:12:23.320 | summaries across them as such. And for especially last one, it's interesting because what you identify is
00:12:30.600 | actually cross document topics, right? So, because it's basically each document is in
00:12:35.960 | basic temporal vertical representation of information, but what this is is actually it looks at which topics are
00:12:42.600 | reoccurring across many different documents. So, you find these kind of topic clusters across documents as well.
00:12:49.160 | Cool. So, if you look at the second phase, the search phase, which is basically the retrieval
00:12:56.360 | part of rec. What we see here is basically that in a graphic retriever, you don't just do a simple
00:13:02.520 | vector lookup to get returns, results return, but what you do, you do an initial index search. It could be
00:13:09.480 | vector search, full text search, hybrid search, spatial search, other kinds of searches to find the entry
00:13:14.200 | points in your graph. And then you basically can take, as you can see here, starting from these entry points,
00:13:21.240 | you don't follow the relationships up to a certain degree or up to a certain relevancy to fetch in
00:13:27.480 | additional context. And this context can be coming from the user question. It can be external user
00:13:32.760 | context that comes in, for instance, when someone from, let's say, your finance department is looking at
00:13:37.640 | your data, you return different information. And if someone from the, let's say, engineering department is
00:13:43.320 | looking at your data, right? So, you also take this external context into account, how much and which
00:13:47.720 | context you retrieve. And then you return to the LLM to generate the answer, not just basically text
00:13:53.960 | fragments, like you would do in vector search, but you also create the returnees, more complete
00:14:00.440 | subset of the, of the contextual graph to the LLM as well. And modern LLM is actually more trained on
00:14:07.960 | graph processing as well. So, they can actually deal with these additional pattern structures where you have
00:14:13.800 | node relationship, node patterns that you provide as additional context to the LLM.
00:14:19.160 | And then, of course, I mentioned that you can enrich it using graph algorithms. So, basically, you can
00:14:24.920 | do things like clustering, link prediction, page rank, and other things to enrich your data.
00:14:29.080 | Cool. Let's look at some practical examples. We don't have too much time left.
00:14:33.960 | So, one is knowledge graph construction from unstructured sources. So, there's a number of libraries.
00:14:39.480 | You've already heard some today from people that do these kind of things. So, one thing that you build
00:14:46.920 | is a tool that allows you to take PDFs, YouTube transcripts, local documents, web articles, Wikipedia
00:14:55.000 | articles, and it extracts your data into a graph. And let me just switch over to the
00:15:02.840 | demo here. So, this is the tool. So, I uploaded information from different Wikipedia pages, YouTube
00:15:11.560 | videos, articles, and so on. And here's, for instance, a Google DeepMind extraction. So, you can use a lot of
00:15:17.800 | different LLMs here. And then, you can also, if you want to, in graph enhancement, provide graph schema as
00:15:23.960 | well. So, you can, for instance, a person works for a company and add these patterns to your, to your
00:15:33.960 | schema. And then, the LLM is using this information to drive the extraction as well. And so, if you look at
00:15:39.400 | the data that has been extracted from DeepMind, it is this one here. We can actually see from the Wikipedia article,
00:15:50.520 | two aspects. One is the document with the chunks, which is this part of the graph, right? And then,
00:15:56.280 | the second part is the entities that have been extracted from, from this article as well. So, you
00:16:01.320 | see, actually, the connected knowledge graph of entities, which are companies, locations, people,
00:16:05.800 | and technologies. So, it followed our, followed our schema to extract this. And then, if I want to run
00:16:12.760 | graph reg, you have here a number of different retrievers. So, we have vector retriever graph in full text,
00:16:18.120 | and others that you can select. All of this is also an open source project. So, you can just go to
00:16:23.960 | GitHub and have a look at this. And so, I just ran this before because the internet is not so reliable
00:16:28.680 | here. So, what has DeepMind worked on? And I get a detailed explanation. And then, if I want to,
00:16:34.040 | I can here look at details. So, it shows me which sources that it used. As a folder, Wikipedia,
00:16:40.600 | another PDF. I see the chunks have been used, which is basically the full text and hybrid search. But then,
00:16:45.640 | I also see which entities that can use from the graph. So, I can actually really see from an
00:16:49.400 | explainability perspective, these are the entities that have been retrieved by the graph reg retriever,
00:16:54.600 | passed to the LLM, in addition to the text that's connected to these entities. So, it gets a richer
00:17:00.040 | response as such. And then, you can also do evil on that with the as well. So, while I'm on the screen,
00:17:09.080 | let me just show you another thing that we worked on, which is more like an energetic approach where
00:17:13.960 | you basically put these individual retrievers into an configuration where you have basically
00:17:19.720 | domain-specific retrievers that are running individual suffer queries. So, for instance,
00:17:26.600 | if you look at, let's say, this one, it has the query here. And basically, a tool with inputs and a
00:17:34.120 | description. And then, you can have an agentic loop using these tools, basically doing a graphic
00:17:40.440 | with each individual tool, taking the responses, and then doing deeper tool calls. I'll show you
00:17:46.760 | a deeper example in a minute. So, this is basically what I showed you. This is all available as open source
00:17:54.120 | libraries. You can use it yourself from Python as well. And what's interesting here in the
00:18:08.920 | agentic approach, you don't just use vector search to retrieve your data, but you basically break down
00:18:14.280 | the user question into individual tasks and extract parameters and run these individual tools,
00:18:20.280 | which then run in sequence or in a loop to return the data, and then you get basically these
00:18:25.400 | outputs back. And basically, for each of these things, individual tools are called and used here.
00:18:32.520 | And the last thing that I want to show is the graphic Python package, which is basically also
00:18:38.920 | encapsulating all of this in construction and retrieval into one package. So, you can build a knowledge
00:18:44.120 | graph, you can implement retrievers, and create the pipelines here. And here's an example of where I
00:18:49.880 | pass in PDFs plus a graph schema, and then basically, it runs the import into Neo4j. And then I can,
00:18:58.360 | in the Python notebook, visualize the data later on. And with that, I leave you with one second.
00:19:05.960 | To take away, which is on graphic.com, you find all of these resources, a lot of the patterns,
00:19:13.400 | and we'd love to have contributions, and love to talk more. I'm outside at the booth if you have more
00:19:20.440 | questions. Yeah. So, that was great. And I think you're getting it all from the expert with all the
00:19:25.400 | tooling. Actually, Michael's team builds a lot of the tools like Knowledge Graph Builder. Very excited you
00:19:30.440 | all came to the Graph Rag track and hope to chat with you all more. If you have questions for me and
00:19:35.080 | Michael, just meet us in the Neo4j booth across the way. Thank you. Thank you.