back to index

Q&A Document Retrieval With DPR


Whisper Transcript | Transcript Only Page

00:00:00.000 | Okay so in the previous video what we did was sell our Elasticsearch document store to contain
00:00:07.680 | all of our paragraphs from meditations. So we did that in this script here and
00:00:15.760 | all together we only have, it's not that much data, 508 paragraphs or documents within our
00:00:22.240 | document store. So what we now want to do is set up the next part of our Retriever reader sack which
00:00:32.640 | is the Retriever and what the Retriever will do is given a query it will communicate with our
00:00:41.680 | Elasticsearch document store and return a certain number of contexts which are the paragraphs in our
00:00:49.040 | case that it thinks are most relevant to our query. So that's what we are going to be doing here and
00:00:59.280 | the first thing that we need to do is initialize our document store again so I'm just going to copy
00:01:06.800 | these and paste them here and this would just initialize it from what we've already built so
00:01:18.880 | it's using the same index that already exists. So just initialize that and once we have our document
00:01:26.640 | store okay cool we have that now. Now what we want to do is set up our DPR which is a Dense
00:01:35.120 | Passage Retriever which essentially uses dense vectors and a type of efficient similarity search
00:01:46.960 | to embed these indexes as dense vectors and then once it comes to actually searching
00:01:54.080 | and finding the most similar or the most relevant documents later on it will use those dense
00:02:01.360 | vectors and find the most similar ones. So I'll explain that a little bit better in a moment.
00:02:12.160 | So first what we want to do is actually initialize that
00:02:15.360 | so we do from haysack dense retriever import dense passage retriever
00:02:28.080 | sorry it's the other way around here so retriever dense
00:02:39.040 | and then we'll put it into a
00:02:46.320 | variable called retriever which uses the dense passage retriever from up here
00:02:55.280 | and in here we need to pass a few parameters so the first thing is the document store
00:03:02.080 | so the document store is just what we've already initialized doc store
00:03:08.400 | and then we need to initialize two different models so it's the query embedding model
00:03:17.600 | and the passage embedding model
00:03:25.520 | now behind the scenes haystack is using the hugging face transformers library so what we'll do is we'll
00:03:35.040 | head over to the models over there and see which embedding models we can use for dpr
00:03:41.760 | okay so here let's just search for dpr and you'll find we have all these models from facebook ai
00:04:01.040 | so now with dpr the reason that it's so useful for question answering is that we have
00:04:09.920 | what are two different models that encode the text that we pass into it so we have
00:04:18.160 | this sort of setup during training and what we see down here
00:04:27.200 | are these two models we have this ep but encoder and we also have this eq but encoder now the ep
00:04:35.760 | but encoder encodes the passages or the context so essentially the paragraphs that we have fed
00:04:44.640 | into our elastic search model this is what will be encoding them into these vectors here
00:04:53.680 | now this is during training this whole graph so all we will actually see
00:04:58.960 | when we're encoding these vectors is we will see the
00:05:03.440 | ep encoder
00:05:08.000 | and this will create the ep vectors
00:05:22.640 | and all we're going to do is feed in all of the documents from elastic search into this
00:05:31.120 | now once all of these have been encoded we then have a new set of dense vectors
00:05:46.480 | and all of those will be fed back into our document store so back into elastic
00:05:54.240 | now when it comes to performing similarity search later on
00:06:01.520 | we're going to ask a question and that question will be processed by the eq encoder
00:06:10.560 | so here we have our eq encoder and we have our question so that will go into here
00:06:21.520 | and that will encode our question and then send it over to elastic and say okay what are the most
00:06:32.080 | similar vectors to this vector that we created from a question and the reason that dpr is so good
00:06:41.200 | is that if you look at the training down here we are creating these ep vectors and these eq vectors
00:06:47.920 | that are matching so where we have a matching question to a matching context we are training
00:06:55.440 | them to maximize the dot product because the dot product measures the alignment between those two
00:07:05.120 | vectors so what happens is that a relevant passage and a relevant question will come out to have a
00:07:15.440 | very similar vector so one example that i like to use is if our question was what is the capital of
00:07:34.720 | france
00:07:41.680 | the embedding that i will create from that will create a context that looks something like the
00:07:50.320 | capital
00:07:52.720 | of france
00:07:56.720 | is and you know something here we don't know why it will pop because it doesn't actually know what
00:08:05.840 | the capital of france is it's just doing linguistic transformations to try and figure out what sort
00:08:12.400 | of context the answer would come from then of course when you feed this context into elastic
00:08:18.880 | the most similar vector will be the one which contains the answer to our question
00:08:30.000 | to our question okay because the answer to our question which is something like the capital of
00:08:35.360 | france is paris now we don't have paris here but it will be able to figure that out because it will
00:08:41.600 | be the most similar sequence to the context that dpr has produced now back to hugging face here
00:08:53.680 | you can see we have these multiple dpr models and what we want is a pair we want a question encoder
00:09:01.360 | and a ctx which is context encoder now we'll be using this single nq base so what i'll do is just
00:09:09.760 | copy this and in here we just add in our model okay so that's a question encoder
00:09:24.640 | now what we also need is the context encoder which is instead of question here we just add ctx
00:09:35.920 | now we have two other parameters that we need to add in here which is use gpu which is if you're
00:09:42.640 | using a gpu obviously you set this to true if not you go with faults it will take a little bit of
00:09:49.200 | time to process this if you're not using a gpu though then we also add embed title equals true
00:10:00.240 | as well now what we should see is this will execute without error hopefully okay great and
00:10:10.320 | then what we need to do is update the embeddings within elastic search so what we've done here is
00:10:16.160 | kind of set up the process and now what we need to do is update the documents that we have in
00:10:22.560 | elastic search to have dpr embeddings so to do that we go doc store update embeddings
00:10:32.400 | and then in here we pass our retriever
00:10:37.680 | okay now this may take well this would be really quick for me we don't have that many documents
00:10:46.160 | and even on cpu actually with the lack of documents we have it should be pretty quick
00:10:51.120 | so what we see here we created these embeddings and then we posted them again to our
00:10:59.520 | index so that is pretty cool and now what we need to do is just test it actually works
00:11:12.880 | now let's go with retriever and this is how we
00:11:17.680 | get context from our elastic search document store so right retrieve
00:11:24.640 | and then we pass in a query here so let me just find something here
00:11:38.560 | like what did you learn from your great grandfather maybe or from verus uh yeah let's go
00:11:47.600 | from grandfather let's go grandfather so
00:11:52.480 | what did you what did your grandfather teach you i don't know if this is going to work but let's see
00:12:05.200 | okay so you see that we return quite a few contacts here now we haven't settled a full
00:12:18.720 | thing so we're just returning what it sees as being relevant context we are not actually
00:12:25.360 | extracting an answer out yet because that will be the job of our reader model so what we have is
00:12:31.920 | from my great grandfather we have that one so it's okay some other ones here
00:12:42.960 | let's type in grandfather
00:12:49.360 | okay so it's just returning that one which is fine it's not perfect but what we would expect
00:13:00.240 | to do in reality is return more so let's try another one as well
00:13:07.280 | and let's say who taught you about freedom of will
00:13:23.600 | who taught the freedom of will
00:13:26.640 | and we see here okay in the first one we don't get out the correct answer that we want or the
00:13:37.840 | correct context and we go down and i saw there he is so here is the context that we wanted to return
00:13:50.320 | so it returns that as the fourth best context which is fine because when we build our reader
00:13:57.840 | model later on we kind of expect that to sort those a little bit better than our
00:14:03.040 | retriever model this is pretty cool and i think definitely a good start so now what we have
00:14:12.640 | retrieved for meditations set up our document store and now we have also set up our retriever
00:14:21.760 | so we can also cross that off and next thing is our reader model
00:14:31.360 | so i think that's it for this video in the next one of course we'll move on to that reader model
00:14:40.160 | and let's just see how that goes but so far i'm pretty happy with that so
00:14:46.160 | thank you for watching and i'll see you again in the next one