back to indexQ&A Document Retrieval With DPR
00:00:00.000 |
Okay so in the previous video what we did was sell our Elasticsearch document store to contain 00:00:07.680 |
all of our paragraphs from meditations. So we did that in this script here and 00:00:15.760 |
all together we only have, it's not that much data, 508 paragraphs or documents within our 00:00:22.240 |
document store. So what we now want to do is set up the next part of our Retriever reader sack which 00:00:32.640 |
is the Retriever and what the Retriever will do is given a query it will communicate with our 00:00:41.680 |
Elasticsearch document store and return a certain number of contexts which are the paragraphs in our 00:00:49.040 |
case that it thinks are most relevant to our query. So that's what we are going to be doing here and 00:00:59.280 |
the first thing that we need to do is initialize our document store again so I'm just going to copy 00:01:06.800 |
these and paste them here and this would just initialize it from what we've already built so 00:01:18.880 |
it's using the same index that already exists. So just initialize that and once we have our document 00:01:26.640 |
store okay cool we have that now. Now what we want to do is set up our DPR which is a Dense 00:01:35.120 |
Passage Retriever which essentially uses dense vectors and a type of efficient similarity search 00:01:46.960 |
to embed these indexes as dense vectors and then once it comes to actually searching 00:01:54.080 |
and finding the most similar or the most relevant documents later on it will use those dense 00:02:01.360 |
vectors and find the most similar ones. So I'll explain that a little bit better in a moment. 00:02:12.160 |
So first what we want to do is actually initialize that 00:02:15.360 |
so we do from haysack dense retriever import dense passage retriever 00:02:28.080 |
sorry it's the other way around here so retriever dense 00:02:46.320 |
variable called retriever which uses the dense passage retriever from up here 00:02:55.280 |
and in here we need to pass a few parameters so the first thing is the document store 00:03:02.080 |
so the document store is just what we've already initialized doc store 00:03:08.400 |
and then we need to initialize two different models so it's the query embedding model 00:03:25.520 |
now behind the scenes haystack is using the hugging face transformers library so what we'll do is we'll 00:03:35.040 |
head over to the models over there and see which embedding models we can use for dpr 00:03:41.760 |
okay so here let's just search for dpr and you'll find we have all these models from facebook ai 00:04:01.040 |
so now with dpr the reason that it's so useful for question answering is that we have 00:04:09.920 |
what are two different models that encode the text that we pass into it so we have 00:04:18.160 |
this sort of setup during training and what we see down here 00:04:27.200 |
are these two models we have this ep but encoder and we also have this eq but encoder now the ep 00:04:35.760 |
but encoder encodes the passages or the context so essentially the paragraphs that we have fed 00:04:44.640 |
into our elastic search model this is what will be encoding them into these vectors here 00:04:53.680 |
now this is during training this whole graph so all we will actually see 00:04:58.960 |
when we're encoding these vectors is we will see the 00:05:22.640 |
and all we're going to do is feed in all of the documents from elastic search into this 00:05:31.120 |
now once all of these have been encoded we then have a new set of dense vectors 00:05:46.480 |
and all of those will be fed back into our document store so back into elastic 00:05:54.240 |
now when it comes to performing similarity search later on 00:06:01.520 |
we're going to ask a question and that question will be processed by the eq encoder 00:06:10.560 |
so here we have our eq encoder and we have our question so that will go into here 00:06:21.520 |
and that will encode our question and then send it over to elastic and say okay what are the most 00:06:32.080 |
similar vectors to this vector that we created from a question and the reason that dpr is so good 00:06:41.200 |
is that if you look at the training down here we are creating these ep vectors and these eq vectors 00:06:47.920 |
that are matching so where we have a matching question to a matching context we are training 00:06:55.440 |
them to maximize the dot product because the dot product measures the alignment between those two 00:07:05.120 |
vectors so what happens is that a relevant passage and a relevant question will come out to have a 00:07:15.440 |
very similar vector so one example that i like to use is if our question was what is the capital of 00:07:41.680 |
the embedding that i will create from that will create a context that looks something like the 00:07:56.720 |
is and you know something here we don't know why it will pop because it doesn't actually know what 00:08:05.840 |
the capital of france is it's just doing linguistic transformations to try and figure out what sort 00:08:12.400 |
of context the answer would come from then of course when you feed this context into elastic 00:08:18.880 |
the most similar vector will be the one which contains the answer to our question 00:08:30.000 |
to our question okay because the answer to our question which is something like the capital of 00:08:35.360 |
france is paris now we don't have paris here but it will be able to figure that out because it will 00:08:41.600 |
be the most similar sequence to the context that dpr has produced now back to hugging face here 00:08:53.680 |
you can see we have these multiple dpr models and what we want is a pair we want a question encoder 00:09:01.360 |
and a ctx which is context encoder now we'll be using this single nq base so what i'll do is just 00:09:09.760 |
copy this and in here we just add in our model okay so that's a question encoder 00:09:24.640 |
now what we also need is the context encoder which is instead of question here we just add ctx 00:09:35.920 |
now we have two other parameters that we need to add in here which is use gpu which is if you're 00:09:42.640 |
using a gpu obviously you set this to true if not you go with faults it will take a little bit of 00:09:49.200 |
time to process this if you're not using a gpu though then we also add embed title equals true 00:10:00.240 |
as well now what we should see is this will execute without error hopefully okay great and 00:10:10.320 |
then what we need to do is update the embeddings within elastic search so what we've done here is 00:10:16.160 |
kind of set up the process and now what we need to do is update the documents that we have in 00:10:22.560 |
elastic search to have dpr embeddings so to do that we go doc store update embeddings 00:10:37.680 |
okay now this may take well this would be really quick for me we don't have that many documents 00:10:46.160 |
and even on cpu actually with the lack of documents we have it should be pretty quick 00:10:51.120 |
so what we see here we created these embeddings and then we posted them again to our 00:10:59.520 |
index so that is pretty cool and now what we need to do is just test it actually works 00:11:12.880 |
now let's go with retriever and this is how we 00:11:17.680 |
get context from our elastic search document store so right retrieve 00:11:24.640 |
and then we pass in a query here so let me just find something here 00:11:38.560 |
like what did you learn from your great grandfather maybe or from verus uh yeah let's go 00:11:52.480 |
what did you what did your grandfather teach you i don't know if this is going to work but let's see 00:12:05.200 |
okay so you see that we return quite a few contacts here now we haven't settled a full 00:12:18.720 |
thing so we're just returning what it sees as being relevant context we are not actually 00:12:25.360 |
extracting an answer out yet because that will be the job of our reader model so what we have is 00:12:31.920 |
from my great grandfather we have that one so it's okay some other ones here 00:12:49.360 |
okay so it's just returning that one which is fine it's not perfect but what we would expect 00:13:00.240 |
to do in reality is return more so let's try another one as well 00:13:07.280 |
and let's say who taught you about freedom of will 00:13:26.640 |
and we see here okay in the first one we don't get out the correct answer that we want or the 00:13:37.840 |
correct context and we go down and i saw there he is so here is the context that we wanted to return 00:13:50.320 |
so it returns that as the fourth best context which is fine because when we build our reader 00:13:57.840 |
model later on we kind of expect that to sort those a little bit better than our 00:14:03.040 |
retriever model this is pretty cool and i think definitely a good start so now what we have 00:14:12.640 |
retrieved for meditations set up our document store and now we have also set up our retriever 00:14:21.760 |
so we can also cross that off and next thing is our reader model 00:14:31.360 |
so i think that's it for this video in the next one of course we'll move on to that reader model 00:14:40.160 |
and let's just see how that goes but so far i'm pretty happy with that so 00:14:46.160 |
thank you for watching and i'll see you again in the next one