back to indexBest Indexes for Similarity Search in Faiss
Chapters
0:0 Intro
0:30 IVFADC
3:30 IVFADC in Faiss
7:29 Multi-D-ADC
9:17 Multi-D-ADC in Faiss
14:43 IVF-HNSW
21:39 IVF-HNSW in Faiss
25:58 Outro
00:00:02.520 |
Well these composite indexes that you can you can see on the screen now 00:00:06.440 |
IVF ADC Monte de ADC and IVF HSW in the last video in the series 00:00:13.480 |
we sort of introduce composite indexes and the index factory in FICE and 00:00:19.160 |
What we're going to do now is obviously go through each of these and 00:00:25.120 |
Using what we learned in the last video the index factory 00:00:33.420 |
This is it stands for IVF is in like the inverted file index 00:00:38.980 |
followed by asymmetric distance of computation if you watched the video or the videos or read the articles on 00:00:58.060 |
During indexing so when we first take our vectors, and we add them to the index 00:01:08.220 |
That's the IVF part of it, so if I just point that out 00:01:13.740 |
So we're taking our vector over here, and IVF assignment is this part so this is our index IVF 00:01:20.960 |
Assigning them to IVF cells which are the the voronoi cells that we usually 00:01:36.140 |
And then the second step is that those vectors that we've just assigned they go through a product quantization 00:01:46.860 |
Goes all the way over here, and we create and we create a PQ code 00:01:51.300 |
Do you quantize code and they get sold together so did the code to get stored in the? 00:02:06.880 |
During search this is where the the ADC part comes in so ADC is that that asymmetric? 00:02:18.880 |
Well when you when you quantize your vectors you have these two 00:02:23.400 |
alternative ways of searching the first which is symmetric distance computation over here is not as popular and 00:02:31.080 |
what you do with that is you take your query vector which is xq and you quantize it and 00:02:37.880 |
Then you compare it to all of your pre quantized 00:02:41.260 |
Pre-indexed vectors, so essentially both sides of what you're comparing are quantized, and that's why it's 00:02:49.620 |
Asymmetrical or symmetric distance computation the alternative which tends to work better is 00:02:55.000 |
Where you don't quantize your query vector, so you just take its cue as it is 00:03:01.180 |
You compare that to your pre quantized pre indexed vectors, so the quantized speed vectors 00:03:09.020 |
that's where that's where you get the IVF side of things, and then you also get the ADC side of things and 00:03:17.040 |
When it can suffice we can we can use this index factory string to pull that together, so we have IVF on the left 00:03:24.020 |
Product quantization on the right, so let's let's go to Python and put that together 00:03:30.100 |
So I've already imported the data using again the SIFT 1M 00:03:35.420 |
Dataset there'll be a link in the description which will explain how you can get that data set from that we have 00:03:46.420 |
index vector the vectors are searching through and then XQ which is just a single vector that we're using to search and 00:03:56.920 |
We just write index vice index factory as we as we have been doing we have our number of dimensions 00:04:04.560 |
128 for this data set and then we want to input the the string the index factory string 00:04:11.680 |
Which is so we have IVF. Which is the first part. We're gonna split into 256 00:04:20.320 |
We are using PQ for the second part. Which is the ADC part 00:04:24.980 |
and here we're going to use an M value 32 and 00:04:40.560 |
Include the times 8 there, but you don't need it 00:04:44.000 |
The only way you probably need this is if you want to use a lower n bits value 00:04:48.960 |
So say you want to use 4 if trying to go for 12 you 00:04:51.760 |
Well, you're not going to be able to unless vice has updated that yet. So basically you're not allowed to go over 8 00:05:00.240 |
For now with the the PQ index or IVF PQ index, but that should be being removed in the future version 00:05:09.600 |
And then like we usually do we just do index train so we need to train 00:05:18.120 |
And also put quantization. So we we really really do need to train 00:05:23.240 |
And then now we add our vectors now from there we just do search as we normally do so the I equals 00:05:43.040 |
No, I don't really need to see what we return, but I'll show you the performance of that 00:05:51.080 |
Mays or explain so we have M here M is the the number of sub vectors that we're splitting 00:06:00.880 |
Product quantization set and the n bits value here is 8 which is the number of bits used by each 00:06:08.280 |
Subquantizer within the product quantization set. So you just increase that to perform to increase 00:06:15.000 |
Recall performance or accuracy. Okay, and then you see we have our nearest indexes there 00:06:26.000 |
Let me show you the actual performance of the metrics for this 00:06:31.080 |
So what we have here is a few things on to compare so we have the blue lines here 00:06:38.080 |
Oh what we just built so the IVF you see then we also have the the magenta line 00:06:47.200 |
Optimization set to it now for me. There was no real performance increase there 00:06:53.240 |
And I just want to compare that to a simple flat 00:06:58.480 |
Now a flat index if we use that the the memory usage of that is something like half a gigabyte 00:07:08.560 |
PQ or IVF PQ indexes is I think of around 40 megabytes 00:07:18.200 |
So that that's why you'd use PQ in the first place to minimize your memory usage of the index 00:07:25.760 |
But in terms of recall, of course, you can't really be a flat index 00:07:28.940 |
So let's move on to the next one. So we have multi D ADC, which is a 00:07:34.220 |
Multi index asymmetric distance computation so that the ADC part is the same 00:07:40.440 |
so we know okay, there's product quantization in there and 00:07:47.440 |
so this is based on IMI, which is a inverted multi index and 00:07:59.480 |
IVF IMI uses product quantization to split your vectors into 00:08:06.120 |
Subvectors and then process them through that 00:08:10.220 |
Voronoi cell or cluster them using Voronoi cells. So it's like IVF but split across multiple 00:08:17.360 |
Vector subspaces summit like a like a tiered Voronoi cell structure, which is why I'm trying to visualize here 00:08:29.560 |
IMI so this is IMI not necessarily the multi D ADC. So this is just I 00:08:37.040 |
So our vectors come in they get split into two in this case 00:08:40.960 |
And then they get clustered based on that and then when we introduce our query vector 00:08:47.000 |
We split that again into two and we find the cells that it belongs to and we find a crossover between those cells 00:08:58.120 |
Okay, and this is using just an n-probe value of one in this case now 00:09:03.600 |
We can we can put that together quite easily using this index factor string 00:09:08.600 |
So we have the IMI index on the left and then we have again PQ because it's a ADC 00:09:14.840 |
On the right. So let's go ahead and put that together 00:09:17.720 |
Okay, so I'm gonna come up here. I'm just going to use the same same 00:09:22.560 |
Script we use here. So I'm gonna take IMI to multiply by eight there or two two by eight and 00:09:30.600 |
Let me remove the by eight there now in this case. It's actually better if we include a PQ 00:09:40.960 |
So this in this case, it does improve the performance of our of our index 00:09:52.560 |
so I'm going to get a recall function so we can compare the matches between this and a 00:10:09.600 |
Okay, so I'm gonna add that in here. All we're doing is creating a flat index and 00:10:15.200 |
Creating this this function here to compare how many of them match and what I'm also gonna do is I'm gonna change K 00:10:23.720 |
To a hundred and remember we're using ten. That's a little bit low 00:10:27.560 |
So let's change it to a hundred and then we can compare those a little easier 00:10:32.560 |
the reason I want to do that is to show you the 00:10:36.600 |
Performance difference we'll get when we start modifying another variable called the Emperor value 00:10:41.200 |
So let me wait for this to to finish training and we'll jump into that 00:10:48.560 |
Okay, so that that's finished now and we so we've run this I can 00:10:55.720 |
Compare or do it. Yeah. Yeah want to do this and I'm gonna call recall on here and we'll see what it is 00:11:16.480 |
So the number of cells in our IVF index that we're searching the any problem 00:11:23.760 |
Is this so we'd write index and probe and we set this to a higher number 00:11:28.880 |
but we can't actually do that anymore because 00:11:35.440 |
Let's say multiple indexes. So if we were just using this part would be fine 00:11:40.320 |
We could do that but because we've added a BQ we can't access and probe in a normal way. So we have to 00:11:46.680 |
We have to use something else and that is if I write it out. It's 00:11:54.080 |
Extract index and we need to write IVF. I know we're not using IVF using IMI, but we use this 00:12:01.520 |
This here for both of those. I am I or IVF whichever one you're using 00:12:05.920 |
And then in here we just pass our index and then from here 00:12:10.920 |
So this is kind of extracted the IMI part or the IVF part of our index 00:12:16.400 |
And that means that we can now set the Emperor value. So we can set up to 14 and 00:12:21.480 |
What we can do is DI equals and there's a search again 00:12:34.600 |
Now what happens we say increase that to 100? 00:12:38.480 |
And get 69 so you see is we keep increasing this number the 00:12:48.380 |
Okay, so 69 looks like the max we're going to get there 00:12:53.040 |
Although in reality, so I've tested this before you you can definitely get higher 00:12:59.760 |
You should in fact, well slightly higher. You should be able to go to about 75% 00:13:11.040 |
Through this we do get very fast search times. So you see here 0.2 00:13:26.520 |
Once it's okay, and let's reduce this. Let's say 50 see if we still get okay 00:13:37.880 |
Okay, so I'm gonna go for 60 for a 66 percent recall and then index search 00:13:42.840 |
And this this loops through multiple times to get the the average 00:13:54.240 |
So that is that last two of our comps indexes 00:14:01.000 |
Let me before we move on to next one. Let me actually show you the performance you can expect from from this one 00:14:08.000 |
So this is what I got when when I was testing on on this index 00:14:11.880 |
So in the middle there the magenta line, that's what we just built 00:14:20.440 |
Which is the the blue line and then I wanted to also include the the flat index in there as well for comparison 00:14:29.680 |
And plus again the the memory requirements for that flat index are sort of through the roof 00:14:35.760 |
This one is incredibly small not as small as IVF PQ or IVF ADC 00:14:44.400 |
Okay, so this is our final concept index and I think the most 00:14:49.080 |
But most performing one from what I've seen as long as you don't mind increasing the memory usage quite a bit 00:14:58.000 |
So before we using product quantization, we can use product quantization with this and still get pretty good results actually 00:15:04.760 |
But the the memory usage when we're using this with flat vectors is of course more 00:15:10.720 |
So what we're doing here, we still have that IVF component 00:15:14.400 |
So we're not using I am I this time using IVF like we were in the first 00:15:23.680 |
hierarchical navigable small worlds in depth or graph and 00:15:40.040 |
to identify which cells are closest to your query and 00:15:49.760 |
Search query. So like an exhaustive search across all those centroids. We use HN SW to 00:15:57.280 |
Approximate that and optimize or speed it up. So let's just go into HN SW very quickly first 00:16:05.560 |
so it's based on the small worlds graph theory and I mean this way you can see on on the 00:16:11.680 |
on the screen right now, there's this circle or the nodes of 00:16:15.840 |
vertices on the side and then we have all the edges or links between them and the theory is that in a 00:16:23.800 |
Very large networks is something like Facebook where there's billions of users 00:16:28.600 |
You will find the average degree separation the average number of links between any two people is is very small 00:16:35.520 |
I think with the Facebook one is something like credibly small like four or six 00:16:39.160 |
And that's from you know, it's between the average between anyone any two random people on Facebook are connected by only four to six 00:16:49.600 |
with this what we find is that in these small world graphs, we have long range and 00:16:55.560 |
short range links now when we're traversing a 00:16:59.240 |
Around that graph. So from one node or one vertex to another 00:17:03.620 |
We obviously move more quickly across the graph when it's a long range link 00:17:08.640 |
So let's say your friend you're American your friend has a friend over in India 00:17:16.240 |
And they have friend who is someone important in one the villages in India 00:17:23.920 |
For example, you know between you and that important person in a village in India, even though you don't know anyone in India 00:17:31.000 |
You have your friend that friend over in India and him so there's three 00:17:37.780 |
Sets to get him now the set between you and your friend. You're pretty close. You have a very high proximity 00:17:45.320 |
So it's a short range link between your friend and their friend over in India 00:17:50.100 |
that's a long range link now H&S W takes this theory and 00:17:58.400 |
what it does is it takes long range links and 00:18:00.920 |
Splits them across different layers. So in this case, we define long range as 00:18:07.240 |
geometrically far apart and on the highest level of our H&S W graph 00:18:13.920 |
We have all of these long-range links and it's on that first level that we input our query vector 00:18:20.360 |
And we first find those nearest neighbors. So we take our first few 00:18:25.220 |
Traversals across the links and as we go down each layer 00:18:29.540 |
so each traversal we go down a layer the links become shorter and shorter range it becomes finer and 00:18:36.820 |
At the bottom layer, we finally get to what is our approximate nearest neighbor now 00:18:43.360 |
That's what you can see here. So we saw there's the entry point on 00:18:47.180 |
Layer 2 which is our entry layer that only has long ranging so we can make it a big jump 00:18:59.940 |
Then we're on to the next step. So we move down the layer 00:19:05.260 |
Down here and then we go. Okay, where's it? Where's the nearest neighbor now? 00:19:10.960 |
It's over here. Okay, so so we go there and then we're on to the next step 00:19:15.800 |
So we again want to move down a layer. So now we're now bottom layer and 00:19:20.880 |
This is our final set. It's okay. Where's where's our nearest neighbor? It's over here. Okay, and 00:19:27.320 |
Then that's what we assume we assume that this 00:19:30.600 |
vertex or node is our nearest neighbor to our query vector and 00:19:35.760 |
The reason we do this is if you think okay up here. We only have long range 00:19:44.180 |
Nodes, whereas down here. We probably have loads of nodes. So if we saw it down here, we'd be making loads of steps 00:19:50.720 |
Whereas doing it like this we can make big steps early on and then smaller steps later 00:19:56.200 |
Which just minimizes the number of steps we need to take or the number of comparisons that we make now 00:20:04.000 |
IVF we said okay. This is a IVF and patients of you composite index and 00:20:10.240 |
This is what you can see now. So that HHSW process that we just went through 00:20:15.280 |
Imagine all those points or those those vertices. All of those are 00:20:23.520 |
So what we're doing is rather than comparing exhaustively comparing all of those cell centroids 00:20:30.200 |
We're just comparing a few demo or what finding the approximate closest or nearest neighbor very quickly 00:20:36.080 |
And then we limit our exhaustive search scope to that single 00:20:41.400 |
So if the end pro value is is one obviously usually it's not going to be born. It shouldn't be greater 00:20:46.300 |
Yeah, but that's pretty much what we're doing with this IVF HHSW index now because IVF 00:20:54.080 |
HHSW optimizes is centroid or focuses on optimizing the centroid search portion 00:21:00.360 |
It makes more sense if we increase the number of centuries because if we increase the number of centroids 00:21:05.720 |
we reduce the number of vectors within each cell and 00:21:09.120 |
Because our optimized search point here or the approximate search is the centroid comparison and then once we're in those centroids 00:21:17.880 |
We're actually comparing all of the vectors in there 00:21:21.280 |
So we're forming an exhaustive search on the scope that we do that great 00:21:25.840 |
So what we do is we increase the number of centroids because that's the optimized 00:21:31.520 |
portion and we reduce the number of vectors that will return in our in our final scope because that's the unoptimized part of 00:21:38.640 |
This search so to pull that together. We're going to use this string this index factory string 00:21:44.960 |
In terms of increasing speed, like I said, we'll just increase this number here, which we will see 00:21:58.560 |
I'm going to compare. Sorry. I'm going to modify this. So we're going to IVF 00:22:08.820 |
by HSW 32 and what this means is we're creating this many cell centroids or cells and 00:22:21.000 |
Connections or links each node or vertex has in the HSW graph 00:22:28.040 |
And then after that, we were using a flat in those here now. I said before we can use pq32 for example here 00:22:35.600 |
But the the accuracy to recall won't be as good but you can use that 00:22:40.360 |
Right. So if you if you need to reduce the memory usage with this and it is very good by the way using pq 00:22:46.840 |
So you still get very good recall you can do like that now. Let's run that 00:22:53.200 |
We will do the search again. We'll check the recall. See what we get 00:22:57.280 |
Okay, so that has just finished and we get this recall of 25 cent 00:23:02.760 |
And now again, we're using the the Emperor value of 1 by default. We probably want to increase this 00:23:09.120 |
So come down here. We don't need to change the variable name. I'm just doing it 00:23:14.040 |
Because it bothers me leaving it as I am I am I and 00:23:20.560 |
Let's go 60 again. I mean we do have a high number of cells here. So probably want to increase it a bit more 00:23:30.080 |
Let's go with that. See what we get. I'm getting nice 3% which okay fair enough. That's pretty good like straight out 00:23:43.920 |
Time it see how fast that is and you see okay. So before we were getting with with the multi D 00:23:59.560 |
But it's hardly anything and I mean we're getting nice 2% before we were getting I think 00:24:07.920 |
6% so but of course at the same time the memory usage for this is it's like half a gigabyte 00:24:16.960 |
Instead of using flat index. We just change its peak you say - not like that. We 00:24:27.280 |
That improves it a lot. It's but we're not doing that here. We're just gonna leave it flat 00:24:33.320 |
And I mean, let's just so if we for example, we know that the maximum number cells here is 00:24:39.500 |
1496 let's do an insanely high number and this is the maximum 00:24:45.640 |
Performance that we're going to get out of it. So a 00:24:48.440 |
Hundred percent right because we're using flats of vectors. We can get that hundred percent point 00:24:54.800 |
So yeah, we can get very good performance 89% there 00:25:04.840 |
432 microseconds. I mean for me, I think that's very good 00:25:11.880 |
For that reason, this is definitely one of my favorite indexes. It's just again the memory usage is high 00:25:18.960 |
So let's have a look at the actual performance search time recall 00:25:24.260 |
So included a few different number of cells here for us to compare 00:25:29.120 |
so like I said before what wet the optimal the optimized part is the IVF part or the 00:25:36.080 |
Centroid search part number of cells so we can increase the number of cells and that will decrease the search time 00:25:43.400 |
But also decrease the recall a little bit as well, but not that much so 00:25:47.640 |
You know, you can you can modify it. You can increase the number of cells IVF viral cells 00:25:54.000 |
To improve the search time if you if you need to now, I mean, that's it 00:26:00.400 |
I don't want to cover anything else in the video. We've covered those just three indexes 00:26:10.200 |
IVF patient sub you with those I think you can build some pretty cool indexes. So 00:26:16.000 |
Thank you very much for watching. I hope some useful and I will see you in the next one