back to indexSupercharge eCommerce Search: OpenAI's CLIP, BM25, and Python
Chapters
0:0 Multi-modal hybrid search
1:5 Multi-modal hybrid search in e-commerce
5:14 How do we construct multi-modal embeddings
7:5 Difference between sparse and dense vectors
9:43 E-commerce search in Python
11:11 Connect to Pinecone vector db
12:4 Creating a Pinecone index
13:45 Data preparation
16:32 Creating BM25 sparse vectors
19:33 Creating dense vectors with sentence transformers
20:26 Indexing everything in Pinecone
24:41 Making hybrid queries
26:1 Mixing dense vs sparse with alpha
32:11 Adding product metadata filtering
34:13 Final thoughts on search
00:00:00.000 |
Today we're going to be taking a look at actually quite a few things that we're going to apply in the scenario of a 00:00:07.240 |
multimodal e-commerce search engine. Now when I say there's quite a few technologies involved here 00:00:14.600 |
I say that because we're going to be covering something that's called hybrid search, which is a pretty recent thing 00:00:23.200 |
sparse vectors and dense vectors. If that doesn't make sense, no problem. We're going to explain it in a moment 00:00:29.800 |
We're also going to be taking a look at multimodality 00:00:32.800 |
So that is where you have multiple modalities of data within the either within your query or within your search space 00:00:41.880 |
So in our scenario, we're going to have both images and text 00:00:46.400 |
which is a pretty typical scenario when it comes to e-commerce search and 00:00:50.600 |
We're going to be actually mixing both of those. So we're going to have like a hybrid 00:00:56.120 |
Multimodal search which is I think pretty cool. And I think you know, there's a lot of cool ideas that could come from this 00:01:03.440 |
So let's jump straight into it. Now what you can see on the screen right now is a screenshot from 00:01:09.680 |
Amazon right so we can see multimodality. It's definitely a thing here. So we have images we have text 00:01:16.840 |
Probably a couple of the bits of information we can we can scrape them there 00:01:21.440 |
But I want to focus on we have the titles and we have these images 00:01:25.720 |
When we're searching for these things our queries might describe different parts of these images, right? 00:01:34.560 |
This is basically a keyword matching query, right? So someone is looking for the brand French connection 00:01:41.720 |
They want jeans and they want them to be for men 00:01:45.360 |
We would expect in the product description or in the actual product title. It's probably going to have all those keywords 00:01:52.940 |
So in that case you might just do like a keyword search, right which is is what we would 00:02:01.540 |
Right because with these keywords we can create these what called sparse vectors using things like TF IDF and BM 25 00:02:10.480 |
Right, but then on this next one, we have something a little bit different. So we have some descriptive words here 00:02:21.720 |
Blue as well as descriptive and then jeans for men right jeans for men. That's fine 00:02:26.360 |
There's probably a this probably okay as a keyword. It's probably okay as a keyword with these ones here, you know 00:02:32.360 |
They're not really like okay blue might be in there in the in the description 00:02:37.260 |
right, in fact almost definitely will be fine that's kind of like a mix both like it's descriptive and 00:02:43.860 |
Visual but also it will probably be in the description 00:02:48.620 |
But then faded worn-out looking, you know, maybe faded would be in there worn-out looking probably not 00:02:54.560 |
so these are more descriptive and honestly if we're going to search with the more 00:03:00.640 |
Descriptive like you can think of it as a more human like way of describing things 00:03:06.100 |
Essentially a semantic search if we're going to use that we ideally want dense vectors 00:03:12.380 |
okay, so you have like a transformer model like Bert or something on these lines and 00:03:17.460 |
You take that and that will create like a vector in dense vector space 00:03:21.700 |
Or if we're thinking about the multimodal side of things then maybe we don't use maybe we use something like clip 00:03:30.800 |
Basically clip can encode both images and text into the same space 00:03:36.260 |
so you would create like a vector here and that would hopefully if you've you know also encoded your images be close to a 00:03:44.740 |
Image vector that can represents the meaning behind that text if that makes sense if that's a little confusing 00:03:51.260 |
I do have some videos on clip you can take a look at those. I'll make sure there's a link somewhere around here 00:03:57.060 |
So we kind of have like a mix of things here and then we kind of see that here like even more so so we have 00:04:05.380 |
Blue maybe faded as a is a better example. So faded is very descriptive 00:04:10.380 |
But French connection is like 100% that's a keyword match, right? 00:04:15.920 |
We want to compare keywords in that scenario a model like birds or clip probably won't capture that information 00:04:26.540 |
That kind of makes things more difficult. It's almost like we need a mix of both sparse and 00:04:38.420 |
So we have these images these look kind of faded here also these ones here, right? 00:04:44.900 |
So it'd be great if we can kind of encode those images with a model like clip 00:04:50.220 |
But also at the same time if we say, you know in our users query 00:04:55.460 |
They say American Eagle if I think that's a brand if it has American Eagle in there 00:05:00.500 |
We also want to be considering that within our search 00:05:04.380 |
We basically want to consider these different modalities and we also want to consider these different ways of searching both 00:05:12.900 |
So how do we actually do that? I can't already mentioned it, but you take your image here 00:05:18.940 |
What we are going to do is encode that with a clip model to create a dense vector. Okay, and 00:05:25.500 |
Then down here. We have our description what we're going to do with that is we're going to use bm25 00:05:34.020 |
To create a sparse vector. All right, which is kind of different and I'll show you what they look like in a moment 00:05:40.780 |
So we're going to create both of those and then users gonna come with their query, you know 00:05:45.100 |
They're gonna search for like some like dark blue 00:05:53.220 |
I don't know but essentially they're gonna come along this query 00:05:56.180 |
Because we want to search for both across both the clip and the bm25 vectors 00:06:01.960 |
We need to encode this query with both of those encoding methods, right? So we take our query. It's gonna go 00:06:11.020 |
One for clip and one for bm25. Okay, and then we sort of get like two vectors here 00:06:18.580 |
what we need to do is we need to take them into a 00:06:21.300 |
Vector database that can handle both of these so going to use pinecone for that 00:06:26.740 |
there are not many vector databases that can handle this right now, so pinecone is is one of the few there and 00:06:32.820 |
Within pinecone. We're going to have these vectors are going to be in there, right? They're going to be sold in there and 00:06:42.740 |
okay, and what we we should hopefully see is that these the vectors are matched to this image here or this listing will 00:06:53.020 |
Vectors that we see in our users query and we would obviously return that listing laid this out very badly 00:07:00.780 |
But we would return this so that's the general process 00:07:05.100 |
Before we move on to code. Let's just have a quick look at 00:07:09.060 |
What was sparse vector was a dense vector just to give you a quick idea of what they might look like 00:07:14.180 |
So at the top here, we have a sparse vector. The reason it's sparse is because the information within that vector is 00:07:21.060 |
Very sparsely located right? There's only a few non zero values within the vector 00:07:27.300 |
Whereas, you know, the majority of the values are zeros. Okay, they're kind of like they don't have any meaning 00:07:34.540 |
There's there that is what a sparse vector is 00:07:38.020 |
For a dense vector. It's different so you can see here the 00:07:44.420 |
Has information inside it the majority of the vector is non zero values. There's also a difference in dimensionality 00:07:51.660 |
So dimensionality of like this vector here, you know, it could be tens of thousands and you know 00:07:59.940 |
so if we for example use a BERT tokenizer as part of process to create our 00:08:14.060 |
But then doesn't take a lot of space to store all of this information because in reality you don't actually need to store all of 00:08:20.580 |
It because it's mostly zeros. So what you would end up doing is 00:08:26.100 |
Point three here. You'd end up just creating like a dictionary or something along these lines 00:08:31.180 |
Where you have okay in position or in index two, three, four five 00:08:36.060 |
We have the value zero point three and then in the index over here is like ten or eleven or something 00:08:43.620 |
We have zero point one. So then it actually ends up becoming quite an efficient way to store these vectors 00:08:50.040 |
Whereas you can't really do that with dense facts 00:08:53.500 |
You just have to store them as is the typical dimensionality here it varies 00:08:58.460 |
but it's a very typical embedding dimensionality for these numbers at the moment is 00:09:04.240 |
768 but you will see some others it kind of goes up to like think about the OPI embeddings 00:09:11.340 |
The current order zero zero two model is like one five 00:09:16.300 |
Three six, I think something along those lines and some of their older models actually use over ten thousand 00:09:23.740 |
Embedding dimensions, but that that's really kind of like an edge case 00:09:28.060 |
So that kind of explains what we're going to be looking at 00:09:31.020 |
We'll just kind of covered a lot of the theory behind all this stuff 00:09:35.020 |
But I think you know, it'll make a lot more sense if we actually go through the the code to create this 00:09:42.700 |
Okay, so I'm getting the notebook for this from the pinecone examples repo here in ecommerce search 00:09:49.860 |
Yeah, and I'm just gonna go and open that in collab 00:09:55.300 |
This is a collab for this. I will make sure there is a link to this notebook or to the github repo 00:10:04.100 |
so first thing we come to is a pip install now if you notice that you have 00:10:10.180 |
Transformers and sentient transformers at the top of the notebook. That's a good indication 00:10:16.680 |
So you just head up to runtime change your runtime type and make sure you're using a GPU 00:10:22.580 |
Then if you are running locally try and make sure you have a CUDA enabled GPU or if not 00:10:29.540 |
You know you just deal with the slowness of CPU a little bit. This isn't a huge data sets 00:10:36.860 |
It will take a bit longer but not too long. Now. This should be 00:10:40.900 |
Updated by the time I release this video. So actually rather than using this whole pip install 00:10:48.140 |
You can actually just do this. So Pinecone client and we're going to be using gRPC, which is essentially just 00:10:55.300 |
It's going to help us index our vectors more quickly 00:10:59.580 |
But at the moment, this is actually still in beta. So I'm using this 00:11:04.820 |
Okay, cool they're installed now so come down to here we need to 00:11:12.500 |
Essentially initialize our connections pinecone for that. We need to go to a pinecone dot IO. So I'm gonna head over there 00:11:22.700 |
You will probably see something like this and also you might actually need to create an account 00:11:27.580 |
Once you have create an account you will see something like this. You need to head over to API keys 00:11:38.220 |
Which will probably not be internal bait if you probably be like u.s. West one or u.s 00:11:43.180 |
East one you need to take this so copy this and 00:11:46.740 |
Put into your environment here and you also need to copy the value here. So you just click here and 00:11:54.100 |
Put that here. I've sold mine in a variable called your API key 00:12:01.180 |
so I run that that will initialize the connection and 00:12:05.300 |
Then we come down to actually creating our index now to create the index because we're going to be using this 00:12:17.680 |
Sparse vectors and the dense vectors in one to actually use that we need to make sure we're using a few 00:12:26.500 |
Set items in the create index call. So the metric has to be dot product and for the pod type 00:12:33.820 |
We have to either use s 1 or p 1 other than that. I think we're okay 00:12:39.020 |
We can kind of change everything else now dimensionality because we're going to be using 00:12:43.260 |
The clip for the dense embeddings. We need to set up to the same dimensionality as clip, right? 00:12:49.820 |
Which is 512 so dimension here is always referring to your dense vector dimension 00:12:56.260 |
So we create that your index name you can call it whatever you want. It doesn't really matter. Okay, cool 00:13:03.380 |
So we've just created our sparse dense enabled index and then we connect to it 00:13:08.600 |
Now if you have followed any of these videos before if you're kind of aware pinecone 00:13:13.720 |
I think you've probably mostly seen me use this pinecone index in its name. You can use that as fine 00:13:19.420 |
You know, we don't have to change it because we're using sparse dense or anything like that. You can use index 00:13:23.980 |
it's just I'm going to be using GRPC index because the 00:13:32.940 |
It's also faster when you're doing upsets, right? So in reality the index behind this is still the same 00:13:38.980 |
It's just the connection is through GRPC rather than rest. So we run that cool and 00:13:46.300 |
What we're going to do is so we've got this the open fashion product images data set 00:13:52.340 |
So this Kaggle data set it just has you can kind of see in the in the right up here all these fashion images 00:13:59.740 |
We're going to be using that also has like descriptions and everything 00:14:03.260 |
We're going to see in a moment, but we're using a subset of that that ashrack is created here 00:14:07.100 |
So we're going to grab that. It's currently stored on 00:14:11.340 |
Hugging face datasets so you can basically you can go on hugging face dot co type this in and you'll be able to find the data 00:14:19.820 |
Set that would just take a moment to download 00:14:21.820 |
Cool, so we can kind of see what we have here 00:14:26.180 |
So we have ID gender mass category or all these different things in terms of the text 00:14:31.540 |
I think the main one we're focusing on is going to be the product display name 00:14:38.500 |
so the majority of a sparse vector will be created with this and 00:14:42.460 |
Then the entirety of the dense vector is going to create with the image now because we're confusing image to clip to dance vector 00:14:53.460 |
Click on this. So we're going to create this meta data 00:14:59.660 |
Everything except from the image column. Okay, and the image column is we're going to be storing that in a list called images 00:15:07.740 |
So now we can take a look at you know, one of the images we have this 00:15:11.420 |
Index 900. It's just kind of like a dress or a long top or something along those lines 00:15:18.860 |
So yeah, we see that and then if we just take a look at our metadata 00:15:23.260 |
So we've now created a data frame from this we have a lot of useful information there, right? So we have the gender 00:15:30.980 |
What is it apparel? We have accessories and stuff down here as well 00:15:35.140 |
It's top shirts, you know all this different stuff basically, which is kind of useful 00:15:41.060 |
But actually the most descriptive part here if I can if I can get to it is the product display name 00:15:48.500 |
Okay, so it's not super descriptive but it has kind of like a short product description in there 00:15:53.900 |
Which is what we're going to be relying on for our sparse vectors 00:16:01.060 |
We're going to using all of these metadata fields except from the ID and year to create our sparse vectors 00:16:05.980 |
So within our sparse vector, we're going to have men apparel top wear shirts and a blue season 00:16:14.500 |
Maybe you could in maybe that would make sense to in some cases, but in this case, I don't think so 00:16:20.020 |
It's casual and we're going to have this now product spline name. So as you're going to within that sparse vector 00:16:25.580 |
We're going to have a ton of different things which is really useful for this sort of information. So let's go down 00:16:32.180 |
We're going to create the sparse vectors first. So for now, I'm using this 00:16:37.620 |
File here. This will probably get updated pretty soon 00:16:41.780 |
So when it does get updated, I'm going to leave I'm gonna leave a comment 00:16:46.780 |
I'll just pin it to the top of the video giving you new code for this 00:16:50.320 |
But essentially all that's in this is a bm25 implementation that we're going to be using 00:16:55.660 |
Okay, so let's come down to here what we're going to be doing or what this bm25 implementation needs is a tokenizer 00:17:05.060 |
right, so we need a tokenizer essentially to split our text into either like words or 00:17:11.100 |
Pieces of words and we're going to do that using a Bert tokenizer. So we run that 00:17:17.620 |
So here's the tokenized function. So we import our text, right? And this will need to output basically a list of tokens 00:17:25.540 |
It will tokenize those using hugging face transformers tokenizer 00:17:31.460 |
we extract those input IDs and then we convert those IDs back into text tokens, right and 00:17:37.600 |
With this tokenizer function we pass that into the bm25 implementation 00:17:43.460 |
so basically bm25 is going to use this to tokenize everything and 00:17:48.420 |
Then with those tokens, it's going to create our sparse bm25 vectors. Okay 00:17:56.860 |
Just yeah, I'm showing you what this tokenizing function is doing. So it's just essentially breaking everything apart into these little components 00:18:04.480 |
But in reality, we're actually going to have a ton of different things in there 00:18:08.640 |
We're going to have like all the other columns as well now with bm25 00:18:13.120 |
The way it works is there are a few parameters within the function that are based off of your 00:18:24.820 |
So all the data you have you need to feed into bm25 00:18:27.820 |
That will allow the model or bm25 to essentially train on that and update those parameters which will then be used later 00:18:34.980 |
So we run that right and you can actually see a few those sort of number of dots the average document length 00:18:49.380 |
So this is going to create the the query for the for this particular prompt here 00:18:56.240 |
So we run that and you can see your bm25 query vector there now 00:19:01.400 |
We're actually running this on across all of our documents that we're going to store in the vector database 00:19:06.780 |
We actually only need to store or run at the adopt side of it 00:19:10.400 |
Because the query side is kind of considering the term frequency against the inverse document frequency 00:19:17.580 |
But we only need to do that on one side for it to be effective 00:19:20.460 |
right, so you can simplify the calculation just calculate the IDF part for the things that are stored within the 00:19:27.660 |
Vector database. So that's what we're doing here run that. Okay now for the dense vectors 00:19:34.620 |
I would say we can be using something different. So we're actually going to use a clip here and 00:19:39.660 |
We're actually implementing clip through the sentence transformers library. It just makes things easier for us 00:19:45.660 |
you can also implement it through the face transformers library as well, but this is easier and 00:19:51.820 |
We're using CUDA if we can right so you can just print the device to just check what you are using here 00:19:59.540 |
Try if you can and use CUDA you can also I believe if you're on Mac you can use NPS 00:20:06.420 |
But yeah, naturally CUDA is typically going to be faster 00:20:12.940 |
Yeah, we have a clip model there and what we can do is we can encode 00:20:19.900 |
All right, and then we see we get a 512 dimensional dense vector. Okay, cool 00:20:28.900 |
So we are this is going to be a little bit different if you have watched previous videos when I'm using pine cone 00:20:43.300 |
Right. So we're going to be okay TQ DMS just so we can see the progress so we can see, you know 00:20:49.500 |
The progress bar is where going through everything 00:20:54.380 |
We'll see see soon at batch size. We're going to go 200 00:20:58.420 |
This will work on on Colab and you can even go higher 00:21:03.820 |
And we're going to go through the fashion data set in batches of 200 at a time 00:21:15.140 |
So we have our metadata just kind of extracting out in 200 in chunks of 200 items or rows at a time 00:21:22.520 |
And then so here we're concatenating all that metadata 00:21:29.620 |
ID in here to create that that single string. This is what we're going to use to 00:21:34.980 |
Create our sparse vector. Okay, and you can see that 00:21:41.540 |
Right. So that's how we create a sparse vector now for the dense vector. We need to extract the images 00:21:49.860 |
Images very or list that we created earlier. We 00:21:53.380 |
Convert them into dense embeddings with our dense model, which is clip and then here 00:22:00.580 |
We're just creating some unique IDs. So the ID is a literally just a count in this case 00:22:07.460 |
Obviously if you have actual IDs that correlate to your items 00:22:12.060 |
Maybe you want to use those and in fact, actually we did have ideas. We just took them out here 00:22:16.580 |
So, you know, maybe we could have used that but it's fine. We don't it's not really so important in this example 00:22:23.260 |
Right and a bit that's different. So in pinecone, you know, there's sort of 00:22:29.860 |
Making things more organized so that we can do things more efficiently 00:22:35.940 |
So there are a few changes here. So we've created our items here. Typically would just feed them in in like a 00:22:42.560 |
List of two pools, which is kind of not that organized. So that is changing a little bit here 00:22:48.620 |
First we create a structure for our metadata and inside that structure. We just need to add 00:22:55.140 |
The meta here. So that's why I imported up here this 00:23:02.900 |
So this is what we create for our metadata for one row of our metadata to note that look here 00:23:08.420 |
We're looping through the batch. All right, and 00:23:11.260 |
as we're looping through a batch, we're appending everything to this upsets list here and 00:23:20.700 |
GRPC vector. All right, so if you're not using GRPC use vector and 00:23:26.780 |
A GRPC vector expects at least two things, right? 00:23:31.860 |
Which it would be your ID and your values. So your ID is obviously your your record ID or vector ID 00:23:41.100 |
Then for metadata and sparse values, you know that they're optional, right? 00:23:47.220 |
But obviously we do have metadata in our use case and we do have sparse values as well 00:23:52.420 |
so if the metadata we feed in this the strokes that we just created and for sparse values we have our 00:24:00.300 |
GRPC sparse values here, and I realized actually after creating this that I can just do 00:24:13.380 |
Okay, and that would take a little bit of time. So I will 00:24:18.620 |
Skip forward and I will see you when it's ready. Okay, so it's just finished and 00:24:27.980 |
Yeah, okay. We'll start having a look at actually querying those and getting some some results. So 00:24:35.420 |
Let's come to here. I think maybe I need to remove this 00:24:39.340 |
Okay, good. So let's start the dark blue French connection genes for men, right? 00:24:46.500 |
So that's gonna be our query when we're querying we we do need to create 00:24:50.120 |
BM 25 query vector. So we're using transform query here. We also need to use this so 00:24:57.540 |
Creating our dense vector embedding so model and code query to list and we want to search through that 00:25:06.260 |
Okay. So with our search this is again if you watch the previous videos slightly different we now have so vectors 00:25:21.660 |
Which is where we pass in our sparse vector and what I'm going to do is just return 00:25:25.420 |
The images based on the results that we get. Okay, so we just get all of these 00:25:30.340 |
Pill image objects to actually view those we're going to use this function here 00:25:35.380 |
Display result, which is just essentially going to display some HTML within a notebook 00:25:41.580 |
Which will show the actual images themselves and with that would just view them. Okay, so 00:25:50.380 |
Dark blue. No. Yeah dark blue French connection genes for men 00:25:54.140 |
And this is what we return. Okay, so yeah, it seems pretty accurate 00:26:07.180 |
Sparse versus dense vectors. Alright, so what we're going to do is use a function named hybrid scale for this 00:26:16.620 |
Essentially if we take look here, we have this alpha value this alpha value must be between 0 and 1 00:26:23.380 |
Where 0 basically means you're doing a sparse vector search only 00:26:33.060 |
The sparse vector multiplied by 1 for sparse, but this one will be this is a dense one 00:26:38.340 |
You're multiplying by 0 here. So you're basically making it 00:26:42.100 |
Meaningless and then if you want to dense search only you would you would use a alpha value of 1 or 1.0 00:26:55.500 |
Then okay, we first do a pure sparse vector search 00:27:02.460 |
So we set that to 0 to do this do so run this 00:27:06.740 |
Okay, and it looks like we kind of get good results. Although there are definitely a few women's genes in here 00:27:13.420 |
So let's have a look at what's actually going on there. So we go to product display name 00:27:18.100 |
All right, so on French connection. Yeah, we're doing well, right? So the keywords are being pulled in we'll find that exact match 00:27:26.940 |
That's great. But the issue is we do have like men is a component of the of the word woman 00:27:37.380 |
What we can do is okay. Let's just go pure dense and see what what happens 00:27:41.900 |
Okay, so this is basing it purely on the images, right? 00:27:47.100 |
So it's looking for dark blue French connection genes 00:27:50.460 |
the issue is does it actually know what French connection is and if we come down to the 00:27:55.580 |
The display name of these items and see okay. It's actually returns of the brand like loco 00:28:04.660 |
Even spy car and Wrangler. It does include a few French connection 00:28:09.140 |
But I think that's just purely by chance rather than it actually knew that these are French connection genes 00:28:17.300 |
Nonetheless, yeah, it's not it's not pulling back the French connection stuff 00:28:23.100 |
So we can't just set the alpha value - it's actually not gonna use 0.07. We're gonna use 0.07 00:28:32.020 |
Actually, I'm basically a sparse search with a little bit of dense in there as well 00:28:45.900 |
So this is heavily weighted towards sparse, but it's still considered some dense and 00:28:52.380 |
You know, they look mostly. Okay, there is some black here 00:28:59.060 |
Maybe we could increase it to dance a little bit more to try and avoid that but you can see 00:29:04.380 |
That for the most part we are getting French connection in here. All right, so we're getting some locomotive and here as well 00:29:11.020 |
But for the most part is French connection and also not including any 00:29:16.940 |
Women's genes in there as well, which is a bonus 00:29:21.860 |
Cool. Let's try it some more queries, which actually I think can demonstrate this pretty well 00:29:26.860 |
So we're going to go small beige handbag for women. We're going to go pure sparse first. So looking at the keywords only and 00:29:34.780 |
Okay, not really beige. All right, we have like every color in there 00:29:40.900 |
They're all handbags, which looks good. And obviously we're all for women. So 00:29:49.180 |
Just not beige whatsoever. So what we're going to do is we're going to add a little bit of dance again 00:29:56.580 |
so even with that little tiny bit of a dense factor in there the results are just like 00:30:02.720 |
Ten times better we get pure beige. All of these are 00:30:13.900 |
It's kind of interesting that it didn't pull beige in even though they all say beige and the actual sparse factor 00:30:20.880 |
But anyway, so that's kind of cool. What happens if we just get pure dense 00:30:26.720 |
It's kind of interesting. I think so if we go pure dense 00:30:31.140 |
Yeah, I think for the most part they're beige, which is okay 00:30:36.160 |
But then we're getting it like women's purses in there as well. Not just 00:30:43.820 |
Okay, like clip is understanding everything as like it knows that a woman's handbag and a woman's purse are kind of similar 00:30:50.900 |
So it's just kind of giving you all of those that are also beige 00:30:55.860 |
We can kind of see the benefit of having a mix of sparse and dense in here across both modalities as well 00:31:02.540 |
All right another thing. So this is kind of interesting 00:31:10.620 |
We're going to use this image to search right so clip can also handle images, right? 00:31:19.660 |
We're also going to add in a text query, which is going to become our sparse vector 00:31:25.560 |
All right. So let's try those. We're going to be using a mix 00:31:29.380 |
0.3 for this one, right? So what do we have here? So these are all kind of similar to this image, I think 00:31:43.220 |
They even I mean to me they look like the same person 00:31:47.860 |
All right, so obviously clip isn't it doesn't know that you just want to focus on the top. It's kind of looking at everything 00:31:53.500 |
All right, so it's also gonna kind of return people that look similar 00:32:00.220 |
But the purple component isn't I think maybe it's considered a little bit here and here 00:32:06.500 |
But you know, it's not really considered a huge amount there. Is it? 00:32:12.740 |
Let's try rather than just relying on the query 00:32:16.660 |
let's try and use the pine cones metadata filtering right because we know that in our metadata we have the 00:32:24.220 |
The color of these items. All right, so we do the same again same query this time. We're going to filter for the color purple 00:32:33.300 |
Okay, run this. All right, and then straight away we get you know much better results. So with this well 00:32:47.700 |
Also a text query, right? So we you know, we're kind of approaching this search from 00:33:00.340 |
Basically, the point is that we can do search in a ton of different ways and we get some some pretty cool results 00:33:07.300 |
So another one here, we're gonna have this guy in his shirt and we're gonna go with a green shirt 00:33:13.820 |
all right going to add that to our filter straight away and 00:33:21.820 |
We don't even include like it needs to be men's shirt 00:33:23.820 |
Anywhere in here the clip model is kind of handling that for us because it knows we're looking at a guy initially 00:33:32.260 |
Like these to me seem like we're actually filtering for green here 00:33:36.980 |
So it means that this shirt is for some reason marked as green. I don't know why 00:33:44.620 |
We for the most part, you know, they're kind of that's a green type of color kind of faded like this one here 00:33:53.580 |
Cool. So yeah, if you're if you're done with the index 00:33:56.940 |
obviously if you want to kind of play around with it a little bit more feel free and go ahead to once you are done 00:34:02.020 |
You can delete the index just to save your resources if you just have the one index, it's free anyway 00:34:08.660 |
But if you have multiple obviously you'd be paying for this. So I mean that's it for this example of 00:34:18.860 |
multimodal search for e-commerce items as I mentioned at start we just kind of 00:34:24.700 |
Blazed through a ton of different technologies like we had filtering in their hybrid search in their multimodality 00:34:32.340 |
Kind of dived into obviously with the hybrid dived into the embedding methods sparse and dense 00:34:40.780 |
But I think it's kind of cool to see how many different 00:34:57.900 |
Helpful. Thank you very much for watching the video and I will see you again in the next one. Bye