Supercharge eCommerce Search: OpenAI's CLIP, BM25, and Python

00:00:00.000 | Today we're going to be taking a look at actually quite a few things that we're going to apply in the scenario of a

00:00:07.240 | multimodal e-commerce search engine. Now when I say there's quite a few technologies involved here

00:00:14.600 | I say that because we're going to be covering something that's called hybrid search, which is a pretty recent thing

00:00:20.400 | So that's the idea of searching across both

00:00:23.200 | sparse vectors and dense vectors. If that doesn't make sense, no problem. We're going to explain it in a moment

00:00:29.800 | We're also going to be taking a look at multimodality

00:00:32.800 | So that is where you have multiple modalities of data within the either within your query or within your search space

00:00:41.880 | So in our scenario, we're going to have both images and text

00:00:46.400 | which is a pretty typical scenario when it comes to e-commerce search and

00:00:50.600 | We're going to be actually mixing both of those. So we're going to have like a hybrid

00:00:56.120 | Multimodal search which is I think pretty cool. And I think you know, there's a lot of cool ideas that could come from this

00:01:03.440 | So let's jump straight into it. Now what you can see on the screen right now is a screenshot from

00:01:09.680 | Amazon right so we can see multimodality. It's definitely a thing here. So we have images we have text

00:01:16.840 | Probably a couple of the bits of information we can we can scrape them there

00:01:21.440 | But I want to focus on we have the titles and we have these images

00:01:25.720 | When we're searching for these things our queries might describe different parts of these images, right?

00:01:32.560 | So in this top one, we have

00:01:34.560 | This is basically a keyword matching query, right? So someone is looking for the brand French connection

00:01:41.720 | They want jeans and they want them to be for men

00:01:45.360 | We would expect in the product description or in the actual product title. It's probably going to have all those keywords

00:01:52.940 | So in that case you might just do like a keyword search, right which is is what we would

00:01:58.600 | Typically refer to as a sparse vector search

00:02:01.540 | Right because with these keywords we can create these what called sparse vectors using things like TF IDF and BM 25

00:02:10.480 | Right, but then on this next one, we have something a little bit different. So we have some descriptive words here

00:02:16.920 | So we're saying okay faded

00:02:19.100 | worn-out looking

00:02:21.720 | Blue as well as descriptive and then jeans for men right jeans for men. That's fine

00:02:26.360 | There's probably a this probably okay as a keyword. It's probably okay as a keyword with these ones here, you know

00:02:32.360 | They're not really like okay blue might be in there in the in the description

00:02:37.260 | right, in fact almost definitely will be fine that's kind of like a mix both like it's descriptive and

00:02:43.860 | Visual but also it will probably be in the description

00:02:48.620 | But then faded worn-out looking, you know, maybe faded would be in there worn-out looking probably not

00:02:54.560 | so these are more descriptive and honestly if we're going to search with the more

00:03:00.640 | Descriptive like you can think of it as a more human like way of describing things

00:03:06.100 | Essentially a semantic search if we're going to use that we ideally want dense vectors

00:03:12.380 | okay, so you have like a transformer model like Bert or something on these lines and

00:03:17.460 | You take that and that will create like a vector in dense vector space

00:03:21.700 | Or if we're thinking about the multimodal side of things then maybe we don't use maybe we use something like clip

00:03:28.100 | Which is a multi modal model

00:03:30.800 | Basically clip can encode both images and text into the same space

00:03:36.260 | so you would create like a vector here and that would hopefully if you've you know also encoded your images be close to a

00:03:44.740 | Image vector that can represents the meaning behind that text if that makes sense if that's a little confusing

00:03:51.260 | I do have some videos on clip you can take a look at those. I'll make sure there's a link somewhere around here

00:03:57.060 | So we kind of have like a mix of things here and then we kind of see that here like even more so so we have

00:04:04.300 | faded

00:04:05.380 | Blue maybe faded as a is a better example. So faded is very descriptive

00:04:10.380 | But French connection is like 100% that's a keyword match, right?

00:04:15.920 | We want to compare keywords in that scenario a model like birds or clip probably won't capture that information

00:04:22.660 | You know this brand very easily

00:04:24.660 | so

00:04:26.540 | That kind of makes things more difficult. It's almost like we need a mix of both sparse and

00:04:34.180 | dense and

00:04:36.180 | You know, we can also see that here, right?

00:04:38.420 | So we have these images these look kind of faded here also these ones here, right?

00:04:44.900 | So it'd be great if we can kind of encode those images with a model like clip

00:04:50.220 | But also at the same time if we say, you know in our users query

00:04:55.460 | They say American Eagle if I think that's a brand if it has American Eagle in there

00:05:00.500 | We also want to be considering that within our search

00:05:03.340 | So, you know

00:05:04.380 | We basically want to consider these different modalities and we also want to consider these different ways of searching both

00:05:10.080 | semantic based and also keyword base

00:05:12.900 | So how do we actually do that? I can't already mentioned it, but you take your image here

00:05:18.940 | What we are going to do is encode that with a clip model to create a dense vector. Okay, and

00:05:25.500 | Then down here. We have our description what we're going to do with that is we're going to use bm25

00:05:31.740 | bm25

00:05:34.020 | To create a sparse vector. All right, which is kind of different and I'll show you what they look like in a moment

00:05:40.780 | So we're going to create both of those and then users gonna come with their query, you know

00:05:45.100 | They're gonna search for like some like dark blue

00:05:48.180 | boss genes

00:05:50.540 | Maybe they'll put for men as well in there

00:05:53.220 | I don't know but essentially they're gonna come along this query

00:05:56.180 | Because we want to search for both across both the clip and the bm25 vectors

00:06:01.960 | We need to encode this query with both of those encoding methods, right? So we take our query. It's gonna go

00:06:09.020 | Okay here and here

00:06:11.020 | One for clip and one for bm25. Okay, and then we sort of get like two vectors here

00:06:18.580 | what we need to do is we need to take them into a

00:06:21.300 | Vector database that can handle both of these so going to use pinecone for that

00:06:26.740 | there are not many vector databases that can handle this right now, so pinecone is is one of the few there and

00:06:32.820 | Within pinecone. We're going to have these vectors are going to be in there, right? They're going to be sold in there and

00:06:39.540 | We're going to use pinecone to compare them

00:06:42.740 | okay, and what we we should hopefully see is that these the vectors are matched to this image here or this listing will

00:06:51.060 | be very close to the

00:06:53.020 | Vectors that we see in our users query and we would obviously return that listing laid this out very badly

00:07:00.780 | But we would return this so that's the general process

00:07:05.100 | Before we move on to code. Let's just have a quick look at

00:07:09.060 | What was sparse vector was a dense vector just to give you a quick idea of what they might look like

00:07:14.180 | So at the top here, we have a sparse vector. The reason it's sparse is because the information within that vector is

00:07:21.060 | Very sparsely located right? There's only a few non zero values within the vector

00:07:27.300 | Whereas, you know, the majority of the values are zeros. Okay, they're kind of like they don't have any meaning

00:07:34.540 | There's there that is what a sparse vector is

00:07:38.020 | For a dense vector. It's different so you can see here the

00:07:41.780 | majority of the vector

00:07:44.420 | Has information inside it the majority of the vector is non zero values. There's also a difference in dimensionality

00:07:51.660 | So dimensionality of like this vector here, you know, it could be tens of thousands and you know

00:07:58.260 | Typically it is right

00:07:59.940 | so if we for example use a BERT tokenizer as part of process to create our

00:08:06.980 | sparse vector we would end up with a

00:08:09.220 | 30k

00:08:12.060 | Dimensional sparse vector, right?

00:08:14.060 | But then doesn't take a lot of space to store all of this information because in reality you don't actually need to store all of

00:08:20.580 | It because it's mostly zeros. So what you would end up doing is

00:08:23.740 | all right, so we have a

00:08:26.100 | Point three here. You'd end up just creating like a dictionary or something along these lines

00:08:31.180 | Where you have okay in position or in index two, three, four five

00:08:36.060 | We have the value zero point three and then in the index over here is like ten or eleven or something

00:08:43.620 | We have zero point one. So then it actually ends up becoming quite an efficient way to store these vectors

00:08:50.040 | Whereas you can't really do that with dense facts

00:08:53.500 | You just have to store them as is the typical dimensionality here it varies

00:08:58.460 | but it's a very typical embedding dimensionality for these numbers at the moment is

00:09:04.240 | 768 but you will see some others it kind of goes up to like think about the OPI embeddings

00:09:11.340 | The current order zero zero two model is like one five

00:09:16.300 | Three six, I think something along those lines and some of their older models actually use over ten thousand

00:09:23.740 | Embedding dimensions, but that that's really kind of like an edge case

00:09:28.060 | So that kind of explains what we're going to be looking at

00:09:31.020 | We'll just kind of covered a lot of the theory behind all this stuff

00:09:35.020 | But I think you know, it'll make a lot more sense if we actually go through the the code to create this

00:09:40.700 | So let's jump into that

00:09:42.700 | Okay, so I'm getting the notebook for this from the pinecone examples repo here in ecommerce search

00:09:49.860 | Yeah, and I'm just gonna go and open that in collab

00:09:53.420 | Okay, so

00:09:55.300 | This is a collab for this. I will make sure there is a link to this notebook or to the github repo

00:10:01.860 | At the top of the video right now

00:10:04.100 | so first thing we come to is a pip install now if you notice that you have

00:10:10.180 | Transformers and sentient transformers at the top of the notebook. That's a good indication

00:10:14.820 | We're probably going to need to use the GPU

00:10:16.680 | So you just head up to runtime change your runtime type and make sure you're using a GPU

00:10:22.580 | Then if you are running locally try and make sure you have a CUDA enabled GPU or if not

00:10:29.540 | You know you just deal with the slowness of CPU a little bit. This isn't a huge data sets

00:10:36.860 | It will take a bit longer but not too long. Now. This should be

00:10:40.900 | Updated by the time I release this video. So actually rather than using this whole pip install

00:10:48.140 | You can actually just do this. So Pinecone client and we're going to be using gRPC, which is essentially just

00:10:55.300 | It's going to help us index our vectors more quickly

00:10:59.580 | But at the moment, this is actually still in beta. So I'm using this

00:11:04.820 | Okay, cool they're installed now so come down to here we need to

00:11:12.500 | Essentially initialize our connections pinecone for that. We need to go to a pinecone dot IO. So I'm gonna head over there

00:11:19.620 | Now unless you use pinecone

00:11:22.700 | You will probably see something like this and also you might actually need to create an account

00:11:27.580 | Once you have create an account you will see something like this. You need to head over to API keys

00:11:32.700 | You will have this default key

00:11:36.340 | You have your environment here

00:11:38.220 | Which will probably not be internal bait if you probably be like u.s. West one or u.s

00:11:43.180 | East one you need to take this so copy this and

00:11:46.740 | Put into your environment here and you also need to copy the value here. So you just click here and

00:11:54.100 | Put that here. I've sold mine in a variable called your API key

00:12:01.180 | so I run that that will initialize the connection and

00:12:05.300 | Then we come down to actually creating our index now to create the index because we're going to be using this

00:12:11.160 | what's called a sparse dense index, which is

00:12:13.780 | it's essentially an index that has both the

00:12:17.680 | Sparse vectors and the dense vectors in one to actually use that we need to make sure we're using a few

00:12:26.500 | Set items in the create index call. So the metric has to be dot product and for the pod type

00:12:33.820 | We have to either use s 1 or p 1 other than that. I think we're okay

00:12:39.020 | We can kind of change everything else now dimensionality because we're going to be using

00:12:43.260 | The clip for the dense embeddings. We need to set up to the same dimensionality as clip, right?

00:12:49.820 | Which is 512 so dimension here is always referring to your dense vector dimension

00:12:56.260 | So we create that your index name you can call it whatever you want. It doesn't really matter. Okay, cool

00:13:03.380 | So we've just created our sparse dense enabled index and then we connect to it

00:13:08.600 | Now if you have followed any of these videos before if you're kind of aware pinecone

00:13:13.720 | I think you've probably mostly seen me use this pinecone index in its name. You can use that as fine

00:13:19.420 | You know, we don't have to change it because we're using sparse dense or anything like that. You can use index

00:13:23.980 | it's just I'm going to be using GRPC index because the

00:13:26.940 | connection handling with this index is

00:13:30.420 | better and

00:13:32.940 | It's also faster when you're doing upsets, right? So in reality the index behind this is still the same

00:13:38.980 | It's just the connection is through GRPC rather than rest. So we run that cool and

00:13:46.300 | What we're going to do is so we've got this the open fashion product images data set

00:13:52.340 | So this Kaggle data set it just has you can kind of see in the in the right up here all these fashion images

00:13:59.740 | We're going to be using that also has like descriptions and everything

00:14:03.260 | We're going to see in a moment, but we're using a subset of that that ashrack is created here

00:14:07.100 | So we're going to grab that. It's currently stored on

00:14:11.340 | Hugging face datasets so you can basically you can go on hugging face dot co type this in and you'll be able to find the data

00:14:19.820 | Set that would just take a moment to download

00:14:21.820 | Cool, so we can kind of see what we have here

00:14:26.180 | So we have ID gender mass category or all these different things in terms of the text

00:14:31.540 | I think the main one we're focusing on is going to be the product display name

00:14:36.180 | I think we also do include a few others

00:14:38.500 | so the majority of a sparse vector will be created with this and

00:14:42.460 | Then the entirety of the dense vector is going to create with the image now because we're confusing image to clip to dance vector

00:14:50.060 | so

00:14:51.620 | What we're going to do is

00:14:53.460 | Click on this. So we're going to create this meta data

00:14:56.900 | Data frame which is just going to contain

00:14:59.660 | Everything except from the image column. Okay, and the image column is we're going to be storing that in a list called images

00:15:07.740 | So now we can take a look at you know, one of the images we have this

00:15:11.420 | Index 900. It's just kind of like a dress or a long top or something along those lines

00:15:18.860 | So yeah, we see that and then if we just take a look at our metadata

00:15:23.260 | So we've now created a data frame from this we have a lot of useful information there, right? So we have the gender

00:15:30.980 | What is it apparel? We have accessories and stuff down here as well

00:15:35.140 | It's top shirts, you know all this different stuff basically, which is kind of useful

00:15:39.940 | I'm going to use that later

00:15:41.060 | But actually the most descriptive part here if I can if I can get to it is the product display name

00:15:48.500 | Okay, so it's not super descriptive but it has kind of like a short product description in there

00:15:53.900 | Which is what we're going to be relying on for our sparse vectors

00:15:57.140 | so

00:15:59.100 | What we're going to do says here

00:16:01.060 | We're going to using all of these metadata fields except from the ID and year to create our sparse vectors

00:16:05.980 | So within our sparse vector, we're going to have men apparel top wear shirts and a blue season

00:16:12.340 | We're not going to include the year

00:16:14.500 | Maybe you could in maybe that would make sense to in some cases, but in this case, I don't think so

00:16:20.020 | It's casual and we're going to have this now product spline name. So as you're going to within that sparse vector

00:16:25.580 | We're going to have a ton of different things which is really useful for this sort of information. So let's go down

00:16:32.180 | We're going to create the sparse vectors first. So for now, I'm using this

00:16:37.620 | File here. This will probably get updated pretty soon

00:16:41.780 | So when it does get updated, I'm going to leave I'm gonna leave a comment

00:16:46.780 | I'll just pin it to the top of the video giving you new code for this

00:16:50.320 | But essentially all that's in this is a bm25 implementation that we're going to be using

00:16:55.660 | Okay, so let's come down to here what we're going to be doing or what this bm25 implementation needs is a tokenizer

00:17:05.060 | right, so we need a tokenizer essentially to split our text into either like words or

00:17:11.100 | Pieces of words and we're going to do that using a Bert tokenizer. So we run that

00:17:17.620 | So here's the tokenized function. So we import our text, right? And this will need to output basically a list of tokens

00:17:25.540 | It will tokenize those using hugging face transformers tokenizer

00:17:31.460 | we extract those input IDs and then we convert those IDs back into text tokens, right and

00:17:37.600 | With this tokenizer function we pass that into the bm25 implementation

00:17:43.460 | so basically bm25 is going to use this to tokenize everything and

00:17:48.420 | Then with those tokens, it's going to create our sparse bm25 vectors. Okay

00:17:54.620 | cool, so

00:17:56.860 | Just yeah, I'm showing you what this tokenizing function is doing. So it's just essentially breaking everything apart into these little components

00:18:04.480 | But in reality, we're actually going to have a ton of different things in there

00:18:08.640 | We're going to have like all the other columns as well now with bm25

00:18:13.120 | The way it works is there are a few parameters within the function that are based off of your

00:18:21.940 | Sort of larger amount of data, right?

00:18:24.820 | So all the data you have you need to feed into bm25

00:18:27.820 | That will allow the model or bm25 to essentially train on that and update those parameters which will then be used later

00:18:34.980 | So we run that right and you can actually see a few those sort of number of dots the average document length

00:18:42.220 | document frequency and so

00:18:44.780 | Cool, so just come down we can try right?

00:18:49.380 | So this is going to create the the query for the for this particular prompt here

00:18:56.240 | So we run that and you can see your bm25 query vector there now

00:19:01.400 | We're actually running this on across all of our documents that we're going to store in the vector database

00:19:06.780 | We actually only need to store or run at the adopt side of it

00:19:10.400 | Because the query side is kind of considering the term frequency against the inverse document frequency

00:19:17.580 | But we only need to do that on one side for it to be effective

00:19:20.460 | right, so you can simplify the calculation just calculate the IDF part for the things that are stored within the

00:19:27.660 | Vector database. So that's what we're doing here run that. Okay now for the dense vectors

00:19:34.620 | I would say we can be using something different. So we're actually going to use a clip here and

00:19:39.660 | We're actually implementing clip through the sentence transformers library. It just makes things easier for us

00:19:45.660 | you can also implement it through the face transformers library as well, but this is easier and

00:19:51.820 | We're using CUDA if we can right so you can just print the device to just check what you are using here

00:19:59.540 | Try if you can and use CUDA you can also I believe if you're on Mac you can use NPS

00:20:06.420 | But yeah, naturally CUDA is typically going to be faster

00:20:10.620 | Okay, cool. And

00:20:12.940 | Yeah, we have a clip model there and what we can do is we can encode

00:20:17.340 | We can encode this

00:20:19.900 | All right, and then we see we get a 512 dimensional dense vector. Okay, cool

00:20:26.500 | So now we need to pull that together

00:20:28.900 | So we are this is going to be a little bit different if you have watched previous videos when I'm using pine cone

00:20:35.660 | So let's let's take a look at this. Okay

00:20:41.300 | Cool. Okay, that's good

00:20:43.300 | Right. So we're going to be okay TQ DMS just so we can see the progress so we can see, you know

00:20:49.500 | The progress bar is where going through everything

00:20:51.500 | This is something new

00:20:54.380 | We'll see see soon at batch size. We're going to go 200

00:20:58.420 | This will work on on Colab and you can even go higher

00:21:00.940 | But I find you know, this is this works fine

00:21:03.820 | And we're going to go through the fashion data set in batches of 200 at a time

00:21:10.220 | We find the end of the batch

00:21:12.220 | We extract the metadata from that batch

00:21:15.140 | So we have our metadata just kind of extracting out in 200 in chunks of 200 items or rows at a time

00:21:22.520 | And then so here we're concatenating all that metadata

00:21:27.400 | except from the

00:21:29.620 | ID in here to create that that single string. This is what we're going to use to

00:21:34.980 | Create our sparse vector. Okay, and you can see that

00:21:40.380 | here

00:21:41.540 | Right. So that's how we create a sparse vector now for the dense vector. We need to extract the images

00:21:47.620 | so we get the images from the

00:21:49.860 | Images very or list that we created earlier. We

00:21:53.380 | Convert them into dense embeddings with our dense model, which is clip and then here

00:22:00.580 | We're just creating some unique IDs. So the ID is a literally just a count in this case

00:22:04.780 | But we do need to make sure they're a string

00:22:07.460 | Obviously if you have actual IDs that correlate to your items

00:22:12.060 | Maybe you want to use those and in fact, actually we did have ideas. We just took them out here

00:22:16.580 | So, you know, maybe we could have used that but it's fine. We don't it's not really so important in this example

00:22:23.260 | Right and a bit that's different. So in pinecone, you know, there's sort of

00:22:29.860 | Making things more organized so that we can do things more efficiently

00:22:35.940 | So there are a few changes here. So we've created our items here. Typically would just feed them in in like a

00:22:42.560 | List of two pools, which is kind of not that organized. So that is changing a little bit here

00:22:48.620 | First we create a structure for our metadata and inside that structure. We just need to add

00:22:55.140 | The meta here. So that's why I imported up here this

00:22:58.260 | Google

00:23:00.260 | Structure item here object

00:23:02.900 | So this is what we create for our metadata for one row of our metadata to note that look here

00:23:08.420 | We're looping through the batch. All right, and

00:23:11.260 | as we're looping through a batch, we're appending everything to this upsets list here and

00:23:16.740 | So as we go through we have this pinecone

00:23:20.700 | GRPC vector. All right, so if you're not using GRPC use vector and

00:23:26.780 | A GRPC vector expects at least two things, right?

00:23:31.860 | Which it would be your ID and your values. So your ID is obviously your your record ID or vector ID

00:23:38.380 | values is your dense vector and

00:23:41.100 | Then for metadata and sparse values, you know that they're optional, right?

00:23:47.220 | But obviously we do have metadata in our use case and we do have sparse values as well

00:23:52.420 | so if the metadata we feed in this the strokes that we just created and for sparse values we have our

00:24:00.300 | GRPC sparse values here, and I realized actually after creating this that I can just do

00:24:06.900 | This okay

00:24:09.300 | so let's

00:24:11.300 | Go through and run this

00:24:13.380 | Okay, and that would take a little bit of time. So I will

00:24:18.620 | Skip forward and I will see you when it's ready. Okay, so it's just finished and

00:24:23.060 | we have

00:24:25.380 | okay, 44,000 vectors in there and

00:24:27.980 | Yeah, okay. We'll start having a look at actually querying those and getting some some results. So

00:24:35.420 | Let's come to here. I think maybe I need to remove this

00:24:39.340 | Okay, good. So let's start the dark blue French connection genes for men, right?

00:24:46.500 | So that's gonna be our query when we're querying we we do need to create

00:24:50.120 | BM 25 query vector. So we're using transform query here. We also need to use this so

00:24:57.540 | Creating our dense vector embedding so model and code query to list and we want to search through that

00:25:06.260 | Okay. So with our search this is again if you watch the previous videos slightly different we now have so vectors

00:25:15.860 | I think what we had before

00:25:17.860 | But now we also have sparse vector

00:25:21.660 | Which is where we pass in our sparse vector and what I'm going to do is just return

00:25:25.420 | The images based on the results that we get. Okay, so we just get all of these

00:25:30.340 | Pill image objects to actually view those we're going to use this function here

00:25:35.380 | Display result, which is just essentially going to display some HTML within a notebook

00:25:41.580 | Which will show the actual images themselves and with that would just view them. Okay, so

00:25:48.020 | query was

00:25:50.380 | Dark blue. No. Yeah dark blue French connection genes for men

00:25:54.140 | And this is what we return. Okay, so yeah, it seems pretty accurate

00:26:00.840 | but let's see what happens if we

00:26:03.740 | kind of weigh the

00:26:07.180 | Sparse versus dense vectors. Alright, so what we're going to do is use a function named hybrid scale for this

00:26:14.420 | okay, and

00:26:16.620 | Essentially if we take look here, we have this alpha value this alpha value must be between 0 and 1

00:26:23.380 | Where 0 basically means you're doing a sparse vector search only

00:26:28.280 | because if you see here 1 minus 0 so

00:26:33.060 | The sparse vector multiplied by 1 for sparse, but this one will be this is a dense one

00:26:38.340 | You're multiplying by 0 here. So you're basically making it

00:26:42.100 | Meaningless and then if you want to dense search only you would you would use a alpha value of 1 or 1.0

00:26:50.940 | so you don't trigger this so we run this and

00:26:55.500 | Then okay, we first do a pure sparse vector search

00:27:02.460 | So we set that to 0 to do this do so run this

00:27:06.740 | Okay, and it looks like we kind of get good results. Although there are definitely a few women's genes in here

00:27:13.420 | So let's have a look at what's actually going on there. So we go to product display name

00:27:18.100 | All right, so on French connection. Yeah, we're doing well, right? So the keywords are being pulled in we'll find that exact match

00:27:26.940 | That's great. But the issue is we do have like men is a component of the of the word woman

00:27:33.620 | So we're kind of pulling those in as well

00:27:35.620 | so

00:27:37.380 | What we can do is okay. Let's just go pure dense and see what what happens

00:27:41.900 | Okay, so this is basing it purely on the images, right?

00:27:47.100 | So it's looking for dark blue French connection genes

00:27:50.460 | the issue is does it actually know what French connection is and if we come down to the

00:27:55.580 | The display name of these items and see okay. It's actually returns of the brand like loco

00:28:01.660 | locomotive I think it's called and

00:28:04.660 | Even spy car and Wrangler. It does include a few French connection

00:28:09.140 | But I think that's just purely by chance rather than it actually knew that these are French connection genes

00:28:14.200 | no, maybe it did who knows but

00:28:17.300 | Nonetheless, yeah, it's not it's not pulling back the French connection stuff

00:28:23.100 | So we can't just set the alpha value - it's actually not gonna use 0.07. We're gonna use 0.07

00:28:29.300 | right, so it's

00:28:32.020 | Actually, I'm basically a sparse search with a little bit of dense in there as well

00:28:38.140 | In fact, not even 0.07 0.05. Okay

00:28:43.100 | All right, so we run this

00:28:45.900 | So this is heavily weighted towards sparse, but it's still considered some dense and

00:28:52.380 | You know, they look mostly. Okay, there is some black here

00:28:55.820 | You know, ideally you'd rather avoid that

00:28:59.060 | Maybe we could increase it to dance a little bit more to try and avoid that but you can see

00:29:04.380 | That for the most part we are getting French connection in here. All right, so we're getting some locomotive and here as well

00:29:11.020 | But for the most part is French connection and also not including any

00:29:16.940 | Women's genes in there as well, which is a bonus

00:29:21.860 | Cool. Let's try it some more queries, which actually I think can demonstrate this pretty well

00:29:26.860 | So we're going to go small beige handbag for women. We're going to go pure sparse first. So looking at the keywords only and

00:29:34.780 | Okay, not really beige. All right, we have like every color in there

00:29:40.900 | They're all handbags, which looks good. And obviously we're all for women. So

00:29:46.180 | Okay, not bad

00:29:49.180 | Just not beige whatsoever. So what we're going to do is we're going to add a little bit of dance again

00:29:54.060 | All right, see what happens

00:29:55.860 | All right

00:29:56.580 | so even with that little tiny bit of a dense factor in there the results are just like

00:30:02.720 | Ten times better we get pure beige. All of these are

00:30:07.580 | Handbags, right? So that's really good

00:30:10.740 | We can also look here

00:30:13.900 | It's kind of interesting that it didn't pull beige in even though they all say beige and the actual sparse factor

00:30:20.880 | But anyway, so that's kind of cool. What happens if we just get pure dense

00:30:26.720 | It's kind of interesting. I think so if we go pure dense

00:30:31.140 | Yeah, I think for the most part they're beige, which is okay

00:30:36.160 | But then we're getting it like women's purses in there as well. Not just

00:30:41.260 | Handbags, so we can see that

00:30:43.820 | Okay, like clip is understanding everything as like it knows that a woman's handbag and a woman's purse are kind of similar

00:30:50.900 | So it's just kind of giving you all of those that are also beige

00:30:53.460 | So, you know

00:30:55.860 | We can kind of see the benefit of having a mix of sparse and dense in here across both modalities as well

00:31:02.540 | All right another thing. So this is kind of interesting

00:31:06.700 | we're gonna start with this image here and

00:31:10.620 | We're going to use this image to search right so clip can also handle images, right?

00:31:15.780 | so this is going to be our clip vector and

00:31:19.660 | We're also going to add in a text query, which is going to become our sparse vector

00:31:25.560 | All right. So let's try those. We're going to be using a mix

00:31:29.380 | 0.3 for this one, right? So what do we have here? So these are all kind of similar to this image, I think

00:31:38.420 | Yeah, I definitely say so

00:31:40.420 | you return the same one here and

00:31:43.220 | They even I mean to me they look like the same person

00:31:47.860 | All right, so obviously clip isn't it doesn't know that you just want to focus on the top. It's kind of looking at everything

00:31:53.500 | All right, so it's also gonna kind of return people that look similar

00:31:56.860 | It's kind of like a amusing side effect

00:32:00.220 | But the purple component isn't I think maybe it's considered a little bit here and here

00:32:06.500 | But you know, it's not really considered a huge amount there. Is it?

00:32:09.980 | Okay, so

00:32:12.740 | Let's try rather than just relying on the query

00:32:16.660 | let's try and use the pine cones metadata filtering right because we know that in our metadata we have the

00:32:24.220 | The color of these items. All right, so we do the same again same query this time. We're going to filter for the color purple

00:32:33.300 | Okay, run this. All right, and then straight away we get you know much better results. So with this well

00:32:40.820 | filtering for the color purple

00:32:43.340 | we are

00:32:45.540 | querying with both an image and

00:32:47.700 | Also a text query, right? So we you know, we're kind of approaching this search from

00:32:54.860 | several different

00:32:57.660 | Angles, right, you know

00:33:00.340 | Basically, the point is that we can do search in a ton of different ways and we get some some pretty cool results

00:33:07.300 | So another one here, we're gonna have this guy in his shirt and we're gonna go with a green shirt

00:33:13.820 | all right going to add that to our filter straight away and

00:33:17.340 | Yeah straight away so one thing is okay

00:33:21.820 | We don't even include like it needs to be men's shirt

00:33:23.820 | Anywhere in here the clip model is kind of handling that for us because it knows we're looking at a guy initially

00:33:30.220 | So it's just retaining as other guys

00:33:32.260 | Like these to me seem like we're actually filtering for green here

00:33:36.980 | So it means that this shirt is for some reason marked as green. I don't know why

00:33:41.220 | but nonetheless

00:33:44.620 | We for the most part, you know, they're kind of that's a green type of color kind of faded like this one here

00:33:50.940 | So it works pretty well, I think

00:33:53.580 | Cool. So yeah, if you're if you're done with the index

00:33:56.940 | obviously if you want to kind of play around with it a little bit more feel free and go ahead to once you are done

00:34:02.020 | You can delete the index just to save your resources if you just have the one index, it's free anyway

00:34:08.660 | But if you have multiple obviously you'd be paying for this. So I mean that's it for this example of

00:34:16.220 | It's like a hybrid

00:34:18.860 | multimodal search for e-commerce items as I mentioned at start we just kind of

00:34:24.700 | Blazed through a ton of different technologies like we had filtering in their hybrid search in their multimodality

00:34:32.340 | Kind of dived into obviously with the hybrid dived into the embedding methods sparse and dense

00:34:38.340 | So we covered a ton of things

00:34:40.780 | But I think it's kind of cool to see how many different

00:34:44.820 | ways you can

00:34:47.620 | Search and how you can enhance your search

00:34:51.060 | But anyway, that's it for now

00:34:54.900 | so I hope all this been interesting and

00:34:57.900 | Helpful. Thank you very much for watching the video and I will see you again in the next one. Bye

00:35:03.340 | [Music]

00:35:05.340 | [Music]

00:35:07.340 | [Music]

00:35:09.340 | [Music]

00:35:11.340 | [Music]

00:35:13.340 | [Music]

00:35:15.340 | [Music]

00:35:17.340 | [Music]

Supercharge eCommerce Search: OpenAI's CLIP, BM25, and Python

Chapters