Streamlit for ML #2 - ML Models and APIs

00:00:00.000 | Okay in the last video we had a look at how to build what you can see on the screen right now a very

00:00:06.720 | simple sort of interface using streamlet now

00:00:10.680 | What we want to do in this video is go through how we actually

00:00:15.520 | build the smart part

00:00:18.640 | behind the

00:00:21.400 | Open domain Q&A system that we're going to put together here. So I

00:00:27.800 | Said before there are a few components to open the main Q&A. We're going to stick the first two for now

00:00:33.120 | so the vector database, which we're going to use pinecone for and

00:00:36.960 | The retriever model which we are going to download

00:00:40.640 | From hugging face model hub and we're going to use the sentence transformers library to actually implement that

00:00:46.920 | now the first thing we are going to want to do is

00:00:51.000 | Create our vector database or our index now to do that. There are

00:00:57.640 | three

00:00:59.120 | Parts or three steps we need to take first. We need to download our data

00:01:03.160 | We're going to be using the squad data set from hugging face data sets

00:01:07.720 | Then we want to encode those vectors

00:01:10.840 | encode those paragraphs or what we call context into context vectors and

00:01:16.420 | We use sentence transformers and a retriever model for that. And then the next part is

00:01:23.080 | uploading or pushing all of those vectors into our

00:01:27.720 | pinecone vector database

00:01:29.720 | so to do all of that

00:01:32.440 | We're just going to very quickly go through that code because it is a lot and I don't want to focus on it too much

00:01:39.680 | So here we have

00:01:47.240 | Script I'm going to maybe zoom out a little bit so you can see

00:01:51.960 | So the first thing we do is import everything you don't need tqdm here

00:01:57.260 | But you can pip install tqdm if you do want to use that

00:01:59.960 | So we are

00:02:03.520 | From data sets. So this is hugging face data sets. You will need to install this. So that is just a

00:02:09.840 | pip install

00:02:12.520 | data sets

00:02:15.280 | We're going to first initialize our retriever model, so we're using the pinecone MP net retrievers or to

00:02:22.280 | Retrieve model. So this is a retriever model that is based on the MP net model from Microsoft

00:02:30.040 | and it's been trained on the spot to data set and

00:02:33.080 | First we need to do is initialize our connection to

00:02:38.500 | Pinecone. So this is where we're going to

00:02:41.840 | Store all of our vectors now to do that. You do need an API key

00:02:46.320 | So I wouldn't I wouldn't write it in your code, but I'm going to just do that

00:02:51.900 | For the sake of simplicity here. So I'm going to go to this app dot pinecone dio and this is free

00:02:58.400 | By the way, you don't have to pay anything

00:03:00.400 | so we just go to

00:03:02.760 | app dot pinecone dio

00:03:05.840 | And then you will have to sign up. So you create an account

00:03:09.040 | I already have one so I don't need to worry about that and

00:03:12.800 | I have this default API key over here. Like I could use that and

00:03:19.000 | Yeah, I'm just going to use that so

00:03:21.880 | We can see the key if we want. I want to zoom in a little bit. I'll Chris it's a little bit bigger and

00:03:28.360 | So we can see that

00:03:31.720 | and we can see the value though, we just copy or we just press over here and copy that across and

00:03:38.640 | Then I'm just going to paste in here

00:03:40.800 | Okay. Now this script by the way, I will leave a link to this in the description

00:03:47.560 | So you can just download it instead of writing it all out

00:03:49.800 | Because this isn't essential to our app. It's just how we build

00:03:54.280 | we encode all of our context and

00:03:57.720 | actually

00:04:00.520 | Saw them in our in our vector database

00:04:03.080 | so

00:04:06.400 | We have that we have the cloud environment that we're using there

00:04:09.840 | Switch this back to the app

00:04:13.080 | We want to check if the index already exists, so I'm going to create a secure index now actually you can see in mind

00:04:20.600 | I already have it because I've run this code already

00:04:23.480 | so QA index

00:04:26.120 | Already exists, so it's not going to create a new index and instead it's just going to connect that index here

00:04:33.520 | right, so we've just connected with or we created our index our vector database index and

00:04:40.200 | Now what I want to do is I'm going to switch back to our

00:04:44.960 | Data, and I'm going to run through that. So I'm going to load the data search and the squad data set from

00:04:53.440 | Hugging face

00:04:56.440 | Now the I'm going to use a validation split because the model has been trained on the training data, but squad

00:05:03.400 | I want to make it at least a little bit hard, so we're going to use a validation split that it hasn't seen before

00:05:09.160 | I'm removing any unique

00:05:12.080 | or duplicate

00:05:14.760 | contexts in there, so

00:05:16.760 | Zoom out a little bit here and squad death. We're using this filter, so this is all hugging face data sets

00:05:24.160 | syntax here

00:05:27.040 | And then we're encoding it so this model don't encode so this is our sentence transformer

00:05:32.080 | We're encoding it to create a load of

00:05:34.080 | Sentence vectors for our context and we're converting these to list because we are going to be pushing these through an API request

00:05:42.560 | To pinecone game. We need a list not a numpy array of what you can get going to get an error

00:05:48.360 | Okay, then back to the pinecone side of things. We want to create a list of

00:05:58.280 | It's basically a list of tuples and those tuples include the ID of

00:06:03.160 | Each context so there's a unique ID for each context. We want the vector or the encoding the context vector

00:06:11.560 | And then we also have this dictionary here now. This is metadata so metadata and pinecone is like any other

00:06:18.800 | Information about your vectors that you want to include and this is really good if you want to use metadata filtering which is super

00:06:26.520 | powerful in pinecone

00:06:28.520 | Sand I definitely want to you know leave the option open later on. I'm not sure if we'll use it or not

00:06:35.060 | We'll probably put something in there. Just so we can play around with it

00:06:39.040 | now

00:06:41.560 | That creates the format that we need to upset everything which means just like push or upload everything to pinecone

00:06:47.840 | So then I do that in chunks of 50 at a time

00:06:52.280 | It just makes things a little bit easier on the on the API request rather than sending everything at once. Okay?

00:06:59.480 | So that's like how we create the index

00:07:02.880 | so now what we're going to do is actually

00:07:05.800 | Integrate that a little bit in in our app

00:07:11.480 | So let's switch back to our app here. Let's view it

00:07:17.560 | So first, let's just remove this. We don't need that. Okay, it's a will automatically reload

00:07:23.680 | So

00:07:26.720 | First we want to do here is let's initialize

00:07:29.720 | the pinecone connection, so I'm going to

00:07:33.560 | just take

00:07:36.680 | Let's just take this part of the code

00:07:39.400 | Just copy it and then we'll remove what we don't need in a minute

00:07:44.000 | So we don't need we do need sentence transformers

00:07:47.800 | In a minute, we don't need datasets

00:07:51.080 | We do need pinecone. So actually here. We're initializing our our retriever model

00:07:58.240 | It's the same as what we did before. So we do want to keep that in there

00:08:01.480 | and a bigger

00:08:04.120 | API key again just saw this somewhere else or if you are using

00:08:09.840 | Streamlit cloud they have like a secrets management

00:08:13.760 | System and it's something we'll look at in the future for sure. But for now, I'm just putting in here

00:08:19.440 | So we have our API key environment and we're just doing the same thing we did before but actually we don't want to create an index

00:08:28.240 | We're assuming we've already created an index if web in our app, so we're just going to connect to it

00:08:32.600 | okay, so

00:08:35.120 | with that we've

00:08:37.120 | kind of set up the

00:08:39.200 | Like the back end part of our app. I've smart part that's going to handle the open the main Q&A

00:08:44.960 | But it's going to be a little bit slow and we we will have a look at how to solve that

00:08:50.720 | pretty soon, but for now, what we're going to do is actually just implement this and

00:08:56.480 | We're going to actually query and see what we we return

00:09:01.600 | So I'm gonna save this

00:09:04.360 | we won't see anything change in our app now other than the fact that it takes longer to load because it's

00:09:09.400 | downloading the

00:09:12.120 | the

00:09:13.480 | retriever model that's the main part of the

00:09:15.880 | The slowness here and then obviously connecting to pinecone also take takes a second as well

00:09:21.280 | so

00:09:24.080 | Now we're going to deal with how slow it is

00:09:26.120 | But we will we will fix that pretty soon and

00:09:31.440 | Now I actually want to do is I want to say okay if the query is not empty because by default it is empty

00:09:38.680 | That's why we add that in there. So I'm going to actually remove this

00:09:41.840 | Enter if it is not empty. So if query

00:09:46.600 | Is not equal to nothing

00:09:49.520 | we're going to

00:09:52.400 | query

00:09:54.200 | Pinecone for whatever is in that query. So the first thing we need to do is create our context vector

00:09:59.480 | So I'm gonna write XQ

00:10:01.480 | Just shorthand for context vector

00:10:03.640 | It's pretty standard

00:10:06.880 | Especially if you use vice before they tend to use this and I say I said context vector. I mean query vector

00:10:13.440 | So we're going to do model

00:10:16.800 | encode and

00:10:18.520 | We need to put this in square brackets and we have query. Okay, and then we're going to convert that to a list

00:10:24.280 | Okay, so this is going to create our

00:10:28.280 | Query vector. Let's write it down

00:10:30.320 | create query vector

00:10:33.320 | And then the next thing we want to do is

00:10:37.040 | Query pinecone with this query vector

00:10:40.360 | so

00:10:42.320 | To do that

00:10:44.160 | We want to write

00:10:46.160 | First let's get relevant

00:10:48.760 | Context and we're going to solve these in XC. So like query like context vector

00:10:58.280 | similar thing to the

00:11:00.280 | Query vector that we use for with XQ

00:11:03.600 | But this time we're gonna write

00:11:07.160 | index dot query

00:11:09.760 | And we're going to pass XQ. So our query vector and we're gonna say how many results we want to return now

00:11:18.120 | Later on we're going to use

00:11:20.360 | Streamlet a little like a slider bar to decide how many we would like to return but for now we will hard code it

00:11:27.320 | and

00:11:28.920 | another thing that we want to include here is we want to tell pinecone to

00:11:34.200 | Return the metadata because by default it will not return metadata

00:11:38.640 | so

00:11:40.480 | Return metadata equals true. So these are like the extra little bits. I mentioned before so included our

00:11:46.800 | title so

00:11:49.600 | Like the topic Wikipedia topic that the context is coming from and also the text itself

00:11:56.960 | okay, so we're going to return the relevant context and

00:12:00.320 | Then we're gonna loop through each of those now

00:12:04.080 | When we do this, there's a particular format that we need to follow

00:12:11.600 | so

00:12:13.960 | our context are actually going to be sword so for context in XC results and

00:12:24.040 | Results is going to return a list and we just want the the first item in that list

00:12:29.360 | The reason it returns a list is because if you are a querying pinecone with multiple queries

00:12:34.960 | It will turn a list of you know, your answers for each query

00:12:40.080 | But in this case, we are only ever going to query with one

00:12:44.160 | query vector so we always enter a position zero here and then in there we will have

00:12:52.880 | all of our

00:12:54.800 | returned matches

00:12:56.640 | inside this matches

00:12:58.640 | Key value value so

00:13:02.640 | For context in there. All we're going to do is write st dot right

00:13:09.160 | Context and

00:13:12.600 | then we want to go into the metadata that we were returning and

00:13:15.840 | We have title and text here. We don't want title we want text

00:13:21.240 | Okay, so let's say that and check that actually works

00:13:25.200 | so again, this is going to take a lot while to load because we're

00:13:28.880 | Initializing like the full pipeline of our vector database and the true model

00:13:34.800 | So every time we run this it's downloading the full tree model, which takes quite a bit of time

00:13:39.600 | Okay, so this is just rerun our app and now I can say who are the Normans?

00:13:46.640 | Okay

00:13:50.080 | Again trying to reload everything so it's gonna take a while. We're going to fix this in the next video

00:13:54.960 | so

00:13:57.080 | We should be returning five context and if we scroll down we can see we have these five paragraphs

00:14:02.960 | Now each one is paragraphs. It's a single

00:14:04.960 | Context so we could maybe we can inspect the element

00:14:10.880 | Okay, so we can see down here

00:14:19.040 | It's

00:14:21.040 | Pretty horrific to look at so if I zoom up we can see each one of these is

00:14:29.160 | A single a single one of our context, right?

00:14:34.880 | These here cool so I

00:14:40.600 | Think that is that's it for this

00:14:45.080 | Video, so we're now we have these back end working and the next one

00:14:50.960 | What we'll do is fix this issue with it taking forever to load reload everything every time

00:14:56.120 | Which is actually super easy

00:14:58.800 | But we'll make a big difference to our app

00:15:03.160 | so

00:15:05.000 | Thank you very much for watching and I will see you in the next one. Bye

Streamlit for ML #2 - ML Models and APIs

Chapters