Back to Index

How to Build Custom Q&A Transformer Models in Python


Chapters

0:0
1:4 Download Our Data
1:9 Squad Data Set
6:40 Data Prep
28:53 Initialize Our Tokenizer
41:10 Create a Pi Torch Dataset Object
47:56 Initialize the Optimizer
48:33 Initializing the Model
50:3 Initialize Our Data Loader
51:22 Training Loop
58:24 While Loop
61:3 Create a Data Loader
65:15 Check for an Exact Match
66:31 Calculate the Accuracy
67:47 Accuracy
68:31 Exact Match Accuracy

Transcript

Hi and welcome to the video. Today we're going to go through how we can fine-tune a QnA transform model. So for those of you that don't know QnA just means question and answering and it's one of the biggest topics in NLP at the moment. There's a lot of models out there where you ask a question and it will give you an answer.

And one of the biggest things that you need to know how to do when you are working with transformers, whether that's QnA or any of the other transformer-based solutions, is how to actually fine-tune those. So that's what we're going to be doing in this video. We're going to go through how we can fine-tune a QnA transformer model in Python.

So I think it's really interesting and I think you will enjoy it a lot. So let's just go ahead and we can get started. Okay so first thing we need to do is actually download our data. So we're going to be using the squad data set which is the Sanford question answering data set which is essentially one of the better known QnA data sets out there that we can use to fine-tune our model.

So let's first create a folder. I'm just going to use os and os make data. We'll just call it squad. Obviously call this and organize it as you want. This is what I will be doing. Now the URL that we are going to be downloading this from is this.

Okay and there are actually two files here that we're going to be downloading but both will be coming from the same URL. So because we're making a request to a URL we're going to import requests. We can also use the wget library as well or if you're on Linux you can just use wget directly in the terminal.

It's up to you what we're going to be using requests. Okay and to request our data we're going to be doing this. So it's just a get request. Use a f string and we have the URL that we've already defined. And then the training data that we'll be using is this file here.

Okay requests. Okay and we can see that we've successfully pulled that data in there. Okay so like I said before there's actually two of these files that we want to extract. So what I'm going to do is just put this into a for loop which will go through both of them.

Just copy and paste this across. Rename this file. And the other file is the same but instead of train we have dev. Okay so here we're making our request. And then the next thing we want to do after making our request is actually saving this file to our drive.

Okay and we want to put that inside this squad folder here. So to do that we use open. And again we're going to use a f string here. And we want to put inside the squad folder here. And then here we are just going to put our file name which is file.

Now we're writing this in binary because it's JSON so we put wb for our flags here. And then within this namespace we are going to run through the file and download it in chunks. So we do for chunk and then we iterate through the response like this. Let's use a chunk size of four.

And then we just want to write to the file like that. So that will download both files. Just add the colon there. So that will download both files. We should be able to see them here now. So in here we have data, we have essentially a lot of different topics.

So the first one is Beyonce. And then in here we will see, if we just come to here, we get a context. But alongside this context we also have QAS which is question and answers. And each one of these contains a question and answer pair. So we have this question, when did Beyonce start becoming popular?

So this answer is actually within this context. And what we want our model to do is extract the answer from that context by telling us the start and end token of the answer within that context. So we go zero and it is in the late 1990s. And we have answer start 269.

So that means that at character 269, we get I. So if we go through here, we can find it here. Okay, so this is the extract. And that's what we will be aiming for our model to actually extract. But there will be a start point and also the end point as well, which is not included in here, but we will add that manually quite soon.

So that's our data. And then we'll also be testing on the dev data as well, which is exactly the same. Okay, so we move on to the data prep. So now we have our files here, we're going to want to read them in. So we're going to use the JSON library for that.

And like we saw before, there's quite a complex structure in these JSONs, there's a lot of different layers. So we need to use a few for loops to fill through each of these and extract what we want, which is the context, questions and answers, all corresponding to each other.

So in the end, we're going to have lists of strings, which is going to be all of these. And in the case of the answers, we will also have the starting position. So it will be a list of dictionaries, where one value is a text and one value is the starting position.

So to do that, we're going to define a function called read squad. We'll define our path here as well. And the first thing we need to do is actually open the JSON file. So we do with open path. And again, we are using a binary file. So we're going to have B as a flag.

But we're instead of writing, we are reading so use our here. So our B. And just do JSON load F here. So now we have our dictionary within this squad dict here. So maybe whilst we're just building this function up, it's probably more useful to do it here. So you can see what we're actually doing.

So let's copy that across. And then we'll fill this out afterwards. Of course, we do actually need to include the path. So let's take this. And then we can see what's inside here. Maybe we can load just a few rather than all of them. Or we can investigate it like this.

Okay, so we have the version and data, which we can actually see over here, version and data. So we want to access the data. Now within data is we have a list of all these different items, which is what I was trying to do before. So we go into data.

And just take a few of those. Okay, and then we get our different sections. For the first one, let's just take zero, which is Beyonce. And then we have all of these. So we're going to want to loop through each one of these, because we have this one next, and we're going to keep needing to just run through all of these.

So to do that, we want to do for group in squad dict. And remember, we need to include the data here. Let's just see our say group title. So we can see a few of those. Okay, then go through each one of those. So the second part of that are these, these paragraphs.

And when the paragraphs we have each one of our questions. So let's first go with paragraphs. And we'll do the chop in here. Sorry, it's a list. There we go. And the first thing we need to extract is the easiest one, which is our context. However, that is also within a list.

So now if we access the context, we get this. So we're essentially going to need to jump through or loop through each one of these here. Now we're gonna need to access the paragraphs and loop through each one of those. And then here, we're going to access the context.

So let's write that. So we already have one group here. So let's just stick with that. And we're going to run through the passage in the paragraphs. So already here, we're going through the for loop on this index. And now we're going to go through a loop on this index.

Let's keep that. So that means that we will be able to print the passage context. And there we go. So here we have all of our context. So that's one of our three items that we need to extract. Okay, so that's great. Let's put that all together. So we're going to take this, put it here.

And then we have our context. Okay, that's great. Obviously, for each context, we have a few different questions and answers. So we need to get those as well. Now, that requires us to go through another for loop. So let's go this passage, we need to go into the QAS key and loop through this list of question and answers.

So we have this, and then we have our list. So another layer in our for loop will be for question answer in that passage QAS. And then let's take a look at what we have there. Okay, great. So we have plausible answers, question and answers. So what we want in here is the question and answers.

So question is our first one. Perfect. So we have the questions now. And then after we have extracted the question, we can move on to our answers. As we see here, the answers comes as another list. Now each one of these lists all just have one actual answer in there, which is completely fine.

So we can access that in two ways. We can either loop through or we can access the zero value of that array. Either way, it doesn't matter. So all we need to do here is loop through those answers. Or if we want, just go with QA answers zero. So in most cases, this should be completely fine.

As we can see here, most of these question and then they have the answers dictionary. Which is fine. However, some of these are slightly different. So if we scroll right down to the end here, say, okay, we have this, which is talking about physics. And then rather than having our answers array, we have these plausible answers, which is obviously slightly different.

And this is the case for a couple of those. So from what I've seen, the states that the best way to deal with this is simply to have a check. If there is a plausible answers key within the dictionary, we will include that as the answer rather than the actual answers dictionary.

So to do that, all we need to do is check if QA keys contains plausible answers. If it does, we use that. Otherwise, we use answers. Okay. Then we use this one. Otherwise, we will use answers. So let's just add all of that into our for loop here. So we have our context, and then we want to loop through the question answers.

And this is where we get our question. Then once we're here, we need to do something slightly different, which is this plausible answers. Okay. And then we use this access variable in order to define what we're going to loop through next. So here we go for answers, answer, sorry, in QA access, because this will switch to implausible answers or answers.

And then within this for loop, this is where we can begin adding this context, question, and answer to a list of questions, context, and answers that we still need to define up here. So each one of these is just going to be an empty list. And then all we do, copy this across, and we just append everything that we've extracted in this loop and the context, question, and answer.

And that should work. So now let's take a look at a few about context. Okay. We can see we have this, and because we have multiple question answers for each context, the context does repeat over and over again. But then we should see something slightly different when we go with answers.

And questions. Okay. So that's great. We have our data in a reasonable format now, but we want to do this for both the training set and the validation set. So what we're going to do is just going to put this into a function like we were going to do before, which is this read squad.

So here, we're going to read in our data, and then we run through it and transform it into our three lists. And all we need to do now is actually return those three lists and answers. So now what we can do is execute this function for both our training and validation sets.

So we're going to train context, questions, and answers. Okay. So that is one of them, and we can just copy that. And we just want this to be our validation set. Like so. Okay. So that's great. We now have the training context and the validation context, which we can see right here.

So here, let's hope that there is a slight difference in what we see between both. Okay. Great. That's what we would expect. Okay. So now we have our data almost in the right format. We just need to add the ending position. So we already have the start position. If we take a look in our train answers.

Okay. We have the answer start, but we also need the answer end, and that's not included within the data. So what we need to do here is actually put that into a function and that's not included within the data. So what we need to do here is actually define a function that will go through each one of our answers and context and figure out where that ending character actually is.

And of course we could just say, okay, it's the length of the text. We add that onto the answer start and we have our answer end. However, that unfortunately won't work because some of the answer starts are actually incorrect and they're usually off by one or two characters. So we actually need to go through and one, fix that and two, add our end indices.

So to do that, we're just going to define a new function, which is going to be add end index. And here we will have our answers and the context, and then we're going to just feed these in. So first thing we do is loop through each answer and context pair.

And then we extract something which is called the gold text, which is essentially the answer that we are looking for. It's called the golden text or gold text. So simply our answer and within that, the text. So we are pulling this out here. So we should already know the starting index.

So what we do here is simply pull that out as well. And then the end index ideally will be the start plus the length of the gold text. However, that's not always the case because like I said before, they can be off by one or two characters. So we need to add in some logic just to deal with that.

So in our first case, let's assume that the characters are not off. So if context start to end equals the gold text, this means everything is good and we don't need to worry about it. So we can modify the original dictionary and we can add answer end into there.

And we made that equal to our end index. However, if that's not the case, that means we have a problem. It's one of those dodgy question answer pairs. And so this time what we can do is we'll add a out statement. So we're just going to go through when the position is off by one or two characters, because it is not off by any more than that in the squad dataset.

Loop through each of those and we'll say, okay, if the context, and then in here we need to add the start index and this again. So let's just copy and paste side cross be easier. But this time we're checking to see if it is off by one or two characters.

So just do minus N and it's always minus N. It isn't shifted. It's always shifted to the left rather than shifted to the right. So this is fine. So in this case, the answer is off by N tokens. And so we need to update our answer, start value, and also add our answer end value.

So start index minus N and we also have the end. So that's great. We can take that and we can apply it to our train and validation sets. So all we do here is call the function and we just see train answers and train context. And of course we can just copy this and do the same for our validation set.

Okay. Perfect. So now if we have a quick look, we should be able to see that we have a few of these ending points as well. Okay. So I think that looks pretty good. And that means we can move on to actually encoding our text. To tokenize or encode our text, this is where we bring in a tokenizer.

So we need to import the transformers library for this. And from transformers, we are going to import the distilbert. So distilbert is a smaller version of BERT, which is just going to run a bit quicker, but it will take a very long time. And we're going to import the fast version of this tokenizer because this allows us to more easily adjust our character and then start locations to token end and start locations later on.

So first we need to actually initialize our tokenizer. Which is super easy. All we're doing is loading it from a pre-trained model. Okay. And then all we do to create our encodings is to call the tokenizer. So we'll do the training set first, which is called tokenizer. And in here, we include our training context.

And the training questions. So what this will do is actually merge these two strings together. So what we will have is our context, and then there will be a separator token followed by the question. And this will be fed into distilbert during training. I just want to add padding there as well.

And then we'll copy this and do the same for our validation set. Okay. And this will convert our data into encoding objects. So what we can do here is print out different parts that we have within our encodings. So in here, so you have the input IDs. So let's access that.

And you'll find in here, we have a big list of all of our samples. So check that we have 130K. And let's open one of those. Okay. And we have these token IDs, and this is what Bert will be reading. Now, if we want to have a look at what this actually is in sort of human readable language, we can use the tokenizer to just decode it for us.

Okay. And this is what we're feeding in. So we have a couple of these special tokens. This just means it's the start of a sequence. And in here, we have a process form of our original context. Now, you'll find that the context actually ends here. And like I said before, we have this separator token.

And then after that, we have our actual question. And this is what is being fed into Bert, but obviously the token ID version. So it's just good to be aware of what is actually being fed in and what we're actually using here. But this is a format that Bert is expecting.

And then after that, we have another separator token followed by all of our padding tokens, because Bert is going to be expecting 512 tokens to be fed in for every one sample. So we just need to fill that space essentially. So that's all that is doing. So let's remove those and we can continue.

So the next thing we need to add to our encodings is the start and end positions, because at the moment, we just don't have them in there. So to do that, we need to add an additional bit of logic. We use this character to token method. So if we just take out one of these, let's take the first one.

OK, we have this. And what we can do is actually modify this to use the character to token method, remove the input IDs, because we just need to pass it the index of whichever encoding we are wanting to modify or get the start and end position of. And in here, all we're doing is converting from the character that we have found a position for to the token that we want to find a position for.

And what we need to add is train answers. We have our position again, because the answers and encodings, the context in question needs to match up to the answer, of course, that we're asking about. And we do answers start. So here, we're just feeding in the position of the character.

And this is answer. OK. So feeding in position of the character, and we're expecting to return the position of the token, which is position 64. So all we need to do now is do this for both of those. So for the start position and end position. So here we should get a different value.

OK, but this is one of the limitations of this. Sometimes this is going to return nothing. As you can see, it's not returning anything here. And that is because sometimes it is actually returning the space. And when it looks at the space and the tokenizer see that, and they say, OK, that's nothing.

We're not concerned about spaces. And it returns this non-value that you can see here. So this is something that we need to consider and build in some added logic for. So to do that, again, we're going to use a function to contain all of this. And call it addTokenPositions.

Yeah, we'll have our encodings and our answers. And then we just modify this code. So we have the encodings. We have the answers. And because we're collecting all of the token positions, we also need to initialize a list to contain those. So we do startPositions, emptyList, and endPositions. And now we just want to loop through every single answer and encoding that we have.

Like so. And here we have our startPositions. So we need to append that to our startPositionsList. Then we just do the same for our endPositions, which is here. Now, here we can deal with this problem that we had. So if we find that the endPositions, the most recent one, so the negative one index, is none, that means it wasn't found.

And it means there is a space. So what we do is we change it to instead use the -1 version. And all this needs to do is update the endPositions here. OK, that's great. But in some cases, this also happens with the startPosition. But that is for a different reason.

The reason it will occasionally happen with startPosition is when the passage of data that we're adding in here, so you saw before we had the context, a separated token, and then the question, sometimes the context passage is truncated in order to fit in the question. So some of it will be cut off.

And in that case, we do have a bit of a problem. But we still need to just allow our code to run without any problems. So what we do is we just modify the startPositions again, just like we did with the endPositions. Obviously, only if it's none. And we just set it to be equal to the maximum length that has been defined by the tokenizer.

And it's as simple as that. Now, the only final thing we need to do, which is because we're using the encodings, is actually update those encodings to include this data. Because as of yet, we haven't added that back in. So to do that, we can use this quite handy update method.

And just add in our data as a dictionary. So we have the startPositions, startPositions. And then we also have our endPositions. And then, again, we just need to apply this to our training. We just need to apply this to our training and validation sets. And let's just modify that.

Let's add the train encodings here and the train answers. And do that again, the validation set. So now let's take a look at our encodings. And here we can see, great, now we have those startPositions and endPositions. We can even have a quick look at what they look like.

And what we've done is actually not included the index here. So we're just taking it for the very first item every single time. So let's just update that. So obviously, that won't get us very far. And just update that as well. And now this should look a little bit better.

So it's lucky we checked. Okay, so our data, our training, and our validation sets are now up to date. Okay, so our data at the moment is in the right format. We just need to use it to create a PyTorch dataset object. So to do that, obviously, we need to import PyTorch.

And we define that dataset using a class. And just pass in the TorchUtilsDataDataset. We need to initialize that like so. And this is coming from the Houdinface Transformers documentation. Don't take credit for this. And we essentially need to do this so that we can load in our data using the PyTorch data loader later on, which makes things incredibly easy.

And then we just have one more function here, one method. Okay, and we just need to return and also this as well. That should be okay. So we apply this to our datasets to create dataset objects. We have our encodings and then the same again for the validation set.

Okay, so that is our data almost fully prepared. All we do now is load it into a data loader object. But this is everything on the data side done, which is great because I know it does take some time and I know it's not the most interesting part of it, but it's just something that we need to do and need to understand what we're doing as well.

So now we get to the more interesting bit. So we'll just add the imports in here. So we need our data loader. We're going to import the atom optimizer with weighted decay, which is pretty commonly used for transformer models when you are fine tuning. Because transformer models are generally very large models and they can overfit very easily.

So this atom optimizer with weighted decay essentially just reduces the chances of that happening, which is supposed to be very useful and quite important. So obviously we're going to use that. And then final bit is TQDM. So TQDM is a progress bar that we are going to be using so that we can actually see the progress of our training.

Otherwise, we're just going to be sat there for probably quite a long time, not knowing what is actually happening. And trust me, it won't take long before you start questioning whether anything is happening, because it takes a long time to train these models. So they are our imports and I'm being stupid again here.

That's from, did that twice. Okay, so that's all good. So now we just need to do a few little bits for the setup. So we need to tell PyTorch whether we're using CPU or GPU. In my case, it will be a GPU. If you're using CPU, this is going to take you a very long time to train.

And it's still going to take you a long time on GPU. So just be aware of that. But what we're going to do here is say device. It's CUDA, if CUDA is available. Otherwise, we are going to use the CPU. And good luck if that is what you're doing.

So once we've defined the device, we want to move our model over to it. So we just model.toDevice. So this .to method is essentially a way of transferring data between different hardware components, so your CPU or GPU. It's quite useful. And then we want to activate our model for training.

So there's two things we have here. So we have .train and eval. So when we're in train mode, there's a lot of different layers and different parts of your model that will behave differently depending on whether you are using the model for training or you're using it for inference, which is predictions.

So we just need to make sure our model is in the right mode for whatever we're doing. And later on, we'll switch it to eval to make some predictions. So that's almost everything. So we just need to initialize the optimizer. And here, we're using the weighted decay atom optimizer.

We need to pass in our model parameters and also give it a learning rate. And we're just going to use this value here. All of these are the recommended parameters for what we are doing here. So the one thing that I have somehow missed is defining the, actually initializing the model.

So let's just add that in. And all we're doing here is loading, again, a pre-trained one. So like we did before when we were loading the Transformers tokenizer. This time, it's for question answering. So this distilbert of question answering is a distilbert model with a question and answering head added onto the end of it.

So essentially with Transformers, you have all these different heads that you add on. And they will do different things depending on what head it has on there. So let's initialize that from pre-trained. And we're using the same one we use up here, which is distilbert base uncased. And sometimes you will need to download that.

Fortunately, I don't need to as I've already done that. But this can also take a little bit of time, not too long though. And you get a nice progress bar, hopefully, as well. Okay, so now that is all settled, we can initialize our data loader. So all we're doing here is using the PyTorch data loader object.

And we just pass in our training data set, the batch size. So how many we want to train on at once in parallel before updating the model weights, which will be 16. And we also would like to shuffle the data because we don't want to train the model on a single batch and it just learned about Beyonce.

And then the next one is learning about Chopin. And it will keep switching from batch to batch. So let's move on. It's learning about Chopin. And it will keep switching between those. But never within a single batch, having a good mix of different things to learn about. So it is data sets.

Seems a bit of a weird name to me, so I'm just going to change it. And I also can't spell. There we go. And that is everything we can actually begin our training loop. So we're going to go for three pups. And what we want to start with here is a loop object.

So we do this mainly because we're using TQDM as a progress bar. Otherwise, we wouldn't need to do this. There'd be no point in doing it. And all this is doing is kind of like pre-initializing our loop that we are going to go through. So we're going to obviously loop through every batch within the train loader.

So we just add that in here. And then there's this other parameter, which I don't know if we... So let's leave it. But essentially, you can add leave equals true in order to leave your progress bar in the same place with every epoch. Whereas at the moment, with every epoch, what it will do is create a new progress bar.

We are going to create a new progress bar. But if you don't want to do that and you want it to just stay in the same place, you add leave equals true into this function here. So after that, we need to go through each batch within our loop. And the first thing that we need to do is set all of our calculated gradients to zero.

So with every iteration that we go through here or every batch, at the end of it, we are going to calculate gradients, which tells the model in which direction to change the weights within the model. And obviously, when we go into the next iteration, we don't want those gradients to still be there.

So all we're doing here is re-initializing those gradients at the start of every loop. So we have a fresh set of gradients to work with every time. And here, we just want to pull in our data. So this is everything that is relevant that we're going to be feeding into the training process.

So we have everything within our batch. And then in here, we have all of our different items. So we can actually see-- go here. We want to add in all of these. And we also want to move them across to the GPU, in my case, or whatever device you are working on.

And we'll do that for the attention mask start positions and end positions. So these start and end positions are essentially the labels, they're the targets that we want our model to optimize for. And the input IDs and attention masks are the inputs. So now we have those defined. We just need to feed them into our model for training.

And we will output the results of that training batch to the outputs variable. So our model, input IDs, need the attention mask. And we also want our start positions and end positions. Now, from our training batch, we want to extract the loss. And then we want to calculate loss for every parameter.

And this is for our gradient update. And then we use the step method here to actually update those gradient updates. And then we use the step method here to actually update those gradients. And then this final little bit here is purely for us to see, this is our progress bar.

So we call it a loop. We set the description, which is going to be our epoch. And then it would probably be quite useful to also see the loss in there as well. We will set that as a postfix. So it will appear after the progress bar. Okay, and that should be everything.

Okay, so that looks pretty good. We have our model training. And as I said, this will take a little bit of time. So I will let that run. And then we will go back to the model and we will run the training batch. So I will let that run.

Okay, so we have this non-type error here. And this is because within our mPositions, we will normally expect integers, but we're also getting some non-values because the code that we used earlier, where we're checking if mPosition is non, essentially wasn't good enough. So as a fix for that, we'll just go back and we'll add like a while loop, which will keep checking if it's non.

And every time it is non, reduce the value that we are seeing by one. So go back up here, and this is where the problem is coming from. So we're just going to change this to be a while loop. And just initialize essentially a counter here. And we'll use this as our go back value.

And every time the mPosition is still non, we'll just add one to go back. And this should work. So we need to remember to rerun anything we need to rerun. Yeah. Okay. And that looks like it solved the issue. So great. We can just leave that training for a little while and I will see you when it's done.

Okay. So the model's finished and we'll go ahead and just save it. So obviously we'll need to do that whenever actually doing this on any other projects. So I'm just going to call it the Silbert custom. And it's super easy to save. We just do save pre-trained and the model path.

Now, as well as this, we might also want to save the tokenizer so we have everything in one place. So to do that, we also just use tokenizer and save pre-trained again. Okay. So if we go back into our folder here, see models, and we have this Silbert custom.

And then in here we have all of the files we need to build our PyTorch model. It's a little bit different if we're using TensorFlow, but the actual saving process is practically the same. So now we've finished training. We want to switch it out of the training mode. So we use a model eval.

And we just get all this information about our model as well. We don't actually need any of that. And just like before, we want to create a data loader. So for that, I'm just going to call it ValLoader. And it's exactly the same code as before. In fact, it's probably better if we just copy and paste some of this.

At least the loop. So what we're going to do here is take the same loop and apply it as a validation run with our validation data. Just paste up there. We'll initialize this data loader. This time, of course, with the validation set. We'll start with the same batch size.

Now, this time, we do want to keep a log of accuracy. So we will keep that there. And we also don't need to run multiple epochs because we're not training this time. We're just running through all of the batches within our loop of validation data. So this is now a validation loader.

And we just loop through each of those batches. So we don't need to do anything with the gradients here. And because we're not doing anything with the gradients, we actually have this in to stop PyTorch from calculating any gradients. Because this will obviously save us a bit of time when we're processing all of this.

And we put those in there. The outputs, we do still want this. But of course, we don't need to be putting in the start and end positions. So we can remove those. And this time, we want to pull out the start prediction and end prediction. So if we have a look at what our outputs look like before, you see we have this model output.

And within here, we have a few different tensors which each have an accessible name. So the ones that we care about are start logits. And that will give us the logits for our start position, which is essentially like a set of predictions where the highest value within that vector represents the token ID.

So we can do that for both. And you'll see we get these tensors. Now, we only want the largest value in each one of these vectors here, because that will give us the input ID. So to get that, we use the argmax function. And if we just use it by itself, that will give us the maximum index within the whole thing.

But we don't want that. We want one for every single vector or every row. And to do that, we just set dim equal to 1. And there you go. We get a full batch of outputs. So these are our starting positions. And then we also want to do the same for our ending positions.

So we just change start to end. So it's pretty easy. Now obviously, we want to be doing this within our loop, because this is only doing one batch. And we need to do this for every single batch. So we're just going to assign them to a variable. And there we have our predictions.

And all we need to do now is check for an exact match. So what I mean by exact match is we want to see whether the start positions here, which we can rename to the true values, whether these are equal to the predicted values down here. And to calculate that, so let me just run this so we have one batch.

That shouldn't take too long to process, and we can just write the code out. So to check this, we just use the double equal syntax here. And this will just check for matches between two arrays. So we have the startPredictions and the startTrueValues. So we'll check for those. So if we just have a look at what we have here, we get this array of true or false.

So these ones don't look particularly good. But that's fine. We just want to calculate the accuracy here. So let's take the sum, and we also want to divide that by the length. Okay so that will give us our accuracy within the tensor. And we just take it out using the item method.

But we also just need to include brackets around this, because at the moment we're trying to take item of the length value. Okay and then that gives us our very poor accuracy on this final batch. So we can take that, and within here we want to append that to our accuracy list.

And then we also want to do that for the endPrediction as well. And we'll just let that run through, and then we can calculate our accuracy from the end of that. And then we can have a quick look at our accuracy here. And we can see fortunately it's not as bad as it first seemed.

So we're getting a lot of 93, 100%, 81%. That's generally pretty good. So of course if we want to get the overall accuracy, all we do is sum that and divide by the length. And we get 63.6% for an exact match accuracy. So what I mean by exact match is, say if we take a look at a few of these that do not match.

So we have a 75% match on the fourth batch. Although that won't be particularly useful, because we can't see that batch right now. So let's just take the last batch, because we have these values here. Now if we look at what startTrue is, we get these values. Then if we look at startPred, we get this.

So none of these match, but a couple of them do get pretty close. So these final four, all of these count as 0% on the exact match. But in reality, if you look at what we predicted, for every single one of them, it's predicting just one token before. So it's getting quite close, but it's not an exact match.

So it scores 0. So when you consider that with our 63.6% accuracy here, that means that this model is actually probably doing pretty well. It's not perfect, of course, but it's doing pretty well. So overall, that's everything for this video. We've gone all the way through this. If you do want the code for this, I'm going to make sure I keep a link to it in the description.

So check that out if you just want to copy this across. But for now, that's everything. So thank you very much for watching, and I will see you again soon.