The NEW Match-Case Statement in Python 3.10

Okay, so we're going to go through the new match case statement in Python 3.10. So the new match statement, they're actually calling it structural pattern matching. And what it does is basically allows us to create a switch case statement in Python. So switch case statements are pretty common in most languages.

But it's something that Python just hasn't had. And we can actually see in this pep here, pep 3.10.3, that they were going to add it back in 2006, or it's been considered at least. But then it was just rejected because basically no one wanted it. So since then, it just hasn't been added to Python.

But now with this new pep, pep 6.3.4, it looks like it's going to be added in 3.10. So what I've done is downloaded Python 3.10, it's alpha 6. And we're just going to have a play around and see how this works. So this first quick example of how it's going to look.

And it's just a super easy, simple example. I'm just going to comment these out because I'll cause an error. And you can see here that we have this HTTP code 418. And what we're doing here is we're matching the subject, which is this HTTP code. And then we go into our cases.

So in the case of that subject being equal to this, we do whatever is within this block. And we do the same if it is 404. Or if not, we go down to 418. And at the end of that, we can also add a catch all case statement. So in this case, we just say, OK, case, we put anything here.

And we just say, OK, code not found. So this is what we'll run if none of these are executed. So we can also remove this and it will just carry on with the code. So that's our first example. You can see here we get I'm a TPOP for code 418.

Obviously, that's a super simple example. And that's essentially how it works. So I also have this graph here to try and explain more visually the actual flow of information here. So we have a subject which we set to match at the top. So in this case, we are using HTTP code.

And then we check for truthy or falsy, whether that subject matches the pattern that we've provided in our case. If it is truthy, we execute the block. If it is falsy, we continue to the next case. So, yes, pretty much how it works. But let's have a look at a few other examples, which I think demonstrate the actual use case and benefits of this a bit better.

So the first one I just want to quickly show you is this one, which is actually from this PEP here, PEP 635. And I thought this was a pretty cool example. So what I showed you is just a really simple, is this equal to this? If not, go on to the next one.

Whereas this, I think, better demonstrates that we can check the structure of the data we're feeding in. So here we're checking the host port and mode of a connection. So, for example, if we're connecting through HTTP, we might set our mode to HTTP. And then we also have a host and a port, and this will all be within a tuple.

However, in some cases, maybe I'll show you here. So we can have our host here. Then we have the port. And then here we may sometimes have the mode of the connection, so maybe it's FTP and sometimes we might not. And if we don't see anything here, we just assume it's HTTP.

And that's essentially what this code here is doing. So it's taking that tuple as X. And in this case, we can see we're using if elif. And it is fine, like there's nothing wrong with this. But then if we look at how we write this using matching case, it does look a lot cleaner.

So that's just one example. And then I have an actual use case example. I've been using the SQUAD2 dataset, which is a very popular dataset for training question answering models in machine learning. And I figured that's actually a really cool example of how the case match statement might actually be pretty beneficial.

So let's just separate this out. And I'll walk you through how we can apply that. So first, I'm just going to import requests. And also import JSON, because we're going to be pulling a JSON file from the Internet, and we're going to be reading that. And the URL for that is this.

And the file that we're going to be reading is the training data. From the SQUAD2 dataset. OK, so the data that we're looking for is at this address. So if we want to just download that, we can with requests. And we'll just get URL plus the file. And then after we've pulled that using requests, we're just going to save it to file.

So I'm going to file, write binary as F. And I'm going to write it to file in chunks. It's quite a big dataset. So it's going to use, it's a content, it's a chunk size. I mean, we can kind of go for anything here, let's just go for a hundred.

And then just write the chunk. OK, so that's our dataset downloaded and we can open it over here. And if we just look through a few of these, we can see that there's quite a few layers to the dataset. So this is something we're going to have to consider when we're building out this function for parsing it, both with the Eiffel's version and the match case version.

So what we're going to want to do here is loop through each one of these. And you can see there's quite a few of them. And we want to get the question and the text here, which is the answer to that question. And we just want to pull those out and nothing else.

We'll just put them out as a tuple and create a big list. Now, the complexity of this comes from, so if I, sorry, just open that again. You can see here we have, OK, this is fine. And we have a question and we have a question and then we have answers, which contains our answer.

So just remember, we've got answers here, OK? This is the name of the key that contains our answer. If we go right down to the bottom, the actual format is different. And this is the case for quite a few of them. So if we go into here. OK, see straight away, we have the question and we have answers, but it's empty.

And instead, we have this plausible answers. And in here, we have our answer. So it's slightly different format. In that, rather than using answers, some of them use plausible answers. I'm not sure why, but some of them do. And for some of those as well, they also include this answers key, which is just an empty list.

So we just need to write some logic to actually deal with that. OK, so we've saved this file, so I'm just going to open it back up again. Open it back up again, and it is only just here. It's readBinary, it's F, and we just save it into the squad variable.

Let's have a quick look at what we have in here. So you see over here, everything is contained within this data key. So if we close that, it closes everything. So first off, we want to access that. And then it's a list, and we'll loop through each one of those.

So let's just have a look at the first version or the first item in that list. And it's the Beyonce group. So we can just have a quick look. It's just quite messy. I'm not going to go into it. But essentially, to work through this data, we're going to have to write something like this.

And this is for both the if-else statement or if-else version and the match case version. And we write squad data. So this is going through each group. So Beyonce or Matters are the two that we saw before. And I'm going to go through each paragraph. In the group, that is paragraphs.

And then we're going to go through each question and answer here within that or each question and the information that has next to that question. And that is in the paragraph. QAS. And I'm just going to pass for now, but that will loop through everything that we need. So we're going to use that for both the if-else version and the match case version.

So let's take this and we'll build out the if-else version first. So in this case, we can just get our questions straight away. So we are at the moment, we've gone through paragraphs and we're looping through each of these. So we're in a QAS at the moment and we want to get the question, which is this.

So QA, question. And then this is where we have our if-else logic. So answers in here. First, we want to check, okay, is there an answers key within the dictionary here? So we write that like, if answers in QA.keys. But as we saw before, the answers key can be there, but it can also be empty.

And if that's the case, we also, obviously we can't pull anything out from that answers list. So we also need to say, and length of QA answers is greater than zero. So we have two conditions there, and if both of those satisfied, then we want to pull the answers from that key.

So it's answers QA, answers, and then it's always with this data set, we are always entering the, or there's always only one item within the list of answers. So we just pull out the first item, that index zero, and then we pull out the text. So if I show you, we have index zero, and then we have the text.

So entering the dictionary or list, sorry, and then we're entering index zero because all index zero, and then we're pulling out the text, which can be in the late 1990s for this first one. Okay, so that is our first part of the if else statement. Second part, we want to say, okay, else if we have plausible answers.

So this is what I showed you at the end, plausible answers. In QA keys, and we also want to say the same again, where QA plausible answers needs to have some values in it as well. So in this case, we pull our answers from plausible answers. And it's the same again there.

Okay, so that's our if and else if, and then we just want one final else statement at the end. So if for some reason we can't find the answers, I'm just going to put answers equal to none. And then after all of that, I'm going to just initialize a new squad list here.

And we're just going to append our question and answer pairs to that. So new squad dot append, let's have question and answers. Should really be answer rather than answers, but it's fine. Okay, so let me just put these into a tuple. Okay, great. So let's just have a quick look at what we get here.

Make sure we're getting the right thing. So we're going to take a look at the first five and the last five as well. Okay, so this looks good. Yep, so we have tuples, question in the first index and the answer in the second index. Okay, so that's great. And it's worked for both the answers and the plausible answers.

Because all of these down here are using the plausible answers format. So this is what it looks like with if, elif and else. So let's rewrite this, but we're going to do it with the match case statement instead. So take this and what we can do is just remove all of that.

And maybe we can just keep in the question and what we'll do is say match QA. And then in here, what we are looking for is the case of having a dictionary that contains the answers key. And that answers key contains a list, which also contains a dictionary, which contains a key called text.

We want to pull out the answers from whatever that text key is pointing to. So the value there and this will pull whatever is within this part of the dictionary or the part of the data structure into this new variable answers, which is, I think, really cool about the new syntax.

So what we've done here is actually already assigned our variable answers. So we don't actually need to do anything within this code block. So that's pretty cool. And then we just write the same thing, but we do it for our plausible answers. So if we write that out and this, my opinion is it's pretty cool as well.

We just write it like that. And that's all there is to it. And we just write pass again. And in the final case of it not working, we just say, OK, just set answers equal to none. And let's run that. And let's just take what we have here. And we should get the same.

OK, so, yeah, we can see exact same output. So that's the comparison. I mean, to me, this does look way cleaner. And this is, I suppose, a little bit complicated. But generally, I think this is kind of easier to read than this, or at least at first glance, this does look cleaner, in my opinion.

But I'm not 100% sure on which one I would go for. At the moment, I'm kind of leaning towards this one. But let's see. We'll see how people start using this new syntax going forward. It'll be pretty interesting to see, at least. So that is all I wanted to cover on that.

So that's it for this video. Thank you very much for watching. I hope it's been useful. And I will see you again in the next one.

The NEW Match-Case Statement in Python 3.10

Chapters

Transcript