The NEW Match-Case Statement in Python 3.10

00:00:00.000 | Okay, so we're going to go through the new match case statement in Python 3.10.

00:00:05.040 | So the new match statement, they're actually calling it structural pattern matching.

00:00:10.280 | And what it does is basically allows us to create a switch case statement in Python.

00:00:18.240 | So switch case statements are pretty common in most languages.

00:00:22.840 | But it's something that Python just hasn't had.

00:00:26.880 | And we can actually see in this pep here, pep 3.10.3,

00:00:32.560 | that they were going to add it back in 2006, or it's been considered at least.

00:00:37.440 | But then it was just rejected because basically no one wanted it.

00:00:43.400 | So since then, it just hasn't been added to Python.

00:00:47.760 | But now with this new pep, pep 6.3.4,

00:00:54.800 | it looks like it's going to be added in 3.10.

00:00:57.840 | So what I've done is downloaded Python 3.10, it's alpha 6.

00:01:04.480 | And we're just going to have a play around and see how this works.

00:01:08.960 | So this first quick example of how it's going to look.

00:01:13.960 | And it's just a super easy, simple example.

00:01:21.320 | I'm just going to comment these out because I'll cause an error.

00:01:25.680 | And you can see here that we have this HTTP code 418.

00:01:32.280 | And what we're doing here is we're matching the subject, which is this HTTP code.

00:01:39.880 | And then we go into our cases.

00:01:42.800 | So in the case of that subject being equal to this,

00:01:49.120 | we do whatever is within this block.

00:01:51.800 | And we do the same if it is 404.

00:01:54.880 | Or if not, we go down to 418.

00:01:58.520 | And at the end of that, we can also add a catch all case statement.

00:02:04.840 | So in this case, we just say, OK, case, we put anything here.

00:02:10.280 | And we just say, OK, code not found.

00:02:13.000 | So this is what we'll run if none of these are executed.

00:02:17.680 | So we can also remove this and it will just carry on with the code.

00:02:21.480 | So that's our first example.

00:02:25.560 | You can see here we get I'm a TPOP for code 418.

00:02:29.520 | Obviously, that's a super simple example.

00:02:33.920 | And that's essentially how it works.

00:02:36.960 | So I also have this graph here to try and explain more visually

00:02:41.600 | the actual flow of information here.

00:02:43.520 | So we have a subject which we set to match at the top.

00:02:46.960 | So in this case, we are using HTTP code.

00:02:50.400 | And then we check for truthy or falsy,

00:02:55.320 | whether that subject matches the pattern that we've provided in our case.

00:03:00.240 | If it is truthy, we execute the block.

00:03:04.400 | If it is falsy, we continue to the next case.

00:03:07.920 | So, yes, pretty much how it works.

00:03:11.640 | But let's have a look at a few other examples,

00:03:14.560 | which I think demonstrate the actual use case

00:03:18.200 | and benefits of this a bit better.

00:03:20.840 | So the first one I just want to quickly show you is this one,

00:03:26.960 | which is actually from this PEP here, PEP 635.

00:03:30.720 | And I thought this was a pretty cool example.

00:03:33.400 | So what I showed you is just a really simple,

00:03:37.560 | is this equal to this?

00:03:40.160 | If not, go on to the next one.

00:03:42.240 | Whereas this, I think, better demonstrates

00:03:44.600 | that we can check the structure of the data we're feeding in.

00:03:49.200 | So here we're checking the host port and mode of a connection.

00:03:55.040 | So, for example, if we're connecting through HTTP,

00:03:58.680 | we might set our mode to HTTP.

00:04:02.600 | And then we also have a host and a port, and this will all be within a tuple.

00:04:08.400 | However, in some cases, maybe I'll show you here.

00:04:12.960 | So we can have our host here.

00:04:17.520 | Then we have the port.

00:04:25.120 | And then here we may sometimes have the mode of the connection,

00:04:31.240 | so maybe it's FTP and sometimes we might not.

00:04:35.120 | And if we don't see anything here, we just assume it's HTTP.

00:04:38.640 | And that's essentially what this code here is doing.

00:04:41.240 | So it's taking that tuple as X.

00:04:44.360 | And in this case, we can see we're using if elif.

00:04:47.560 | And it is fine, like there's nothing wrong with this.

00:04:50.960 | But then if we look at how we write this using matching case,

00:04:55.440 | it does look a lot cleaner.

00:04:57.760 | So that's just one example.

00:04:59.600 | And then I have an actual use case example.

00:05:03.080 | I've been using the SQUAD2 dataset,

00:05:06.520 | which is a very popular dataset

00:05:09.640 | for training question answering models in machine learning.

00:05:12.960 | And I figured that's actually a really cool example

00:05:16.480 | of how the case match statement might actually be pretty beneficial.

00:05:21.800 | So let's just separate this out.

00:05:24.240 | And I'll walk you through how we can apply that.

00:05:28.600 | So first, I'm just going to import requests.

00:05:34.000 | And also import JSON, because we're going to be pulling a JSON file

00:05:38.160 | from the Internet, and we're going to be reading that.

00:05:40.280 | And the URL for that is this.

00:05:44.400 | And the file that we're going to be reading is the training data.

00:05:51.760 | From the SQUAD2 dataset.

00:05:56.920 | OK, so the data that we're looking for is at this address.

00:06:02.040 | So if we want to just download that, we can with requests.

00:06:05.720 | And we'll just get URL plus the file.

00:06:10.680 | And then after we've pulled that using requests,

00:06:15.440 | we're just going to save it to file.

00:06:17.800 | So I'm going to file, write binary as F.

00:06:24.200 | And I'm going to write it to file in chunks.

00:06:27.880 | It's quite a big dataset.

00:06:29.920 | So it's going to use, it's a content, it's a chunk size.

00:06:34.800 | I mean, we can kind of go for anything here, let's just go for a hundred.

00:06:40.000 | And then just write the chunk.

00:06:43.840 | OK, so that's our dataset downloaded and we can open it over here.

00:06:52.480 | And if we just look through a few of these,

00:06:55.920 | we can see that there's quite a few layers to the dataset.

00:06:57.960 | So this is something we're going to have to consider

00:06:59.720 | when we're building out this function for parsing it,

00:07:02.680 | both with the Eiffel's version and the match case version.

00:07:07.320 | So what we're going to want to do here is loop through each one of these.

00:07:12.280 | And you can see there's quite a few of them.

00:07:14.680 | And we want to get the question and the text here,

00:07:19.000 | which is the answer to that question.

00:07:20.880 | And we just want to pull those out and nothing else.

00:07:23.720 | We'll just put them out as a tuple and create a big list.

00:07:27.040 | Now, the complexity of this comes from, so if I, sorry, just open that again.

00:07:33.880 | You can see here we have, OK, this is fine.

00:07:37.680 | And we have a question and we have a question

00:07:41.120 | and then we have answers, which contains our answer.

00:07:43.920 | So just remember, we've got answers here, OK?

00:07:46.880 | This is the name of the key that contains our answer.

00:07:51.040 | If we go right down to the bottom, the actual format is different.

00:07:56.440 | And this is the case for quite a few of them.

00:08:00.480 | So if we go into here.

00:08:03.480 | OK, see straight away, we have the question and we have answers, but it's empty.

00:08:09.040 | And instead, we have this plausible answers.

00:08:12.120 | And in here, we have our answer.

00:08:15.320 | So it's slightly different format.

00:08:16.960 | In that, rather than using answers, some of them use plausible answers.

00:08:20.720 | I'm not sure why, but some of them do.

00:08:23.960 | And for some of those as well, they also include this answers key,

00:08:29.000 | which is just an empty list.

00:08:30.280 | So we just need to write some logic to actually deal with that.

00:08:34.480 | OK, so we've saved this file, so I'm just going to open it back up again.

00:08:43.400 | Open it back up again, and it is only just here.

00:08:51.120 | It's readBinary, it's F, and we just save it into the squad variable.

00:08:59.160 | Let's have a quick look at what we have in here.

00:09:07.040 | So you see over here, everything is contained within this data key.

00:09:13.480 | So if we close that, it closes everything.

00:09:15.480 | So first off, we want to access that.

00:09:18.720 | And then it's a list, and we'll loop through each one of those.

00:09:21.480 | So let's just have a look at the first version or the first item in that list.

00:09:27.280 | And it's the Beyonce group.

00:09:29.240 | So we can just have a quick look.

00:09:33.000 | It's just quite messy.

00:09:34.360 | I'm not going to go into it.

00:09:37.120 | But essentially, to work through this data,

00:09:40.720 | we're going to have to write something like this.

00:09:44.080 | And this is for both the if-else statement or if-else version

00:09:47.800 | and the match case version.

00:09:50.680 | And we write squad data.

00:09:55.480 | So this is going through each group.

00:09:58.200 | So Beyonce or Matters are the two that we saw before.

00:10:01.920 | And I'm going to go through each paragraph.

00:10:05.200 | In the group, that is paragraphs.

00:10:09.160 | And then we're going to go through each question and answer here within that

00:10:17.960 | or each question and the information that has next to that question.

00:10:22.080 | And that is in the paragraph.

00:10:25.320 | QAS.

00:10:33.000 | And I'm just going to pass for now,

00:10:34.560 | but that will loop through everything that we need.

00:10:37.360 | So we're going to use that for both the if-else version and the match case version.

00:10:41.760 | So let's take this and we'll build out the if-else version first.

00:10:48.000 | So in this case, we can just get our questions straight away.

00:10:53.440 | So we are at the moment, we've gone through paragraphs

00:10:57.360 | and we're looping through each of these.

00:11:00.560 | So we're in a QAS at the moment and we want to get the question, which is this.

00:11:05.480 | So QA, question.

00:11:09.760 | And then this is where we have our if-else logic.

00:11:15.960 | So answers in here.

00:11:18.200 | First, we want to check, okay, is there an answers key within the dictionary here?

00:11:27.440 | So we write that like, if answers in QA.keys.

00:11:36.880 | But as we saw before, the answers key can be there, but it can also be empty.

00:11:42.760 | And if that's the case, we also, obviously we can't pull anything out from that answers list.

00:11:49.200 | So we also need to say, and length of QA answers is greater than zero.

00:11:59.880 | So we have two conditions there, and if both of those satisfied,

00:12:04.880 | then we want to pull the answers from that key.

00:12:08.840 | So it's answers QA, answers, and then it's always with this data set,

00:12:17.120 | we are always entering the, or there's always only one item within the list of answers.

00:12:25.560 | So we just pull out the first item, that index zero, and then we pull out the text.

00:12:33.640 | So if I show you, we have index zero, and then we have the text.

00:12:39.240 | So entering the dictionary or list, sorry, and then we're entering index zero

00:12:43.840 | because all index zero, and then we're pulling out the text,

00:12:48.480 | which can be in the late 1990s for this first one.

00:12:53.480 | Okay, so that is our first part of the if else statement.

00:12:59.480 | Second part, we want to say, okay, else if we have plausible answers.

00:13:05.800 | So this is what I showed you at the end, plausible answers.

00:13:10.400 | In QA keys, and we also want to say the same again,

00:13:18.840 | where QA plausible answers needs to have some values in it as well.

00:13:27.760 | So in this case, we pull our answers from plausible answers.

00:13:36.400 | And it's the same again there.

00:13:40.640 | Okay, so that's our if and else if, and then we just want one final else statement at the end.

00:13:47.960 | So if for some reason we can't find the answers, I'm just going to put answers equal to none.

00:13:54.360 | And then after all of that, I'm going to just initialize a new squad list here.

00:14:04.440 | And we're just going to append our question and answer pairs to that.

00:14:09.000 | So new squad dot append, let's have question and answers.

00:14:16.160 | Should really be answer rather than answers, but it's fine.

00:14:20.840 | Okay, so let me just put these into a tuple.

00:14:31.000 | Okay, great. So let's just have a quick look at what we get here.

00:14:34.080 | Make sure we're getting the right thing.

00:14:37.280 | So we're going to take a look at the first five and the last five as well.

00:14:47.400 | Okay, so this looks good.

00:14:51.640 | Yep, so we have tuples, question in the first index and the answer in the second index.

00:14:58.680 | Okay, so that's great. And it's worked for both the answers and the plausible answers.

00:15:03.840 | Because all of these down here are using the plausible answers format.

00:15:08.600 | So this is what it looks like with if, elif and else.

00:15:14.120 | So let's rewrite this, but we're going to do it with the match case statement instead.

00:15:21.800 | So take this and what we can do is just remove all of that.

00:15:28.440 | And maybe we can just keep in the question and what we'll do is say match QA.

00:15:39.320 | And then in here, what we are looking for is the case of having a dictionary that contains

00:15:47.040 | the answers key.

00:15:50.000 | And that answers key contains a list, which also contains a dictionary, which contains

00:15:57.560 | a key called text.

00:16:00.520 | We want to pull out the answers from whatever that text key is pointing to.

00:16:06.680 | So the value there and this will pull whatever is within this part of the dictionary or the

00:16:12.800 | part of the data structure into this new variable answers, which is, I think, really cool about

00:16:20.280 | the new syntax.

00:16:22.440 | So what we've done here is actually already assigned our variable answers.

00:16:26.900 | So we don't actually need to do anything within this code block.

00:16:31.080 | So that's pretty cool.

00:16:33.600 | And then we just write the same thing, but we do it for our plausible answers.

00:16:41.000 | So if we write that out and this, my opinion is it's pretty cool as well.

00:16:49.240 | We just write it like that.

00:16:51.260 | And that's all there is to it.

00:16:53.320 | And we just write pass again.

00:16:56.200 | And in the final case of it not working, we just say, OK, just set answers equal to none.

00:17:08.160 | And let's run that.

00:17:10.280 | And let's just take what we have here.

00:17:14.640 | And we should get the same.

00:17:16.360 | OK, so, yeah, we can see exact same output.

00:17:22.080 | So that's the comparison.

00:17:24.080 | I mean, to me, this does look way cleaner.

00:17:28.200 | And this is, I suppose, a little bit complicated.

00:17:31.800 | But generally, I think this is kind of easier to read than this, or at least at first glance,

00:17:39.360 | this does look cleaner, in my opinion.

00:17:41.600 | But I'm not 100% sure on which one I would go for.

00:17:45.880 | At the moment, I'm kind of leaning towards this one.

00:17:48.720 | But let's see.

00:17:49.720 | We'll see how people start using this new syntax going forward.

00:17:54.160 | It'll be pretty interesting to see, at least.

00:17:56.880 | So that is all I wanted to cover on that.

00:18:00.740 | So that's it for this video.

00:18:02.640 | Thank you very much for watching.

00:18:04.000 | I hope it's been useful.

00:18:05.400 | And I will see you again in the next one.

The NEW Match-Case Statement in Python 3.10

Chapters