back to index

Stanford CS224N NLP with Deep Learning | 2023 | Python Tutorial, Manasi Sharma


Whisper Transcript | Transcript Only Page

00:00:00.000 | [BLANK_AUDIO]
00:00:05.300 | All right, hi everyone.
00:00:07.080 | Welcome to the 224N Python review session.
00:00:10.820 | The goal of the session really will be to sort of give you the basics of Python and
00:00:16.120 | NumPy in particular that you'll be using a lot in your second homework.
00:00:20.980 | And the homeworks that will come after that as well.
00:00:23.020 | We're sort of taking this tutorial from the background of anyone who hasn't
00:00:28.540 | touched programming languages to some extent.
00:00:31.300 | But also for people who have, we'll be sort of going through a lot of that
00:00:34.020 | material very quickly and we'll be progressing to NumPy as well.
00:00:36.980 | And as I mentioned, first and foremost,
00:00:38.900 | the session is really meant for the people who are here in person.
00:00:41.380 | So if you'd like me to slow down, speed up at any point,
00:00:45.180 | need time for clarifications, feel free to ask.
00:00:47.100 | It's really meant for you first here.
00:00:49.860 | And I really would like it to be sort of an interactive session as well.
00:00:52.780 | All right, so these are the topics we'll be covering today.
00:00:57.260 | Going through first of all, why Python as a language?
00:00:59.900 | Why have we chosen it for sort of this course?
00:01:01.580 | And in general, why do people prefer it to some extent for
00:01:04.140 | machine learning and natural language processing?
00:01:06.900 | Some basics of the language itself, common data structures.
00:01:09.940 | And then getting to sort of the meat of it through NumPy,
00:01:13.460 | which as I mentioned you'll be extensively using in your homeworks going forward.
00:01:16.140 | And then some practical tips about how to use things in Python.
00:01:19.780 | All right, so first thing, why Python?
00:01:23.580 | So a lot of you who might have been first introduced to programming,
00:01:28.060 | might have done Java before.
00:01:29.700 | A lot of people use MATLAB in other fields as well.
00:01:34.260 | So why Python?
00:01:35.700 | Python is generally used for one, because it's a very high level language.
00:01:39.460 | It can look very, very English like, and so it's really easy to work with for
00:01:42.860 | people, especially when they get started out.
00:01:44.700 | It has a lot of scientific computational functionality as well,
00:01:48.060 | similar to MATLAB.
00:01:48.900 | So when you talk about NumPy,
00:01:49.780 | you'll see that it has a lot of frameworks of very,
00:01:51.580 | very quick and efficient operations involving math or matrices.
00:01:55.460 | And that's very, very useful in applications such as deep learning.
00:01:59.420 | And for deep learning in particular, a lot of frameworks that people use,
00:02:02.420 | particularly for example, PyTorch and TensorFlow, interface directly with Python.
00:02:06.620 | And so for that, those main reasons,
00:02:08.140 | people generally tend to use Python within deep learning.
00:02:10.700 | Okay, so the setup information is in the slides if you'd like to look at them
00:02:15.900 | offline.
00:02:16.860 | I will be sort of jumping over that for
00:02:18.860 | now because I wanna sort of get to the introduction to the language itself.
00:02:22.580 | And if we have time, come back to sort of the setup information.
00:02:25.300 | A lot of it's pretty direct.
00:02:26.460 | You can walk through it.
00:02:27.500 | It gives you steps for sort of how to install packages.
00:02:30.780 | What is a conda environment, for example?
00:02:33.780 | And gets you set up with your first working Python environment, so
00:02:36.500 | you can sort of run simple and basic commands to get used to the language.
00:02:40.100 | But for now, I'm gonna be skipping over this and
00:02:41.620 | coming back to it if we have time.
00:02:42.780 | All right, language basics.
00:02:46.660 | So in Python, you have variables, and
00:02:49.860 | these variables can take on multiple values.
00:02:52.220 | The assignment operation, or the equal sign,
00:02:54.780 | will allow you to assign a particular value to a variable.
00:02:57.660 | A nice thing with Python is you don't have to instantiate the type of the variable to
00:03:01.260 | begin with, and then only instantiate, and only assign values of that type.
00:03:05.460 | So for example, in certain languages,
00:03:07.340 | we first say that this variable, x, is only gonna be of type int.
00:03:11.420 | And any value aside from that assigned to it will throw an error.
00:03:14.820 | Python's pretty flexible.
00:03:15.660 | So if I want to, I can reassign, I can start with x is equal to 10.
00:03:19.020 | And then later on, like five lines later,
00:03:20.620 | I can say x is equal to high as a string, and there would be no issue.
00:03:24.500 | You can do simple mathematical operations, such as the plus and division signs.
00:03:31.020 | You can do exponentiation, which is raising one value to another value.
00:03:35.540 | So x to the power of y, for example, using the double asterisk.
00:03:38.580 | You can do type castings for float divisions.
00:03:42.220 | So if you wanna ensure that your values are being divided,
00:03:44.820 | resulting in a float value and not just dividing two integers,
00:03:47.100 | you can cast two different types like float.
00:03:49.660 | If you want something to be specifically an int, you can also just put an int
00:03:52.860 | instead of the float with brackets around the result, and
00:03:56.020 | that'll give you an integer value.
00:03:57.980 | And then you can also do type casting to, for
00:04:01.060 | example, convert from integers to strings.
00:04:03.540 | So in this case, if I wanted to, instead of doing 10 plus 3 as
00:04:07.780 | a mathematical operation, I just wanted to write out 10 plus 3.
00:04:11.140 | Then I can convert the x and y values, for example, to strings, and
00:04:15.420 | then add the plus sign as a character as well to create a string.
00:04:20.260 | And so a lot of these common operations you can look online as well.
00:04:22.620 | People have lists for them, and just see how they're sort of done in Python.
00:04:26.340 | All right, some other quick things.
00:04:30.100 | So Boolean values, the true and the false, they're always used with capital letters.
00:04:34.220 | In some of the languages, it might be lowercase, so just one thing to know.
00:04:37.580 | Python also doesn't have a null value.
00:04:39.380 | The equivalent of a null value is none.
00:04:42.060 | So sometimes when you wanna say that this value, you want to return none,
00:04:45.780 | say that I'm not really doing anything here.
00:04:47.620 | You wanna do checks, for example, in if statements,
00:04:50.980 | to say that this doesn't have a value, then you can assign it to none.
00:04:55.700 | So none sort of functions as a null equivalent, so
00:04:59.060 | you're not really returning anything, it doesn't have a value.
00:05:01.980 | Not the same as zero.
00:05:03.820 | And another nice thing about Python is lists, which are sort of mutable,
00:05:09.100 | we'll come to that a little bit later, but sort of mutable lists of objects.
00:05:13.260 | And means that you can change them, they can be of any type.
00:05:16.540 | So you can have a mixture of integers, none values, strings, etc.
00:05:22.580 | And yeah, functions can return the none value as well.
00:05:24.460 | And another quick thing, instead of using the double and
00:05:29.460 | and in some of the languages as people might do, with Python,
00:05:33.100 | I mentioned earlier, it's very English-like.
00:05:34.860 | So you can actually just write out if x is equal to 3 and, and in English,
00:05:40.460 | y is equal to 4, then return true or something.
00:05:42.940 | It's quite nice that way, so you can use and, or, and not.
00:05:47.460 | And then just the comparison operators of equal equals to and
00:05:50.860 | not equals to will check for equality and inequality.
00:05:54.660 | This one's pretty standard, I feel, across many languages, and
00:05:57.020 | you can use them in Python as well.
00:05:58.620 | And yeah, remember, just a quick thing, the equal equal to sign is different from
00:06:02.500 | the assignment operator.
00:06:03.500 | This one checks for equality, that one is just assigning a value.
00:06:06.820 | So single equal sign versus two of them.
00:06:08.580 | All right, and then also in Python, you don't use brackets.
00:06:13.300 | So Python, you can use basically spaces or tabs.
00:06:17.940 | So either indents of 2 or 4 to be able to break up what is contained within
00:06:22.900 | the function or contained within like an if statement, a for statement, or
00:06:26.540 | any loops, for example.
00:06:28.140 | And so the main thing is you can choose whether to do 2 or 4.
00:06:31.020 | You just have to be consistent throughout your entire code base,
00:06:35.260 | otherwise they will throw an error.
00:06:36.860 | Now we'll go to some common data structures, and for
00:06:39.900 | this we'll transition to the Colab.
00:06:41.700 | So this one will sort of show you in real time.
00:06:46.540 | This is, by the way, a Colab.
00:06:48.100 | A Colab is basically a Jupyter Notebook, for
00:06:50.700 | those of you who are familiar with those, that you can use and
00:06:53.820 | it's hosted on Google servers.
00:06:56.140 | The really nice thing about Jupyter Notebooks is you don't have to run an entire
00:06:59.500 | file all together, you can run it step by step into what are these called cells.
00:07:04.500 | So if you want to see like an intermediate output,
00:07:06.020 | you can see that pretty easily.
00:07:08.140 | And that way, and it also writes, for example, a lot of descriptions
00:07:12.340 | pertaining to cells, which is really, really nice to have as well.
00:07:15.820 | So a lot of people tend to use these when they're sort of starting off their project
00:07:18.660 | and want to debug things.
00:07:20.020 | And Colab allows you to use these Jupyter Notebook type applications,
00:07:25.980 | hosted on their servers for free, basically.
00:07:27.860 | So anyone can create one of these and run their code.
00:07:30.740 | All right, so lists are mutable arrays.
00:07:35.260 | Mutable means that you can change them, so that once you declare them,
00:07:38.340 | you can add to them, you can delete them, and they're optimized for that purpose.
00:07:41.700 | So they expect to be changed very often.
00:07:44.220 | We'll come to what are called NumPy arrays later, and
00:07:46.020 | those tend to be pretty much fixed.
00:07:48.340 | When you change one, you basically have to create a new array,
00:07:52.180 | which will have the additional information.
00:07:53.820 | So this is highly optimized for changing things.
00:07:55.820 | So if you know, for example, and you're in a loop,
00:07:58.300 | you're adding different elements to, let's say, a bigger entity,
00:08:02.100 | you'd want to use something like a list, because you're going to be changing that
00:08:04.700 | very often.
00:08:05.980 | So let's see how they work.
00:08:07.500 | So we start off with a names array with Zack and Jay.
00:08:10.260 | You can index into the list by, so what is that?
00:08:15.220 | It says index into the list by index, which means that you can list out
00:08:19.460 | the elements in the list, depending on what's called the index.
00:08:22.380 | So it's what place that value is at within the list.
00:08:25.660 | So zero refers to the first element, so Python's what's called zero index,
00:08:29.420 | which means it starts with zero, and then it goes to one.
00:08:31.940 | So here, zero will be Zack.
00:08:33.500 | And then let's say I want to append something to the end.
00:08:37.260 | So to add something to the end of the list, the term is append, not add.
00:08:42.300 | And so if I want to append, I can now create a separate list,
00:08:46.700 | which is the original list itself with the added last element.
00:08:50.340 | And what would currently be the length of this?
00:08:52.220 | It would be three, because you have three elements.
00:08:54.500 | And you can just quickly get that by using the len function, not length,
00:08:57.340 | just three letters, len.
00:08:58.820 | All right, it's also really nice because Python has overloaded
00:09:04.540 | the plus operation to be able to concatenate lists.
00:09:09.380 | So here, I have a separate list, right?
00:09:12.060 | And all you need for a list definition is just brackets.
00:09:13.940 | So this is a separate list altogether,
00:09:15.700 | even though I haven't saved it in the variable, just Abhi and Kevin.
00:09:19.420 | And I can just do a plus equal to,
00:09:21.220 | which means that names is equal to names plus Abhi and Kevin.
00:09:24.100 | And this should output this full list.
00:09:26.940 | You can create lists by just putting the plain brackets or an existing list.
00:09:32.580 | And then as I mentioned earlier,
00:09:33.700 | your list can have a variety of types within them.
00:09:36.220 | So here, this list contains an integer value, a list value.
00:09:39.340 | So you can have a list of lists, as many sort of sublists as you'd like,
00:09:43.100 | a float value and a none value.
00:09:45.620 | And this is completely valid within Python.
00:09:47.340 | Slicing refers to how you can access only parts of the list.
00:09:52.500 | So if I only want, for example,
00:09:54.820 | in this numbers array, I only want 0, 1, 2.
00:09:59.140 | Slicing is a way that you can extract only those parts.
00:10:01.940 | So the way slicing works is,
00:10:03.480 | the first element is included and the last element is excluded.
00:10:06.980 | So here, I start with 0, 1, 2, 3.
00:10:10.580 | So 3 is not included and so 0, 1, 2 will be printed out.
00:10:14.420 | There's also shorthands.
00:10:15.980 | So if you know that you're going to be starting with the first element of the array.
00:10:19.660 | So if you know I'm starting, I want 0, 1, 2 and it starts with 0,
00:10:22.780 | then you don't need to even include the first index.
00:10:25.140 | You can just leave that and include the last index that would be excluded.
00:10:28.700 | So that would be blank,
00:10:30.780 | semi-colon 3 and same deal with the end.
00:10:34.060 | If you know that you want to take everything,
00:10:36.420 | let's say from like 5 and 6 till the end of the array,
00:10:40.420 | you can start with what would you like.
00:10:42.140 | So 0, 1, 2, 3, 4,
00:10:43.980 | 5 till the end and leave that.
00:10:46.540 | Fun fact, so this semi-colon,
00:10:51.580 | when you take just the semi-colon,
00:10:52.820 | it'll take everything in the list but it'll also create a duplicate in memory.
00:10:55.820 | That's a very slight,
00:10:57.820 | very useful thing to know because sometimes when you like pass lists in array,
00:11:03.700 | sorry in Python which is out of scope of this tutorial,
00:11:06.500 | you'd only pass the reference to it.
00:11:08.140 | So if you will change the array, that gets changed.
00:11:10.380 | This will create an entirely separate copy in memory of the exact same array.
00:11:13.800 | So if you make any changes to it,
00:11:15.740 | it won't affect your original array.
00:11:17.300 | So this is a very pretty neat way to do that.
00:11:19.620 | Then another fun thing that Python has which is pretty unique,
00:11:22.780 | is you can index negatively.
00:11:24.500 | So negative indexing means you index from the back of the array.
00:11:28.060 | So minus 1 refers to the last element of the array,
00:11:31.460 | minus 3 will refer to the third last element.
00:11:35.180 | So what minus 1 will give you will be 6 in this case,
00:11:38.380 | but minus 3 will give you will be everything
00:11:40.540 | because you're starting with the minus 3 elements.
00:11:42.460 | So minus 1, minus 2,
00:11:43.880 | minus 3 till the end.
00:11:46.240 | Then this one seems kind of confusing, right?
00:11:48.760 | 3 to minus 2.
00:11:49.820 | So this will do is it will give you 0, 1, 2, 3.
00:11:52.500 | So you start with 3 and then minus 1, minus 2.
00:11:55.740 | So you leave off the x,
00:11:57.460 | the last because you excluded within lists.
00:12:00.740 | You'd only get 3 and 4.
00:12:02.620 | That's what this is. Okay. That's about lists.
00:12:06.900 | Tuples are immutable arrays.
00:12:09.500 | So once you declare the values of these, they cannot be changed.
00:12:12.420 | So I start with, you know,
00:12:13.620 | we started with like the list of Zack and Jay.
00:12:16.100 | Tuples, you start with Zack and Jay.
00:12:18.620 | You can still access them.
00:12:20.700 | You know, I can still print out names 0,
00:12:22.940 | same as I did with lists.
00:12:24.180 | But if I try to change it,
00:12:25.940 | in this case, it'll throw an error.
00:12:27.780 | So tuples, once you've instantiated them,
00:12:29.580 | they cannot be changed.
00:12:31.300 | To create an empty tuple,
00:12:32.860 | you just create, you can either use just a tuple sign,
00:12:35.580 | or oftentimes you can just use the parentheses brackets.
00:12:38.700 | So you can just say, for example,
00:12:40.180 | as you did here, just parentheses to instantiate something.
00:12:44.500 | All right. And yeah, this one,
00:12:47.900 | we'll come to a little bit later in shapes.
00:12:49.980 | But you can also have a tuple of a single value.
00:12:52.420 | And all you have to do there is just put the value and put a comma.
00:12:55.620 | So that just shows that you have a tuple,
00:12:57.420 | which is like an immutable array.
00:12:59.580 | So you can't change it. It's a list,
00:13:01.220 | but only of one item. And that's here.
00:13:04.100 | Okay. I'll quickly move to dictionaries.
00:13:06.620 | For those of you who might be familiar with other languages,
00:13:09.620 | this is the equivalent of a hash map or hash table.
00:13:13.140 | What this is useful for essentially is mapping
00:13:15.340 | one value to another in a really, really quick way.
00:13:18.060 | So if I want to map, for example,
00:13:19.420 | a string to an index,
00:13:20.940 | which you will happen to do a lot of in your homeworks,
00:13:23.620 | this is a really, really useful way to do that.
00:13:25.940 | And so what it does is you can instantiate this dictionary.
00:13:29.340 | And it says corresponding that
00:13:31.060 | Zack is going to correspond to this string value, whatever it is.
00:13:34.340 | And so anytime I want to retrieve the string value,
00:13:37.580 | I just use this dictionary.
00:13:39.500 | I index by it,
00:13:40.940 | which is what I do here,
00:13:42.180 | and then it outputs the corresponding value.
00:13:44.300 | And it does that really, really quickly.
00:13:46.740 | And yeah, so it's really useful, very, very commonly used.
00:13:51.300 | Especially when you sort of, for example,
00:13:52.860 | you have like a list of strings or a list of items,
00:13:55.620 | and you want to have a corresponding index for them.
00:13:58.940 | Because as you'll see in NLP,
00:14:01.020 | oftentimes you're using with- you're working with
00:14:02.740 | indices and numbers in particular.
00:14:04.620 | So it's a really great way to sort of move from like
00:14:08.100 | string formats to just like numerical index values.
00:14:11.860 | There's some other things you can do for dictionaries.
00:14:14.340 | You can check whether certain elements are in there.
00:14:16.300 | So if you, for example, try to index phone book is equal to Monty,
00:14:20.100 | they'll throw an error because there's no string that
00:14:22.180 | says Monty in that phone book dictionary.
00:14:24.260 | And so sometimes you might be wanting to do checks before you extract a value.
00:14:28.100 | And so this will just check, for example,
00:14:29.740 | if I do print Monty and phone book,
00:14:31.380 | it should say false or for example here Kevin and phone book,
00:14:33.820 | it should say false.
00:14:34.780 | While something that's actually in that dictionary,
00:14:37.100 | Zach will be true.
00:14:39.140 | Okay. And then if you'd like to delete an entry from the,
00:14:42.700 | um, from the dictionary, you can just do that using the del command.
00:14:46.820 | All right. Let's move to loops, um, quickly.
00:14:51.500 | So loops are a really great way to optimize for the same kind of op-
00:14:56.260 | same kind of operation you're doing.
00:14:58.220 | Um, it's also a great way to, um,
00:15:00.540 | start to sequentially go over, um,
00:15:03.180 | those list type or array type objects we were talking about earlier.
00:15:06.020 | You know, you have like a list of names, right?
00:15:08.300 | Um, how do you access all of them?
00:15:10.260 | So loops are really a great way to do that.
00:15:12.220 | Um, in Python, um,
00:15:13.740 | they've abstracted away a lot of the confusing sort
00:15:15.580 | of, um, parts in other languages that might be.
00:15:18.020 | You- you can really, for example,
00:15:19.540 | first index on numbers.
00:15:21.420 | So what you do is you have like a range function that you call.
00:15:24.420 | So here you say range, um,
00:15:26.340 | and the range of the last number you'd want.
00:15:28.460 | So what this range function will return is 0, 1, 2, 3, 4,
00:15:31.860 | and that's what will be stored in this i value.
00:15:33.860 | And here it's just printing out that i value.
00:15:36.140 | So if I want to, for example,
00:15:37.720 | loop over the length of an- of a list of size 10,
00:15:40.540 | I just have to do for i in range 10,
00:15:42.660 | and then index that corresponding part of the list.
00:15:45.720 | You technically don't even have to do that because in Python,
00:15:48.180 | you can just directly get the element of the list.
00:15:51.020 | So here I have an- a list of,
00:15:52.900 | um, names where I have Zach,
00:15:55.060 | Jay, and Richard. Instead of saying first the length of the list,
00:15:58.900 | and then doing this range operation,
00:16:00.740 | I can just directly say for name and names,
00:16:03.460 | and then print out the names,
00:16:04.580 | and it will just directly get the element in each list.
00:16:07.540 | Um, but sometimes you might want both.
00:16:09.780 | You might both want this element Zach,
00:16:12.940 | as well as its position in the array.
00:16:14.760 | And for that, you can actually use
00:16:16.440 | this really helpful function called enumerate.
00:16:18.540 | And so enumerate will basically pair those two values,
00:16:21.820 | and it'll give you the, um,
00:16:23.020 | both the value which is here in name for example,
00:16:25.460 | and its corresponding index within the array,
00:16:28.060 | um, both together. So that's really, really convenient.
00:16:30.860 | Versus for example, having to do this like
00:16:32.620 | a little bit more complicated range operation,
00:16:34.620 | where you first take the range and then you index the list.
00:16:37.980 | How do you iterate over a dictionary?
00:16:39.900 | So for dictionaries, um,
00:16:41.740 | if you want to inter- um,
00:16:43.300 | iterate over what's called the keys.
00:16:45.140 | So all of these first items that you first,
00:16:47.300 | you know, put into the dic- the dictionary,
00:16:49.140 | you can just iterate the same way you would a list.
00:16:51.140 | You just say for name in for example,
00:16:52.860 | phone book, and you can output the keys.
00:16:54.940 | If you want to iterate over what is stored in the list,
00:16:58.020 | which is called a value,
00:16:59.460 | you'd have to do the dictionary dot values.
00:17:02.720 | And if you want both,
00:17:03.940 | you use the dot items function.
00:17:05.780 | And so that will print out both of these.
00:17:08.340 | All right. So this is sort of covering
00:17:13.460 | the overarching most commonly used sort of structures, um,
00:17:17.300 | lists, um, dictionaries, and then loops,
00:17:20.140 | and how to sort of efficiently use them within your code.
00:17:22.820 | We'll quickly be moving to the sort of meat of what,
00:17:26.540 | um, is really, really strong about Python,
00:17:28.500 | and what you'll be using a lot for your coming homework,
00:17:30.580 | especially homework two, um, which is NumPy.
00:17:34.020 | Okay. So for NumPy also I'm going to be going to the CoLab,
00:17:38.140 | but just quickly wanted to mention, um, what NumPy is.
00:17:41.100 | So NumPy is basically an optimized library,
00:17:44.020 | um, for mathematical operations.
00:17:46.100 | You know, people tend to like MathLab because it's very,
00:17:47.900 | very useful for these mathematical operations,
00:17:49.780 | which people use in their research.
00:17:51.500 | Um, Python's sort of solution to that is to have
00:17:54.420 | a separate library entirely where they make use of, um,
00:17:58.420 | subroutines which are sort of like sub languages, um, sorry,
00:18:01.380 | sub, um, scripts that are written in a different language called C or C++,
00:18:05.100 | um, that are highly optimized for, um, efficiency.
00:18:08.300 | So the reason C and C++ are much faster than Python is
00:18:11.500 | because they're closer to what's called machine language,
00:18:13.340 | which is what the computer will read.
00:18:14.940 | Um, I mentioned earlier, one of the nice things about Python is it's kind of high level.
00:18:18.260 | It looks like English, right? Just like I said.
00:18:19.540 | You know, we say literally like is,
00:18:21.020 | you know, if x is equal to one or x is equal to two, right?
00:18:24.780 | But, um, that also means that there's a lot more translation required on
00:18:28.120 | the computer's part before it understands what you mean.
00:18:31.220 | Um, and that's useful when you know we're writing out code where we want to understand it,
00:18:34.900 | but it's a little bit less useful when you're sort of
00:18:36.660 | running a lot of operations on a lot of data.
00:18:39.620 | So the real benefit of something like NumPy is that if you have sort of
00:18:43.260 | your memory and your data in a particular format,
00:18:45.660 | it'll call these, these like we see scripts or what are called
00:18:48.380 | subroutines in a different language and it'll make them very, very fast.
00:18:51.620 | And so that's the real benefit of using NumPy.
00:18:53.300 | And almost everyone, um,
00:18:55.260 | in, in sort of NLP is very,
00:18:57.100 | very familiar with this because you'll be running a lot of operations on,
00:19:00.060 | for example, like co-occurrence matrices,
00:19:01.680 | which are really, really big and,
00:19:03.420 | um, it's very useful to have them optimized for time.
00:19:05.720 | So that's really the benefit of using NumPy.
00:19:08.220 | And NumPy basically, it's involved for all these like math and matrix and vector calculations.
00:19:13.300 | Um, and it's different than a list.
00:19:14.740 | Although you can easily translate between a list and a NumPy array,
00:19:17.740 | NumPy arrays are specifically, as I mentioned,
00:19:20.020 | designed to be used in these subroutines.
00:19:22.620 | So they have a specific format,
00:19:23.820 | they're instantiated differently, um,
00:19:25.820 | and you can translate between this and sort of your standard list easily.
00:19:29.420 | But to know that you can only do NumPy operations on NumPy arrays.
00:19:32.240 | You can't do NumPy operations on lists directly.
00:19:34.740 | You'd first have to like convert them, which is really simple.
00:19:37.140 | You just use this NumPy dot array function.
00:19:39.420 | Um, but just know that they'd operate only on NumPy arrays.
00:19:42.460 | Okay. So for NumPy, we're gonna be going back to the Colab.
00:19:46.740 | And then, as I mentioned earlier,
00:19:48.960 | the real strength of NumPy is, you know,
00:19:50.500 | it supports these large multi-dimensional arrays and matrices
00:19:53.740 | for very, very optimized high-level mathematical functions.
00:19:57.420 | Um, and just to go back- step back for a quick second, what is a matrix?
00:20:01.540 | Matrices are basically like rectangular, um,
00:20:04.660 | structures of numbers that are used and you can treat them with
00:20:08.300 | specific rules, um, for operations between different kinds of things.
00:20:12.220 | So if you have like a lot of data, instead of, you know,
00:20:15.220 | individually potentially multiplying things,
00:20:17.020 | if you can store them in this rectangular format,
00:20:19.500 | um, you have specific rules about how this matrix,
00:20:22.100 | for example, will interact with a different one.
00:20:23.940 | And by doing that, which is matrix multiplication or matrix math,
00:20:27.300 | um, you can do a wide variety of mathematical operations.
00:20:31.420 | A vector is generally- this is conventional.
00:20:34.120 | None of these are like hard and fast rules,
00:20:35.580 | but conventionally, a vector is,
00:20:37.620 | um, a matrix in one dimension.
00:20:39.500 | So it's usually like a row vector or a column vector,
00:20:42.560 | which usually just means that it's a list,
00:20:45.100 | um, of values in only one dimension.
00:20:47.320 | So it's like, for example,
00:20:48.420 | here, when I come down to x is equal to numpy array of 1, 2, 3,
00:20:53.240 | that's a list in only one dimension versus, for example,
00:20:57.160 | z, when I- this is z down here,
00:20:59.680 | that is what's called like a two-dimensional array because you have both rows,
00:21:04.480 | for example, like 6, 7,
00:21:06.440 | and then you have 8, 9,
00:21:08.900 | um, versus in this first one,
00:21:11.080 | you only have three values in one dimension.
00:21:13.940 | So that's sort of the conventional difference between the two.
00:21:16.560 | Another convention is matrices generally refer to two-dimensional objects.
00:21:19.800 | So this, as I mentioned, is like z, this is two-dimensional.
00:21:23.000 | Um, you might have heard the word tensor also.
00:21:25.460 | Tensors by convention usually are like higher dimensional objects.
00:21:28.840 | So instead of having two dimensions,
00:21:30.160 | you know, 2, 2,
00:21:31.460 | you can have like n dimensions.
00:21:33.240 | You can have 2, 2, 2, 2, 2, 2,
00:21:36.240 | for like five or six dimensions.
00:21:38.080 | Um, and those are very valid to do mathematical operations on,
00:21:41.360 | um, and those are often colloquially sort of called tensors.
00:21:44.760 | Um, in addition, and this will be covered in the next tutorial in PyTorch,
00:21:49.240 | um, those larger sort of tensors are also optimized for efficiency,
00:21:54.280 | um, to be used on GPUs.
00:21:55.960 | And so they're called tensor in a more concrete way because you're using
00:21:59.280 | these tensors with PyTorch and other sort of packages
00:22:02.160 | to directly do those quicker GPU operations on for deep learning.
00:22:05.920 | So those are sort of- that's a quick sort of terminology difference between the three.
00:22:10.160 | Okay. So now, um,
00:22:12.400 | let's start off with just some quick sort of
00:22:14.200 | representations of how are these matrices and vectors represented in NumPy.
00:22:17.920 | Um, this sort of goes back to your question about like,
00:22:20.960 | what is the difference between like three comma versus like one comma three.
00:22:25.400 | Um, so usually three comma in NumPy arrays usually just means that you have
00:22:29.840 | one list of like one, two, three, for example,
00:22:33.240 | there's like three values versus if you add another list on top of that,
00:22:37.560 | this one comma three essentially refers to the fact that there's a list of lists.
00:22:42.160 | So anytime you have two dimensions,
00:22:44.140 | it always means that there's a list of lists,
00:22:47.000 | um, and that being like a list of lists of for example like a row.
00:22:49.760 | So here, one comma three means that there's one row and then three columns.
00:22:54.200 | So it's saying there's one row of three comma four comma five essentially,
00:22:58.320 | and then each of those is like a column separately.
00:23:01.400 | You can easily reshape them.
00:23:03.280 | So these are basically the same format,
00:23:05.320 | but from NumPy's perspective,
00:23:07.360 | you'll see a little bit later for operations such as broadcasting,
00:23:10.400 | you need to have it for example sometimes in this one comma three format or three comma one format.
00:23:15.120 | Um, and also like what- like as I said,
00:23:18.120 | three is just like it represents three numbers.
00:23:20.280 | One comma three means like one row of three elements.
00:23:23.480 | Three comma one will mean you have essentially in each column,
00:23:27.280 | you'll have a separate array.
00:23:29.000 | So you'll see sort of boxes around each of them.
00:23:30.960 | I'll- there's an example that comes a little bit later in
00:23:32.880 | this colab which will make it a little bit more clearer.
00:23:34.880 | So here, if you can see the difference between like x and y,
00:23:37.880 | one of them has only one bracket which just says it's one list,
00:23:41.560 | only one list of one comma two comma three.
00:23:44.720 | The second one is two brackets which says it's a list with only one list in it.
00:23:49.200 | It's a list of a list.
00:23:50.680 | That's really the main difference between like these sort of two representations.
00:23:54.380 | So I could have like, let's say like a separate one.
00:23:59.880 | I'm going to call this A, and I just do this.
00:24:03.480 | So it's the same sort of elements,
00:24:06.720 | but this will be one comma three because it's showing that there's
00:24:09.840 | one outer list which shows the rows,
00:24:12.680 | and then one inner list which will have each of those values.
00:24:16.000 | So the benefit will come when I'm coming to what a little bit later which is broadcasting.
00:24:20.240 | And so it essentially will help you determine what dimensions you want to match against.
00:24:24.560 | Because sometimes you'd want to have one comma three,
00:24:27.720 | like 1, 2, 3 applied only to rows in some other matrix.
00:24:32.740 | We'll, we'll come to that a little bit later.
00:24:33.840 | Uh, but sometimes you might want to have it only applied to columns.
00:24:37.020 | And so, like if I have a separate matrix for example of 0, 0, 0, 0, 0, 0, 0, 0,
00:24:42.340 | and I want the resulting matrix to be for example,
00:24:44.880 | 1, 2, 3, 1, 2, 3, 1, 2, 3 along the rows.
00:24:47.240 | Let me actually draw this out. It might be easier.
00:24:49.520 | So, let's say I have like the 0, 0,
00:24:53.520 | 0, 0, 0, 0, 0, 0.
00:24:56.860 | Um, and if I want to have a matrix that does 1, 2, 3, 1, 2, 3, 1, 2, 3,
00:25:03.980 | versus 1, 2, 3, 1, 2, 3, 1, 2, 3.
00:25:10.680 | The difference in how to generate these two,
00:25:13.400 | um, will be the difference in the shape,
00:25:15.160 | like how you represent their shape.
00:25:16.440 | It's the same 1, 2, 3,
00:25:18.380 | but the resulting array you're generating by repeating
00:25:21.200 | the 1, 2, 3 values, um,
00:25:23.320 | requires a difference in shape.
00:25:24.720 | And so, we'll come to that a little bit later because this process of how
00:25:26.800 | you generate these arrays is called broadcasting.
00:25:28.780 | But that's the real benefit of having an understanding of the shapes.
00:25:31.680 | The same 1, 2, 3 values are the same.
00:25:33.340 | It's just how they're sort of used with regards to other arrays.
00:25:36.460 | All right. So, yeah, vectors can be easily represented as sort of,
00:25:39.940 | and this is what I was talking about earlier as like n dimensions,
00:25:42.240 | n by 1 or 1 by n dimensions,
00:25:44.480 | and they can resolve in this different behavior kind of what,
00:25:46.520 | like this that I talked about.
00:25:48.040 | Um, matrices are usually in two dimensions represented as m by n.
00:25:51.640 | Um, these are just two examples.
00:25:53.000 | If for example, I generate, let's say,
00:25:54.220 | and then you can also reshape.
00:25:55.240 | So, I start with, for example,
00:25:57.180 | this array which is a list of 10.
00:25:59.920 | Oh, sorry, I need to import them back quickly.
00:26:02.720 | So, I start off with this matrix A which is basically a one-dimensional list of 10 values.
00:26:09.700 | I can reshape it into a 5 by 2 matrix.
00:26:12.600 | So, you just have to make sure that your dimensions match which means that like,
00:26:15.240 | you can multiply them together and get the original size.
00:26:19.160 | So, if I start off with the 10 matrix,
00:26:20.520 | I can make a 2 by 5 matrix,
00:26:22.040 | I can make a 5 by 2 matrix,
00:26:23.560 | I can make a 10 by 1, 1 by 10.
00:26:25.520 | I can't make a, for example,
00:26:26.680 | 3 and 5 because that it wouldn't fit into the original size.
00:26:29.640 | Um, and for that, this operation called reshape is really useful.
00:26:33.320 | Um, you might be wondering why is there two parentheses.
00:26:35.960 | The way that reshape works is essentially it'll take in a tuple.
00:26:39.480 | So, remember that what I was talking about earlier with tuples is that these,
00:26:41.920 | they're immutable objects and they're defined by parentheses.
00:26:44.840 | So, the outer parentheses is representing what you're inputting to the function,
00:26:48.400 | and what you're inputting is a tuple.
00:26:50.000 | So, it uses a second set of parentheses.
00:26:52.640 | So, now, let's go to some array operations.
00:26:55.800 | Um, so I start off with, you know,
00:26:57.160 | this array X. Um, when you apply simple operations,
00:27:01.360 | for example, a max operation,
00:27:02.960 | sometimes you might want the max of the entire array.
00:27:05.120 | So, if I do the max of the entire array,
00:27:07.080 | what's the max value of the entire array by the way?
00:27:08.920 | Just the entire thing. Yes, six, right?
00:27:11.520 | So, if I just do np.max of X,
00:27:14.320 | it'll return one value, it'll return six.
00:27:16.360 | Well, let's say I want the max of every row, right?
00:27:19.400 | Like in every, in each of these rows,
00:27:21.360 | I say I want, let's say the max of each row.
00:27:22.920 | I want two and then four and then six. How do you do that?
00:27:25.720 | And so, NumPy always has like usually in most of their functions an access variable.
00:27:31.120 | And what the access variable will do is it'll tell you
00:27:33.640 | which of these dimensions do you want to take the max over.
00:27:37.160 | And the way to sort of think about it is,
00:27:39.000 | this is going to be a little bit tricky,
00:27:40.520 | um, but the way people describe it is,
00:27:42.560 | the access is what you want to apply your function over,
00:27:46.240 | what you want to reduce over.
00:27:48.120 | And what that means is I print out the shape of the original array, it's three by two.
00:27:52.720 | I want to apply access one,
00:27:54.720 | where as I remember, you know,
00:27:56.080 | NumPy is zero index, it'll be zero one.
00:27:58.240 | So, I want to apply the max over the second dimension.
00:28:00.960 | The second dimension means that for each of these essentially,
00:28:05.600 | you know that like for,
00:28:07.040 | like the row dimension is the first dimension.
00:28:09.240 | So, it's not along, along the rows,
00:28:11.000 | I'm going to be comparing columns.
00:28:12.600 | And so, compare this entire column to this entire column.
00:28:16.360 | And so, just remember for axes,
00:28:18.840 | um, usually the axis zero refers to the row axis,
00:28:22.000 | and then the axis one refers to the column axis.
00:28:24.560 | Um, if you don't even want to remember that,
00:28:26.040 | you can just remember that from the original dimension,
00:28:28.240 | which of these it's referring to.
00:28:30.400 | Um, and that's the dimension you want to compare over or reduce over.
00:28:35.000 | So, it can be a little bit harder to grasp around.
00:28:38.320 | It- it- usually the best way to sort of get around is like just play with a bunch of sort
00:28:41.480 | of operations of min-max, um, and things like that.
00:28:44.760 | But just remember like the axis is what you want to compare over,
00:28:48.080 | not the resulting thing.
00:28:49.280 | So, axis one means here column,
00:28:51.160 | I want to compare between the columns.
00:28:52.700 | I want to get, for example,
00:28:53.800 | comparing one to two, three to four, five to six.
00:28:57.160 | Does that make sense? Okay.
00:29:01.240 | Um, and what this will do is if I just do, um,
00:29:04.400 | numpy.axis, it'll just return- basically since I'm comparing these columns,
00:29:07.920 | it'll just return a resultant column.
00:29:10.000 | And so, as I mentioned, you know, um,
00:29:11.800 | for over the axis one,
00:29:12.920 | you get three values because you're comparing over these columns,
00:29:16.080 | and each column has three values.
00:29:17.760 | I'm comparing over rows, as you mentioned,
00:29:19.400 | I get two values, right?
00:29:20.840 | Um, and so this will just be the tuple comma,
00:29:23.400 | which is just indicating that it's just a list.
00:29:25.440 | It's not a list of lists, it's just a list.
00:29:27.080 | But let's say I want a list of lists, you know,
00:29:29.040 | maybe I want to do those operations I talked about earlier.
00:29:31.440 | Um, instead of reshaping,
00:29:33.040 | which is always there, it's always an option,
00:29:34.720 | you can also use this, um,
00:29:36.600 | feature called keep dims.
00:29:38.360 | And what that'll do is it'll take the original dimensions,
00:29:41.340 | which is two dimensions, right?
00:29:42.940 | Because you have three comma two,
00:29:44.240 | there's two of them, and it'll keep that consistent.
00:29:47.040 | So it'll be three comma one.
00:29:49.280 | But it just means that instead of returning just the extracted column,
00:29:53.520 | which is just a list,
00:29:54.720 | it'll basically keep the column in the context of the original sort of x,
00:29:59.400 | and it'll be- it'll keep it as like a two-dimensional value.
00:30:03.400 | All right. Now, these are just some operations.
00:30:08.560 | So in NumPy, um,
00:30:10.220 | you can use an asterisk as,
00:30:11.900 | uh, an element-wise multiplication.
00:30:14.180 | So an asterisk means that I'm going to be comparing every single value,
00:30:17.380 | um, to every single corresponding value in another matrix.
00:30:20.220 | And it's- you need your matrices to also be the same size for this one.
00:30:23.380 | So this one, it's- it's basically an element-wise matrix.
00:30:25.580 | It's not a matrix multiplication,
00:30:27.040 | so you need to have them be the exact same size.
00:30:28.820 | So this will compare, for example,
00:30:29.900 | one into three, two into three,
00:30:31.700 | three into three, and four into three.
00:30:33.500 | All right. Um, you can also do matrix multiplication,
00:30:37.460 | which is a different operation entirely.
00:30:39.440 | Um, for those of you unfamiliar with matrix multiplication,
00:30:43.360 | um, you would basically be multiplying a row of one matrix with a column of another matrix.
00:30:49.440 | And for that to be necessary,
00:30:50.800 | you need to have the second dimension of the first array
00:30:53.600 | be equal to the first dimension of the second array.
00:30:56.000 | So for matrix multiplication,
00:30:57.380 | if I have an a into b,
00:31:02.520 | comma, c into c, um,
00:31:06.340 | shaped matrices, these two have to be equal for matrix multiplication.
00:31:10.220 | Just something to keep in mind, um,
00:31:12.060 | because oftentimes if you're doing matrix multiplication, um,
00:31:15.580 | you need- you have to make sure that these dimensions are the same.
00:31:18.220 | Which means that, for example,
00:31:20.700 | this is a valid operation, um,
00:31:26.740 | but this can sometimes throw an error.
00:31:30.580 | Sometimes. So it's just important to make sure that sometimes you,
00:31:34.480 | you want to make sure that these are exactly equal.
00:31:36.280 | You can actually just print out the shapes and
00:31:38.000 | make sure that these are equal to be doing matrix multiplication.
00:31:40.480 | And then for matrix multiplication,
00:31:42.600 | um, there's a couple of functions you can use.
00:31:46.520 | Um, the first one is just np.matmul,
00:31:48.680 | which is np.matrixmultiplication.
00:31:50.720 | You can also just use the, um,
00:31:52.320 | the at operation.
00:31:53.680 | And that one, both of those are overloaded.
00:31:55.760 | You can choose whichever one. They'll result in the same exact operation.
00:31:58.800 | And just a quick session show,
00:32:01.060 | you can- to show what this will do is it will multiply one into two.
00:32:04.380 | So it'll come like one,
00:32:05.940 | two versus three, four.
00:32:07.340 | So it'll do one into three,
00:32:08.860 | two into three, and add those two values.
00:32:11.060 | That's what matrix multiplication will do.
00:32:13.740 | Okay. Um, and then dot products will- what,
00:32:18.060 | what a dot product is that it takes two vectors.
00:32:20.460 | So usually it operates on vectors.
00:32:22.420 | Um, and a vector as I mentioned is just like a one-dimensional matrix.
00:32:26.140 | So it's just basically three cross one, for example,
00:32:27.940 | a four cross one.
00:32:29.060 | Um, it'll element-wise multiply between two different vectors and will sum up those values.
00:32:33.420 | And so here, what a dot product would do would be like one into one,
00:32:36.580 | plus two into 10, plus three into 100.
00:32:39.060 | And for a NumPy, you can just do np. and then both of those vectors.
00:32:44.100 | Um, this one is just a side on how you would want the structure of the dot product to be.
00:32:50.260 | Um, for arrays that are more- so,
00:32:54.220 | okay, so the, the phrase is the best way.
00:32:56.620 | Um, for single-dimensional, um,
00:32:58.620 | vectors, this operation works directly.
00:33:01.440 | Anytime it's a multiple-dimensional matrix,
00:33:04.740 | um, then it treats it as a matrix multiplication,
00:33:07.900 | the np. dot function.
00:33:09.300 | So for a two by two matrix versus a two by two matrix dot product,
00:33:12.900 | it's not going to return the sum,
00:33:14.620 | it's going to return, um,
00:33:16.540 | the matrix multiplication.
00:33:17.580 | Now that's just something to keep in mind.
00:33:19.160 | If you want to make sure that your,
00:33:21.060 | um, your dot product is happening in the correct way,
00:33:24.820 | um, you would want to make sure that sort of similar to what I was talking about earlier,
00:33:30.180 | um, that here, I think this way to show it.
00:33:35.780 | Okay. So you would want the second,
00:33:39.940 | like the- what I mentioned like the last dimension of
00:33:42.780 | the first one to match with the first dimension of the next one,
00:33:45.260 | because it's treating it as like a matrix multiplication.
00:33:47.740 | Um, here, the error that it's throwing is this three comma two combined with three.
00:33:53.380 | And so the way to sort of like fix that would be to have this be like,
00:33:57.980 | for example, like, um,
00:33:59.780 | switch the two so you'd have two comma three and then three comma.
00:34:03.820 | It's really a dimension matching thing at this point.
00:34:06.820 | So the- the- it's- it can be a little bit confusing,
00:34:09.380 | but when you sort of- the main thing to keep in mind is like for single-dimensional vectors,
00:34:13.180 | you can just do np. dot directly and it'll give you the dot product value.
00:34:16.380 | For higher dimensional matrices,
00:34:17.780 | it treats it as a matrix multiplication.
00:34:20.020 | Um, and so for- if you still want to,
00:34:22.460 | like for those higher dimensional values to ensure that you're getting a dot product,
00:34:26.340 | um, you'd have to make sure that the dimensions are aligned similar to these.
00:34:29.980 | So anything that's two by two plus for both,
00:34:33.980 | um, any- any- you see any matrix that doesn't have a single dimension in any of them,
00:34:37.760 | yes, it would treat it as a matrix on,
00:34:39.380 | uh, mat mule, the same thing.
00:34:41.700 | Okay. All right.
00:34:46.380 | Okay. I'm going to move to indexing.
00:34:48.260 | So similar to what I was talking about earlier,
00:34:51.020 | remember with lists, I was saying if you just do the semicolon,
00:34:53.020 | it'll create like the same array.
00:34:54.940 | Same- same deal here. The- the semicolon just means that you take everything from the original array.
00:34:58.860 | In fact, it returns a copy.
00:35:00.140 | So it returns a deep copy,
00:35:01.860 | means that you have a set- complete separate copy in memory.
00:35:04.300 | Um, okay. Now, I'm going into sort of more details on how do you want to index quickly.
00:35:09.300 | So if I, for example,
00:35:11.020 | have, let's say this three by four matrix,
00:35:13.580 | and I only want to select the zero and the second rows, how would I do that?
00:35:17.780 | So what's useful is that you can sort of treat a numpy,
00:35:20.740 | you can treat different dimensions differently for indexing.
00:35:23.700 | So a semicolon means you select everything in that dimension,
00:35:27.100 | which for example, here there's a semicolon in the second dimension,
00:35:29.540 | which means I'm taking all of the column values.
00:35:32.860 | Um, versus what's in the first dimension here,
00:35:35.380 | it's saying a numpy array of zero and two.
00:35:38.140 | So it's saying only the zero index and only the two index,
00:35:41.220 | which means only the zeroth row and only the second row.
00:35:44.820 | So what this would look like would be something like,
00:35:49.140 | I have a matrix.
00:35:52.380 | Okay. I have a matrix and I only want to select the zeroth row and I only want to
00:35:59.580 | select the column- the second row,
00:36:01.900 | zero and second, and everything in the columns.
00:36:05.820 | All right. And then similarly, for example,
00:36:10.700 | if I want to select in the column dimension, um,
00:36:13.140 | I want to select the first and second rows,
00:36:15.820 | and only the first row, I can do that.
00:36:17.740 | So you can basically treat them separately.
00:36:19.060 | You can think how many columns do I want,
00:36:20.780 | how many rows do I want,
00:36:21.860 | and then index those separately.
00:36:23.620 | And that goes for as many dimensions as you want in your entire tensor.
00:36:27.220 | Um, so nice things also,
00:36:29.160 | if I want to for example take- I have this like- let me print out actually x here.
00:36:34.140 | I'll just generate the x. Okay.
00:36:37.620 | So this is x, right?
00:36:38.780 | So if I want to take all the values of x that are above 0.5 for example,
00:36:43.220 | I can do that by using what's called Boolean indexing.
00:36:46.780 | So I just basically will say x indexed by everything in x that's bigger than 0.5.
00:36:53.140 | So it's pretty direct and it'll just output all the values
00:36:55.500 | in this entire array that are bigger than 0.5.
00:36:58.740 | All right. Um, this one is also another way to do reshaping.
00:37:04.780 | So I kind of mentioned earlier, you know,
00:37:05.900 | sometimes you want- have this like list of
00:37:07.500 | three elements and you want to reshape it to a three by one array for example.
00:37:12.380 | Um, you can also use what's called numpy.newaccess.
00:37:15.500 | This will essentially add another access in whatever dimension you want.
00:37:20.380 | So if I want to change, go from like this three by four array to a three by,
00:37:24.780 | three by four to three by four by one,
00:37:29.260 | then I can just add a numpy.newaccess there.
00:37:31.900 | Even simpler way to think about it would be like a two comma to a two comma one.
00:37:38.260 | And so it's just- it's another way to do what essentially what would be the reshaping operation.
00:37:44.220 | Does that make sense? Also what this would look like for example,
00:37:47.980 | let me just do a little bit more concrete.
00:37:50.780 | So it's basically I have this list, right?
00:38:00.180 | I have like a singular list and in each- in that list I have a list of lists.
00:38:04.260 | So I have a list with element one and list of element two.
00:38:07.140 | So this is what that reshape operation will do.
00:38:10.060 | And what numpy.newaccess will enable you to do as well.
00:38:13.940 | All right. I think we're good for time.
00:38:19.740 | So the last main topic we'll be covering is broadcasting.
00:38:24.020 | And what's really great about broadcasting is it'll allow you to operate with numpy arrays that are of
00:38:30.660 | different shapes but can be sort of- if many operations in them can be repeated,
00:38:35.820 | it allows for that in a very efficient manner.
00:38:37.860 | And this is actually one of the most I would say useful things
00:38:39.860 | about numpy and one of its defining features.
00:38:42.020 | And what that means is if for example in this case, right?
00:38:46.220 | If we go back to this example that I had with- I start off with the 0, 0, 0 array.
00:38:51.900 | How do I generate this array versus how do I generate this array, right?
00:38:57.260 | Instead of me saying, okay,
00:39:00.180 | element 0, 0 plus 1,
00:39:02.740 | element 0, 1 plus 2,
00:39:05.260 | all that stuff, right? Instead of doing that one by one,
00:39:07.740 | what broadcasting allows me to do is I can have
00:39:10.700 | only one vector of size 1, 2, 3.
00:39:14.300 | And it'll- depending on how I do the broadcasting which I'll come to in a second,
00:39:18.740 | I can duplicate it along the row dimension,
00:39:21.700 | or I can duplicate it along the column dimension.
00:39:24.140 | And numpy allows for that.
00:39:25.380 | It'll do that on its own in the back end.
00:39:27.300 | And so that's really what broadcasting means is I don't need to for example,
00:39:31.420 | create a new array saying I wanted like create a new array to begin with,
00:39:35.900 | which is already like this and then add those two together.
00:39:38.500 | I can just duplicate this and get this.
00:39:41.020 | All right. So now some rules for broadcasting.
00:39:43.620 | And I mean just we visually also just show what broadcasting will do.
00:39:47.740 | Oh, sorry. So broadcasting,
00:39:51.580 | this is a pretty good visual analogy.
00:39:53.820 | I have this 1 by 1, 1, 2, 3 vector, right?
00:39:58.740 | And I want to basically add,
00:40:01.140 | let's say only the columns with this 1, 2, 3 vector.
00:40:06.340 | So what broadcasting allows you to do is you only pass these two values in,
00:40:10.380 | and on the back end it'll duplicate this along the column dimension.
00:40:13.700 | So let's say I have 1, 2, 3,
00:40:15.300 | 1, 2, 3, 1, 2, 3, 1, 2, 3,
00:40:16.540 | and then it'll do the addition.
00:40:18.100 | Similarly, if I pass it a vector 1, 2, 3, 4,
00:40:23.100 | and I want it to be added to each of the rows instead of each of the columns,
00:40:26.760 | it'll be able to do that by sort of duplicating it on the back end.
00:40:29.420 | So this is visually what's happening with broadcasting.
00:40:32.180 | All right. Now some rules.
00:40:36.060 | So how does NumPy know when and how to do broadcasting?
00:40:40.020 | So the main two rules to keep in mind with for broadcasting is one,
00:40:43.860 | it can only happen if all of the dimensions,
00:40:46.940 | every single dimension between two arrays are compatible.
00:40:49.900 | And when they say what is compatible,
00:40:52.220 | either the dimension values are equal or one of them is equal to one.
00:40:56.640 | And that is the only rule required.
00:40:58.580 | So for example, I start off with this X array, right?
00:41:02.100 | I have this like 3 by 4 X array.
00:41:04.780 | Will Y is equal to 3, 1 be compatible?
00:41:09.020 | Yes, it will be. Why? Because you have
00:41:11.220 | three in the first dimension between the two which is the same,
00:41:14.220 | and in the second dimension you have four and you have one.
00:41:16.880 | So those are compatible values.
00:41:18.260 | And so what this tells NumPy on the back end is I'm doing,
00:41:21.040 | for example, an addition operation X plus Y.
00:41:24.060 | It knows that, okay, three and three are the same,
00:41:26.700 | but four and one are not the same.
00:41:28.980 | You know, one of them has one dimension.
00:41:30.440 | So I need to duplicate this Y along the second dimension,
00:41:34.820 | which means I need to duplicate it along the column dimension.
00:41:37.400 | And once it does that, it duplicates it,
00:41:39.120 | it'll get four, 3, 4 in array,
00:41:41.460 | and then it can do the addition.
00:41:42.680 | And it does that really fast.
00:41:44.080 | So it's better to use broadcasting in this way,
00:41:46.320 | but then for you to create a separate array already duplicated and then add them.
00:41:50.880 | Similarly, I have this Z array which is 1, 4.
00:41:55.680 | What X into Z will do is, first,
00:41:58.520 | it'll check, okay, 3, 1.
00:42:00.560 | Okay, is that compatible?
00:42:01.640 | Yes, because you have three in one dimension and you have one in the second,
00:42:04.600 | and four and four are compatible.
00:42:06.420 | Okay, so say I know that these two are compatible in the second dimension,
00:42:09.160 | I don't need to change anything.
00:42:10.320 | In the first dimension, it'll know to duplicate them, basically.
00:42:13.340 | So you don't have to duplicate Z. And so add it three times in the row dimension.
00:42:18.880 | Create a separate array and then multiply those two.
00:42:22.200 | So this is giving you an example of saying I started off with X,
00:42:25.360 | I have Y, then the final shape will be 3, 4.
00:42:28.360 | So a lot of times in deep learning,
00:42:30.720 | you will have the same- basically,
00:42:33.960 | you'll have different batches of different images coming in.
00:42:37.200 | But you want to apply, let's say,
00:42:38.720 | the same weight matrix to all of them.
00:42:41.760 | And instead of duplicating that weight matrix a hundred or even like
00:42:45.240 | potentially depending on the size of your batch size like a thousand times,
00:42:48.320 | and then adding those together,
00:42:50.080 | you use the same matrix and it'll know, okay,
00:42:52.240 | if I'm going to be duplicating over the batch dimension,
00:42:54.880 | it'll do that for you on the back end.
00:42:56.440 | So it's used a lot of times in deep learning because of this.
00:42:58.880 | And basically, in your second homework,
00:43:00.880 | that's basically what you'll be doing.
00:43:02.080 | We're implementing a feed-for-all network in NumPy.
00:43:05.360 | And it'll say you have like this W matrix,
00:43:07.760 | you have this like B matrix,
00:43:09.200 | which is a bias, it will come to those in class.
00:43:11.560 | And it'll ask you to implement it in NumPy,
00:43:13.560 | because that's basically what you're doing.
00:43:14.840 | It's like you have this input image,
00:43:16.380 | you have a weight matrix which will somehow scale it to an output.
00:43:19.960 | And that weight matrix will be applied to multiple images in your batch.
00:43:23.440 | And those images can be different,
00:43:24.640 | but their sizes will be the same and it's optimized for that.
00:43:28.000 | Okay. So this is just more examples of sort of the same thing.
00:43:33.120 | Your final thing that you'll be coming to is the size of 3,4.
00:43:36.920 | Let's see. This one's sort of the example that I showed right here, right?
00:43:41.520 | Which is that I have this array of like say zeros.
00:43:44.680 | I have this NumPy,
00:43:46.000 | this B array of size,
00:43:47.560 | what size were they? What would this be?
00:43:48.860 | Yes. Good. Because you have one outer list,
00:43:52.040 | and inside this you have one inner list.
00:43:53.760 | So it's just basically one row and then three values inside.
00:43:56.880 | So yes. And so would this be compatible?
00:43:59.800 | Yes. And so it'll know basically to duplicate over the row dimension.
00:44:03.600 | And so you're going to get duplicates in the row dimensions.
00:44:05.600 | You're going to get 1, 2, 3, 1, 2, 3, 1, 2, 3.
00:44:07.800 | And that's what's happening here.
00:44:09.800 | So these are for example a little bit sometimes when it says more complex behavior.
00:44:15.240 | What this basically just means is that like if I have this B vector,
00:44:18.520 | which is 3,1.
00:44:20.320 | If I'm doing this B plus B dot transpose,
00:44:23.760 | by the way transpose is just changing the dimensions and switching them.
00:44:26.400 | So if I have a two by three matrix,
00:44:28.200 | transpose will be a three by two matrix.
00:44:30.360 | What that means visually is something like your row and rows and like column dimensions will get switched.
00:44:38.000 | X goes to, I believe it's like 1, 2, 3, 4, 5, 6.
00:44:44.840 | So like three row- rows versus like three columns.
00:44:48.480 | And what this is just saying is that a three by one and a one by three,
00:44:54.320 | both of those vectors will be compatible because remember in
00:44:56.880 | each dimension it's either the same or one.
00:44:59.680 | And so it knows to duplicate over both of those dimensions.
00:45:03.760 | And that's what's happening here.
00:45:06.280 | Okay. So I think we are right at time.
00:45:09.760 | And what I would recommend is basically playing with variations of this for broadcasting.
00:45:14.320 | And see, just remember the two rules for broadcasting is just,
00:45:17.760 | if it's compatible it's either the same value or it's one.
00:45:20.560 | And whatever is the one dimension is what's going to be duplicated over on the back end.
00:45:24.200 | So yeah, it's not going to be compatible if they're divisible for example, right?
00:45:27.360 | So if you have like let's say six and three, that's not compatible.
00:45:30.800 | You can reshape it and then see if you'd like to have one.
00:45:34.960 | There's tricks you can use where you're sort of thinking like on the back end,
00:45:38.480 | how do I want this data to be multiplied?
00:45:40.320 | You can maybe reshape everything into like an eight- one,
00:45:42.840 | like one by 18 matrix and then multiply everything and then reshape it back.
00:45:46.440 | That's what you can do but you can never just directly for example,
00:45:49.040 | six by three, make that compatible.
00:45:51.280 | Okay. So I think let's wrap up.
00:45:54.400 | This one's just a quick example of another use of efficient NumPy code.
00:45:58.560 | Quick note, never, preferably don't use loops whenever you're dealing with large data matrices.
00:46:05.120 | Mostly because loops are almost always about a 100 times slower.
00:46:09.720 | NumPy is usually very, very efficient.
00:46:11.840 | As this is just an example of what you can accomplish
00:46:14.360 | with NumPy and same thing using loops.
00:46:16.760 | So what this is saying is I have an x matrix of size 1000 by 1000.
00:46:21.160 | And I want to apply, you know,
00:46:22.760 | let's say I want to add everything from row 100 onwards with plus five.
00:46:28.040 | So visually what that will look like is something like I have
00:46:31.840 | this full matrix and I want everything here basically to be added with plus five.
00:46:40.520 | Then in the loop format,
00:46:42.880 | I can basically loop over the first dimension of 100 plus and do that.
00:46:46.920 | Or NumPy, I can basically do what's called NumPy.a range,
00:46:49.760 | which will generate integers in like we see 1, 2, 3, 4, 5,
00:46:53.240 | 6 all the way up to that 100 value.
00:46:55.440 | In this case, it's between 100 and 1000.
00:46:57.040 | So start with 100, 100, 1, 100, 2,
00:46:58.720 | all the way to 1000 in the first dimension and then just add that with five.
00:47:02.680 | So this is just an example of how you would switch from using loops to using NumPy.
00:47:06.480 | And it's a lot, lot faster.
00:47:08.680 | [BLANK_AUDIO]