back to indexStanford CS224N NLP with Deep Learning | 2023 | Python Tutorial, Manasi Sharma
00:00:10.820 |
The goal of the session really will be to sort of give you the basics of Python and 00:00:16.120 |
NumPy in particular that you'll be using a lot in your second homework. 00:00:20.980 |
And the homeworks that will come after that as well. 00:00:23.020 |
We're sort of taking this tutorial from the background of anyone who hasn't 00:00:28.540 |
touched programming languages to some extent. 00:00:31.300 |
But also for people who have, we'll be sort of going through a lot of that 00:00:34.020 |
material very quickly and we'll be progressing to NumPy as well. 00:00:38.900 |
the session is really meant for the people who are here in person. 00:00:41.380 |
So if you'd like me to slow down, speed up at any point, 00:00:45.180 |
need time for clarifications, feel free to ask. 00:00:49.860 |
And I really would like it to be sort of an interactive session as well. 00:00:52.780 |
All right, so these are the topics we'll be covering today. 00:00:57.260 |
Going through first of all, why Python as a language? 00:00:59.900 |
Why have we chosen it for sort of this course? 00:01:01.580 |
And in general, why do people prefer it to some extent for 00:01:04.140 |
machine learning and natural language processing? 00:01:06.900 |
Some basics of the language itself, common data structures. 00:01:09.940 |
And then getting to sort of the meat of it through NumPy, 00:01:13.460 |
which as I mentioned you'll be extensively using in your homeworks going forward. 00:01:16.140 |
And then some practical tips about how to use things in Python. 00:01:23.580 |
So a lot of you who might have been first introduced to programming, 00:01:29.700 |
A lot of people use MATLAB in other fields as well. 00:01:35.700 |
Python is generally used for one, because it's a very high level language. 00:01:39.460 |
It can look very, very English like, and so it's really easy to work with for 00:01:42.860 |
people, especially when they get started out. 00:01:44.700 |
It has a lot of scientific computational functionality as well, 00:01:49.780 |
you'll see that it has a lot of frameworks of very, 00:01:51.580 |
very quick and efficient operations involving math or matrices. 00:01:55.460 |
And that's very, very useful in applications such as deep learning. 00:01:59.420 |
And for deep learning in particular, a lot of frameworks that people use, 00:02:02.420 |
particularly for example, PyTorch and TensorFlow, interface directly with Python. 00:02:08.140 |
people generally tend to use Python within deep learning. 00:02:10.700 |
Okay, so the setup information is in the slides if you'd like to look at them 00:02:18.860 |
now because I wanna sort of get to the introduction to the language itself. 00:02:22.580 |
And if we have time, come back to sort of the setup information. 00:02:27.500 |
It gives you steps for sort of how to install packages. 00:02:33.780 |
And gets you set up with your first working Python environment, so 00:02:36.500 |
you can sort of run simple and basic commands to get used to the language. 00:02:40.100 |
But for now, I'm gonna be skipping over this and 00:02:54.780 |
will allow you to assign a particular value to a variable. 00:02:57.660 |
A nice thing with Python is you don't have to instantiate the type of the variable to 00:03:01.260 |
begin with, and then only instantiate, and only assign values of that type. 00:03:07.340 |
we first say that this variable, x, is only gonna be of type int. 00:03:11.420 |
And any value aside from that assigned to it will throw an error. 00:03:15.660 |
So if I want to, I can reassign, I can start with x is equal to 10. 00:03:20.620 |
I can say x is equal to high as a string, and there would be no issue. 00:03:24.500 |
You can do simple mathematical operations, such as the plus and division signs. 00:03:31.020 |
You can do exponentiation, which is raising one value to another value. 00:03:35.540 |
So x to the power of y, for example, using the double asterisk. 00:03:38.580 |
You can do type castings for float divisions. 00:03:42.220 |
So if you wanna ensure that your values are being divided, 00:03:44.820 |
resulting in a float value and not just dividing two integers, 00:03:49.660 |
If you want something to be specifically an int, you can also just put an int 00:03:52.860 |
instead of the float with brackets around the result, and 00:03:57.980 |
And then you can also do type casting to, for 00:04:03.540 |
So in this case, if I wanted to, instead of doing 10 plus 3 as 00:04:07.780 |
a mathematical operation, I just wanted to write out 10 plus 3. 00:04:11.140 |
Then I can convert the x and y values, for example, to strings, and 00:04:15.420 |
then add the plus sign as a character as well to create a string. 00:04:20.260 |
And so a lot of these common operations you can look online as well. 00:04:22.620 |
People have lists for them, and just see how they're sort of done in Python. 00:04:30.100 |
So Boolean values, the true and the false, they're always used with capital letters. 00:04:34.220 |
In some of the languages, it might be lowercase, so just one thing to know. 00:04:42.060 |
So sometimes when you wanna say that this value, you want to return none, 00:04:47.620 |
You wanna do checks, for example, in if statements, 00:04:50.980 |
to say that this doesn't have a value, then you can assign it to none. 00:04:55.700 |
So none sort of functions as a null equivalent, so 00:04:59.060 |
you're not really returning anything, it doesn't have a value. 00:05:03.820 |
And another nice thing about Python is lists, which are sort of mutable, 00:05:09.100 |
we'll come to that a little bit later, but sort of mutable lists of objects. 00:05:13.260 |
And means that you can change them, they can be of any type. 00:05:16.540 |
So you can have a mixture of integers, none values, strings, etc. 00:05:22.580 |
And yeah, functions can return the none value as well. 00:05:24.460 |
And another quick thing, instead of using the double and 00:05:29.460 |
and in some of the languages as people might do, with Python, 00:05:34.860 |
So you can actually just write out if x is equal to 3 and, and in English, 00:05:40.460 |
y is equal to 4, then return true or something. 00:05:42.940 |
It's quite nice that way, so you can use and, or, and not. 00:05:47.460 |
And then just the comparison operators of equal equals to and 00:05:50.860 |
not equals to will check for equality and inequality. 00:05:54.660 |
This one's pretty standard, I feel, across many languages, and 00:05:58.620 |
And yeah, remember, just a quick thing, the equal equal to sign is different from 00:06:03.500 |
This one checks for equality, that one is just assigning a value. 00:06:08.580 |
All right, and then also in Python, you don't use brackets. 00:06:13.300 |
So Python, you can use basically spaces or tabs. 00:06:17.940 |
So either indents of 2 or 4 to be able to break up what is contained within 00:06:22.900 |
the function or contained within like an if statement, a for statement, or 00:06:28.140 |
And so the main thing is you can choose whether to do 2 or 4. 00:06:31.020 |
You just have to be consistent throughout your entire code base, 00:06:36.860 |
Now we'll go to some common data structures, and for 00:06:41.700 |
So this one will sort of show you in real time. 00:06:50.700 |
those of you who are familiar with those, that you can use and 00:06:56.140 |
The really nice thing about Jupyter Notebooks is you don't have to run an entire 00:06:59.500 |
file all together, you can run it step by step into what are these called cells. 00:07:04.500 |
So if you want to see like an intermediate output, 00:07:08.140 |
And that way, and it also writes, for example, a lot of descriptions 00:07:12.340 |
pertaining to cells, which is really, really nice to have as well. 00:07:15.820 |
So a lot of people tend to use these when they're sort of starting off their project 00:07:20.020 |
And Colab allows you to use these Jupyter Notebook type applications, 00:07:27.860 |
So anyone can create one of these and run their code. 00:07:35.260 |
Mutable means that you can change them, so that once you declare them, 00:07:38.340 |
you can add to them, you can delete them, and they're optimized for that purpose. 00:07:44.220 |
We'll come to what are called NumPy arrays later, and 00:07:48.340 |
When you change one, you basically have to create a new array, 00:07:53.820 |
So this is highly optimized for changing things. 00:07:55.820 |
So if you know, for example, and you're in a loop, 00:07:58.300 |
you're adding different elements to, let's say, a bigger entity, 00:08:02.100 |
you'd want to use something like a list, because you're going to be changing that 00:08:07.500 |
So we start off with a names array with Zack and Jay. 00:08:10.260 |
You can index into the list by, so what is that? 00:08:15.220 |
It says index into the list by index, which means that you can list out 00:08:19.460 |
the elements in the list, depending on what's called the index. 00:08:22.380 |
So it's what place that value is at within the list. 00:08:25.660 |
So zero refers to the first element, so Python's what's called zero index, 00:08:29.420 |
which means it starts with zero, and then it goes to one. 00:08:33.500 |
And then let's say I want to append something to the end. 00:08:37.260 |
So to add something to the end of the list, the term is append, not add. 00:08:42.300 |
And so if I want to append, I can now create a separate list, 00:08:46.700 |
which is the original list itself with the added last element. 00:08:50.340 |
And what would currently be the length of this? 00:08:52.220 |
It would be three, because you have three elements. 00:08:54.500 |
And you can just quickly get that by using the len function, not length, 00:08:58.820 |
All right, it's also really nice because Python has overloaded 00:09:04.540 |
the plus operation to be able to concatenate lists. 00:09:12.060 |
And all you need for a list definition is just brackets. 00:09:15.700 |
even though I haven't saved it in the variable, just Abhi and Kevin. 00:09:21.220 |
which means that names is equal to names plus Abhi and Kevin. 00:09:26.940 |
You can create lists by just putting the plain brackets or an existing list. 00:09:33.700 |
your list can have a variety of types within them. 00:09:36.220 |
So here, this list contains an integer value, a list value. 00:09:39.340 |
So you can have a list of lists, as many sort of sublists as you'd like, 00:09:47.340 |
Slicing refers to how you can access only parts of the list. 00:09:59.140 |
Slicing is a way that you can extract only those parts. 00:10:03.480 |
the first element is included and the last element is excluded. 00:10:10.580 |
So 3 is not included and so 0, 1, 2 will be printed out. 00:10:15.980 |
So if you know that you're going to be starting with the first element of the array. 00:10:19.660 |
So if you know I'm starting, I want 0, 1, 2 and it starts with 0, 00:10:22.780 |
then you don't need to even include the first index. 00:10:25.140 |
You can just leave that and include the last index that would be excluded. 00:10:34.060 |
If you know that you want to take everything, 00:10:36.420 |
let's say from like 5 and 6 till the end of the array, 00:10:52.820 |
it'll take everything in the list but it'll also create a duplicate in memory. 00:10:57.820 |
very useful thing to know because sometimes when you like pass lists in array, 00:11:03.700 |
sorry in Python which is out of scope of this tutorial, 00:11:08.140 |
So if you will change the array, that gets changed. 00:11:10.380 |
This will create an entirely separate copy in memory of the exact same array. 00:11:17.300 |
So this is a very pretty neat way to do that. 00:11:19.620 |
Then another fun thing that Python has which is pretty unique, 00:11:24.500 |
So negative indexing means you index from the back of the array. 00:11:28.060 |
So minus 1 refers to the last element of the array, 00:11:31.460 |
minus 3 will refer to the third last element. 00:11:35.180 |
So what minus 1 will give you will be 6 in this case, 00:11:40.540 |
because you're starting with the minus 3 elements. 00:11:46.240 |
Then this one seems kind of confusing, right? 00:11:49.820 |
So this will do is it will give you 0, 1, 2, 3. 00:11:52.500 |
So you start with 3 and then minus 1, minus 2. 00:12:02.620 |
That's what this is. Okay. That's about lists. 00:12:09.500 |
So once you declare the values of these, they cannot be changed. 00:12:13.620 |
we started with like the list of Zack and Jay. 00:12:32.860 |
you just create, you can either use just a tuple sign, 00:12:35.580 |
or oftentimes you can just use the parentheses brackets. 00:12:40.180 |
as you did here, just parentheses to instantiate something. 00:12:49.980 |
But you can also have a tuple of a single value. 00:12:52.420 |
And all you have to do there is just put the value and put a comma. 00:13:06.620 |
For those of you who might be familiar with other languages, 00:13:09.620 |
this is the equivalent of a hash map or hash table. 00:13:13.140 |
What this is useful for essentially is mapping 00:13:15.340 |
one value to another in a really, really quick way. 00:13:20.940 |
which you will happen to do a lot of in your homeworks, 00:13:23.620 |
this is a really, really useful way to do that. 00:13:25.940 |
And so what it does is you can instantiate this dictionary. 00:13:31.060 |
Zack is going to correspond to this string value, whatever it is. 00:13:34.340 |
And so anytime I want to retrieve the string value, 00:13:46.740 |
And yeah, so it's really useful, very, very commonly used. 00:13:52.860 |
you have like a list of strings or a list of items, 00:13:55.620 |
and you want to have a corresponding index for them. 00:14:01.020 |
oftentimes you're using with- you're working with 00:14:04.620 |
So it's a really great way to sort of move from like 00:14:08.100 |
string formats to just like numerical index values. 00:14:11.860 |
There's some other things you can do for dictionaries. 00:14:14.340 |
You can check whether certain elements are in there. 00:14:16.300 |
So if you, for example, try to index phone book is equal to Monty, 00:14:20.100 |
they'll throw an error because there's no string that 00:14:24.260 |
And so sometimes you might be wanting to do checks before you extract a value. 00:14:31.380 |
it should say false or for example here Kevin and phone book, 00:14:34.780 |
While something that's actually in that dictionary, 00:14:39.140 |
Okay. And then if you'd like to delete an entry from the, 00:14:42.700 |
um, from the dictionary, you can just do that using the del command. 00:14:51.500 |
So loops are a really great way to optimize for the same kind of op- 00:15:03.180 |
those list type or array type objects we were talking about earlier. 00:15:06.020 |
You know, you have like a list of names, right? 00:15:13.740 |
they've abstracted away a lot of the confusing sort 00:15:15.580 |
of, um, parts in other languages that might be. 00:15:21.420 |
So what you do is you have like a range function that you call. 00:15:28.460 |
So what this range function will return is 0, 1, 2, 3, 4, 00:15:31.860 |
and that's what will be stored in this i value. 00:15:33.860 |
And here it's just printing out that i value. 00:15:37.720 |
loop over the length of an- of a list of size 10, 00:15:42.660 |
and then index that corresponding part of the list. 00:15:45.720 |
You technically don't even have to do that because in Python, 00:15:48.180 |
you can just directly get the element of the list. 00:15:55.060 |
Jay, and Richard. Instead of saying first the length of the list, 00:16:04.580 |
and it will just directly get the element in each list. 00:16:16.440 |
this really helpful function called enumerate. 00:16:18.540 |
And so enumerate will basically pair those two values, 00:16:23.020 |
both the value which is here in name for example, 00:16:25.460 |
and its corresponding index within the array, 00:16:28.060 |
um, both together. So that's really, really convenient. 00:16:32.620 |
a little bit more complicated range operation, 00:16:34.620 |
where you first take the range and then you index the list. 00:16:49.140 |
you can just iterate the same way you would a list. 00:16:54.940 |
If you want to iterate over what is stored in the list, 00:17:13.460 |
the overarching most commonly used sort of structures, um, 00:17:20.140 |
and how to sort of efficiently use them within your code. 00:17:22.820 |
We'll quickly be moving to the sort of meat of what, 00:17:28.500 |
and what you'll be using a lot for your coming homework, 00:17:34.020 |
Okay. So for NumPy also I'm going to be going to the CoLab, 00:17:38.140 |
but just quickly wanted to mention, um, what NumPy is. 00:17:46.100 |
You know, people tend to like MathLab because it's very, 00:17:47.900 |
very useful for these mathematical operations, 00:17:51.500 |
Um, Python's sort of solution to that is to have 00:17:54.420 |
a separate library entirely where they make use of, um, 00:17:58.420 |
subroutines which are sort of like sub languages, um, sorry, 00:18:01.380 |
sub, um, scripts that are written in a different language called C or C++, 00:18:05.100 |
um, that are highly optimized for, um, efficiency. 00:18:08.300 |
So the reason C and C++ are much faster than Python is 00:18:11.500 |
because they're closer to what's called machine language, 00:18:14.940 |
Um, I mentioned earlier, one of the nice things about Python is it's kind of high level. 00:18:18.260 |
It looks like English, right? Just like I said. 00:18:21.020 |
you know, if x is equal to one or x is equal to two, right? 00:18:24.780 |
But, um, that also means that there's a lot more translation required on 00:18:28.120 |
the computer's part before it understands what you mean. 00:18:31.220 |
Um, and that's useful when you know we're writing out code where we want to understand it, 00:18:34.900 |
but it's a little bit less useful when you're sort of 00:18:36.660 |
running a lot of operations on a lot of data. 00:18:39.620 |
So the real benefit of something like NumPy is that if you have sort of 00:18:43.260 |
your memory and your data in a particular format, 00:18:45.660 |
it'll call these, these like we see scripts or what are called 00:18:48.380 |
subroutines in a different language and it'll make them very, very fast. 00:18:51.620 |
And so that's the real benefit of using NumPy. 00:18:57.100 |
very familiar with this because you'll be running a lot of operations on, 00:19:03.420 |
um, it's very useful to have them optimized for time. 00:19:08.220 |
And NumPy basically, it's involved for all these like math and matrix and vector calculations. 00:19:14.740 |
Although you can easily translate between a list and a NumPy array, 00:19:17.740 |
NumPy arrays are specifically, as I mentioned, 00:19:25.820 |
and you can translate between this and sort of your standard list easily. 00:19:29.420 |
But to know that you can only do NumPy operations on NumPy arrays. 00:19:32.240 |
You can't do NumPy operations on lists directly. 00:19:34.740 |
You'd first have to like convert them, which is really simple. 00:19:39.420 |
Um, but just know that they'd operate only on NumPy arrays. 00:19:42.460 |
Okay. So for NumPy, we're gonna be going back to the Colab. 00:19:50.500 |
it supports these large multi-dimensional arrays and matrices 00:19:53.740 |
for very, very optimized high-level mathematical functions. 00:19:57.420 |
Um, and just to go back- step back for a quick second, what is a matrix? 00:20:04.660 |
structures of numbers that are used and you can treat them with 00:20:08.300 |
specific rules, um, for operations between different kinds of things. 00:20:12.220 |
So if you have like a lot of data, instead of, you know, 00:20:17.020 |
if you can store them in this rectangular format, 00:20:19.500 |
um, you have specific rules about how this matrix, 00:20:22.100 |
for example, will interact with a different one. 00:20:23.940 |
And by doing that, which is matrix multiplication or matrix math, 00:20:27.300 |
um, you can do a wide variety of mathematical operations. 00:20:39.500 |
So it's usually like a row vector or a column vector, 00:20:48.420 |
here, when I come down to x is equal to numpy array of 1, 2, 3, 00:20:53.240 |
that's a list in only one dimension versus, for example, 00:20:59.680 |
that is what's called like a two-dimensional array because you have both rows, 00:21:13.940 |
So that's sort of the conventional difference between the two. 00:21:16.560 |
Another convention is matrices generally refer to two-dimensional objects. 00:21:19.800 |
So this, as I mentioned, is like z, this is two-dimensional. 00:21:23.000 |
Um, you might have heard the word tensor also. 00:21:25.460 |
Tensors by convention usually are like higher dimensional objects. 00:21:38.080 |
Um, and those are very valid to do mathematical operations on, 00:21:41.360 |
um, and those are often colloquially sort of called tensors. 00:21:44.760 |
Um, in addition, and this will be covered in the next tutorial in PyTorch, 00:21:49.240 |
um, those larger sort of tensors are also optimized for efficiency, 00:21:55.960 |
And so they're called tensor in a more concrete way because you're using 00:21:59.280 |
these tensors with PyTorch and other sort of packages 00:22:02.160 |
to directly do those quicker GPU operations on for deep learning. 00:22:05.920 |
So those are sort of- that's a quick sort of terminology difference between the three. 00:22:14.200 |
representations of how are these matrices and vectors represented in NumPy. 00:22:17.920 |
Um, this sort of goes back to your question about like, 00:22:20.960 |
what is the difference between like three comma versus like one comma three. 00:22:25.400 |
Um, so usually three comma in NumPy arrays usually just means that you have 00:22:29.840 |
one list of like one, two, three, for example, 00:22:33.240 |
there's like three values versus if you add another list on top of that, 00:22:37.560 |
this one comma three essentially refers to the fact that there's a list of lists. 00:22:44.140 |
it always means that there's a list of lists, 00:22:47.000 |
um, and that being like a list of lists of for example like a row. 00:22:49.760 |
So here, one comma three means that there's one row and then three columns. 00:22:54.200 |
So it's saying there's one row of three comma four comma five essentially, 00:22:58.320 |
and then each of those is like a column separately. 00:23:07.360 |
you'll see a little bit later for operations such as broadcasting, 00:23:10.400 |
you need to have it for example sometimes in this one comma three format or three comma one format. 00:23:18.120 |
three is just like it represents three numbers. 00:23:20.280 |
One comma three means like one row of three elements. 00:23:23.480 |
Three comma one will mean you have essentially in each column, 00:23:29.000 |
So you'll see sort of boxes around each of them. 00:23:30.960 |
I'll- there's an example that comes a little bit later in 00:23:32.880 |
this colab which will make it a little bit more clearer. 00:23:34.880 |
So here, if you can see the difference between like x and y, 00:23:37.880 |
one of them has only one bracket which just says it's one list, 00:23:44.720 |
The second one is two brackets which says it's a list with only one list in it. 00:23:50.680 |
That's really the main difference between like these sort of two representations. 00:23:54.380 |
So I could have like, let's say like a separate one. 00:23:59.880 |
I'm going to call this A, and I just do this. 00:24:06.720 |
but this will be one comma three because it's showing that there's 00:24:12.680 |
and then one inner list which will have each of those values. 00:24:16.000 |
So the benefit will come when I'm coming to what a little bit later which is broadcasting. 00:24:20.240 |
And so it essentially will help you determine what dimensions you want to match against. 00:24:24.560 |
Because sometimes you'd want to have one comma three, 00:24:27.720 |
like 1, 2, 3 applied only to rows in some other matrix. 00:24:32.740 |
We'll, we'll come to that a little bit later. 00:24:33.840 |
Uh, but sometimes you might want to have it only applied to columns. 00:24:37.020 |
And so, like if I have a separate matrix for example of 0, 0, 0, 0, 0, 0, 0, 0, 00:24:42.340 |
and I want the resulting matrix to be for example, 00:24:47.240 |
Let me actually draw this out. It might be easier. 00:24:56.860 |
Um, and if I want to have a matrix that does 1, 2, 3, 1, 2, 3, 1, 2, 3, 00:25:18.380 |
but the resulting array you're generating by repeating 00:25:24.720 |
And so, we'll come to that a little bit later because this process of how 00:25:26.800 |
you generate these arrays is called broadcasting. 00:25:28.780 |
But that's the real benefit of having an understanding of the shapes. 00:25:33.340 |
It's just how they're sort of used with regards to other arrays. 00:25:36.460 |
All right. So, yeah, vectors can be easily represented as sort of, 00:25:39.940 |
and this is what I was talking about earlier as like n dimensions, 00:25:44.480 |
and they can resolve in this different behavior kind of what, 00:25:48.040 |
Um, matrices are usually in two dimensions represented as m by n. 00:25:59.920 |
Oh, sorry, I need to import them back quickly. 00:26:02.720 |
So, I start off with this matrix A which is basically a one-dimensional list of 10 values. 00:26:12.600 |
So, you just have to make sure that your dimensions match which means that like, 00:26:15.240 |
you can multiply them together and get the original size. 00:26:26.680 |
3 and 5 because that it wouldn't fit into the original size. 00:26:29.640 |
Um, and for that, this operation called reshape is really useful. 00:26:33.320 |
Um, you might be wondering why is there two parentheses. 00:26:35.960 |
The way that reshape works is essentially it'll take in a tuple. 00:26:39.480 |
So, remember that what I was talking about earlier with tuples is that these, 00:26:41.920 |
they're immutable objects and they're defined by parentheses. 00:26:44.840 |
So, the outer parentheses is representing what you're inputting to the function, 00:26:57.160 |
this array X. Um, when you apply simple operations, 00:27:02.960 |
sometimes you might want the max of the entire array. 00:27:07.080 |
what's the max value of the entire array by the way? 00:27:16.360 |
Well, let's say I want the max of every row, right? 00:27:22.920 |
I want two and then four and then six. How do you do that? 00:27:25.720 |
And so, NumPy always has like usually in most of their functions an access variable. 00:27:31.120 |
And what the access variable will do is it'll tell you 00:27:33.640 |
which of these dimensions do you want to take the max over. 00:27:42.560 |
the access is what you want to apply your function over, 00:27:48.120 |
And what that means is I print out the shape of the original array, it's three by two. 00:27:58.240 |
So, I want to apply the max over the second dimension. 00:28:00.960 |
The second dimension means that for each of these essentially, 00:28:07.040 |
like the row dimension is the first dimension. 00:28:12.600 |
And so, compare this entire column to this entire column. 00:28:18.840 |
um, usually the axis zero refers to the row axis, 00:28:22.000 |
and then the axis one refers to the column axis. 00:28:26.040 |
you can just remember that from the original dimension, 00:28:30.400 |
Um, and that's the dimension you want to compare over or reduce over. 00:28:35.000 |
So, it can be a little bit harder to grasp around. 00:28:38.320 |
It- it- usually the best way to sort of get around is like just play with a bunch of sort 00:28:41.480 |
of operations of min-max, um, and things like that. 00:28:44.760 |
But just remember like the axis is what you want to compare over, 00:28:53.800 |
comparing one to two, three to four, five to six. 00:29:01.240 |
Um, and what this will do is if I just do, um, 00:29:04.400 |
numpy.axis, it'll just return- basically since I'm comparing these columns, 00:29:12.920 |
you get three values because you're comparing over these columns, 00:29:20.840 |
Um, and so this will just be the tuple comma, 00:29:23.400 |
which is just indicating that it's just a list. 00:29:27.080 |
But let's say I want a list of lists, you know, 00:29:29.040 |
maybe I want to do those operations I talked about earlier. 00:29:33.040 |
which is always there, it's always an option, 00:29:38.360 |
And what that'll do is it'll take the original dimensions, 00:29:44.240 |
there's two of them, and it'll keep that consistent. 00:29:49.280 |
But it just means that instead of returning just the extracted column, 00:29:54.720 |
it'll basically keep the column in the context of the original sort of x, 00:29:59.400 |
and it'll be- it'll keep it as like a two-dimensional value. 00:30:03.400 |
All right. Now, these are just some operations. 00:30:14.180 |
So an asterisk means that I'm going to be comparing every single value, 00:30:17.380 |
um, to every single corresponding value in another matrix. 00:30:20.220 |
And it's- you need your matrices to also be the same size for this one. 00:30:23.380 |
So this one, it's- it's basically an element-wise matrix. 00:30:27.040 |
so you need to have them be the exact same size. 00:30:33.500 |
All right. Um, you can also do matrix multiplication, 00:30:39.440 |
Um, for those of you unfamiliar with matrix multiplication, 00:30:43.360 |
um, you would basically be multiplying a row of one matrix with a column of another matrix. 00:30:50.800 |
you need to have the second dimension of the first array 00:30:53.600 |
be equal to the first dimension of the second array. 00:31:06.340 |
shaped matrices, these two have to be equal for matrix multiplication. 00:31:12.060 |
because oftentimes if you're doing matrix multiplication, um, 00:31:15.580 |
you need- you have to make sure that these dimensions are the same. 00:31:30.580 |
Sometimes. So it's just important to make sure that sometimes you, 00:31:34.480 |
you want to make sure that these are exactly equal. 00:31:36.280 |
You can actually just print out the shapes and 00:31:38.000 |
make sure that these are equal to be doing matrix multiplication. 00:31:42.600 |
um, there's a couple of functions you can use. 00:31:55.760 |
You can choose whichever one. They'll result in the same exact operation. 00:32:01.060 |
you can- to show what this will do is it will multiply one into two. 00:32:18.060 |
what a dot product is that it takes two vectors. 00:32:22.420 |
Um, and a vector as I mentioned is just like a one-dimensional matrix. 00:32:26.140 |
So it's just basically three cross one, for example, 00:32:29.060 |
Um, it'll element-wise multiply between two different vectors and will sum up those values. 00:32:33.420 |
And so here, what a dot product would do would be like one into one, 00:32:39.060 |
And for a NumPy, you can just do np. and then both of those vectors. 00:32:44.100 |
Um, this one is just a side on how you would want the structure of the dot product to be. 00:33:04.740 |
um, then it treats it as a matrix multiplication, 00:33:09.300 |
So for a two by two matrix versus a two by two matrix dot product, 00:33:21.060 |
um, your dot product is happening in the correct way, 00:33:24.820 |
um, you would want to make sure that sort of similar to what I was talking about earlier, 00:33:39.940 |
like the- what I mentioned like the last dimension of 00:33:42.780 |
the first one to match with the first dimension of the next one, 00:33:45.260 |
because it's treating it as like a matrix multiplication. 00:33:47.740 |
Um, here, the error that it's throwing is this three comma two combined with three. 00:33:53.380 |
And so the way to sort of like fix that would be to have this be like, 00:33:59.780 |
switch the two so you'd have two comma three and then three comma. 00:34:03.820 |
It's really a dimension matching thing at this point. 00:34:06.820 |
So the- the- it's- it can be a little bit confusing, 00:34:09.380 |
but when you sort of- the main thing to keep in mind is like for single-dimensional vectors, 00:34:13.180 |
you can just do np. dot directly and it'll give you the dot product value. 00:34:22.460 |
like for those higher dimensional values to ensure that you're getting a dot product, 00:34:26.340 |
um, you'd have to make sure that the dimensions are aligned similar to these. 00:34:33.980 |
um, any- any- you see any matrix that doesn't have a single dimension in any of them, 00:34:48.260 |
So similar to what I was talking about earlier, 00:34:51.020 |
remember with lists, I was saying if you just do the semicolon, 00:34:54.940 |
Same- same deal here. The- the semicolon just means that you take everything from the original array. 00:35:01.860 |
means that you have a set- complete separate copy in memory. 00:35:04.300 |
Um, okay. Now, I'm going into sort of more details on how do you want to index quickly. 00:35:13.580 |
and I only want to select the zero and the second rows, how would I do that? 00:35:17.780 |
So what's useful is that you can sort of treat a numpy, 00:35:20.740 |
you can treat different dimensions differently for indexing. 00:35:23.700 |
So a semicolon means you select everything in that dimension, 00:35:27.100 |
which for example, here there's a semicolon in the second dimension, 00:35:29.540 |
which means I'm taking all of the column values. 00:35:32.860 |
Um, versus what's in the first dimension here, 00:35:38.140 |
So it's saying only the zero index and only the two index, 00:35:41.220 |
which means only the zeroth row and only the second row. 00:35:44.820 |
So what this would look like would be something like, 00:35:52.380 |
Okay. I have a matrix and I only want to select the zeroth row and I only want to 00:36:01.900 |
zero and second, and everything in the columns. 00:36:10.700 |
if I want to select in the column dimension, um, 00:36:23.620 |
And that goes for as many dimensions as you want in your entire tensor. 00:36:29.160 |
if I want to for example take- I have this like- let me print out actually x here. 00:36:38.780 |
So if I want to take all the values of x that are above 0.5 for example, 00:36:43.220 |
I can do that by using what's called Boolean indexing. 00:36:46.780 |
So I just basically will say x indexed by everything in x that's bigger than 0.5. 00:36:53.140 |
So it's pretty direct and it'll just output all the values 00:36:55.500 |
in this entire array that are bigger than 0.5. 00:36:58.740 |
All right. Um, this one is also another way to do reshaping. 00:37:07.500 |
three elements and you want to reshape it to a three by one array for example. 00:37:12.380 |
Um, you can also use what's called numpy.newaccess. 00:37:15.500 |
This will essentially add another access in whatever dimension you want. 00:37:20.380 |
So if I want to change, go from like this three by four array to a three by, 00:37:31.900 |
Even simpler way to think about it would be like a two comma to a two comma one. 00:37:38.260 |
And so it's just- it's another way to do what essentially what would be the reshaping operation. 00:37:44.220 |
Does that make sense? Also what this would look like for example, 00:38:00.180 |
I have like a singular list and in each- in that list I have a list of lists. 00:38:04.260 |
So I have a list with element one and list of element two. 00:38:07.140 |
So this is what that reshape operation will do. 00:38:10.060 |
And what numpy.newaccess will enable you to do as well. 00:38:19.740 |
So the last main topic we'll be covering is broadcasting. 00:38:24.020 |
And what's really great about broadcasting is it'll allow you to operate with numpy arrays that are of 00:38:30.660 |
different shapes but can be sort of- if many operations in them can be repeated, 00:38:35.820 |
it allows for that in a very efficient manner. 00:38:37.860 |
And this is actually one of the most I would say useful things 00:38:39.860 |
about numpy and one of its defining features. 00:38:42.020 |
And what that means is if for example in this case, right? 00:38:46.220 |
If we go back to this example that I had with- I start off with the 0, 0, 0 array. 00:38:51.900 |
How do I generate this array versus how do I generate this array, right? 00:39:05.260 |
all that stuff, right? Instead of doing that one by one, 00:39:07.740 |
what broadcasting allows me to do is I can have 00:39:14.300 |
And it'll- depending on how I do the broadcasting which I'll come to in a second, 00:39:21.700 |
or I can duplicate it along the column dimension. 00:39:27.300 |
And so that's really what broadcasting means is I don't need to for example, 00:39:31.420 |
create a new array saying I wanted like create a new array to begin with, 00:39:35.900 |
which is already like this and then add those two together. 00:39:41.020 |
All right. So now some rules for broadcasting. 00:39:43.620 |
And I mean just we visually also just show what broadcasting will do. 00:40:01.140 |
let's say only the columns with this 1, 2, 3 vector. 00:40:06.340 |
So what broadcasting allows you to do is you only pass these two values in, 00:40:10.380 |
and on the back end it'll duplicate this along the column dimension. 00:40:23.100 |
and I want it to be added to each of the rows instead of each of the columns, 00:40:26.760 |
it'll be able to do that by sort of duplicating it on the back end. 00:40:29.420 |
So this is visually what's happening with broadcasting. 00:40:36.060 |
So how does NumPy know when and how to do broadcasting? 00:40:40.020 |
So the main two rules to keep in mind with for broadcasting is one, 00:40:46.940 |
every single dimension between two arrays are compatible. 00:40:52.220 |
either the dimension values are equal or one of them is equal to one. 00:40:58.580 |
So for example, I start off with this X array, right? 00:41:11.220 |
three in the first dimension between the two which is the same, 00:41:14.220 |
and in the second dimension you have four and you have one. 00:41:18.260 |
And so what this tells NumPy on the back end is I'm doing, 00:41:24.060 |
It knows that, okay, three and three are the same, 00:41:30.440 |
So I need to duplicate this Y along the second dimension, 00:41:34.820 |
which means I need to duplicate it along the column dimension. 00:41:44.080 |
So it's better to use broadcasting in this way, 00:41:46.320 |
but then for you to create a separate array already duplicated and then add them. 00:41:50.880 |
Similarly, I have this Z array which is 1, 4. 00:42:01.640 |
Yes, because you have three in one dimension and you have one in the second, 00:42:06.420 |
Okay, so say I know that these two are compatible in the second dimension, 00:42:10.320 |
In the first dimension, it'll know to duplicate them, basically. 00:42:13.340 |
So you don't have to duplicate Z. And so add it three times in the row dimension. 00:42:18.880 |
Create a separate array and then multiply those two. 00:42:22.200 |
So this is giving you an example of saying I started off with X, 00:42:33.960 |
you'll have different batches of different images coming in. 00:42:41.760 |
And instead of duplicating that weight matrix a hundred or even like 00:42:45.240 |
potentially depending on the size of your batch size like a thousand times, 00:42:50.080 |
you use the same matrix and it'll know, okay, 00:42:52.240 |
if I'm going to be duplicating over the batch dimension, 00:42:56.440 |
So it's used a lot of times in deep learning because of this. 00:43:02.080 |
We're implementing a feed-for-all network in NumPy. 00:43:09.200 |
which is a bias, it will come to those in class. 00:43:16.380 |
you have a weight matrix which will somehow scale it to an output. 00:43:19.960 |
And that weight matrix will be applied to multiple images in your batch. 00:43:24.640 |
but their sizes will be the same and it's optimized for that. 00:43:28.000 |
Okay. So this is just more examples of sort of the same thing. 00:43:33.120 |
Your final thing that you'll be coming to is the size of 3,4. 00:43:36.920 |
Let's see. This one's sort of the example that I showed right here, right? 00:43:41.520 |
Which is that I have this array of like say zeros. 00:43:53.760 |
So it's just basically one row and then three values inside. 00:43:59.800 |
Yes. And so it'll know basically to duplicate over the row dimension. 00:44:03.600 |
And so you're going to get duplicates in the row dimensions. 00:44:05.600 |
You're going to get 1, 2, 3, 1, 2, 3, 1, 2, 3. 00:44:09.800 |
So these are for example a little bit sometimes when it says more complex behavior. 00:44:15.240 |
What this basically just means is that like if I have this B vector, 00:44:23.760 |
by the way transpose is just changing the dimensions and switching them. 00:44:30.360 |
What that means visually is something like your row and rows and like column dimensions will get switched. 00:44:38.000 |
X goes to, I believe it's like 1, 2, 3, 4, 5, 6. 00:44:44.840 |
So like three row- rows versus like three columns. 00:44:48.480 |
And what this is just saying is that a three by one and a one by three, 00:44:54.320 |
both of those vectors will be compatible because remember in 00:44:59.680 |
And so it knows to duplicate over both of those dimensions. 00:45:09.760 |
And what I would recommend is basically playing with variations of this for broadcasting. 00:45:14.320 |
And see, just remember the two rules for broadcasting is just, 00:45:17.760 |
if it's compatible it's either the same value or it's one. 00:45:20.560 |
And whatever is the one dimension is what's going to be duplicated over on the back end. 00:45:24.200 |
So yeah, it's not going to be compatible if they're divisible for example, right? 00:45:27.360 |
So if you have like let's say six and three, that's not compatible. 00:45:30.800 |
You can reshape it and then see if you'd like to have one. 00:45:34.960 |
There's tricks you can use where you're sort of thinking like on the back end, 00:45:40.320 |
You can maybe reshape everything into like an eight- one, 00:45:42.840 |
like one by 18 matrix and then multiply everything and then reshape it back. 00:45:46.440 |
That's what you can do but you can never just directly for example, 00:45:54.400 |
This one's just a quick example of another use of efficient NumPy code. 00:45:58.560 |
Quick note, never, preferably don't use loops whenever you're dealing with large data matrices. 00:46:05.120 |
Mostly because loops are almost always about a 100 times slower. 00:46:11.840 |
As this is just an example of what you can accomplish 00:46:16.760 |
So what this is saying is I have an x matrix of size 1000 by 1000. 00:46:22.760 |
let's say I want to add everything from row 100 onwards with plus five. 00:46:28.040 |
So visually what that will look like is something like I have 00:46:31.840 |
this full matrix and I want everything here basically to be added with plus five. 00:46:42.880 |
I can basically loop over the first dimension of 100 plus and do that. 00:46:46.920 |
Or NumPy, I can basically do what's called NumPy.a range, 00:46:49.760 |
which will generate integers in like we see 1, 2, 3, 4, 5, 00:46:58.720 |
all the way to 1000 in the first dimension and then just add that with five. 00:47:02.680 |
So this is just an example of how you would switch from using loops to using NumPy.