back to indexfastai v2 walk-thru #10
Chapters
0:0 Introduction
1:25 Core
8:10 Make class
11:40 Multiple inheritance
21:1 Stop
23:10 Trace
25:55 Composition
26:33 LS
28:47 Data augmentation
29:55 Brand transform
32:20 Flip LR
37:0 Filt
43:15 Croppad
00:00:22.040 |
Okay, let me know if anybody has any requests. 00:00:43.480 |
This will be the last walkthrough, at least for a while, because we've covered most of 00:00:48.640 |
the main stuff other than what's in learner and optimizer and stuff, which is largely 00:00:57.400 |
the same as what was in the course in Part 2, Version 3 of the course. 00:01:04.880 |
We'll probably do another walkthrough in a couple of weeks or so to cover the little 00:01:14.040 |
We can have a quick look at the Imagenet tutorial today as well. 00:01:18.760 |
Mainly I was going to look at the data augmentation and finish off a little bit more of the stuff 00:01:25.800 |
But yeah, just ask any questions if there's stuff either we haven't covered or you didn't 00:01:32.800 |
quite follow or you don't quite know how it fits together. 00:01:44.600 |
And so you can see in Core, other than metaclasses, which we're probably going to be largely getting 00:01:51.280 |
rid of, so we'll ignore that, there's the kind of decorators and stuff we talked about, 00:01:57.480 |
there's Gedatra, there's L, and then there's a bunch of stuff. 00:02:02.640 |
And a bunch of stuff is, broadly speaking, stuff for collections, types, functions about 00:02:17.880 |
So in terms of stuff that's interesting, one thing that's kind of interesting is make class. 00:02:31.120 |
Make class is a replacement for typing class. 00:02:36.140 |
So basically, I'll just make sure I've done a poll here. 00:02:59.360 |
So basically this one here, make class TA equals 1, super equals Gedatra, is the same 00:03:08.200 |
as typing this, class underscore T, Gedatra colon A equals 1. 00:03:24.580 |
So those are basically two ways of doing the same thing, except that this version has a 00:04:02.240 |
So the class that we get back works kind of a bit more like a data class from the Python 00:04:23.360 |
So you can kind of see, and behind the scenes it calls this thing called get class, which 00:04:35.480 |
So you can see that we can pass in, for instance, A. It says that we will get something called, 00:04:56.440 |
So we could go T equals underscore T. There we go, okay, T dot A. And so you can see that 00:05:07.760 |
if we just pass in something saying this is a field, so we see we first pass in a list 00:05:11.560 |
of fields, then we get that as an empty field. 00:05:22.040 |
And so that will mean that the first field listed A will get that value, as you can see. 00:05:37.000 |
So in this case, B here, you can initialize like this. 00:05:49.260 |
You can also pass in functions that you want to be included. 00:06:08.880 |
And so then we could say, funks equals F. And so now we'll have an underscore F. 00:06:20.160 |
Oops, what did I do wrong there? Functions, now funks, F, and what did I do wrong? 00:06:40.400 |
Maybe I should get rid of the underscore that might be confusing for it, because that's 00:06:49.640 |
normally hidden to T. Oh, sorry, I'm being silly. 00:07:20.760 |
So we have to create the class, funks equals F. And we have A. And we had B equals 2. 00:08:11.680 |
So you can see I can just type T. You'll see that the rep for isn't added if you put in 00:08:18.280 |
a superclass, because the superclass might have its own rep for. 00:08:22.760 |
So basically, by using this make class, we can quickly create a little-- it's kind of 00:08:29.340 |
We can create a quick class, which has some key values that we want, some fields that 00:08:39.200 |
So that's pretty handy little thing that we use quite often in the tests. 00:08:49.200 |
See if it's also used in the actual main code base. 00:09:32.440 |
So this is actually a really good example of something that we do occasionally. 00:09:40.440 |
So let's see what does pull type now look like. 00:09:45.920 |
So we're going through average, max, cat, and creating a dictionary. 00:09:50.640 |
So if we go to-- well, let's just go pull type.tab, and there you can see average, cat, and max. 00:10:03.720 |
This is actually a really nice little way of creating kind of enum strings or kind of 00:10:07.200 |
strings that have nice tab completion, so it's a good use of make class. 00:10:14.720 |
Here you go, doing exactly the same here for pad mode. 00:10:21.600 |
It's actually the same here for resize method. 00:10:28.760 |
So other than that-- oh, here's one again that's getting all the events for callbacks. 00:10:43.040 |
And then in the Notebox-- oh, in the wrong space, pass the idea of-- in the Notebox, yeah, 00:11:11.760 |
you can see we use it sometimes to create little quick and dirty things for testing. 00:11:19.200 |
So we tend to use it instead of data classes, because we find it a bit more convenient sometimes. 00:11:41.960 |
So multiple inheritance is pretty widely covered in Python tutorials, so I won't go into detail, 00:11:49.360 |
but if you go to Python multiple inheritance, you will find lots of examples. 00:11:58.360 |
But the answer is, yeah, basically you have multiple base classes. 00:12:03.140 |
So what will happen is that when you look in super for something, it will first look 00:12:11.400 |
If it doesn't find it, then it will look in here. 00:12:15.800 |
That's basically what multiple inheritance does. 00:12:20.120 |
So in this case, we wanted this to contain all of the basic collection stuff, and we 00:13:00.200 |
Wrap class is something that just takes a function and wraps it into a class by calling 00:13:07.160 |
So sometimes that's handy for tests where you want to quickly create something that 00:13:23.240 |
The collection function is worth knowing about, but they're pretty simple. 00:13:26.200 |
So like, toplify to turn something into a tuple, uniquify to get the unique values of 00:13:33.160 |
something, setify to turn something into a set, group by is super handy. 00:13:45.720 |
It takes a list like this and groups it by some function. 00:13:56.280 |
In this case, a, a, a, b, b, b being grouped by item getter zero is going to group by the 00:14:10.120 |
There's something called iter_tools.groupby in Python standard library. 00:14:15.680 |
I find this one more convenient and less complicated to understand. 00:14:25.040 |
So it just does what it suggests, merges the dictionaries together. 00:14:55.640 |
We probably will do a walkthrough with the text stuff. 00:15:04.160 |
But we haven't really gone back over it to see whether the ideas from tabular could be 00:15:11.360 |
Because we wrote the 30 stuff first before we did tabular, but we might be able to refactor 00:15:20.240 |
So if we do end up with two different approaches, we'll explain why. 00:15:24.680 |
Hopefully we won't, though, because it's nice not to have to have two different approaches. 00:15:32.920 |
I don't use this very much, but I possibly could use it more as more of an exploration 00:15:41.000 |
This is basically something where I'm creating all of these things as functions, less than 00:15:49.000 |
greater than, less than equal, greater than equal, et cetera. 00:15:52.440 |
They are all functions that exist within the operator module in the Python standard library. 00:16:04.980 |
And these functions work the same way, less than 3,5 is true, greater than 3,5 is false. 00:16:12.440 |
But they also allow you to do what's called carrying, as you can see here. 00:16:19.640 |
And specifically, this is helpful, where you can kind of go f equals lt3. 00:16:29.760 |
And then I could do f5, so just putting that into two lines. 00:16:38.080 |
And the reason that's interesting is so I could do stuff like l.range8. 00:16:52.760 |
We wanted to grab those which are less than 3, I could go filtered, lt3, like so. 00:17:03.160 |
And so that's just a lot more convenient than writing the normal way, which would be lambda 00:17:16.440 |
So this kind of thing is pretty common, like pretty much all functional languages allow 00:17:24.800 |
Unfortunately, Python doesn't, but you can kind of create versions that do work that 00:17:35.320 |
And as you can see, the way I'm doing it is I've created this little upper thing, which 00:17:39.040 |
if you pass in only one thing, then b is none, then it returns a lambda, otherwise it does 00:17:48.480 |
And so that is handy with some of this extra stuff that I'm going to show here. 00:17:53.720 |
For example, I've created an infinite lists class so that you can do things like an infinite 00:18:10.920 |
So I could do, for instance, list, filter, inf.count, comma, less than 10. 00:18:24.720 |
Oops, I've got to run those lines, wrong way around, filter in Python first takes a function 00:18:45.960 |
Now that's interesting, why isn't that working? 00:18:55.880 |
Never mind, we can do it this way, zip, range, 5, comma, 15. 00:19:20.720 |
So you can see that the numbers coming from inf.count are just the numbers counting up. 00:19:30.560 |
So we could do things like list iter tools.islice inf.count, comma, 10, say, or we could map 00:19:54.240 |
and x, colon, x times 2, like so, and so forth. 00:20:06.200 |
You could replace inf.count with inf.zeros, as you can see, or inf.ones. 00:20:16.840 |
So it's often really handy to be able to have quick infinite lists available to you. 00:20:20.440 |
We use that, for example, in the data loader. 00:20:27.240 |
It's actually very challenging in-- I mean, not challenging. 00:20:33.400 |
It's awkward in Python to be able to create a property like this that behaves this way. 00:20:39.160 |
The only way I could try to do it was to make it into a meta class and put the properties 00:20:44.560 |
in the meta class and then make that meta class the thing I actually want. 00:20:54.760 |
So it's not too bad, but it's a little more awkward than would be ideal. 00:21:03.420 |
It's often useful to be able to do an expression that returns-- that raises an exception. 00:21:14.080 |
So something like a equals 3, a is greater than 5, or stop. 00:21:27.140 |
And so you can raise an exception in this way. 00:21:35.080 |
So when you see stop, I particularly use that-- by default, it raises a stop iteration, which 00:21:41.680 |
is what Python uses when you finish iterating through a list. 00:21:45.600 |
But you can do anything you like, and as you can see-- Oh, this is a good example of how 00:22:01.920 |
So I've created a generator function that is a lot like a map, but it allows us to do 00:22:15.200 |
So we can generate from some function over some sequence-- so it's basically like a map-- 00:22:26.360 |
So for example, do no-op over count while less than 5, and that's going to be return 00:22:34.200 |
the same as range 5, for operator.negative over an infinite count, well, it's greater 00:22:45.640 |
Well, here's an example which does not have a condition, but instead the actual mapping 00:22:55.340 |
So look over an infinite list, return itself, if o is less than 5, otherwise stop. 00:23:02.960 |
So this is actually a super nice way of doing functional stuff in Python. 00:23:09.560 |
Chunk, we briefly saw, because it was in data loader, and the behavior you can see, you start 00:23:16.360 |
with a range, for example, and chunk into groups of three, and this is how we do batching 00:23:24.440 |
In type, we've seen-- I think most of these types we've seen, show title, so that's fine. 00:23:41.120 |
Trace is super handy if I have-- let me give you an example. 00:23:50.480 |
So let's say I'm creating, I don't know, some list.mapped lambda o-- o times 2. 00:24:12.160 |
And so I've got a passing a function or lambda into something, and I want to debug something, 00:24:19.200 |
Actually, so a good example would be, what if I was doing this? 00:24:34.880 |
And maybe it's not working the way I expected, so I want to debug it. 00:24:39.040 |
You can use trace to turn any function into a traced version of that function, like so. 00:24:50.900 |
And so now I can step into, if I step, I will be stepping into operator.negative, which it 00:25:02.280 |
looks like I can't step into, because I guess that's written not in Python. 00:25:10.000 |
Let's create our own version then, def, neg, just for this example, return, x, there we 00:25:34.280 |
So this is really handy for debugging stuff, particularly where you're doing map or something 00:25:40.080 |
like that, passing in some function that might be from fast.ai or the Python standard library 00:25:49.160 |
You just stick a set trace and then return the same function. 00:25:57.240 |
This does function composition, map does the same as map, but you can pass in multiple functions. 00:26:08.160 |
Yeah, they're all pretty self-explanatory, and then these are the ones we're not really 00:26:18.080 |
using at the moment, so don't worry about that. 00:26:21.760 |
If you look at any of those other things and decide you're interested, feel free to ask 00:26:36.920 |
It's interesting to note that the Python standard library has a bzip standard library function, 00:26:45.160 |
but it doesn't do simple things like unzip something in a path. 00:26:50.320 |
So here's a little function that just does that. 00:26:54.520 |
It just takes a path and unzips it with bzip using the standard library. 00:26:59.440 |
So you don't have to call out to an external process. 00:27:04.720 |
So this kind of thing is very useful to create cross-platform compatible code. 00:27:18.480 |
So as you can see, it's a bit of a mishmash of stuff that we've thrown in there as we've 00:27:24.240 |
The main thing I wanted to show you then was augmentation functionality, and the data augmentation 00:27:34.200 |
functionality is basically grouped into two phases. 00:27:38.600 |
You can either do data augmentation on individual items, like an individual image, or you can 00:27:48.640 |
And so obviously we would rather do data augmentation on a whole batch, so we can do it on the GPU. 00:27:53.000 |
But it's pretty difficult to create data augmentation functions that operate on a batch where the 00:28:00.840 |
things in a batch are different sizes, because you can't really create a proper tensor of 00:28:07.560 |
So to deal with that, we suggest you first of all do a data augmentation that resizes 00:28:14.360 |
things to a consistent size, and then do the rest of your data augmentation on the GPU 00:28:22.000 |
By the way, as most of you probably know, you need to make sure that if you're doing 00:28:27.080 |
something like segmentation or object detection, that your independent variables and your dependent 00:28:32.960 |
variables get augmentation using the same random state, the same, you know, they need 00:28:40.920 |
to be rotate your, your mask and your image both need to be rotated by the same amount, 00:28:47.920 |
So to let that happen, we have a subclass of transform called rand transform. 00:28:53.900 |
And rand transform overrides dunder call from transform to just add a extra callback called 00:29:07.360 |
So remember how in our transforms, they will by default operate, they will like get called 00:29:18.740 |
So we need to make sure that we do any randomization before that happens. 00:29:27.760 |
And so by default, our rand transform has a P, a probability that the transform is applied. 00:29:35.440 |
And by default, our before call will set something called do, so do you want to do it or not, 00:29:42.000 |
which is is some random number less than that P or not. 00:29:55.880 |
So for example, we can create a rand transform where the encoder is at one. 00:30:03.840 |
And so P equals point five means that will be applied half the time. 00:30:09.660 |
So let's-- I mean, we don't really need to create it there. 00:31:13.300 |
So you can see that the do attribute will be set to true about half the time. 00:31:20.940 |
So if it's set to true, we'll set this thing to say, yep, it was set to true at least once. 00:31:26.900 |
Otherwise, we'll set this thing saying it was false at least once. 00:31:29.420 |
So we'll make sure that both of them got called. 00:31:31.520 |
That way, we know it is actually randomizing properly. 00:31:36.300 |
Now, most of the time, you're not going to create a rand transform by passing an encoder 00:31:41.320 |
Most of the time, you will create a rand transform by-- let's see if I can find an example-- by 00:31:53.460 |
inheriting from rand transform and defining the for call and encodes. 00:32:02.740 |
So the encodes is just the usual fast.ai transform encodes. 00:32:08.900 |
And this is the bit where you get to set things up. 00:32:25.620 |
So before we do, it would be nice if we have a flip left right method, which we can call 00:32:31.100 |
on pretty much anything, which isn't a random transform, just something where we can just 00:32:37.060 |
say like this, show image image dot flip left right. 00:32:43.420 |
So if I'm going to be able to say image dot flip left right, then the easiest way to do 00:32:53.540 |
And I mean, there's no reason for that just to be a PIO image. 00:33:01.380 |
May as well make it as convenient as possible. 00:33:06.540 |
So now we have something called flipLR, which is a method of image, tensor image, tensor 00:33:15.380 |
And so we can test that by creating a tensor from a PILO image. 00:33:30.940 |
We can test flipLR, we can create a tensor point, we can create a tensor bbox. 00:33:41.340 |
As you can see, we're just checking that our flipLR works correctly. 00:33:48.080 |
So for the example for the PyTorch case, it already has a dot flip. 00:34:04.260 |
So when you want to now turn that into a random data augmentation, then you inherit from Rantransform. 00:34:13.580 |
And actually in the case where everything you want to use it on is going to just be exactly 00:34:20.780 |
the same line of code, it's going to have the same name. 00:34:23.340 |
So there's a couple of ways we could do this, right? 00:34:25.220 |
One would be to say def encodes self, x, colon, tensor image, return x.flipLR, that'd be one 00:34:49.220 |
We'd like to do it for each different type, but they're all going to have the same code. 00:34:58.540 |
So another thing we could do would be to actually use a tuple as our dispatch type. 00:35:09.460 |
And in fastAI, if you use a tuple, it means any of these, but we actually have an even 00:35:16.220 |
easier way with Rantransform is that the default encodes actually is something which will basically 00:35:32.820 |
And so in this case, if I set self.name to flipLR, then it's just going to call that 00:35:40.460 |
function, which is the function I want to call. 00:35:44.120 |
But we wouldn't want to like, you might have some types that just so happens to have this 00:35:48.920 |
function name and we don't want to do data augmentation on. 00:35:52.220 |
So the other thing you do is you would add your type to this supports list to say this 00:36:00.240 |
So if you later on have something else that has a flipLR method and you want it to be 00:36:06.100 |
added to things that get flipped in data augmentation. 00:36:10.380 |
So maybe your class is a, I don't know, a 3D image that might be something called image3D. 00:36:16.620 |
Then you could say in, you could say flipItem.supports.append.image3D and that's it. 00:36:26.100 |
Now that's going to get random data augmentation as well, or you can do it in the usual way, 00:36:35.060 |
which is def encodes, self, comma, x colon, and then you can do it there. 00:36:51.700 |
So that's the usual way of adding stuff to a transform. 00:37:02.060 |
Another interesting point about RAM transforms is that Filt is set to zero and to remind 00:37:11.220 |
you in transforms, we use this to decide whether or not to call a transform based on what subset 00:37:23.020 |
So this says because Filt is zero for this transform, this will be by default only called 00:37:30.340 |
on your training set and will not be called on your validation or test sets. 00:37:36.840 |
You can obviously change that by setting Filt to something else, but that's the default. 00:37:41.500 |
And so when we then create our flipItem transform to test it out, when we call it, we have to 00:37:48.660 |
say Filt equals zero because we're not using a data source or anything here to say, hey, 00:37:59.620 |
So dihedral, for those of you that remember, is basically the same thing, except it flips 00:38:05.500 |
also vertically or with transposes, so the eight possible dihedral symmetries. 00:38:14.500 |
And as you can see, it's doing the same thing. 00:38:17.980 |
So now we've patched dihedral into all these types. 00:38:25.180 |
And this time, we don't only have a p, but we also need to set our random number between 00:38:36.640 |
0 and 7, saying which of these types of flip we'll be doing. 00:38:46.980 |
Presumably random.randint is inclusive, is it? 00:39:01.060 |
Yeah, so by doing this with before call, we make sure that that k is available, although 00:39:26.420 |
So we're testing it only with this image.dihedral approach, but we're not testing it with the 00:39:36.380 |
Ah, that was a mistake, because this k is going to need to be passed in here. 00:39:48.500 |
All right, so let's-- we have to call that ourselves, which is no problem. 00:40:00.700 |
All right, so I think we're going to have to go def ncodes self comma x colon, and then 00:40:27.860 |
we'll list all the types we support, just all of those. 00:40:37.260 |
And so now there's no point passing this name along, because we're not doing the automatic 00:40:46.820 |
And so now we will return x.dihedral self dot k. 00:41:00.180 |
And so now we need to make sure that we have a test of that. 00:41:05.580 |
So what we could do is we could create a transform here, called a dihedral item transform. 00:41:33.300 |
So we're going to go, and then we will go show-- we'll go f image. 00:41:56.820 |
There's no need to use the i, because this is random this time. 00:42:03.100 |
So hopefully we'll see a nice mix of random transforms. 00:42:40.780 |
So we've got lots of different random versions. 00:42:46.380 |
So let's get rid of this one, since we don't really need it twice. 00:42:55.180 |
And this one we can probably hide, because it's just testing it on TensorPoint as well. 00:43:17.660 |
So those are transforms that work on a single item at a time. 00:43:23.500 |
So here is something called CropPad, which will either crop or pad, depending on whatever 00:43:34.420 |
is necessary to create the size that you ask for, which we've seen in other versions of 00:43:43.020 |
There's a CropPad, which calls something called DoCropPad, and then DoCropPad is defined. 00:43:54.500 |
Let's move everything around so it's easier to see what's going on. 00:44:05.940 |
For TensorBbox, for TensorPoint, and for any kind of image. 00:44:13.220 |
So since we now have that working, we can, in our CropPad transform, simply call, as 00:44:43.260 |
Still some room to refactor this a little bit, but it's on the right check. 00:44:58.260 |
And as you can see, you can choose what padding mode you want, reflection, border, zeros. 00:45:05.820 |
So then we can use something you can then inherit from that to create random crop just 00:45:26.820 |
And one of the nice things here is it'll automatically take the-- oh, we should check this is actually 00:45:37.900 |
We want it to take the center crop automatically on the validation set, although I don't know 00:45:48.660 |
OK, we can do resizing, similar idea, again, just inheriting from CropPad. 00:45:59.860 |
The famous random resize crop used in pretty much all ImageNet solutions is just another 00:46:08.580 |
And then we start the random transforms that'll work on the GPU. 00:46:17.540 |
And there's nothing particularly different about them. 00:46:22.460 |
These don't need to be refactored a little bit. 00:46:27.900 |
There's a before call, there's encodes for the different types you want. 00:46:33.340 |
And they're just written so that the matrix math works out with the extra batch dimension 00:46:40.780 |
So for example, dihedral is done on the GPU by using an affine transform. 00:46:55.980 |
And most of this stuff, the affine transforms and warps and stuff, we did touch on in the 00:47:02.620 |
And the lighting transforms also were done there, if you've forgotten. 00:47:22.220 |
So this is ImageNet, and it looks pretty familiar, untied data, get image files. 00:47:37.100 |
Obviously, you could use data blocks as well, but this is doing it fairly manually. 00:47:46.580 |
Hector of the rows, these tags are mentioned in the previous code walkthroughs, so check 00:47:51.340 |
I was just looking at the script starting with the number 9 to see how they're defined. 00:48:00.100 |
So transforms for the independent variable will just be create an image for the dependent. 00:48:04.580 |
It will be for the parent label function, and then categorize. 00:48:09.580 |
And then for the tuples, we go tensor optionally flip for random resize crop, create a data 00:48:18.140 |
source, and then on the batch, I should say, put it on the GPU, turn it into a float tensor, 00:48:41.940 |
And here's the data block version of the same thing. 00:48:55.020 |
So then some of these should need to be exported, but we can create a CNN learner to wrap our 00:49:03.940 |
learner a bit more conveniently like we did in version 1, label smoothing, and fit. 00:49:51.100 |
Actually, we tend not to add much augmentation because we tend to use mixup nowadays if we 00:50:01.620 |
So we tested this on more epochs, and he was getting slightly better results than we were 00:50:13.300 |
All right, I think-- oh, one more question-- torch vision models be used. 00:50:33.280 |
So if you look at xresnet, it's just a normal nn.sequential. 00:50:41.700 |
So yeah, there shouldn't be any special requirements. 00:50:47.380 |
If you try using a model and it doesn't work, let us know. 00:50:53.900 |
I guess for stuff like transfer learning, maybe that's something we can do in a future 00:50:58.980 |
Yeah, we should probably do that in a future walkthrough, talk about how that stuff works. 00:51:07.100 |
See you on the forums, and I'll let you know if we're going to do more of these in the