back to index

fastai v2 walk-thru #10


Chapters

0:0 Introduction
1:25 Core
8:10 Make class
11:40 Multiple inheritance
21:1 Stop
23:10 Trace
25:55 Composition
26:33 LS
28:47 Data augmentation
29:55 Brand transform
32:20 Flip LR
37:0 Filt
43:15 Croppad

Whisper Transcript | Transcript Only Page

00:00:00.000 | >> Okay, hi there, everybody.
00:00:09.100 | Can you see in here okay?
00:00:22.040 | Okay, let me know if anybody has any requests.
00:00:43.480 | This will be the last walkthrough, at least for a while, because we've covered most of
00:00:48.640 | the main stuff other than what's in learner and optimizer and stuff, which is largely
00:00:57.400 | the same as what was in the course in Part 2, Version 3 of the course.
00:01:04.880 | We'll probably do another walkthrough in a couple of weeks or so to cover the little
00:01:09.000 | differences, I suspect.
00:01:14.040 | We can have a quick look at the Imagenet tutorial today as well.
00:01:18.760 | Mainly I was going to look at the data augmentation and finish off a little bit more of the stuff
00:01:24.800 | that's in Core.
00:01:25.800 | But yeah, just ask any questions if there's stuff either we haven't covered or you didn't
00:01:32.800 | quite follow or you don't quite know how it fits together.
00:01:36.600 | In the meantime, we'll get back to Core.
00:01:44.600 | And so you can see in Core, other than metaclasses, which we're probably going to be largely getting
00:01:51.280 | rid of, so we'll ignore that, there's the kind of decorators and stuff we talked about,
00:01:57.480 | there's Gedatra, there's L, and then there's a bunch of stuff.
00:02:02.640 | And a bunch of stuff is, broadly speaking, stuff for collections, types, functions about
00:02:11.400 | functions, file and network.
00:02:17.880 | So in terms of stuff that's interesting, one thing that's kind of interesting is make class.
00:02:31.120 | Make class is a replacement for typing class.
00:02:36.140 | So basically, I'll just make sure I've done a poll here.
00:02:46.280 | There we go.
00:02:59.360 | So basically this one here, make class TA equals 1, super equals Gedatra, is the same
00:03:08.200 | as typing this, class underscore T, Gedatra colon A equals 1.
00:03:24.580 | So those are basically two ways of doing the same thing, except that this version has a
00:03:33.360 | bit of extra functionality.
00:03:34.880 | Let me think, what does it do?
00:03:44.880 | Been a while since I've looked at this one.
00:03:58.040 | Ah, yes, okay.
00:04:02.240 | So the class that we get back works kind of a bit more like a data class from the Python
00:04:18.100 | data classes standard library thing.
00:04:23.360 | So you can kind of see, and behind the scenes it calls this thing called get class, which
00:04:30.080 | is actually kind of what does the work.
00:04:35.480 | So you can see that we can pass in, for instance, A. It says that we will get something called,
00:04:45.680 | as you see here, T dot A. Let's try that.
00:04:56.440 | So we could go T equals underscore T. There we go, okay, T dot A. And so you can see that
00:05:07.760 | if we just pass in something saying this is a field, so we see we first pass in a list
00:05:11.560 | of fields, then we get that as an empty field.
00:05:16.800 | You can pass in a value for it.
00:05:22.040 | And so that will mean that the first field listed A will get that value, as you can see.
00:05:30.700 | You can pass in keyword argument fields.
00:05:37.000 | So in this case, B here, you can initialize like this.
00:05:43.360 | There it is.
00:05:49.260 | You can also pass in functions that you want to be included.
00:05:55.320 | So for example, define F self print I.
00:06:08.880 | And so then we could say, funks equals F. And so now we'll have an underscore F.
00:06:20.160 | Oops, what did I do wrong there? Functions, now funks, F, and what did I do wrong?
00:06:39.200 | Let's have a look.
00:06:40.400 | Maybe I should get rid of the underscore that might be confusing for it, because that's
00:06:49.640 | normally hidden to T. Oh, sorry, I'm being silly.
00:07:11.720 | We need to recreate it.
00:07:13.600 | So we need to go-- I'm going crazy.
00:07:20.760 | So we have to create the class, funks equals F. And we have A. And we had B equals 2.
00:07:41.760 | Right.
00:07:42.760 | And so now there we go, that's our function.
00:07:53.320 | Let's see.
00:08:03.160 | We get a rep for free.
00:08:05.680 | Let's try that one.
00:08:08.520 | Yep, there we go.
00:08:11.680 | So you can see I can just type T. You'll see that the rep for isn't added if you put in
00:08:18.280 | a superclass, because the superclass might have its own rep for.
00:08:22.760 | So basically, by using this make class, we can quickly create a little-- it's kind of
00:08:27.280 | good for tests and stuff.
00:08:29.340 | We can create a quick class, which has some key values that we want, some fields that
00:08:34.480 | we want.
00:08:35.480 | You can initialize it.
00:08:36.960 | It's got a representation.
00:08:39.200 | So that's pretty handy little thing that we use quite often in the tests.
00:08:48.200 | Let's have a look, actually.
00:08:49.200 | See if it's also used in the actual main code base.
00:08:52.220 | So let's look for make class.
00:08:55.520 | OK, it's used occasionally.
00:09:00.580 | So here's an example.
00:09:01.580 | One of usage is inside layers.
00:09:04.880 | So let's take a look at that.
00:09:11.280 | OK, so here's the example of in layers.
00:09:20.160 | It's creating a class called pull type.
00:09:25.060 | And it has a bunch of fields in it.
00:09:32.440 | So this is actually a really good example of something that we do occasionally.
00:09:36.220 | So let's use that to see how it works.
00:09:40.440 | So let's see what does pull type now look like.
00:09:45.920 | So we're going through average, max, cat, and creating a dictionary.
00:09:50.640 | So if we go to-- well, let's just go pull type.tab, and there you can see average, cat, and max.
00:10:03.720 | This is actually a really nice little way of creating kind of enum strings or kind of
00:10:07.200 | strings that have nice tab completion, so it's a good use of make class.
00:10:14.720 | Here you go, doing exactly the same here for pad mode.
00:10:21.600 | It's actually the same here for resize method.
00:10:28.760 | So other than that-- oh, here's one again that's getting all the events for callbacks.
00:10:43.040 | And then in the Notebox-- oh, in the wrong space, pass the idea of-- in the Notebox, yeah,
00:11:11.760 | you can see we use it sometimes to create little quick and dirty things for testing.
00:11:19.200 | So we tend to use it instead of data classes, because we find it a bit more convenient sometimes.
00:11:29.920 | So Kevin asked about multiple inheritance.
00:11:41.960 | So multiple inheritance is pretty widely covered in Python tutorials, so I won't go into detail,
00:11:49.360 | but if you go to Python multiple inheritance, you will find lots of examples.
00:11:58.360 | But the answer is, yeah, basically you have multiple base classes.
00:12:03.140 | So what will happen is that when you look in super for something, it will first look
00:12:10.400 | in here.
00:12:11.400 | If it doesn't find it, then it will look in here.
00:12:15.800 | That's basically what multiple inheritance does.
00:12:20.120 | So in this case, we wanted this to contain all of the basic collection stuff, and we
00:12:28.640 | also wanted it to have the getatra.
00:12:31.320 | So this means we now have both.
00:12:37.040 | Right.
00:12:40.720 | What else have we got here in functions?
00:13:00.200 | Wrap class is something that just takes a function and wraps it into a class by calling
00:13:05.160 | make class.
00:13:07.160 | So sometimes that's handy for tests where you want to quickly create something that
00:13:11.280 | you can test methods on.
00:13:14.280 | No worries, Kevin.
00:13:23.240 | The collection function is worth knowing about, but they're pretty simple.
00:13:26.200 | So like, toplify to turn something into a tuple, uniquify to get the unique values of
00:13:33.160 | something, setify to turn something into a set, group by is super handy.
00:13:45.720 | It takes a list like this and groups it by some function.
00:13:56.280 | In this case, a, a, a, b, b, b being grouped by item getter zero is going to group by the
00:14:02.560 | first letter.
00:14:03.560 | So a is a, a, b, a, b, and b is b, b.
00:14:09.000 | That's a very handy function.
00:14:10.120 | There's something called iter_tools.groupby in Python standard library.
00:14:15.680 | I find this one more convenient and less complicated to understand.
00:14:23.360 | Merging dictionaries is often handy.
00:14:25.040 | So it just does what it suggests, merges the dictionaries together.
00:14:37.400 | Okay, yeah.
00:14:55.640 | We probably will do a walkthrough with the text stuff.
00:15:04.160 | But we haven't really gone back over it to see whether the ideas from tabular could be
00:15:09.600 | used there.
00:15:11.360 | Because we wrote the 30 stuff first before we did tabular, but we might be able to refactor
00:15:16.600 | text using in-place transform.
00:15:18.920 | I'm not sure.
00:15:20.240 | So if we do end up with two different approaches, we'll explain why.
00:15:24.680 | Hopefully we won't, though, because it's nice not to have to have two different approaches.
00:15:29.840 | Oh, this is fun.
00:15:32.920 | I don't use this very much, but I possibly could use it more as more of an exploration
00:15:37.600 | of some functional programming ideas.
00:15:41.000 | This is basically something where I'm creating all of these things as functions, less than
00:15:49.000 | greater than, less than equal, greater than equal, et cetera.
00:15:52.440 | They are all functions that exist within the operator module in the Python standard library.
00:16:04.980 | And these functions work the same way, less than 3,5 is true, greater than 3,5 is false.
00:16:12.440 | But they also allow you to do what's called carrying, as you can see here.
00:16:19.640 | And specifically, this is helpful, where you can kind of go f equals lt3.
00:16:29.760 | And then I could do f5, so just putting that into two lines.
00:16:38.080 | And the reason that's interesting is so I could do stuff like l.range8.
00:16:49.880 | So let's pick those ones.
00:16:51.760 | Let's say we had some list.
00:16:52.760 | We wanted to grab those which are less than 3, I could go filtered, lt3, like so.
00:17:03.160 | And so that's just a lot more convenient than writing the normal way, which would be lambda
00:17:11.520 | x, colon, x is less than 3.
00:17:16.440 | So this kind of thing is pretty common, like pretty much all functional languages allow
00:17:22.720 | you to do carrying like this.
00:17:24.800 | Unfortunately, Python doesn't, but you can kind of create versions that do work that
00:17:35.320 | And as you can see, the way I'm doing it is I've created this little upper thing, which
00:17:39.040 | if you pass in only one thing, then b is none, then it returns a lambda, otherwise it does
00:17:44.960 | the actual operation.
00:17:48.480 | And so that is handy with some of this extra stuff that I'm going to show here.
00:17:53.720 | For example, I've created an infinite lists class so that you can do things like an infinite
00:18:06.320 | count is arranged from 0 to infinity.
00:18:10.920 | So I could do, for instance, list, filter, inf.count, comma, less than 10.
00:18:24.720 | Oops, I've got to run those lines, wrong way around, filter in Python first takes a function
00:18:40.720 | less than 10, comma, inf.count.
00:18:45.960 | Now that's interesting, why isn't that working?
00:18:55.880 | Never mind, we can do it this way, zip, range, 5, comma, 15.
00:19:09.560 | comma, inf.count, and then list that.
00:19:16.680 | OK, there we go.
00:19:20.720 | So you can see that the numbers coming from inf.count are just the numbers counting up.
00:19:30.560 | So we could do things like list iter tools.islice inf.count, comma, 10, say, or we could map
00:19:54.240 | and x, colon, x times 2, like so, and so forth.
00:20:06.200 | You could replace inf.count with inf.zeros, as you can see, or inf.ones.
00:20:16.840 | So it's often really handy to be able to have quick infinite lists available to you.
00:20:20.440 | We use that, for example, in the data loader.
00:20:25.200 | So that's basically why this is here.
00:20:27.240 | It's actually very challenging in-- I mean, not challenging.
00:20:33.400 | It's awkward in Python to be able to create a property like this that behaves this way.
00:20:39.160 | The only way I could try to do it was to make it into a meta class and put the properties
00:20:44.560 | in the meta class and then make that meta class the thing I actually want.
00:20:54.760 | So it's not too bad, but it's a little more awkward than would be ideal.
00:21:03.420 | It's often useful to be able to do an expression that returns-- that raises an exception.
00:21:14.080 | So something like a equals 3, a is greater than 5, or stop.
00:21:27.140 | And so you can raise an exception in this way.
00:21:33.040 | So I do that quite often.
00:21:35.080 | So when you see stop, I particularly use that-- by default, it raises a stop iteration, which
00:21:41.680 | is what Python uses when you finish iterating through a list.
00:21:45.600 | But you can do anything you like, and as you can see-- Oh, this is a good example of how
00:22:00.920 | to use that.
00:22:01.920 | So I've created a generator function that is a lot like a map, but it allows us to do
00:22:09.480 | this handle stop iteration nicely.
00:22:15.200 | So we can generate from some function over some sequence-- so it's basically like a map--
00:22:23.680 | as long as some condition is true.
00:22:26.360 | So for example, do no-op over count while less than 5, and that's going to be return
00:22:34.200 | the same as range 5, for operator.negative over an infinite count, well, it's greater
00:22:42.000 | than negative 5.
00:22:43.360 | We'll return this.
00:22:45.640 | Well, here's an example which does not have a condition, but instead the actual mapping
00:22:53.240 | function has a stop in it.
00:22:55.340 | So look over an infinite list, return itself, if o is less than 5, otherwise stop.
00:23:02.960 | So this is actually a super nice way of doing functional stuff in Python.
00:23:09.560 | Chunk, we briefly saw, because it was in data loader, and the behavior you can see, you start
00:23:16.360 | with a range, for example, and chunk into groups of three, and this is how we do batching
00:23:21.800 | in the data loader.
00:23:24.440 | In type, we've seen-- I think most of these types we've seen, show title, so that's fine.
00:23:41.120 | Trace is super handy if I have-- let me give you an example.
00:23:50.480 | So let's say I'm creating, I don't know, some list.mapped lambda o-- o times 2.
00:24:10.120 | So I've got something like this, right?
00:24:12.160 | And so I've got a passing a function or lambda into something, and I want to debug something,
00:24:18.040 | like something like this.
00:24:19.200 | Actually, so a good example would be, what if I was doing this?
00:24:22.480 | Whoops, I should say negative.
00:24:34.880 | And maybe it's not working the way I expected, so I want to debug it.
00:24:39.040 | You can use trace to turn any function into a traced version of that function, like so.
00:24:50.900 | And so now I can step into, if I step, I will be stepping into operator.negative, which it
00:25:02.280 | looks like I can't step into, because I guess that's written not in Python.
00:25:06.560 | That's annoying.
00:25:10.000 | Let's create our own version then, def, neg, just for this example, return, x, there we
00:25:28.760 | Step.
00:25:30.760 | And there you go.
00:25:31.760 | Here we are.
00:25:32.760 | We've stepped into neg.
00:25:34.280 | So this is really handy for debugging stuff, particularly where you're doing map or something
00:25:40.080 | like that, passing in some function that might be from fast.ai or the Python standard library
00:25:44.920 | or PyTorch or whatever.
00:25:47.920 | And as you can see, it's very simple.
00:25:49.160 | You just stick a set trace and then return the same function.
00:25:57.240 | This does function composition, map does the same as map, but you can pass in multiple functions.
00:26:08.160 | Yeah, they're all pretty self-explanatory, and then these are the ones we're not really
00:26:18.080 | using at the moment, so don't worry about that.
00:26:21.760 | If you look at any of those other things and decide you're interested, feel free to ask
00:26:25.400 | in the forums.
00:26:27.480 | LS we've looked at.
00:26:34.600 | This is interesting.
00:26:35.720 | Something like b-unzip.
00:26:36.920 | It's interesting to note that the Python standard library has a bzip standard library function,
00:26:45.160 | but it doesn't do simple things like unzip something in a path.
00:26:50.320 | So here's a little function that just does that.
00:26:54.520 | It just takes a path and unzips it with bzip using the standard library.
00:26:59.440 | So you don't have to call out to an external process.
00:27:04.720 | So this kind of thing is very useful to create cross-platform compatible code.
00:27:18.480 | So as you can see, it's a bit of a mishmash of stuff that we've thrown in there as we've
00:27:23.240 | needed it.
00:27:24.240 | The main thing I wanted to show you then was augmentation functionality, and the data augmentation
00:27:34.200 | functionality is basically grouped into two phases.
00:27:38.600 | You can either do data augmentation on individual items, like an individual image, or you can
00:27:44.760 | do data augmentation on a whole batch.
00:27:48.640 | And so obviously we would rather do data augmentation on a whole batch, so we can do it on the GPU.
00:27:53.000 | But it's pretty difficult to create data augmentation functions that operate on a batch where the
00:28:00.840 | things in a batch are different sizes, because you can't really create a proper tensor of
00:28:04.120 | it unless you do padding and stuff.
00:28:07.560 | So to deal with that, we suggest you first of all do a data augmentation that resizes
00:28:14.360 | things to a consistent size, and then do the rest of your data augmentation on the GPU
00:28:19.240 | as a batch.
00:28:22.000 | By the way, as most of you probably know, you need to make sure that if you're doing
00:28:27.080 | something like segmentation or object detection, that your independent variables and your dependent
00:28:32.960 | variables get augmentation using the same random state, the same, you know, they need
00:28:40.920 | to be rotate your, your mask and your image both need to be rotated by the same amount,
00:28:45.720 | for example.
00:28:47.920 | So to let that happen, we have a subclass of transform called rand transform.
00:28:53.900 | And rand transform overrides dunder call from transform to just add a extra callback called
00:29:05.880 | before call.
00:29:07.360 | So remember how in our transforms, they will by default operate, they will like get called
00:29:14.920 | on each part of your tuple independently.
00:29:18.740 | So we need to make sure that we do any randomization before that happens.
00:29:24.360 | So this is our opportunity to do that.
00:29:27.760 | And so by default, our rand transform has a P, a probability that the transform is applied.
00:29:35.440 | And by default, our before call will set something called do, so do you want to do it or not,
00:29:42.000 | which is is some random number less than that P or not.
00:29:50.000 | So that's how we do data augmentation.
00:29:55.880 | So for example, we can create a rand transform where the encoder is at one.
00:30:03.840 | And so P equals point five means that will be applied half the time.
00:30:09.660 | So let's-- I mean, we don't really need to create it there.
00:30:18.800 | We can actually probably create it there.
00:30:20.600 | Oh, looks like these got renamed somehow.
00:30:45.460 | Oh, I see, it doesn't need to be inside.
00:31:07.460 | Ah, yes, I see.
00:31:13.300 | So you can see that the do attribute will be set to true about half the time.
00:31:20.940 | So if it's set to true, we'll set this thing to say, yep, it was set to true at least once.
00:31:26.900 | Otherwise, we'll set this thing saying it was false at least once.
00:31:29.420 | So we'll make sure that both of them got called.
00:31:31.520 | That way, we know it is actually randomizing properly.
00:31:33.780 | Yeah, that's the basic idea.
00:31:36.300 | Now, most of the time, you're not going to create a rand transform by passing an encoder
00:31:39.900 | in like this.
00:31:41.320 | Most of the time, you will create a rand transform by-- let's see if I can find an example-- by
00:31:53.460 | inheriting from rand transform and defining the for call and encodes.
00:32:02.740 | So the encodes is just the usual fast.ai transform encodes.
00:32:08.900 | And this is the bit where you get to set things up.
00:32:12.420 | So let's look at some examples.
00:32:21.860 | So let's do a flip left right.
00:32:25.620 | So before we do, it would be nice if we have a flip left right method, which we can call
00:32:31.100 | on pretty much anything, which isn't a random transform, just something where we can just
00:32:37.060 | say like this, show image image dot flip left right.
00:32:43.420 | So if I'm going to be able to say image dot flip left right, then the easiest way to do
00:32:46.720 | that is with our patch.
00:32:53.540 | And I mean, there's no reason for that just to be a PIO image.
00:32:55.980 | It could actually be a image dot image.
00:33:01.380 | May as well make it as convenient as possible.
00:33:03.740 | There we go.
00:33:06.540 | So now we have something called flipLR, which is a method of image, tensor image, tensor
00:33:12.380 | point, tensor bbox.
00:33:15.380 | And so we can test that by creating a tensor from a PILO image.
00:33:30.940 | We can test flipLR, we can create a tensor point, we can create a tensor bbox.
00:33:41.340 | As you can see, we're just checking that our flipLR works correctly.
00:33:48.080 | So for the example for the PyTorch case, it already has a dot flip.
00:33:53.380 | And you just say which axis to flip on.
00:33:55.180 | So that made that one super easy.
00:33:58.180 | PILO has something else called transpose.
00:34:04.260 | So when you want to now turn that into a random data augmentation, then you inherit from Rantransform.
00:34:13.580 | And actually in the case where everything you want to use it on is going to just be exactly
00:34:20.780 | the same line of code, it's going to have the same name.
00:34:23.340 | So there's a couple of ways we could do this, right?
00:34:25.220 | One would be to say def encodes self, x, colon, tensor image, return x.flipLR, that'd be one
00:34:48.220 | way to do it.
00:34:49.220 | We'd like to do it for each different type, but they're all going to have the same code.
00:34:58.540 | So another thing we could do would be to actually use a tuple as our dispatch type.
00:35:09.460 | And in fastAI, if you use a tuple, it means any of these, but we actually have an even
00:35:16.220 | easier way with Rantransform is that the default encodes actually is something which will basically
00:35:28.060 | just call a function called self.name.
00:35:32.820 | And so in this case, if I set self.name to flipLR, then it's just going to call that
00:35:40.460 | function, which is the function I want to call.
00:35:44.120 | But we wouldn't want to like, you might have some types that just so happens to have this
00:35:48.920 | function name and we don't want to do data augmentation on.
00:35:52.220 | So the other thing you do is you would add your type to this supports list to say this
00:35:57.020 | is something which supports flipping.
00:36:00.240 | So if you later on have something else that has a flipLR method and you want it to be
00:36:06.100 | added to things that get flipped in data augmentation.
00:36:10.380 | So maybe your class is a, I don't know, a 3D image that might be something called image3D.
00:36:16.620 | Then you could say in, you could say flipItem.supports.append.image3D and that's it.
00:36:26.100 | Now that's going to get random data augmentation as well, or you can do it in the usual way,
00:36:35.060 | which is def encodes, self, comma, x colon, and then you can do it there.
00:36:51.700 | So that's the usual way of adding stuff to a transform.
00:36:56.980 | All right.
00:37:02.060 | Another interesting point about RAM transforms is that Filt is set to zero and to remind
00:37:11.220 | you in transforms, we use this to decide whether or not to call a transform based on what subset
00:37:22.020 | it's in.
00:37:23.020 | So this says because Filt is zero for this transform, this will be by default only called
00:37:30.340 | on your training set and will not be called on your validation or test sets.
00:37:36.840 | You can obviously change that by setting Filt to something else, but that's the default.
00:37:41.500 | And so when we then create our flipItem transform to test it out, when we call it, we have to
00:37:48.660 | say Filt equals zero because we're not using a data source or anything here to say, hey,
00:37:53.580 | we're in the training set.
00:37:59.620 | So dihedral, for those of you that remember, is basically the same thing, except it flips
00:38:05.500 | also vertically or with transposes, so the eight possible dihedral symmetries.
00:38:14.500 | And as you can see, it's doing the same thing.
00:38:17.980 | So now we've patched dihedral into all these types.
00:38:22.000 | So we can just say name equals dihedral.
00:38:25.180 | And this time, we don't only have a p, but we also need to set our random number between
00:38:36.640 | 0 and 7, saying which of these types of flip we'll be doing.
00:38:46.980 | Presumably random.randint is inclusive, is it?
00:38:51.980 | And int, including both endpoints, OK?
00:38:58.320 | That's not what I expected.
00:39:01.060 | Yeah, so by doing this with before call, we make sure that that k is available, although
00:39:15.460 | how is that going to work?
00:39:16.700 | It's not.
00:39:18.700 | That's a bug.
00:39:26.420 | So we're testing it only with this image.dihedral approach, but we're not testing it with the
00:39:34.380 | function.
00:39:36.380 | Ah, that was a mistake, because this k is going to need to be passed in here.
00:39:48.500 | All right, so let's-- we have to call that ourselves, which is no problem.
00:40:00.700 | All right, so I think we're going to have to go def ncodes self comma x colon, and then
00:40:27.860 | we'll list all the types we support, just all of those.
00:40:34.180 | There we go.
00:40:37.260 | And so now there's no point passing this name along, because we're not doing the automatic
00:40:41.780 | version.
00:40:46.820 | And so now we will return x.dihedral self dot k.
00:41:00.180 | And so now we need to make sure that we have a test of that.
00:41:05.580 | So what we could do is we could create a transform here, called a dihedral item transform.
00:41:25.180 | And let's do it with a p equals 1, OK?
00:41:33.300 | So we're going to go, and then we will go show-- we'll go f image.
00:41:51.500 | And we would say built equals 0.
00:41:56.820 | There's no need to use the i, because this is random this time.
00:42:03.100 | So hopefully we'll see a nice mix of random transforms.
00:42:09.100 | Let's see if that works.
00:42:16.780 | And that needs a built.
00:42:26.900 | And super needs a filter.
00:42:37.580 | There we go.
00:42:40.780 | So we've got lots of different random versions.
00:42:46.380 | So let's get rid of this one, since we don't really need it twice.
00:42:55.180 | And this one we can probably hide, because it's just testing it on TensorPoint as well.
00:43:17.660 | So those are transforms that work on a single item at a time.
00:43:23.500 | So here is something called CropPad, which will either crop or pad, depending on whatever
00:43:34.420 | is necessary to create the size that you ask for, which we've seen in other versions of
00:43:38.820 | fast.ai.
00:43:39.820 | So we've done the same idea.
00:43:43.020 | There's a CropPad, which calls something called DoCropPad, and then DoCropPad is defined.
00:43:54.500 | Let's move everything around so it's easier to see what's going on.
00:44:01.940 | There we go.
00:44:05.940 | For TensorBbox, for TensorPoint, and for any kind of image.
00:44:13.220 | So since we now have that working, we can, in our CropPad transform, simply call, as
00:44:24.180 | you see, DoCropPad.
00:44:43.260 | Still some room to refactor this a little bit, but it's on the right check.
00:44:49.460 | All right.
00:44:58.260 | And as you can see, you can choose what padding mode you want, reflection, border, zeros.
00:45:05.820 | So then we can use something you can then inherit from that to create random crop just
00:45:15.580 | by changing before call.
00:45:17.140 | So do a random crop, as you can see.
00:45:19.620 | So here's some random crops of this doggy.
00:45:26.820 | And one of the nice things here is it'll automatically take the-- oh, we should check this is actually
00:45:36.900 | working.
00:45:37.900 | We want it to take the center crop automatically on the validation set, although I don't know
00:45:42.380 | if we actually have that set up.
00:45:48.660 | OK, we can do resizing, similar idea, again, just inheriting from CropPad.
00:45:59.860 | The famous random resize crop used in pretty much all ImageNet solutions is just another
00:46:05.140 | kind of CropPad.
00:46:06.140 | So we don't need to go through all that.
00:46:08.580 | And then we start the random transforms that'll work on the GPU.
00:46:17.540 | And there's nothing particularly different about them.
00:46:22.460 | These don't need to be refactored a little bit.
00:46:25.420 | But yeah, same basic idea.
00:46:27.900 | There's a before call, there's encodes for the different types you want.
00:46:33.340 | And they're just written so that the matrix math works out with the extra batch dimension
00:46:39.780 | automatically.
00:46:40.780 | So for example, dihedral is done on the GPU by using an affine transform.
00:46:50.620 | Great.
00:46:55.980 | And most of this stuff, the affine transforms and warps and stuff, we did touch on in the
00:47:00.540 | last part too.
00:47:01.540 | So go check that out.
00:47:02.620 | And the lighting transforms also were done there, if you've forgotten.
00:47:09.580 | Great.
00:47:12.420 | So then finally, you can check out 21.
00:47:22.220 | So this is ImageNet, and it looks pretty familiar, untied data, get image files.
00:47:35.380 | So we're not going to use data blocks here.
00:47:37.100 | Obviously, you could use data blocks as well, but this is doing it fairly manually.
00:47:46.580 | Hector of the rows, these tags are mentioned in the previous code walkthroughs, so check
00:47:50.340 | them out.
00:47:51.340 | I was just looking at the script starting with the number 9 to see how they're defined.
00:47:57.140 | All right.
00:48:00.100 | So transforms for the independent variable will just be create an image for the dependent.
00:48:04.580 | It will be for the parent label function, and then categorize.
00:48:09.580 | And then for the tuples, we go tensor optionally flip for random resize crop, create a data
00:48:18.140 | source, and then on the batch, I should say, put it on the GPU, turn it into a float tensor,
00:48:31.420 | normalize it.
00:48:37.420 | And so then we can create a data bunch.
00:48:40.940 | There it is.
00:48:41.940 | And here's the data block version of the same thing.
00:48:45.820 | So you can compare and again, data bunch.
00:48:55.020 | So then some of these should need to be exported, but we can create a CNN learner to wrap our
00:49:03.940 | learner a bit more conveniently like we did in version 1, label smoothing, and fit.
00:49:18.060 | And we can see if we have any augmentation.
00:49:35.060 | Oh, that's it listed.
00:49:48.340 | Not sure we might even need to add that in.
00:49:51.100 | Actually, we tend not to add much augmentation because we tend to use mixup nowadays if we
00:49:55.780 | want to use more epochs.
00:50:01.620 | So we tested this on more epochs, and he was getting slightly better results than we were
00:50:08.700 | with version 1.
00:50:10.120 | So I thought that was a good sign.
00:50:13.300 | All right, I think-- oh, one more question-- torch vision models be used.
00:50:24.060 | Yeah, anything should be usable.
00:50:27.780 | They're just models.
00:50:33.280 | So if you look at xresnet, it's just a normal nn.sequential.
00:50:41.700 | So yeah, there shouldn't be any special requirements.
00:50:47.380 | If you try using a model and it doesn't work, let us know.
00:50:53.900 | I guess for stuff like transfer learning, maybe that's something we can do in a future
00:50:57.980 | walkthrough.
00:50:58.980 | Yeah, we should probably do that in a future walkthrough, talk about how that stuff works.
00:51:04.340 | All right, thanks for joining, everybody.
00:51:07.100 | See you on the forums, and I'll let you know if we're going to do more of these in the
00:51:10.860 | future.
00:51:11.860 | Thanks for coming along.
00:51:13.360 | [BLANK_AUDIO]