Live coding 13

00:00:00.000 | Just continuing with this.

00:00:02.120 | OK, great.

00:00:03.320 | Yeah.

00:00:05.720 | Any other new faces today?

00:00:07.400 | Ali, have you been here before?

00:00:11.960 | I don't remember seeing you.

00:00:13.400 | Maybe you've been quiet.

00:00:14.280 | I've been here.

00:00:15.160 | I don't think I've had the camera on.

00:00:17.560 | OK, yeah.

00:00:19.400 | Nice to meet you.

00:00:21.280 | And Nick is back from his tour of Tropical Islands.

00:00:26.040 | Hi, Nick.

00:00:26.680 | Yes, yep.

00:00:28.720 | Not so far away, though. Still pretty close.

00:00:31.000 | Yeah.

00:00:31.600 | Well, when you're at Queensland, Tropical Islands are easy.

00:00:34.920 | By default.

00:00:37.120 | Hi, Marie.

00:00:37.800 | Nice to see you.

00:00:39.200 | Where are you joining us from?

00:00:42.320 | Hello, can you hear me?

00:00:43.280 | Hi, absolutely.

00:00:45.360 | I'm Joni from South Bank of Brisbane.

00:00:47.720 | Uh-huh.

00:00:48.680 | Yeah, I'm at the Children's Health Queensland Hospital

00:00:52.040 | building.

00:00:52.760 | Ah, yes.

00:00:53.760 | OK, great.

00:00:58.240 | Anybody got anything they wanted to talk about or ask about

00:01:00.920 | before we dive in?

00:01:02.000 | Because I've got some good news, which

00:01:03.840 | is I've made things much easier in the horrible direction

00:01:07.720 | we went last time.

00:01:08.600 | All right, sounds like I should dive in then.

00:01:14.000 | I know there was, like, as I rather expected,

00:01:22.960 | and quite fair enough, some concern on the forum

00:01:26.000 | about how complicated and weird this all is, which is partly

00:01:31.880 | because we're kind of, like, jumping into stuff we'd normally

00:01:34.040 | do in part two.

00:01:35.000 | And so we haven't quite got all the necessary background here.

00:01:41.160 | And so, yeah, don't worry if you're feeling a little at sea.

00:01:49.480 | But it's a good way to, like, I don't know,

00:01:51.040 | start to see, like, some stuff you could look into

00:01:54.440 | after the course is finished.

00:01:55.760 | All right, so let me just--

00:02:02.760 | Jeremy.

00:02:03.400 | Yes.

00:02:03.760 | Sorry, just not mentioning part two.

00:02:05.720 | Are you planning a part two this year?

00:02:09.480 | Planning would be too strong a word.

00:02:11.360 | But as much as I ever plan anything--

00:02:14.000 | Thinking about?

00:02:15.120 | Yes, I would absolutely like to do a part two this year.

00:02:18.120 | Awesome.

00:02:18.720 | And the fact that you're asking that question

00:02:20.600 | means that you are not up to date on our Discord.

00:02:23.560 | So you should definitely join our Discord channel

00:02:25.720 | because we've actually been talking

00:02:27.200 | about doing a conference, or an unconference,

00:02:32.240 | to go with part two at the end of the year in Queensland.

00:02:38.400 | And a lot of folks from places that are not in Australia

00:02:43.000 | are saying they would be up for coming here for that.

00:02:48.080 | Partly a kind of a social thing, and also

00:02:52.280 | trying to do it all in a very COVID-safe way of outdoor,

00:02:56.080 | and masked, and people tested ahead of time, and stuff.

00:03:02.080 | Yeah, so that's--

00:03:04.920 | yeah, so nothing's planned.

00:03:06.520 | We don't have dates, or a syllabus, or anything

00:03:09.200 | like that, but we absolutely hope

00:03:11.440 | to have something awesome maybe towards the end of the year

00:03:19.920 | where we can get to know each other a bit better,

00:03:23.320 | and get to know fast AI and deep learning a bit better.

00:03:27.560 | Jeremy, can I ask you, are you going

00:03:31.520 | to continue this work for next week

00:03:34.440 | because I know that the class is going

00:03:36.320 | to be shown next Tuesday?

00:03:38.280 | Yeah, I think so.

00:03:42.160 | The fact that I'm doing it this week is interesting

00:03:45.640 | because, yeah, it has meant I've got less time to actually work

00:03:51.920 | on the course, but I also feel like the stuff we're doing,

00:03:55.440 | perhaps, is stuff I can use in the next lesson,

00:03:57.840 | depending on where we get to.

00:03:59.080 | So yeah, so I think we'll do it next week.

00:04:03.520 | We'll see how things go.

00:04:04.720 | If I get really behind, then maybe not.

00:04:07.760 | But then I certainly plan to continue doing them

00:04:11.560 | after the last class until--

00:04:16.560 | I don't know.

00:04:17.240 | We all know everything, I guess, and then we'll stop.

00:04:21.040 | At which point, there'll be more things to know, so yeah.

00:04:24.120 | OK.

00:04:25.200 | We don't have to stop, necessarily.

00:04:27.160 | There's just so much to learn.

00:04:30.120 | The only problem is that it's obviously

00:04:32.120 | a burden on your time, but it's--

00:04:34.960 | I enjoy it.

00:04:35.920 | I enjoy it.

00:04:37.000 | My issue is, what about when we get to the point

00:04:39.120 | where there's nothing left to learn, Radek?

00:04:40.880 | What do we do then?

00:04:42.680 | Well, is there such a point?

00:04:45.800 | Sure, there must--

00:04:48.120 | But then we just start doing it all in a different language.

00:04:50.520 | We start doing it all in R or Julia.

00:04:52.560 | Yeah, correct.

00:04:55.120 | C plus plus, that would keep us busy.

00:04:57.760 | I think this is my fifth year of doing fast AI courses,

00:05:01.280 | and I'm still trying to complete the part one.

00:05:06.000 | Are you too monostatic?

00:05:08.640 | That's true.

00:05:11.280 | All right.

00:05:12.040 | Let me see.

00:05:12.720 | So multitask.

00:05:15.560 | All right.

00:05:21.400 | I am just so happy about how this worked out, to be honest.

00:05:26.240 | Although, spoiler alert, it didn't turn out to help our score.

00:05:31.080 | The score was about the same.

00:05:34.680 | But I was so happy at how the whole process turned out.

00:05:38.920 | But I kind of want to show you how I got there, as well as

00:05:42.760 | where we ended up.

00:05:45.880 | And yeah, as soon as I kind of turned off Zoom last time,

00:05:50.680 | and I went for a walk, and then as soon as I did that,

00:05:53.520 | I was like, oh, no, of course I know how we should do this.

00:05:56.200 | And really, so there's quite a few--

00:06:00.280 | we can make this much, much simpler.

00:06:02.160 | So let me explain what we're going to try to do.

00:06:05.680 | We are going to try to predict two things, the disease

00:06:20.120 | and the variety for each image.

00:06:23.600 | And the first thing will be to create a pair of data loaders

00:06:29.040 | that look like this.

00:06:29.920 | For each image, they will have connected two things to them,

00:06:33.800 | the disease and the type of race.

00:06:37.760 | So this is going to be step one.

00:06:39.320 | So let me kind of show you how I got to that step.

00:06:44.400 | So the step one that I took was to, first of all,

00:06:48.120 | try to replicate what we've already had before.

00:06:53.400 | So patty's ball.

00:06:56.320 | But before, I used image data loaders,

00:06:58.600 | which is the highest level least flexible function.

00:07:03.640 | We can do all the data processing

00:07:05.800 | of a single line of code, but only

00:07:08.080 | if we want to do something really standard.

00:07:10.200 | And trying to predict two things is not standard enough for it.

00:07:14.840 | So we now need to go one layer down.

00:07:18.040 | And there's a lot of really good tutorials on docs.fast.ai.

00:07:27.600 | And because it's been ages since I've done any of this,

00:07:30.120 | I'd forgotten entirely how fast.ai worked,

00:07:32.920 | so I used them very heavily to remind myself

00:07:36.080 | of what's going on.

00:07:37.960 | But for example, there is a data block tutorial.

00:07:41.560 | This pat's tutorial is great.

00:07:45.560 | It goes through all the layers of different ways

00:07:47.960 | of doing things with fast.ai preprocessing.

00:07:53.760 | This Siamese tutorial is another really good one.

00:07:56.440 | So these are some of the things I looked at.

00:07:58.560 | And the other thing that I looked at was the actual API docs.

00:08:02.760 | So if I click here on data block,

00:08:06.120 | this is actually probably what I found the most useful

00:08:09.360 | in the end.

00:08:11.640 | There's lots of great examples in the documentation.

00:08:14.120 | So yeah, as I kind of like--

00:08:17.040 | you know how it is, you come back to something a couple

00:08:19.560 | of years after you built it, and you're now

00:08:22.240 | kind of the customer of your documentation.

00:08:24.200 | And so my experience as a customer of my documentation

00:08:27.000 | was I was really delighted by it.

00:08:28.800 | So I can definitely suggest checking all that out.

00:08:33.280 | So what you can do--

00:08:37.000 | before, we were using image data loaders.from folder.

00:08:40.080 | So if we just do the double question mark trick,

00:08:42.080 | we can see the source code for it.

00:08:45.040 | And it's the normal size of fast.ai things.

00:08:47.560 | It's very small.

00:08:49.320 | And you can see that actually all the work's

00:08:51.840 | being done by data block.

00:08:53.800 | So data block is the still high level API,

00:08:59.000 | but not as high level.

00:09:01.320 | It's actually very flexible.

00:09:02.760 | And so we're going to--

00:09:04.240 | step one that I did was to replicate exactly what we had

00:09:08.680 | before about using data blocks.

00:09:13.040 | And for this, there's actually so many good examples

00:09:19.360 | in the tutorials and in the book.

00:09:21.120 | You've seen them all before.

00:09:22.280 | We don't need to talk about it too much.

00:09:25.480 | We can basically say, OK, for the data block,

00:09:29.000 | the input will be an image.

00:09:30.200 | The output will be a category.

00:09:32.640 | This is just to do disease prediction.

00:09:35.080 | The labeling function will be the parent folder.

00:09:50.920 | Parent label.

00:09:53.440 | Do a random split.

00:09:55.360 | The item and batch transforms, we

00:09:57.200 | can copy and paste from what we had before.

00:10:00.480 | And that creates a data block.

00:10:06.760 | So data loaders is then a data block.data loaders.

00:10:11.000 | And you then have to pass in a source.

00:10:15.160 | So the source is basically anything

00:10:20.080 | that you can iterate through or index

00:10:24.880 | into to grab the things that will be passed to these blocks

00:10:30.080 | and to this function.

00:10:30.960 | So for this, it's going to be a path.

00:10:35.800 | And then we also need to get items.

00:10:48.000 | And so when we get a path, we're going

00:10:50.400 | to pass that path into the getImageFiles function,

00:10:54.200 | because that's the thing that returns

00:10:55.800 | a list of all of the images in a path.

00:10:57.520 | And let's see if that works.

00:11:03.920 | How do you know that--

00:11:09.680 | OK, so the block, you have an image block, category block.

00:11:14.200 | How do you know that the getImageFiles

00:11:21.400 | is going to be able to feed both those blocks?

00:11:26.680 | So I guess the short answer would

00:11:32.000 | be to read the documentation about those blocks

00:11:35.680 | to see what they take and what they do,

00:11:40.480 | or any of the tutorials that use them.

00:11:42.560 | As you can see, they're used all over the place, right?

00:11:44.720 | So you can start with this tutorial, or this tutorial,

00:11:48.360 | or this tutorial.

00:11:49.040 | So any of those would show you what it does.

00:11:55.080 | Yeah, the actual-- sorry, this is not good documentation.

00:12:03.280 | I really never bothered to look at this,

00:12:05.240 | because it's basically all it does, because we should fix

00:12:12.880 | that, I guess, because there's so many tutorials.

00:12:16.840 | I mean, as you can see, I guess the reason I never really

00:12:19.680 | wrote drops, right, is that it's literally a single line of code.

00:12:22.240 | So that's like--

00:12:23.120 | yeah, so maybe looking at the code

00:12:30.040 | is actually interesting in this case.

00:12:31.560 | So an image block is something which

00:12:36.120 | calls class.create, where class is P-I-L image.

00:12:42.440 | So it's going to call P-I-L image.create.

00:12:44.920 | So to find out what's actually going to be called by it,

00:12:49.600 | you can go P-I-L image.create, and you can see

00:12:59.520 | it's going to get passed a file name, which

00:13:01.640 | can be a path, or a string, or various other things.

00:13:05.280 | So get image files, then, obviously, you

00:13:09.680 | can either run it to see what it comes out with,

00:13:11.880 | or let's do that so we could just run get image files,

00:13:16.000 | passing in the thing it's going to be given, which is the path.

00:13:20.760 | And so as you can see, it's a bunch of paths.

00:13:23.640 | And so we could pass one of those--

00:13:28.080 | copy that, and it's going to be passed into this function.

00:13:31.240 | So we've now just replicated exactly

00:13:35.800 | what's happening in the code.

00:13:37.240 | Yeah, but for this, I generally just

00:13:43.400 | look at the tutorials, which tell you what to do.

00:13:47.240 | Could you have two different get items

00:13:50.200 | that feed different blocks?

00:13:52.720 | We're going to come to that.

00:13:54.680 | OK, so park that.

00:13:57.720 | So yeah, so--

00:14:02.920 | There was also a batch transform,

00:14:05.160 | like it gets transformed later after reading, right?

00:14:08.880 | Yeah, here we've got the batch transform, yeah.

00:14:12.440 | But in the image block, because right now, we

00:14:15.920 | have a build image, or something,

00:14:18.440 | and it needs to become a tensor, right?

00:14:21.120 | Yeah, that's right.

00:14:22.200 | It gets changed from an int tensor to a float tensor

00:14:25.600 | later on.

00:14:26.520 | That's right.

00:14:27.520 | Yeah, that's a fairly subtle thing, but that's right.

00:14:32.760 | We stick that in something.

00:14:39.720 | Image equals-- and we look at, like, np.array image.

00:14:47.480 | It's actually stored as bytes, or UN8, as they call it,

00:14:51.480 | in PyTorch.

00:14:54.440 | So yes, this is going to add a batch transform that's

00:15:00.200 | going to turn that into a float tensor, which

00:15:02.320 | we can see what that's going to look like.

00:15:06.040 | We could run it here, I expect, to float tensor.

00:15:10.760 | The transformation that we're applying

00:15:17.840 | is 224, which is like a square image, correct?

00:15:21.720 | A 224 by 224?

00:15:24.200 | This one here?

00:15:26.120 | Yes.

00:15:26.920 | Yeah, so this is doing data augmentation

00:15:29.680 | of lots of different kinds.

00:15:31.360 | So let's copy doc paste.

00:15:43.560 | And if we show--

00:15:48.760 | for data augmentation, looking at the docs

00:15:50.800 | is particularly important, because you

00:15:52.720 | can see examples of the augmentations it does.

00:15:55.760 | So it tells you a list of all the augmentations

00:15:57.680 | and how you can change them.

00:15:58.760 | And here's some examples of what they look like.

00:16:02.480 | And--

00:16:02.960 | These augmentations would happen after the integer float

00:16:06.520 | tensor, right?

00:16:08.800 | Yes, that's correct.

00:16:10.040 | And some data augmentations, they

00:16:13.080 | operate on the entire batch, and some operate

00:16:15.600 | with a single example.

00:16:17.320 | Yeah, that's right.

00:16:18.240 | So the ones in batch transforms operate on the whole batch,

00:16:20.720 | and the ones in item transforms operate

00:16:22.520 | on a single item.

00:16:23.960 | And so batch transforms operate--

00:16:27.880 | because they operate on a batch, before you get there,

00:16:31.160 | everything has to be the same shape.

00:16:33.000 | So it can be turned into a batch.

00:16:34.800 | So this resizes everything to the same shape.

00:16:38.680 | And then this does these various different types of data

00:16:42.480 | augmentation.

00:16:44.160 | And one of the key pieces of data augmentation it does

00:16:47.440 | is to randomly zoom into a subset of the image,

00:16:51.760 | as you can see in these various examples here.

00:16:55.360 | And this data block API, can you also

00:17:03.840 | use it with data frames, where you

00:17:05.920 | would be reading your images from a data frame?

00:17:08.320 | We are going to do that.

00:17:08.960 | We're going to do that in a moment, yes.

00:17:10.480 | OK, cool.

00:17:11.280 | Absolutely.

00:17:14.400 | So yeah, I'm kind of skipping over quite a bit of this,

00:17:17.360 | because it's super well covered in the tutorials.

00:17:22.040 | So I don't want to say stuff that you can very easily read.

00:17:24.680 | Whereas the stuff I'm about to show you

00:17:26.040 | isn't as well covered in the tutorials, and it's kind of new.

00:17:28.880 | But yeah, feel free to keep asking questions

00:17:33.880 | about anything you see.

00:17:36.360 | So basically, yeah, so all we've done

00:17:39.840 | is this is just the same thing that we have in lesson one.

00:17:43.880 | And it's doing exactly the same thing as my image data

00:17:51.760 | loaders.from folder, but just turn it into data block.

00:17:54.240 | And so this is what I did, just to show you

00:17:56.920 | through my process.

00:17:57.880 | But step one was to get this working,

00:17:59.760 | and then I passed that into a learner.

00:18:05.200 | And so let's go copy, and I want this all to run as fast as

00:18:14.040 | possible.

00:18:15.520 | So I would use the fastest--

00:18:17.560 | Do you-- when you make this data loader thing,

00:18:27.520 | do you try to make sure that the shape that it's outputting

00:18:31.560 | is what you need for your model, or that's later?

00:18:34.720 | Well, I generally use models which

00:18:37.960 | don't care what size they get.

00:18:41.720 | So yeah, that's one of my tricks.

00:18:44.840 | So ResNet18 is happy with any size.

00:18:48.640 | So actually, for my testing, I'm going

00:18:50.160 | to bring this back down to 128, so it's super fast.

00:18:53.920 | And so I just want to get the maximum iteration speed here.

00:19:00.720 | And so now, I can call learn.fit1cycle.

00:19:07.400 | And let's do one epoch.

00:19:09.400 | OK, so this is going to run in under 20 seconds, which

00:19:21.120 | is kind of what you want, right?

00:19:22.320 | You want something that you can test in under about 20 seconds,

00:19:24.920 | so that way, you can just quickly try things

00:19:27.200 | and make sure that end-to-end, it's working.

00:19:29.760 | Yep, the error rate is down to 30%.

00:19:33.880 | So that's a good sign.

00:19:38.600 | I guess one correlated question is, OK,

00:19:40.920 | I understand the input size, but what

00:19:43.200 | about the output size of your data block?

00:19:45.680 | You know that this is what you need for that model.

00:19:48.120 | Let's not say the model doesn't care.

00:19:49.920 | The model's happy with any size.

00:19:52.360 | I mean, the targets, or whatever.

00:19:55.640 | You're talking about the labels?

00:19:57.120 | Yeah, the labels.

00:19:58.000 | I mean, labels don't have sizes.

00:19:59.800 | The labels are strings.

00:20:01.120 | Or just the shape of that.

00:20:02.680 | Like, hey, is it--

00:20:05.080 | because maybe different models are kind of trying

00:20:07.600 | to predict different types of stuff, potentially?

00:20:10.600 | I don't know.

00:20:13.360 | Like, some might have the shape of the target.

00:20:16.360 | A vision learner is--

00:20:18.760 | I suspect the thing you're kind of asking

00:20:24.320 | is the thing that we're going to be covering in a moment.

00:20:26.560 | So maybe put that on hold, and then tell me

00:20:28.640 | if it doesn't make sense.

00:20:30.040 | OK.

00:20:30.520 | OK.

00:20:32.760 | Before-- I have a question.

00:20:37.120 | On the data block, you randomly select the amount of records

00:20:44.560 | or the amount of the batch size that you're going to process.

00:20:49.120 | I don't randomly pick the batch size, no.

00:20:51.360 | The batch size is actually selected in the .dataLetters

00:20:55.160 | call and it take off to 64.

00:20:58.200 | 64.

00:20:59.400 | So what is the guarantee that every single one

00:21:02.760 | of the images in this particular case

00:21:05.000 | will be selected, or there's no way to know?

00:21:09.400 | Is there any way to know that every single one will be--

00:21:12.520 | Yes.

00:21:13.720 | I mean, well, yes, except that we're randomly

00:21:16.760 | selecting 20% to be a validation set.

00:21:21.880 | But every single one will go through the learner.

00:21:26.200 | Of the 80% that aren't in there, everyone

00:21:29.080 | will go through our learner because we randomly shuffle them

00:21:31.960 | and then we iterate through the whole lot.

00:21:35.400 | So in a single epoch, the model is

00:21:38.760 | guaranteed to see every example that is trained just once.

00:21:44.040 | Yeah.

00:21:46.040 | Yeah, and that's what this one means.

00:21:47.480 | That's what one epoch means, is look at everything once.

00:21:50.280 | And so if we put two there, it would look at everything twice.

00:21:52.680 | But each time it randomly shuffles it,

00:21:54.360 | so it does it in a different random order.

00:21:56.920 | I have a quick question.

00:21:59.600 | I guess this is PyTorch data loader stuff,

00:22:03.080 | but what actually happens for the last batch?

00:22:06.280 | The last batch, it depends.

00:22:08.360 | And this is actually not the PyTorch data loader,

00:22:10.840 | it's actually fast.ai's data loader.

00:22:12.680 | So we have our own data loader, although in the next version,

00:22:16.120 | we're likely to replace it with the fast.ai one.

00:22:19.080 | So it depends what drop last is.

00:22:20.760 | If drop last is true, then it deletes the last batch.

00:22:24.920 | And if it's false, then it includes the last batch.

00:22:27.560 | And the reason that's interesting is that the last batch may not

00:22:30.040 | be of size 64.

00:22:32.920 | Yeah.

00:22:34.920 | For the validation set, it always keeps the last batch.

00:22:39.080 | It's super important to shuffle the transfer.

00:22:42.040 | The fast.ai does it for you, but if you

00:22:44.560 | will mess around with the data loaders or do something

00:22:47.520 | yourself, if you don't shuffle the transfer,

00:22:49.960 | you might get very poor training performance.

00:22:53.560 | When we used to use Keras, I used

00:22:55.160 | to mess all this stuff up all the time.

00:22:57.960 | Yeah, trying to get all those details, right?

00:22:59.560 | It's really annoying.

00:23:00.400 | Just to make sure on something you said,

00:23:02.080 | you said in the next iteration, you're

00:23:03.600 | going to replace it with the PyTorch data loaders?

00:23:05.760 | Yeah, probably.

00:23:07.240 | OK.

00:23:08.400 | You said fast.ai, so I got confused.

00:23:10.440 | Oh, did I?

00:23:11.680 | That is confusing.

00:23:13.880 | Thanks.

00:23:15.720 | OK, so that was my step one.

00:23:17.360 | Is to just get it working exactly like before.

00:23:20.120 | And then I ran it in the background

00:23:25.000 | on the same architecture for the same epochs

00:23:27.320 | to make sure I got about the same error rate, and I did.

00:23:29.760 | So then I was happy that, OK, I'm matching what we had before.

00:23:33.960 | So then step two was to try to make it

00:23:38.440 | so that the data block spits out three things, which

00:23:41.440 | would be one image and two categories.

00:23:44.480 | The category of disease and the category of rice type.

00:23:50.000 | So to get it to spit out an image and two categories,

00:23:53.160 | hopefully you wouldn't be surprised to hear

00:23:54.840 | that we just do that.

00:23:57.920 | We say we want three blocks, an image, and two categories.

00:24:01.000 | Now, this variety, we did some way

00:24:07.280 | of getting that given an image ID.

00:24:13.160 | And actually, the way I did it was a bit ugly.

00:24:16.680 | And since then, I thought of a better way of doing it,

00:24:19.160 | which is what I think we should do

00:24:20.560 | is we should create a dict that maps from image ID to variety.

00:24:25.520 | And then our function will just be to look that up, right?

00:24:29.560 | So let's call this image to variety equals--

00:24:36.800 | OK, and it's going to be a dict comprehension.

00:24:40.800 | So we're going to loop through the rows in df dot iter items.

00:25:00.400 | Now, I always forget what these differences are.

00:25:04.760 | Column name comma series pair, returning a tuple

00:25:08.360 | with the column name.

00:25:11.160 | OK, that's not what I want.

00:25:13.000 | >> Yeah, like iter rows, yeah.

00:25:14.560 | >> Yeah, iter rows, index comma series.

00:25:21.000 | OK, cool.

00:25:22.520 | I think like this iter tuples is the fastest one.

00:25:28.480 | But this is not very big, so let's keep it simple.

00:25:32.720 | OK, so this is going to iterate over rows

00:25:35.320 | and return index comma series.

00:25:39.360 | OK, so we don't really care about the index.

00:25:42.120 | Another thing we could do is make the image ID the index,

00:25:50.840 | and then you could actually jump straight into it.

00:25:54.400 | But I think I'd rather not use pandas features.

00:25:56.760 | I'd rather use more pure Python-y things,

00:25:59.160 | because I think that'll make the explanation a little clearer.

00:26:03.000 | So we're going to loop through.

00:26:04.200 | It's going to give us the index and the row.

00:26:08.720 | And so what we want is the key will be the row's image ID,

00:26:16.440 | and the value will be the row's variety.

00:26:20.080 | OK, that looks good.

00:26:27.920 | So then there's a couple of ways we

00:26:32.800 | could turn this into a function.

00:26:36.000 | And I'm just going to show you a little neat trick, which

00:26:39.240 | is when you go like the--

00:26:40.760 | let's pick out something.

00:26:42.000 | Let's say we're going to grab this one.

00:26:45.440 | When you go like this, behind the scenes,

00:26:51.080 | that square bracket thing is actually

00:26:53.480 | calling a special magic method in Python called

00:26:58.600 | Dunder getItem, which is a function.

00:27:03.600 | This is the cool thing about Python.

00:27:05.200 | It's so dynamic and flexible, like all the syntax sugar

00:27:09.160 | is like behind the scenes just calling functions, basically.

00:27:12.240 | That's exactly the same thing.

00:27:15.200 | And so that means that this function here,

00:27:18.880 | imageToVariety.Dunder getItem, is

00:27:21.040 | a function that converts a file name into a variety.

00:27:27.760 | So here's the cool thing.

00:27:29.320 | Forget why you can pass it an array,

00:27:35.200 | and it's going to call each of those functions, which

00:27:41.840 | I think is rather nice.

00:27:42.880 | So another thing I find helpful--

00:27:52.360 | OK, cool.

00:27:53.560 | So when I call that, it complains.

00:27:56.680 | And it says, oh, getY contains two functions,

00:27:59.440 | but it should contain one, one for each target.

00:28:01.640 | It thinks that there's only one target.

00:28:04.160 | Why is that?

00:28:05.320 | Well, if you think about it, we've said there's three blocks,

00:28:07.840 | but we haven't told it how many of those blocks

00:28:10.200 | are for the independent variable and how many

00:28:12.040 | are for the dependent variable.

00:28:14.000 | And so we have to tell it.

00:28:15.600 | And the way we do that is to say the number of inputs equals--

00:28:19.200 | and so it's one.

00:28:20.840 | We have one input, and then the rest will be outputs.

00:28:25.120 | So when we do that, it's now happy, OK?

00:28:31.120 | And personally, before I jump to data loaders,

00:28:33.880 | I first create data sets just to make sure they work.

00:28:37.360 | So data sets are things where you just

00:28:41.040 | grab one thing at a time.

00:28:42.160 | There's no mini-batches to worry about.

00:28:44.000 | So data sets are easier to debug than data loaders.

00:28:47.080 | So you can create data sets using

00:28:49.440 | exactly the same approach, OK?

00:28:54.600 | And so-- all right, so we've got an error.

00:28:57.560 | OK, so here's the problem.

00:29:04.360 | It tried to look up our function.

00:29:09.040 | And in fact, it's not indexed.

00:29:12.640 | It's not passing in the string of the name.

00:29:17.440 | It's actually passing in the path.

00:29:21.160 | And so that's why we've got a key error.

00:29:22.920 | This path does not exist as a key in this dictionary,

00:29:27.640 | which is quite true, right?

00:29:29.240 | It doesn't.

00:29:30.480 | So what I think we should do is fix this up

00:29:36.360 | so that we've got train images, bacterial leaf streak,

00:29:40.520 | blah, blah, blah, OK.

00:29:42.640 | The get files function, the output of that

00:29:46.200 | is being passed to the get items is being passed to get y.

00:29:51.040 | So get image files.

00:29:54.000 | Yeah, so we haven't kind of gone into the details of exactly

00:29:56.560 | what's going on behind the scenes, Hamill.

00:29:59.240 | Let's do that in a moment.

00:30:02.040 | I kind of like the way you're wanting

00:30:04.200 | to jump into the nitty gritty, but it's a little bit--

00:30:08.360 | It's a problem that I have.

00:30:09.200 | I'm trying to do more top down, right?

00:30:11.000 | So I've got to get to your bottom up.

00:30:13.120 | We'll meet in the middle, OK?

00:30:15.280 | OK, fair enough, fair enough.

00:30:16.640 | By the way, your video is not on.

00:30:18.080 | That's fine.

00:30:18.800 | I just don't know if it's intentional.

00:30:20.000 | I always like to see people when they're, you know, seeable.

00:30:23.640 | Hello.

00:30:25.520 | OK, so we're not going to use this trick after all.

00:30:29.320 | We're going to create a function called get variety.

00:30:36.920 | Actually, no.

00:30:37.760 | Yeah, let's create a function called get variety.

00:30:41.160 | And so it's going to get past a path, OK?

00:30:47.560 | And so we're going to return image to variety.

00:30:55.160 | And we're going to return image to variety

00:30:59.400 | with the name of the file.

00:31:03.840 | So the name of the file is the string.

00:31:08.600 | Wait, we need image to variety, the dunder thing?

00:31:12.880 | Oh, yeah, I'll just grab a bracket, actually, yes.

00:31:15.800 | OK.

00:31:16.600 | Oh, and then we need to use that.

00:31:22.520 | OK, so DSS contains a dot train data set.

00:31:43.640 | OK, and it also contains a dot valid data set.

00:31:48.400 | OK, and so we can look at the zeroth thing in the training

00:31:54.360 | data set, which is a single thing, right?

00:31:56.720 | So we can have a look now.

00:31:58.240 | There's image and Y1 and Y2.

00:32:02.160 | And so then we can look at the image, for example.

00:32:09.000 | OK, so what's happened here is that get image files returned

00:32:13.240 | to a list of paths.

00:32:15.400 | The first one got passed to image block, which,

00:32:19.680 | as we saw earlier, got passed to pyo image dot create.

00:32:23.320 | And here it is.

00:32:26.720 | And that path name also got passed to a function

00:32:30.800 | called parent label.

00:32:31.880 | In fact, let's do it, right?

00:32:33.480 | So let's say file name equals get image files.

00:32:39.680 | And then the thing that we passed in, training path,

00:32:43.400 | and let's just get the zeroth one.

00:32:45.600 | OK, and so there it is, right?

00:32:50.040 | So it ended up calling pyo image dot create with that file name.

00:33:00.040 | OK, it also called parent label with that file name.

00:33:05.080 | Parent label, OK.

00:33:10.080 | And it also called get variety with that file name.

00:33:12.760 | Jeremy, can we look at get variety one more time?

00:33:18.960 | I'm just curious how you build the path.

00:33:22.960 | I didn't.

00:33:23.640 | I removed the path.

00:33:24.680 | I called dot no.

00:33:25.600 | OK, OK.

00:33:28.720 | I see.

00:33:29.920 | Yeah, in my original version of this,

00:33:32.240 | I did it the other way around, of building back up the path,

00:33:35.280 | and then realized that that was kind of stupid.

00:33:38.280 | Yeah, it's unique, so that works.

00:33:42.320 | One question.

00:33:44.200 | OK, this could be too low level, but just let me know.

00:33:51.720 | Can you have multiple get items?

00:33:54.240 | Is this the right place to ask that?

00:33:55.640 | So it wouldn't make sense to have multiple get items, right?

00:33:59.720 | Like, get item returns a single thing,

00:34:02.960 | but it could be anything you like, right?

00:34:04.360 | It could be it could return a tuple, or a list, or an object,

00:34:07.200 | or whatever, right?

00:34:09.200 | And so, or dict.

00:34:11.040 | And then get y and get x, then the thing's

00:34:14.920 | responsible for pulling out the bit that you

00:34:16.960 | need to pass to your blocks.

00:34:19.400 | Now, we don't need to get x, because image blocks just

00:34:23.960 | take paths directly.

00:34:26.520 | If I needed something a bit more like I wanted to put more things

00:34:31.040 | and get image files, like have it emit a tuple,

00:34:33.680 | then would I have to make my own image block to ignore?

00:34:36.840 | No, not your own image block.

00:34:38.000 | You would write your own function,

00:34:40.000 | just like get image files, that returns

00:34:43.280 | the list of all of the objects you want,

00:34:45.800 | which have all the information you need.

00:34:48.520 | OK, and then--

00:34:50.160 | It almost never happens.

00:34:51.320 | I don't think that's ever happened to me,

00:34:53.000 | because nearly always there's a row of a database

00:34:56.400 | table, or a path, or something has all the information

00:35:01.240 | you need to go out and get the stuff with your get x's

00:35:05.480 | and get y's.

00:35:07.920 | That's like the central piece of information for each row.

00:35:11.280 | And based on this information, you can read in text,

00:35:14.000 | you can read in images, but specific to that one row.

00:35:18.640 | Actually, let me show you my hacky version,

00:35:21.360 | because this is the version that uses a data frame.

00:35:24.400 | So this is-- so the version that uses data frame--

00:35:27.480 | let's see.

00:35:35.760 | Is it right to think, get x's?

00:35:39.880 | Yeah, and that's interesting.

00:35:41.520 | Let me just do this, and let me come to your question.

00:35:43.680 | OK, so in this data block, I started out with a data frame,

00:35:52.160 | like so, right?

00:35:53.680 | And so by passing into data block.dataload,

00:35:57.720 | as I passed in the data frame, it's going to get each row.

00:36:03.360 | And so then get y becomes colreader 1,

00:36:08.920 | which is just a function which--

00:36:11.320 | I mean, let's look at it.

00:36:14.680 | That doesn't do much.

00:36:15.760 | Let's see what it does.

00:36:21.560 | So it's done as an object, because you

00:36:24.280 | can do things like add in a prefix path and a suffix path,

00:36:27.680 | and you can split it with a label delimiter and whatever.

00:36:30.240 | But basically, all it's basically doing, OK,

00:36:36.000 | and it checks what kind of thing you're passing in.

00:36:38.680 | Basically, all it does is it calls getAtra to grab

00:36:49.760 | the column and we--

00:36:54.480 | colreader for context, like reading data frames.

00:37:01.880 | Sorry?

00:37:03.120 | Is this colreader function specifically for data frames?

00:37:08.000 | I mean, it can work with anything, basically,

00:37:11.120 | that you're-- so what it's doing here is it's saying,

00:37:15.920 | grab that column.

00:37:19.680 | But it's really--

00:37:20.760 | I've only really used it for data frames,

00:37:22.560 | but you could use it for anything.

00:37:24.600 | But yeah, so basically here, getY is saying, OK, well,

00:37:27.280 | let's return the index 1 field and the index 2 field.

00:37:31.120 | What's up with getX?

00:37:40.680 | So because now we're being passed--

00:37:42.680 | so you can't pass a row of a database table to pailimage.create.

00:37:48.840 | So getX is this function, which basically is going,

00:37:55.360 | oh, it's going to be in the training path slash disease

00:37:59.400 | name slash image name.

00:38:04.120 | So that's-- and then there's a special case for the test set,

00:38:09.240 | because the test set things are not stored in subfolders

00:38:13.400 | according to label, because we don't know the label.

00:38:15.640 | So it's just directly in the test path.

00:38:18.440 | So that's the-- as I said, this was more hacky.

00:38:21.760 | I don't--

00:38:22.360 | [INTERPOSING VOICES]

00:38:23.840 | This really helps.

00:38:25.080 | So getX is kind of like getY.

00:38:27.120 | You can have a list in there.

00:38:28.760 | It's like from--

00:38:30.120 | Yeah, you can have-- yeah, it's totally flexible.

00:38:32.240 | And I mean, seriously, Hamel, we have so many examples

00:38:37.200 | of all of these patterns in the docs in the tutorials.

00:38:42.920 | So like this exact pattern--

00:38:45.000 | let's take a look at one, right, docs.faster.ai.

00:38:48.600 | So tutorials-- let's do data block tutorial.

00:38:58.160 | Right here, look, multi-label.

00:39:00.240 | So here's one.

00:39:03.120 | And yeah, you can see here, this is even

00:39:05.840 | splitting based on columns in the database table.

00:39:08.680 | And here's the co-reader using prefix.

00:39:11.480 | And here's a co-reader using a label delimiter.

00:39:14.240 | And here's the examples coming out.

00:39:16.040 | Yeah, so there's--

00:39:18.000 | yeah, lots of examples you can check out

00:39:22.320 | to see how to do all this.

00:39:23.560 | Yeah, so I think I'm at a point now where I actually

00:39:28.840 | do want to go into the weeds.

00:39:30.440 | So Hamel, you're now, after this,

00:39:32.240 | totally free to ask any super-weedy questions.

00:39:37.120 | The most basic kind of data block

00:39:40.320 | is called the transform block.

00:39:41.840 | And the transform block, basically,

00:39:54.600 | it's going to store a bunch of things you pass in.

00:39:56.880 | It's going to store things called type transforms.

00:39:58.920 | It's going to store things called item transforms.

00:40:00.800 | It's going to store things called batch transforms.

00:40:04.320 | And also, it always adds one thing,

00:40:06.680 | which is to tensor, because PyTorch means tensors.

00:40:10.480 | If you look at the image block, we

00:40:15.280 | saw that that's defined as a transform block where

00:40:19.200 | the type transforms is this and the batch transforms is this.

00:40:23.800 | So now's a good time to talk about how

00:40:25.360 | this all works, what this does.

00:40:27.720 | So if I pass in here transform block

00:40:33.720 | and don't pass any transforms, it won't do anything.

00:40:37.600 | So let's get rid of pretty much everything.

00:40:56.320 | Let's do that separate cell so it gets a little bit easier

00:41:00.680 | to read.

00:41:01.160 | OK.

00:41:08.480 | Here is the world's simplest data block.

00:41:12.680 | OK.

00:41:14.040 | So if we call that, as you can see,

00:41:26.160 | all it does is it takes the output of get image file 0

00:41:31.960 | and turns it into a tuple containing one thing, which

00:41:35.440 | is the thing itself.

00:41:37.920 | If we have two transform blocks, it

00:41:43.360 | returns a tuple with two things in it.

00:41:46.880 | So and the reason it's returning tuples

00:41:49.200 | is because this is what we want.

00:41:51.480 | When we train, we have batches containing inputs and outputs,

00:41:59.480 | potentially multiple inputs and potentially multiple outputs.

00:42:02.200 | So that's why indexing into this gives you back a tuple.

00:42:08.520 | My question, the blocks can either be a list or a tuple?

00:42:14.160 | I don't know, probably.

00:42:16.120 | Yes.

00:42:17.440 | OK.

00:42:19.440 | I have no idea.

00:42:21.840 | OK.

00:42:25.840 | OK.

00:42:26.340 | So then we can do stuff to the first thing in the tuple.

00:42:44.160 | So get x equals-- say let's get a lambda o dot name.

00:42:56.560 | Hello.

00:43:09.540 | Hey, what are you doing?

00:43:16.980 | Oh.

00:43:17.480 | Something to do with lambda, right?

00:43:26.060 | Does name have to be call?

00:43:27.780 | No.

00:43:37.220 | Maybe it's notebook restart time.

00:43:39.380 | It's my notebook restart time.

00:43:42.820 | Oh, that's-- oh.

00:43:48.260 | I wonder if something happened to my GPU server.

00:43:53.780 | I mean, something has happened to my GPU server, clearly.

00:43:59.220 | Never happened before.

00:44:05.500 | Oh, it looks like it's back.

00:44:07.700 | Oh, OK.

00:44:09.060 | It just recognized that it disappeared.

00:44:10.980 | That's wild.

00:44:19.380 | Oh, OK.

00:44:20.380 | I'm very-- oh, I don't know what just happened.

00:44:28.700 | I guess it doesn't really matter.

00:44:41.300 | What are you doing right now?

00:44:42.900 | I'm just looking at the log, see if anything just happened.

00:44:50.620 | Who knows?

00:44:58.980 | OK.

00:44:59.480 | All right.

00:45:02.220 | OK.

00:45:09.140 | So you see what happened here is we got the first thing

00:45:18.500 | from image files, which was this, and get x got its name.

00:45:26.540 | So we could also do get y equals lambda o o dot parent, say.

00:45:36.780 | OK.

00:45:40.500 | So--

00:45:42.500 | It first went-- first, the thing went to the transform block,

00:45:51.460 | get items.

00:45:52.300 | Yes.

00:45:52.820 | So whatever get items got went to transform blocks.

00:45:56.660 | And then it went to get x and get y.

00:46:00.060 | Well, transform block doesn't do anything, right, at all,

00:46:04.020 | unless you pass the transforms.

00:46:05.980 | So yeah.

00:46:07.140 | So it's basically-- but the number of them you have

00:46:11.500 | is the number of pipelines it's going to create.

00:46:15.420 | So if we created another one--

00:46:18.660 | But generally, if you have an image block,

00:46:21.020 | it would do something.

00:46:22.220 | So the order is--

00:46:23.580 | We're going to get to that, yeah.

00:46:24.860 | So here, look, we've now got--

00:46:26.700 | --the order.

00:46:27.740 | We're not quite there yet, right?

00:46:30.300 | So let's get to that.

00:46:32.700 | And it's not quite the mental model you've got, I think.

00:46:35.780 | Now that I've got three transform blocks,

00:46:38.340 | I only have things to create two of them.

00:46:42.660 | So it's sad, right?

00:46:45.860 | And so we could put them here, for instance.

00:46:59.580 | And that would be--

00:47:00.420 | [INAUDIBLE]

00:47:01.020 | Last one is the y in the first two or the x.

00:47:03.340 | Correct.

00:47:04.140 | Unless we say number of inputs equals 1, in which case

00:47:11.660 | now we get x is just going to have to return one thing.

00:47:16.740 | There's going to be one function.

00:47:18.820 | And get y will be two things.

00:47:28.180 | So you could even put it here instead, right?

00:47:43.620 | So you could say, oh, well, this is actually--

00:47:45.460 | let's move it.

00:47:56.540 | We could put it here.

00:47:59.140 | Item transforms equals--

00:48:03.300 | And so the transform block is stuff

00:48:06.340 | that is applied for that transform.

00:48:09.700 | How is that not working?

00:48:14.780 | That's slightly surprising to me.

00:48:21.020 | , it needs to be a type transform.

00:48:37.900 | OK, type transform.

00:48:39.060 | So it's now converted to the type it's meant to be.

00:48:44.260 | So Radek, you were asking about image block.

00:48:49.180 | I'm just curious how all the pieces interact.

00:48:53.820 | [INTERPOSING VOICES]

00:48:55.580 | Let me show you.

00:48:57.260 | Let me show you.

00:48:58.340 | So let's do it manually.

00:49:01.260 | So image block is just this, OK?

00:49:04.220 | So let's not use image block.

00:49:05.820 | Let's instead--

00:49:07.220 | Why didn't the item transform work?

00:49:09.540 | Let's figure that out later.

00:49:11.140 | Yeah, why don't we figure out what's going on here,

00:49:13.380 | and then we'll debug it.

00:49:14.700 | OK, so now we've got three transform blocks, two of them

00:49:19.700 | which do nothing, and the first one of which

00:49:22.380 | is going to call something.create.

00:49:25.740 | That was period image.create.

00:49:27.500 | So transform blocks don't--

00:49:34.500 | if you look at the code of them, transform blocks

00:49:39.420 | don't do anything at all.

00:49:43.100 | They actually-- they only store things.

00:49:47.940 | There's no done to call.

00:49:51.540 | There's no forward.

00:49:53.060 | There's nothing.

00:49:55.020 | Transform blocks don't do anything.

00:49:56.620 | They just store stuff.

00:49:59.140 | The data block is the thing that then going to go through

00:50:01.980 | and say, OK, for each thing, call its type transforms,

00:50:05.700 | and then call to tensor, and then call its item transforms,

00:50:08.580 | and then data load of time, call its batch transforms.

00:50:12.980 | So does that help answer your question, Hamill?

00:50:16.140 | It's not that a transform block doesn't get called.

00:50:20.460 | It just stores the list of things

00:50:22.460 | that will get called at each of these times.

00:50:25.300 | The first thing that gets called is type transforms.

00:50:27.740 | Wait, is that right?

00:50:33.580 | Let me think.

00:50:35.660 | No, that's not correct.

00:50:37.100 | The first thing that gets called is get x and get y,

00:50:40.100 | and then the result of that is passed into type transforms.

00:50:44.580 | And so get x and get y--

00:50:47.020 | so get x would be responsible for making sure

00:50:49.580 | that you have a path that you can pass to pio-image.create.

00:50:54.780 | That's the order.

00:50:55.460 | So this whole path of what happens in a sequence

00:50:58.740 | that lives in a data block.

00:51:00.380 | That lives in data block, exactly.

00:51:01.940 | Now, the data block code is, frankly, hairy,

00:51:05.380 | and it could do with some simplifying and documenting

00:51:09.420 | and refactoring.

00:51:11.260 | It's not long.

00:51:12.380 | It's about 50 or 60 lines of code.

00:51:16.580 | In fact, it's almost all here.

00:51:21.060 | But basically, when you call .datasets, really,

00:51:27.980 | all it's doing is it creates a data sets

00:51:31.380 | object passing in all of the type transforms to it.

00:51:38.340 | And the answer to your question, Hamill,

00:51:39.980 | why didn't the item transforms get done,

00:51:42.100 | is because item transforms actually

00:51:43.740 | get done by the data loader, not by the data sets.

00:51:47.860 | So data sets only use the type transforms.

00:51:51.780 | And basically, the only reason there's quite a bit of code

00:51:55.180 | in here is we try to make sure that if two different things

00:52:00.820 | have the same type transforms, we merge them together

00:52:05.340 | in a sensible way.

00:52:06.580 | So there's some stuff to try to make sure this all just works.

00:52:10.900 | I was going to assume the type transforms

00:52:15.260 | are separate from the items transforms

00:52:16.820 | because of some optimization you can do with the type transforms?

00:52:21.300 | Because the type transforms, they're happening earlier.

00:52:26.100 | They're happening before data loaders time.

00:52:31.020 | So data loaders are the things that

00:52:34.460 | are going to take tensors, or at least things that

00:52:42.540 | can be converted into tensors.

00:52:45.340 | So yeah, so type transforms are the things

00:52:48.780 | that are going to create your data sets for you.

00:52:51.420 | And they're going to spit out things

00:52:53.060 | which need to be convertible into tensors.

00:52:56.380 | And then data loaders has item transforms,

00:53:01.220 | which are things like reshaping everything to the same size.

00:53:04.180 | And batch transforms, which are things like data augmentation.

00:53:09.580 | But you can have an item transform run on the GPU

00:53:13.420 | or not on the GPU, right?

00:53:14.820 | It depends on the ordering.

00:53:16.100 | I don't think an item transform is generally

00:53:23.260 | going to run on the GPU because it's not a batch yet.

00:53:27.020 | I mean, maybe it's theoretically possible,

00:53:29.100 | but that would be pretty weird because you really

00:53:32.060 | don't need things to be in a batch before the GPU can

00:53:34.380 | be optimizing it effectively.

00:53:37.820 | And everything in batch transforms

00:53:39.700 | will run on the GPU.

00:53:43.420 | Assuming that you're using a GPU, I mean, this is OK.

00:53:48.500 | This is some part of the code base

00:53:50.060 | we're not looking at today.

00:53:51.060 | But I can't remember.

00:53:54.420 | I think this might be a callback which sticks things on the GPU.

00:53:57.820 | So it just depends on whether things are before or after

00:54:00.340 | that callback.

00:54:02.820 | Yeah, that's probably a bit of a distraction.

00:54:05.380 | So let's skip that bit for now.

00:54:08.780 | To kind of revise the difference between data set and data loader,

00:54:13.140 | is it best to revisit the PyTorch documentation and kind of--

00:54:16.900 | Yeah, pretty much.

00:54:17.860 | We have our own implementation of them.

00:54:19.500 | But our implementation of data loader

00:54:21.180 | is a superset of PyTorches.

00:54:24.380 | And PyTorches data set is like literally it's an abstract class.

00:54:29.380 | It doesn't do anything at all.

00:54:31.740 | So a data set is something that you can index into.

00:54:35.300 | And it returns a single tuple of your independent and dependent

00:54:39.540 | variables.

00:54:40.140 | That's what a data set is defined as by PyTorch.

00:54:44.660 | And therefore, that's what we do as well.

00:54:48.260 | A data loader, you can't index into it.

00:54:51.860 | The only thing you can do is iterate through it.

00:54:54.220 | You can grab the next one.

00:54:55.660 | And it gives you a mini-batch, which is a tensor.

00:55:00.180 | So that's the difference.

00:55:01.220 | But yeah, that's a PyTorch concept.

00:55:04.140 | I guess I'm trying to understand the type transform thing,

00:55:09.340 | why it has to be done in the data set before the data loader.

00:55:13.420 | Well, it doesn't have to be.

00:55:14.540 | But it's like we want data sets.

00:55:17.220 | Data sets are a very convenient thing

00:55:19.780 | to have to have something you can go into and grab items,

00:55:24.700 | numbered x, y, or z.

00:55:26.940 | That's the basic foundation of the PyTorch data model,

00:55:30.660 | is that there's things you can index into.

00:55:34.020 | The type transform aspect of it.

00:55:36.460 | Yeah, so you need something that converts

00:55:39.700 | the output of get image files into what

00:55:43.580 | you want in your data set.

00:55:45.780 | And that thing needs a name.

00:55:47.180 | And the name we gave it was type transforms.

00:55:52.740 | OK, I think I understand.

00:55:57.140 | This is not the only way you could do this, right?

00:55:59.860 | But it's our way that's really nice

00:56:02.540 | because we now have this thing that you can say like,

00:56:05.220 | oh, Hamill, can you show me the 14th image and its label?

00:56:09.260 | And you can say, yes, no problem, Jeremy.

00:56:10.940 | You can type DSS dot train bracket 13.

00:56:14.820 | And there it is, right?

00:56:17.020 | So yes, that's just a convenient thing, basically.

00:56:23.900 | So I guess a question around that

00:56:26.100 | is that if we did not have type transforms,

00:56:29.340 | then it would just be one more step in the item transforms,

00:56:33.580 | right?

00:56:34.300 | Yeah, I think so.

00:56:36.020 | So it is just separating those things out.

00:56:38.180 | Yeah, your data sets would always just return a single thing,

00:56:42.300 | or maybe two things, the get x and get y results.

00:56:46.540 | And then your data loader would have to do more work, basically.

00:56:50.860 | Exactly.

00:56:51.380 | Yeah, yeah.

00:56:52.260 | Which would be a perfectly OK way to do things as far as I

00:56:56.220 | can tell that I think would be a little harder to debug

00:56:59.740 | and work with and keep things decoupled.

00:57:05.140 | Yeah, I think that's a reasonable comment.

00:57:07.860 | Is it like anything you want to do up front that

00:57:11.300 | is like kind of uniform across your whole data set,

00:57:13.540 | maybe put it in the type transform

00:57:15.460 | that you don't need to change at training time?

00:57:18.860 | Basically, like anything that you

00:57:23.660 | want to be able to index into it and look at that thing, really.

00:57:28.340 | If you're not sure where to put it,

00:57:33.660 | I'd say just chuck it somewhere and don't worry about it.

00:57:35.860 | You know, we kind of put--

00:57:41.300 | the rule is that you need something that

00:57:47.180 | can be turned into a tensor.

00:57:48.900 | Like that's the way fast AI does it.

00:57:52.100 | So you need to make sure that your type transform, when

00:57:55.860 | you're working with fast AI, returns something

00:57:57.780 | that is a tensor or going to be turned into a tensor.

00:58:02.100 | Which PIL image can be, for example?

00:58:06.780 | OK.

00:58:08.820 | I think I understand.

00:58:09.700 | It's kind of like you want to make sure

00:58:12.500 | it's a convenient thing that you understand to look at.

00:58:16.300 | Yeah.

00:58:17.860 | OK.

00:58:18.900 | Yeah.

00:58:20.700 | OK, so then like--

00:58:22.660 | OK, so I can remove all that.

00:58:26.500 | This is the definition of image block.

00:58:29.100 | So let's replace it with the word image block.

00:58:31.580 | OK.

00:58:33.420 | And then let's change-- OK, let me think.

00:58:47.420 | OK, so let's put a dot name here.

00:59:01.340 | Here's kind of something we want as our label, right?

00:59:04.180 | That's one of our labels.

00:59:05.940 | And then the other label we wanted

00:59:09.900 | was the function called get variety, right?

00:59:21.740 | Now we can't-- this breaks our rule.

00:59:24.260 | This can't be turned into a tensor because it's a string.

00:59:29.540 | So what do we do about that?

00:59:34.060 | You might remember from a previous lesson

00:59:35.780 | we learned that what we do is we replace strings with integers

00:59:39.580 | where that integer is a lookup into a vocabulary.

00:59:42.660 | It's a list of all of the possible options.

00:59:45.500 | So if we change this to category block,

00:59:55.860 | that is exactly what category block will do, right?

01:00:04.380 | And so category block, it's got a type transform

01:00:15.580 | categorize, which I'm not going to go into because it's not

01:00:19.380 | particularly exciting.

01:00:21.620 | But if you look up the documentation for categories,

01:00:25.180 | you can see how it does that.

01:00:27.140 | So basically, internally now, you'll

01:00:29.500 | find that the vocab is stored for these things.

01:00:35.380 | So if we look at this at a high level, get items, get--

01:00:39.900 | By the way, just a moment, here's the vocab, right?

01:00:42.140 | It's got two things.

01:00:43.420 | It's got the vocab for the diseases

01:00:45.260 | and the vocab for the varieties.

01:00:47.260 | Yeah, sorry, Radek.

01:00:48.540 | No worries.

01:00:49.540 | So get items gets us the rows or the examples

01:00:53.940 | or whatever allows us to--

01:00:55.660 | and then the core for a single example.

01:00:59.140 | And then from get items, we use get y or get x

01:01:03.820 | to transform it somehow so that we can pass it

01:01:07.220 | into those blocks.

01:01:08.340 | Correct, specifically pass it into the type

01:01:10.820 | transforms of those blocks.

01:01:12.500 | Into type transforms.

01:01:13.700 | And type transforms are things that can get triggered, right?

01:01:19.700 | So they're doing a little bit something similar to get y,

01:01:23.460 | but are building on what get y does.

01:01:25.780 | Correct, exactly.

01:01:26.980 | Because these are very general things, right?

01:01:31.340 | And so I didn't want you guys to have

01:01:32.980 | to write your own every time.

01:01:34.940 | So these basically say, this says,

01:01:38.860 | I will work if you can pass me a path to an image.

01:01:42.540 | And this says, I will work if you pass me a string.

01:01:45.620 | And so get x and get y then are responsible for ensuring

01:01:48.980 | that you pass them a path and pass this one a string.

01:01:53.500 | And get image files is already returning paths,

01:01:56.060 | so we don't need to get x for this guy.

01:02:00.380 | But it's not returning strings, so we

01:02:02.340 | do need to get y for these guys.

01:02:04.060 | OK, so I'm going to finish--

01:02:11.060 | I'm going to run it slightly over time.

01:02:14.900 | But let's have a look at-- so this is exactly the same.

01:02:26.060 | OK, so this is exactly the same as what we just had, right?

01:02:30.620 | And so then we can also then add the two things, which

01:02:33.020 | is the item transforms and the batch transforms.

01:02:35.020 | Some other time, we will talk about how it is that--

01:02:38.100 | how come this is not being applied to the categories?

01:02:40.820 | It's only being applied to the images.

01:02:43.380 | For those of you interested in skipping ahead,

01:02:45.500 | the secret is using fast calls type dispatch functionality.

01:02:49.220 | Anyway, so that's why we're getting

01:02:55.460 | these three different things-- image, we've got y1.

01:02:58.140 | So Jeremy, if we had an image, if we had an image block

01:03:01.580 | in our--

01:03:04.900 | for our y's, for our targets, then item transform

01:03:08.860 | would get applied.

01:03:10.820 | Correct.

01:03:11.940 | Oh, wow.

01:03:13.260 | And there's a-- have a look at the Siamese tutorial

01:03:17.220 | on the fast.ai docs, because that has two images.

01:03:22.380 | Yeah.

01:03:22.900 | And if you think about it, any time we do segmentation,

01:03:25.780 | that's exactly what's happening, right?

01:03:27.340 | The data augmentation is happening to x and y.

01:03:29.580 | And this is really unusual.

01:03:32.300 | I don't know of any other libraries

01:03:33.980 | that have this kind of totally transparent ability

01:03:36.700 | to do bounding boxes, segmentation, point clouds,

01:03:42.380 | whatever as dependent variables, and have it all

01:03:45.220 | happen in unison very, very automatically.

01:03:49.420 | Well, at least it didn't used to be.

01:03:51.060 | Maybe there is now.

01:03:51.940 | OK, so now I can create data loaders from that.

01:04:04.940 | And thanks to the magic of fast.ai, this is so cool.

01:04:12.100 | Check this out.

01:04:12.700 | It's actually auto labeling it with each of our categories.

01:04:16.620 | So thanks to stuff we'll discuss later, basically this stuff

01:04:20.860 | called type dispatch, fast.ai does a lot of things

01:04:26.340 | automatically, even though I don't think I've ever explicitly

01:04:29.140 | coded this to work.

01:04:31.060 | It just does because of how the API is designed.

01:04:35.780 | So we now have something which can

01:04:37.340 | create batches of pictures and two

01:04:40.700 | different dependent variables, each one of which

01:04:43.540 | has a category.

01:04:44.020 | And so what we will get to next time

01:04:58.900 | is it actually turns out--

01:05:02.780 | well, I briefly mentioned it now, actually.

01:05:05.580 | All that stuff I did last time about messing around

01:05:07.660 | with multiple different heads and all that

01:05:09.940 | is actually totally unnecessary.

01:05:12.180 | All we need to do when we create our vision learner

01:05:15.140 | is tell it we don't want 10 outputs,

01:05:17.860 | but we don't want 20 outputs.

01:05:20.020 | So normally it automatically figures out

01:05:21.860 | how many outputs you want by how many levels are

01:05:24.940 | in your categorical dependent variable.

01:05:27.380 | But in this case, we've got something custom, right,

01:05:29.420 | which is we've got a tuple of outputs.

01:05:32.060 | So we have to tell it.

01:05:32.940 | We want 20 outputs.

01:05:34.380 | That's going to make the final matrix that it multiplies by

01:05:38.540 | have 20 outputs.

01:05:41.060 | Now, then you basically need to tell it what loss function

01:05:46.740 | to use.

01:05:48.540 | And so if you look it up, it turns out

01:05:50.540 | we used to use a loss function for this called

01:05:52.460 | cross-entropy_loss_flat.

01:05:54.300 | So we're going to call that exact loss function

01:05:56.620 | on the first 10 items.

01:06:00.740 | And we're going to compare that to the disease probabilities.

01:06:05.860 | And then the second 10, we're going

01:06:08.500 | to compare to the variety probabilities.

01:06:11.660 | And we'll do the same thing for having an error rate, which

01:06:14.940 | just looks at the first 10, the error rate for disease,

01:06:18.340 | and the same thing for variety.

01:06:19.980 | Look at the second 10, the variety.

01:06:22.460 | And so basically then, if you train that,

01:06:25.700 | it's going to print out the disease and the variety error.

01:06:29.300 | And the loss function will be the loss function

01:06:31.500 | on both of the two halves.

01:06:35.500 | And interestingly, for this single model,

01:06:42.100 | this 2.3% disease error is the best I'd ever

01:06:44.820 | got for this architecture.

01:06:47.580 | So at least for this single model case,

01:06:50.100 | this was better than training something that

01:06:56.580 | only predicts disease.

01:06:57.940 | Anyway, we can talk about that more later,

01:06:59.620 | because we kind of spent more time on it.

01:07:01.460 | I have a quick question.

01:07:02.580 | Yeah.

01:07:03.580 | The last layer, it's a flat 20 output layer.

01:07:09.540 | Does this mean at inference time that we

01:07:12.780 | would have to do the softmax plus--

01:07:16.460 | what would it be?

01:07:17.340 | I can't remember.

01:07:18.060 | No, firstly, I handled all that for you automatically.

01:07:20.380 | All right.

01:07:21.180 | Yeah.

01:07:21.660 | Great.

01:07:28.260 | All right.

01:07:28.740 | And by the way, in the inference functions,

01:07:31.620 | you'll see there's always a rule in options

01:07:34.900 | as to whether to decode it and whether to put

01:07:37.140 | the final activation function on it and stuff like that.

01:07:39.740 | So actually, now I think about it.

01:07:42.740 | In this case, because we used a custom loss function,

01:07:46.180 | I think that would have broken its ability

01:07:48.700 | to do it automatically.

01:07:49.780 | So yeah, OK, I'm going to say actually,

01:07:51.540 | you would need to add a softmax if you wanted to.

01:07:55.820 | Although you actually don't need to,

01:07:59.620 | because at least for the Kaggle competition,

01:08:03.260 | I just needed which disease had the highest prediction.

01:08:09.260 | And whether it's softmax or not, it's

01:08:11.420 | going to be the same because that's a monotonic function.

01:08:18.460 | So it depends whether you actually

01:08:20.380 | need probabilities or not.

01:08:22.860 | In my case, I didn't have to do this.

01:08:26.540 | Yeah, but you would only look at the first 10

01:08:29.300 | or the second, I guess, the first ones.

01:08:31.620 | Yeah, so you just--

01:08:33.180 | You can see it.

01:08:33.780 | Because otherwise, yeah.

01:08:36.740 | So I was using TTA to do test plan augmentation.

01:08:41.540 | And I stacked up and I did an ensemble of TTA.

01:08:44.460 | And then I just did an argmax on the first 10.

01:08:49.380 | Yeah, all right.

01:08:51.420 | All right.

01:08:52.500 | Just hold up.

01:08:54.100 | OK, sure.

01:08:55.380 | In the architecture, you selected for ResNet 18, 128.

01:09:03.300 | Is there any programmatic way to find out

01:09:06.220 | the size or the input size of the models

01:09:10.100 | that you are trying to use?

01:09:12.260 | These models handle any input size.

01:09:15.660 | All right.

01:09:17.500 | Yeah.

01:09:17.980 | All right.

01:09:18.620 | All right.

01:09:20.180 | All the ResNets and all the ConfNecks

01:09:22.220 | handle any input size.

01:09:24.580 | All right.

01:09:25.540 | Thank you.

01:09:26.060 | It's only the transformer models.

01:09:28.740 | That also tripped me up in the beginning.

01:09:30.620 | But there's a lot of interesting stuff there

01:09:32.620 | that might take a whole lecture to understand.

01:09:36.180 | All right, again, all that stuff, yeah.

01:09:38.500 | Thanks, gang.

01:09:39.460 | See you.

01:09:40.340 | Thank you.

01:09:40.860 | Thank you.

01:09:42.420 | [BLANK_AUDIO]

Live coding 13

Chapters