back to index

Live coding 13


Chapters

0:0 Introduction and Questions
5:15 MultiTask Classification Notebook
7:40 Good fastai tutorials
8:30 DataBlock API
12:35 How does ImageBlock, get_image_files work
15:15 How is aug_transforms working
17:30 Converting ImageDataLoaders to DataBlock
22:8 In PyTorch DataLoaders, what happens at the last batch?
23:23 Step2: Make DataBlock spit three things
27:30 Modifying get_y to send as two inputs
32:0 Looking into Dataset objects
33:50 Can we have multiple get_items?
35:20 Hacky notebook using data frames for creating DataBlock
39:40 How does TransformBlock and ImageBlock works
49:30 Looking at the source code of TransformBlock, DataBlock code
54:10 Dataset, Dataloaders discussion
58:30 Defining DataBlock for Multi-task classification notebook
65:5 Sneak peek into how the multi-task model is trained

Whisper Transcript | Transcript Only Page

00:00:00.000 | Just continuing with this.
00:00:02.120 | OK, great.
00:00:03.320 | Yeah.
00:00:05.720 | Any other new faces today?
00:00:07.400 | Ali, have you been here before?
00:00:11.960 | I don't remember seeing you.
00:00:13.400 | Maybe you've been quiet.
00:00:14.280 | I've been here.
00:00:15.160 | I don't think I've had the camera on.
00:00:17.560 | OK, yeah.
00:00:19.400 | Nice to meet you.
00:00:21.280 | And Nick is back from his tour of Tropical Islands.
00:00:26.040 | Hi, Nick.
00:00:26.680 | Yes, yep.
00:00:28.720 | Not so far away, though. Still pretty close.
00:00:31.000 | Yeah.
00:00:31.600 | Well, when you're at Queensland, Tropical Islands are easy.
00:00:34.920 | By default.
00:00:37.120 | Hi, Marie.
00:00:37.800 | Nice to see you.
00:00:39.200 | Where are you joining us from?
00:00:42.320 | Hello, can you hear me?
00:00:43.280 | Hi, absolutely.
00:00:45.360 | I'm Joni from South Bank of Brisbane.
00:00:47.720 | Uh-huh.
00:00:48.680 | Yeah, I'm at the Children's Health Queensland Hospital
00:00:52.040 | building.
00:00:52.760 | Ah, yes.
00:00:53.760 | OK, great.
00:00:58.240 | Anybody got anything they wanted to talk about or ask about
00:01:00.920 | before we dive in?
00:01:02.000 | Because I've got some good news, which
00:01:03.840 | is I've made things much easier in the horrible direction
00:01:07.720 | we went last time.
00:01:08.600 | All right, sounds like I should dive in then.
00:01:14.000 | I know there was, like, as I rather expected,
00:01:22.960 | and quite fair enough, some concern on the forum
00:01:26.000 | about how complicated and weird this all is, which is partly
00:01:31.880 | because we're kind of, like, jumping into stuff we'd normally
00:01:34.040 | do in part two.
00:01:35.000 | And so we haven't quite got all the necessary background here.
00:01:41.160 | And so, yeah, don't worry if you're feeling a little at sea.
00:01:49.480 | But it's a good way to, like, I don't know,
00:01:51.040 | start to see, like, some stuff you could look into
00:01:54.440 | after the course is finished.
00:01:55.760 | All right, so let me just--
00:02:02.760 | Jeremy.
00:02:03.760 | Sorry, just not mentioning part two.
00:02:05.720 | Are you planning a part two this year?
00:02:09.480 | Planning would be too strong a word.
00:02:11.360 | But as much as I ever plan anything--
00:02:14.000 | Thinking about?
00:02:15.120 | Yes, I would absolutely like to do a part two this year.
00:02:18.120 | Awesome.
00:02:18.720 | And the fact that you're asking that question
00:02:20.600 | means that you are not up to date on our Discord.
00:02:23.560 | So you should definitely join our Discord channel
00:02:25.720 | because we've actually been talking
00:02:27.200 | about doing a conference, or an unconference,
00:02:32.240 | to go with part two at the end of the year in Queensland.
00:02:38.400 | And a lot of folks from places that are not in Australia
00:02:43.000 | are saying they would be up for coming here for that.
00:02:48.080 | Partly a kind of a social thing, and also
00:02:52.280 | trying to do it all in a very COVID-safe way of outdoor,
00:02:56.080 | and masked, and people tested ahead of time, and stuff.
00:03:02.080 | Yeah, so that's--
00:03:04.920 | yeah, so nothing's planned.
00:03:06.520 | We don't have dates, or a syllabus, or anything
00:03:09.200 | like that, but we absolutely hope
00:03:11.440 | to have something awesome maybe towards the end of the year
00:03:19.920 | where we can get to know each other a bit better,
00:03:23.320 | and get to know fast AI and deep learning a bit better.
00:03:27.560 | Jeremy, can I ask you, are you going
00:03:31.520 | to continue this work for next week
00:03:34.440 | because I know that the class is going
00:03:36.320 | to be shown next Tuesday?
00:03:38.280 | Yeah, I think so.
00:03:42.160 | The fact that I'm doing it this week is interesting
00:03:45.640 | because, yeah, it has meant I've got less time to actually work
00:03:51.920 | on the course, but I also feel like the stuff we're doing,
00:03:55.440 | perhaps, is stuff I can use in the next lesson,
00:03:57.840 | depending on where we get to.
00:03:59.080 | So yeah, so I think we'll do it next week.
00:04:03.520 | We'll see how things go.
00:04:04.720 | If I get really behind, then maybe not.
00:04:07.760 | But then I certainly plan to continue doing them
00:04:11.560 | after the last class until--
00:04:16.560 | I don't know.
00:04:17.240 | We all know everything, I guess, and then we'll stop.
00:04:21.040 | At which point, there'll be more things to know, so yeah.
00:04:25.200 | We don't have to stop, necessarily.
00:04:27.160 | There's just so much to learn.
00:04:30.120 | The only problem is that it's obviously
00:04:32.120 | a burden on your time, but it's--
00:04:34.960 | I enjoy it.
00:04:35.920 | I enjoy it.
00:04:37.000 | My issue is, what about when we get to the point
00:04:39.120 | where there's nothing left to learn, Radek?
00:04:40.880 | What do we do then?
00:04:42.680 | Well, is there such a point?
00:04:45.800 | Sure, there must--
00:04:48.120 | But then we just start doing it all in a different language.
00:04:50.520 | We start doing it all in R or Julia.
00:04:52.560 | Yeah, correct.
00:04:55.120 | C plus plus, that would keep us busy.
00:04:57.760 | I think this is my fifth year of doing fast AI courses,
00:05:01.280 | and I'm still trying to complete the part one.
00:05:06.000 | Are you too monostatic?
00:05:08.640 | That's true.
00:05:11.280 | All right.
00:05:12.040 | Let me see.
00:05:12.720 | So multitask.
00:05:15.560 | All right.
00:05:21.400 | I am just so happy about how this worked out, to be honest.
00:05:26.240 | Although, spoiler alert, it didn't turn out to help our score.
00:05:31.080 | The score was about the same.
00:05:34.680 | But I was so happy at how the whole process turned out.
00:05:38.920 | But I kind of want to show you how I got there, as well as
00:05:42.760 | where we ended up.
00:05:45.880 | And yeah, as soon as I kind of turned off Zoom last time,
00:05:50.680 | and I went for a walk, and then as soon as I did that,
00:05:53.520 | I was like, oh, no, of course I know how we should do this.
00:05:56.200 | And really, so there's quite a few--
00:06:00.280 | we can make this much, much simpler.
00:06:02.160 | So let me explain what we're going to try to do.
00:06:05.680 | We are going to try to predict two things, the disease
00:06:20.120 | and the variety for each image.
00:06:23.600 | And the first thing will be to create a pair of data loaders
00:06:29.040 | that look like this.
00:06:29.920 | For each image, they will have connected two things to them,
00:06:33.800 | the disease and the type of race.
00:06:37.760 | So this is going to be step one.
00:06:39.320 | So let me kind of show you how I got to that step.
00:06:44.400 | So the step one that I took was to, first of all,
00:06:48.120 | try to replicate what we've already had before.
00:06:53.400 | So patty's ball.
00:06:56.320 | But before, I used image data loaders,
00:06:58.600 | which is the highest level least flexible function.
00:07:03.640 | We can do all the data processing
00:07:05.800 | of a single line of code, but only
00:07:08.080 | if we want to do something really standard.
00:07:10.200 | And trying to predict two things is not standard enough for it.
00:07:14.840 | So we now need to go one layer down.
00:07:18.040 | And there's a lot of really good tutorials on docs.fast.ai.
00:07:27.600 | And because it's been ages since I've done any of this,
00:07:30.120 | I'd forgotten entirely how fast.ai worked,
00:07:32.920 | so I used them very heavily to remind myself
00:07:36.080 | of what's going on.
00:07:37.960 | But for example, there is a data block tutorial.
00:07:41.560 | This pat's tutorial is great.
00:07:45.560 | It goes through all the layers of different ways
00:07:47.960 | of doing things with fast.ai preprocessing.
00:07:53.760 | This Siamese tutorial is another really good one.
00:07:56.440 | So these are some of the things I looked at.
00:07:58.560 | And the other thing that I looked at was the actual API docs.
00:08:02.760 | So if I click here on data block,
00:08:06.120 | this is actually probably what I found the most useful
00:08:09.360 | in the end.
00:08:11.640 | There's lots of great examples in the documentation.
00:08:14.120 | So yeah, as I kind of like--
00:08:17.040 | you know how it is, you come back to something a couple
00:08:19.560 | of years after you built it, and you're now
00:08:22.240 | kind of the customer of your documentation.
00:08:24.200 | And so my experience as a customer of my documentation
00:08:27.000 | was I was really delighted by it.
00:08:28.800 | So I can definitely suggest checking all that out.
00:08:33.280 | So what you can do--
00:08:37.000 | before, we were using image data loaders.from folder.
00:08:40.080 | So if we just do the double question mark trick,
00:08:42.080 | we can see the source code for it.
00:08:45.040 | And it's the normal size of fast.ai things.
00:08:47.560 | It's very small.
00:08:49.320 | And you can see that actually all the work's
00:08:51.840 | being done by data block.
00:08:53.800 | So data block is the still high level API,
00:08:59.000 | but not as high level.
00:09:01.320 | It's actually very flexible.
00:09:02.760 | And so we're going to--
00:09:04.240 | step one that I did was to replicate exactly what we had
00:09:08.680 | before about using data blocks.
00:09:13.040 | And for this, there's actually so many good examples
00:09:19.360 | in the tutorials and in the book.
00:09:21.120 | You've seen them all before.
00:09:22.280 | We don't need to talk about it too much.
00:09:25.480 | We can basically say, OK, for the data block,
00:09:29.000 | the input will be an image.
00:09:30.200 | The output will be a category.
00:09:32.640 | This is just to do disease prediction.
00:09:35.080 | The labeling function will be the parent folder.
00:09:50.920 | Parent label.
00:09:53.440 | Do a random split.
00:09:55.360 | The item and batch transforms, we
00:09:57.200 | can copy and paste from what we had before.
00:10:00.480 | And that creates a data block.
00:10:06.760 | So data loaders is then a data block.data loaders.
00:10:11.000 | And you then have to pass in a source.
00:10:15.160 | So the source is basically anything
00:10:20.080 | that you can iterate through or index
00:10:24.880 | into to grab the things that will be passed to these blocks
00:10:30.080 | and to this function.
00:10:30.960 | So for this, it's going to be a path.
00:10:35.800 | And then we also need to get items.
00:10:48.000 | And so when we get a path, we're going
00:10:50.400 | to pass that path into the getImageFiles function,
00:10:54.200 | because that's the thing that returns
00:10:55.800 | a list of all of the images in a path.
00:10:57.520 | And let's see if that works.
00:11:03.920 | How do you know that--
00:11:09.680 | OK, so the block, you have an image block, category block.
00:11:14.200 | How do you know that the getImageFiles
00:11:21.400 | is going to be able to feed both those blocks?
00:11:26.680 | So I guess the short answer would
00:11:32.000 | be to read the documentation about those blocks
00:11:35.680 | to see what they take and what they do,
00:11:40.480 | or any of the tutorials that use them.
00:11:42.560 | As you can see, they're used all over the place, right?
00:11:44.720 | So you can start with this tutorial, or this tutorial,
00:11:48.360 | or this tutorial.
00:11:49.040 | So any of those would show you what it does.
00:11:55.080 | Yeah, the actual-- sorry, this is not good documentation.
00:12:03.280 | I really never bothered to look at this,
00:12:05.240 | because it's basically all it does, because we should fix
00:12:12.880 | that, I guess, because there's so many tutorials.
00:12:16.840 | I mean, as you can see, I guess the reason I never really
00:12:19.680 | wrote drops, right, is that it's literally a single line of code.
00:12:22.240 | So that's like--
00:12:23.120 | yeah, so maybe looking at the code
00:12:30.040 | is actually interesting in this case.
00:12:31.560 | So an image block is something which
00:12:36.120 | calls class.create, where class is P-I-L image.
00:12:42.440 | So it's going to call P-I-L image.create.
00:12:44.920 | So to find out what's actually going to be called by it,
00:12:49.600 | you can go P-I-L image.create, and you can see
00:12:59.520 | it's going to get passed a file name, which
00:13:01.640 | can be a path, or a string, or various other things.
00:13:05.280 | So get image files, then, obviously, you
00:13:09.680 | can either run it to see what it comes out with,
00:13:11.880 | or let's do that so we could just run get image files,
00:13:16.000 | passing in the thing it's going to be given, which is the path.
00:13:20.760 | And so as you can see, it's a bunch of paths.
00:13:23.640 | And so we could pass one of those--
00:13:28.080 | copy that, and it's going to be passed into this function.
00:13:31.240 | So we've now just replicated exactly
00:13:35.800 | what's happening in the code.
00:13:37.240 | Yeah, but for this, I generally just
00:13:43.400 | look at the tutorials, which tell you what to do.
00:13:47.240 | Could you have two different get items
00:13:50.200 | that feed different blocks?
00:13:52.720 | We're going to come to that.
00:13:54.680 | OK, so park that.
00:13:57.720 | So yeah, so--
00:14:02.920 | There was also a batch transform,
00:14:05.160 | like it gets transformed later after reading, right?
00:14:08.880 | Yeah, here we've got the batch transform, yeah.
00:14:12.440 | But in the image block, because right now, we
00:14:15.920 | have a build image, or something,
00:14:18.440 | and it needs to become a tensor, right?
00:14:21.120 | Yeah, that's right.
00:14:22.200 | It gets changed from an int tensor to a float tensor
00:14:25.600 | later on.
00:14:26.520 | That's right.
00:14:27.520 | Yeah, that's a fairly subtle thing, but that's right.
00:14:32.760 | We stick that in something.
00:14:39.720 | Image equals-- and we look at, like, np.array image.
00:14:47.480 | It's actually stored as bytes, or UN8, as they call it,
00:14:51.480 | in PyTorch.
00:14:54.440 | So yes, this is going to add a batch transform that's
00:15:00.200 | going to turn that into a float tensor, which
00:15:02.320 | we can see what that's going to look like.
00:15:06.040 | We could run it here, I expect, to float tensor.
00:15:10.760 | The transformation that we're applying
00:15:17.840 | is 224, which is like a square image, correct?
00:15:21.720 | A 224 by 224?
00:15:24.200 | This one here?
00:15:26.920 | Yeah, so this is doing data augmentation
00:15:29.680 | of lots of different kinds.
00:15:31.360 | So let's copy doc paste.
00:15:43.560 | And if we show--
00:15:48.760 | for data augmentation, looking at the docs
00:15:50.800 | is particularly important, because you
00:15:52.720 | can see examples of the augmentations it does.
00:15:55.760 | So it tells you a list of all the augmentations
00:15:57.680 | and how you can change them.
00:15:58.760 | And here's some examples of what they look like.
00:16:02.480 | And--
00:16:02.960 | These augmentations would happen after the integer float
00:16:06.520 | tensor, right?
00:16:08.800 | Yes, that's correct.
00:16:10.040 | And some data augmentations, they
00:16:13.080 | operate on the entire batch, and some operate
00:16:15.600 | with a single example.
00:16:17.320 | Yeah, that's right.
00:16:18.240 | So the ones in batch transforms operate on the whole batch,
00:16:20.720 | and the ones in item transforms operate
00:16:22.520 | on a single item.
00:16:23.960 | And so batch transforms operate--
00:16:27.880 | because they operate on a batch, before you get there,
00:16:31.160 | everything has to be the same shape.
00:16:33.000 | So it can be turned into a batch.
00:16:34.800 | So this resizes everything to the same shape.
00:16:38.680 | And then this does these various different types of data
00:16:42.480 | augmentation.
00:16:44.160 | And one of the key pieces of data augmentation it does
00:16:47.440 | is to randomly zoom into a subset of the image,
00:16:51.760 | as you can see in these various examples here.
00:16:55.360 | And this data block API, can you also
00:17:03.840 | use it with data frames, where you
00:17:05.920 | would be reading your images from a data frame?
00:17:08.320 | We are going to do that.
00:17:08.960 | We're going to do that in a moment, yes.
00:17:10.480 | OK, cool.
00:17:11.280 | Absolutely.
00:17:14.400 | So yeah, I'm kind of skipping over quite a bit of this,
00:17:17.360 | because it's super well covered in the tutorials.
00:17:22.040 | So I don't want to say stuff that you can very easily read.
00:17:24.680 | Whereas the stuff I'm about to show you
00:17:26.040 | isn't as well covered in the tutorials, and it's kind of new.
00:17:28.880 | But yeah, feel free to keep asking questions
00:17:33.880 | about anything you see.
00:17:36.360 | So basically, yeah, so all we've done
00:17:39.840 | is this is just the same thing that we have in lesson one.
00:17:43.880 | And it's doing exactly the same thing as my image data
00:17:51.760 | loaders.from folder, but just turn it into data block.
00:17:54.240 | And so this is what I did, just to show you
00:17:56.920 | through my process.
00:17:57.880 | But step one was to get this working,
00:17:59.760 | and then I passed that into a learner.
00:18:05.200 | And so let's go copy, and I want this all to run as fast as
00:18:14.040 | possible.
00:18:15.520 | So I would use the fastest--
00:18:17.560 | Do you-- when you make this data loader thing,
00:18:27.520 | do you try to make sure that the shape that it's outputting
00:18:31.560 | is what you need for your model, or that's later?
00:18:34.720 | Well, I generally use models which
00:18:37.960 | don't care what size they get.
00:18:41.720 | So yeah, that's one of my tricks.
00:18:44.840 | So ResNet18 is happy with any size.
00:18:48.640 | So actually, for my testing, I'm going
00:18:50.160 | to bring this back down to 128, so it's super fast.
00:18:53.920 | And so I just want to get the maximum iteration speed here.
00:19:00.720 | And so now, I can call learn.fit1cycle.
00:19:07.400 | And let's do one epoch.
00:19:09.400 | OK, so this is going to run in under 20 seconds, which
00:19:21.120 | is kind of what you want, right?
00:19:22.320 | You want something that you can test in under about 20 seconds,
00:19:24.920 | so that way, you can just quickly try things
00:19:27.200 | and make sure that end-to-end, it's working.
00:19:29.760 | Yep, the error rate is down to 30%.
00:19:33.880 | So that's a good sign.
00:19:38.600 | I guess one correlated question is, OK,
00:19:40.920 | I understand the input size, but what
00:19:43.200 | about the output size of your data block?
00:19:45.680 | You know that this is what you need for that model.
00:19:48.120 | Let's not say the model doesn't care.
00:19:49.920 | The model's happy with any size.
00:19:52.360 | I mean, the targets, or whatever.
00:19:55.640 | You're talking about the labels?
00:19:57.120 | Yeah, the labels.
00:19:58.000 | I mean, labels don't have sizes.
00:19:59.800 | The labels are strings.
00:20:01.120 | Or just the shape of that.
00:20:02.680 | Like, hey, is it--
00:20:05.080 | because maybe different models are kind of trying
00:20:07.600 | to predict different types of stuff, potentially?
00:20:10.600 | I don't know.
00:20:13.360 | Like, some might have the shape of the target.
00:20:16.360 | A vision learner is--
00:20:18.760 | I suspect the thing you're kind of asking
00:20:24.320 | is the thing that we're going to be covering in a moment.
00:20:26.560 | So maybe put that on hold, and then tell me
00:20:28.640 | if it doesn't make sense.
00:20:32.760 | Before-- I have a question.
00:20:37.120 | On the data block, you randomly select the amount of records
00:20:44.560 | or the amount of the batch size that you're going to process.
00:20:49.120 | I don't randomly pick the batch size, no.
00:20:51.360 | The batch size is actually selected in the .dataLetters
00:20:55.160 | call and it take off to 64.
00:20:59.400 | So what is the guarantee that every single one
00:21:02.760 | of the images in this particular case
00:21:05.000 | will be selected, or there's no way to know?
00:21:09.400 | Is there any way to know that every single one will be--
00:21:13.720 | I mean, well, yes, except that we're randomly
00:21:16.760 | selecting 20% to be a validation set.
00:21:21.880 | But every single one will go through the learner.
00:21:26.200 | Of the 80% that aren't in there, everyone
00:21:29.080 | will go through our learner because we randomly shuffle them
00:21:31.960 | and then we iterate through the whole lot.
00:21:35.400 | So in a single epoch, the model is
00:21:38.760 | guaranteed to see every example that is trained just once.
00:21:44.040 | Yeah.
00:21:46.040 | Yeah, and that's what this one means.
00:21:47.480 | That's what one epoch means, is look at everything once.
00:21:50.280 | And so if we put two there, it would look at everything twice.
00:21:52.680 | But each time it randomly shuffles it,
00:21:54.360 | so it does it in a different random order.
00:21:56.920 | I have a quick question.
00:21:59.600 | I guess this is PyTorch data loader stuff,
00:22:03.080 | but what actually happens for the last batch?
00:22:06.280 | The last batch, it depends.
00:22:08.360 | And this is actually not the PyTorch data loader,
00:22:10.840 | it's actually fast.ai's data loader.
00:22:12.680 | So we have our own data loader, although in the next version,
00:22:16.120 | we're likely to replace it with the fast.ai one.
00:22:19.080 | So it depends what drop last is.
00:22:20.760 | If drop last is true, then it deletes the last batch.
00:22:24.920 | And if it's false, then it includes the last batch.
00:22:27.560 | And the reason that's interesting is that the last batch may not
00:22:30.040 | be of size 64.
00:22:32.920 | Yeah.
00:22:34.920 | For the validation set, it always keeps the last batch.
00:22:39.080 | It's super important to shuffle the transfer.
00:22:42.040 | The fast.ai does it for you, but if you
00:22:44.560 | will mess around with the data loaders or do something
00:22:47.520 | yourself, if you don't shuffle the transfer,
00:22:49.960 | you might get very poor training performance.
00:22:53.560 | When we used to use Keras, I used
00:22:55.160 | to mess all this stuff up all the time.
00:22:57.960 | Yeah, trying to get all those details, right?
00:22:59.560 | It's really annoying.
00:23:00.400 | Just to make sure on something you said,
00:23:02.080 | you said in the next iteration, you're
00:23:03.600 | going to replace it with the PyTorch data loaders?
00:23:05.760 | Yeah, probably.
00:23:08.400 | You said fast.ai, so I got confused.
00:23:10.440 | Oh, did I?
00:23:11.680 | That is confusing.
00:23:13.880 | Thanks.
00:23:15.720 | OK, so that was my step one.
00:23:17.360 | Is to just get it working exactly like before.
00:23:20.120 | And then I ran it in the background
00:23:25.000 | on the same architecture for the same epochs
00:23:27.320 | to make sure I got about the same error rate, and I did.
00:23:29.760 | So then I was happy that, OK, I'm matching what we had before.
00:23:33.960 | So then step two was to try to make it
00:23:38.440 | so that the data block spits out three things, which
00:23:41.440 | would be one image and two categories.
00:23:44.480 | The category of disease and the category of rice type.
00:23:50.000 | So to get it to spit out an image and two categories,
00:23:53.160 | hopefully you wouldn't be surprised to hear
00:23:54.840 | that we just do that.
00:23:57.920 | We say we want three blocks, an image, and two categories.
00:24:01.000 | Now, this variety, we did some way
00:24:07.280 | of getting that given an image ID.
00:24:13.160 | And actually, the way I did it was a bit ugly.
00:24:16.680 | And since then, I thought of a better way of doing it,
00:24:19.160 | which is what I think we should do
00:24:20.560 | is we should create a dict that maps from image ID to variety.
00:24:25.520 | And then our function will just be to look that up, right?
00:24:29.560 | So let's call this image to variety equals--
00:24:36.800 | OK, and it's going to be a dict comprehension.
00:24:40.800 | So we're going to loop through the rows in df dot iter items.
00:25:00.400 | Now, I always forget what these differences are.
00:25:04.760 | Column name comma series pair, returning a tuple
00:25:08.360 | with the column name.
00:25:11.160 | OK, that's not what I want.
00:25:13.000 | >> Yeah, like iter rows, yeah.
00:25:14.560 | >> Yeah, iter rows, index comma series.
00:25:21.000 | OK, cool.
00:25:22.520 | I think like this iter tuples is the fastest one.
00:25:28.480 | But this is not very big, so let's keep it simple.
00:25:32.720 | OK, so this is going to iterate over rows
00:25:35.320 | and return index comma series.
00:25:39.360 | OK, so we don't really care about the index.
00:25:42.120 | Another thing we could do is make the image ID the index,
00:25:50.840 | and then you could actually jump straight into it.
00:25:54.400 | But I think I'd rather not use pandas features.
00:25:56.760 | I'd rather use more pure Python-y things,
00:25:59.160 | because I think that'll make the explanation a little clearer.
00:26:03.000 | So we're going to loop through.
00:26:04.200 | It's going to give us the index and the row.
00:26:08.720 | And so what we want is the key will be the row's image ID,
00:26:16.440 | and the value will be the row's variety.
00:26:20.080 | OK, that looks good.
00:26:27.920 | So then there's a couple of ways we
00:26:32.800 | could turn this into a function.
00:26:36.000 | And I'm just going to show you a little neat trick, which
00:26:39.240 | is when you go like the--
00:26:40.760 | let's pick out something.
00:26:42.000 | Let's say we're going to grab this one.
00:26:45.440 | When you go like this, behind the scenes,
00:26:51.080 | that square bracket thing is actually
00:26:53.480 | calling a special magic method in Python called
00:26:58.600 | Dunder getItem, which is a function.
00:27:03.600 | This is the cool thing about Python.
00:27:05.200 | It's so dynamic and flexible, like all the syntax sugar
00:27:09.160 | is like behind the scenes just calling functions, basically.
00:27:12.240 | That's exactly the same thing.
00:27:15.200 | And so that means that this function here,
00:27:18.880 | imageToVariety.Dunder getItem, is
00:27:21.040 | a function that converts a file name into a variety.
00:27:27.760 | So here's the cool thing.
00:27:29.320 | Forget why you can pass it an array,
00:27:35.200 | and it's going to call each of those functions, which
00:27:41.840 | I think is rather nice.
00:27:42.880 | So another thing I find helpful--
00:27:52.360 | OK, cool.
00:27:53.560 | So when I call that, it complains.
00:27:56.680 | And it says, oh, getY contains two functions,
00:27:59.440 | but it should contain one, one for each target.
00:28:01.640 | It thinks that there's only one target.
00:28:04.160 | Why is that?
00:28:05.320 | Well, if you think about it, we've said there's three blocks,
00:28:07.840 | but we haven't told it how many of those blocks
00:28:10.200 | are for the independent variable and how many
00:28:12.040 | are for the dependent variable.
00:28:14.000 | And so we have to tell it.
00:28:15.600 | And the way we do that is to say the number of inputs equals--
00:28:19.200 | and so it's one.
00:28:20.840 | We have one input, and then the rest will be outputs.
00:28:25.120 | So when we do that, it's now happy, OK?
00:28:31.120 | And personally, before I jump to data loaders,
00:28:33.880 | I first create data sets just to make sure they work.
00:28:37.360 | So data sets are things where you just
00:28:41.040 | grab one thing at a time.
00:28:42.160 | There's no mini-batches to worry about.
00:28:44.000 | So data sets are easier to debug than data loaders.
00:28:47.080 | So you can create data sets using
00:28:49.440 | exactly the same approach, OK?
00:28:54.600 | And so-- all right, so we've got an error.
00:28:57.560 | OK, so here's the problem.
00:29:04.360 | It tried to look up our function.
00:29:09.040 | And in fact, it's not indexed.
00:29:12.640 | It's not passing in the string of the name.
00:29:17.440 | It's actually passing in the path.
00:29:21.160 | And so that's why we've got a key error.
00:29:22.920 | This path does not exist as a key in this dictionary,
00:29:27.640 | which is quite true, right?
00:29:29.240 | It doesn't.
00:29:30.480 | So what I think we should do is fix this up
00:29:36.360 | so that we've got train images, bacterial leaf streak,
00:29:40.520 | blah, blah, blah, OK.
00:29:42.640 | The get files function, the output of that
00:29:46.200 | is being passed to the get items is being passed to get y.
00:29:51.040 | So get image files.
00:29:54.000 | Yeah, so we haven't kind of gone into the details of exactly
00:29:56.560 | what's going on behind the scenes, Hamill.
00:29:59.240 | Let's do that in a moment.
00:30:02.040 | I kind of like the way you're wanting
00:30:04.200 | to jump into the nitty gritty, but it's a little bit--
00:30:08.360 | It's a problem that I have.
00:30:09.200 | I'm trying to do more top down, right?
00:30:11.000 | So I've got to get to your bottom up.
00:30:13.120 | We'll meet in the middle, OK?
00:30:15.280 | OK, fair enough, fair enough.
00:30:16.640 | By the way, your video is not on.
00:30:18.080 | That's fine.
00:30:18.800 | I just don't know if it's intentional.
00:30:20.000 | I always like to see people when they're, you know, seeable.
00:30:23.640 | Hello.
00:30:25.520 | OK, so we're not going to use this trick after all.
00:30:29.320 | We're going to create a function called get variety.
00:30:36.920 | Actually, no.
00:30:37.760 | Yeah, let's create a function called get variety.
00:30:41.160 | And so it's going to get past a path, OK?
00:30:47.560 | And so we're going to return image to variety.
00:30:55.160 | And we're going to return image to variety
00:30:59.400 | with the name of the file.
00:31:03.840 | So the name of the file is the string.
00:31:08.600 | Wait, we need image to variety, the dunder thing?
00:31:12.880 | Oh, yeah, I'll just grab a bracket, actually, yes.
00:31:16.600 | Oh, and then we need to use that.
00:31:22.520 | OK, so DSS contains a dot train data set.
00:31:43.640 | OK, and it also contains a dot valid data set.
00:31:48.400 | OK, and so we can look at the zeroth thing in the training
00:31:54.360 | data set, which is a single thing, right?
00:31:56.720 | So we can have a look now.
00:31:58.240 | There's image and Y1 and Y2.
00:32:02.160 | And so then we can look at the image, for example.
00:32:09.000 | OK, so what's happened here is that get image files returned
00:32:13.240 | to a list of paths.
00:32:15.400 | The first one got passed to image block, which,
00:32:19.680 | as we saw earlier, got passed to pyo image dot create.
00:32:23.320 | And here it is.
00:32:26.720 | And that path name also got passed to a function
00:32:30.800 | called parent label.
00:32:31.880 | In fact, let's do it, right?
00:32:33.480 | So let's say file name equals get image files.
00:32:39.680 | And then the thing that we passed in, training path,
00:32:43.400 | and let's just get the zeroth one.
00:32:45.600 | OK, and so there it is, right?
00:32:50.040 | So it ended up calling pyo image dot create with that file name.
00:33:00.040 | OK, it also called parent label with that file name.
00:33:05.080 | Parent label, OK.
00:33:10.080 | And it also called get variety with that file name.
00:33:12.760 | Jeremy, can we look at get variety one more time?
00:33:18.960 | I'm just curious how you build the path.
00:33:22.960 | I didn't.
00:33:23.640 | I removed the path.
00:33:24.680 | I called dot no.
00:33:25.600 | OK, OK.
00:33:28.720 | I see.
00:33:29.920 | Yeah, in my original version of this,
00:33:32.240 | I did it the other way around, of building back up the path,
00:33:35.280 | and then realized that that was kind of stupid.
00:33:38.280 | Yeah, it's unique, so that works.
00:33:42.320 | One question.
00:33:44.200 | OK, this could be too low level, but just let me know.
00:33:51.720 | Can you have multiple get items?
00:33:54.240 | Is this the right place to ask that?
00:33:55.640 | So it wouldn't make sense to have multiple get items, right?
00:33:59.720 | Like, get item returns a single thing,
00:34:02.960 | but it could be anything you like, right?
00:34:04.360 | It could be it could return a tuple, or a list, or an object,
00:34:07.200 | or whatever, right?
00:34:09.200 | And so, or dict.
00:34:11.040 | And then get y and get x, then the thing's
00:34:14.920 | responsible for pulling out the bit that you
00:34:16.960 | need to pass to your blocks.
00:34:19.400 | Now, we don't need to get x, because image blocks just
00:34:23.960 | take paths directly.
00:34:26.520 | If I needed something a bit more like I wanted to put more things
00:34:31.040 | and get image files, like have it emit a tuple,
00:34:33.680 | then would I have to make my own image block to ignore?
00:34:36.840 | No, not your own image block.
00:34:38.000 | You would write your own function,
00:34:40.000 | just like get image files, that returns
00:34:43.280 | the list of all of the objects you want,
00:34:45.800 | which have all the information you need.
00:34:48.520 | OK, and then--
00:34:50.160 | It almost never happens.
00:34:51.320 | I don't think that's ever happened to me,
00:34:53.000 | because nearly always there's a row of a database
00:34:56.400 | table, or a path, or something has all the information
00:35:01.240 | you need to go out and get the stuff with your get x's
00:35:05.480 | and get y's.
00:35:07.920 | That's like the central piece of information for each row.
00:35:11.280 | And based on this information, you can read in text,
00:35:14.000 | you can read in images, but specific to that one row.
00:35:18.640 | Actually, let me show you my hacky version,
00:35:21.360 | because this is the version that uses a data frame.
00:35:24.400 | So this is-- so the version that uses data frame--
00:35:27.480 | let's see.
00:35:35.760 | Is it right to think, get x's?
00:35:39.880 | Yeah, and that's interesting.
00:35:41.520 | Let me just do this, and let me come to your question.
00:35:43.680 | OK, so in this data block, I started out with a data frame,
00:35:52.160 | like so, right?
00:35:53.680 | And so by passing into data block.dataload,
00:35:57.720 | as I passed in the data frame, it's going to get each row.
00:36:03.360 | And so then get y becomes colreader 1,
00:36:08.920 | which is just a function which--
00:36:11.320 | I mean, let's look at it.
00:36:14.680 | That doesn't do much.
00:36:15.760 | Let's see what it does.
00:36:21.560 | So it's done as an object, because you
00:36:24.280 | can do things like add in a prefix path and a suffix path,
00:36:27.680 | and you can split it with a label delimiter and whatever.
00:36:30.240 | But basically, all it's basically doing, OK,
00:36:36.000 | and it checks what kind of thing you're passing in.
00:36:38.680 | Basically, all it does is it calls getAtra to grab
00:36:49.760 | the column and we--
00:36:54.480 | colreader for context, like reading data frames.
00:37:01.880 | Sorry?
00:37:03.120 | Is this colreader function specifically for data frames?
00:37:08.000 | I mean, it can work with anything, basically,
00:37:11.120 | that you're-- so what it's doing here is it's saying,
00:37:15.920 | grab that column.
00:37:19.680 | But it's really--
00:37:20.760 | I've only really used it for data frames,
00:37:22.560 | but you could use it for anything.
00:37:24.600 | But yeah, so basically here, getY is saying, OK, well,
00:37:27.280 | let's return the index 1 field and the index 2 field.
00:37:31.120 | What's up with getX?
00:37:40.680 | So because now we're being passed--
00:37:42.680 | so you can't pass a row of a database table to pailimage.create.
00:37:48.840 | So getX is this function, which basically is going,
00:37:55.360 | oh, it's going to be in the training path slash disease
00:37:59.400 | name slash image name.
00:38:04.120 | So that's-- and then there's a special case for the test set,
00:38:09.240 | because the test set things are not stored in subfolders
00:38:13.400 | according to label, because we don't know the label.
00:38:15.640 | So it's just directly in the test path.
00:38:18.440 | So that's the-- as I said, this was more hacky.
00:38:21.760 | I don't--
00:38:22.360 | [INTERPOSING VOICES]
00:38:23.840 | This really helps.
00:38:25.080 | So getX is kind of like getY.
00:38:27.120 | You can have a list in there.
00:38:28.760 | It's like from--
00:38:30.120 | Yeah, you can have-- yeah, it's totally flexible.
00:38:32.240 | And I mean, seriously, Hamel, we have so many examples
00:38:37.200 | of all of these patterns in the docs in the tutorials.
00:38:42.920 | So like this exact pattern--
00:38:45.000 | let's take a look at one, right, docs.faster.ai.
00:38:48.600 | So tutorials-- let's do data block tutorial.
00:38:58.160 | Right here, look, multi-label.
00:39:00.240 | So here's one.
00:39:03.120 | And yeah, you can see here, this is even
00:39:05.840 | splitting based on columns in the database table.
00:39:08.680 | And here's the co-reader using prefix.
00:39:11.480 | And here's a co-reader using a label delimiter.
00:39:14.240 | And here's the examples coming out.
00:39:16.040 | Yeah, so there's--
00:39:18.000 | yeah, lots of examples you can check out
00:39:22.320 | to see how to do all this.
00:39:23.560 | Yeah, so I think I'm at a point now where I actually
00:39:28.840 | do want to go into the weeds.
00:39:30.440 | So Hamel, you're now, after this,
00:39:32.240 | totally free to ask any super-weedy questions.
00:39:37.120 | The most basic kind of data block
00:39:40.320 | is called the transform block.
00:39:41.840 | And the transform block, basically,
00:39:54.600 | it's going to store a bunch of things you pass in.
00:39:56.880 | It's going to store things called type transforms.
00:39:58.920 | It's going to store things called item transforms.
00:40:00.800 | It's going to store things called batch transforms.
00:40:04.320 | And also, it always adds one thing,
00:40:06.680 | which is to tensor, because PyTorch means tensors.
00:40:10.480 | If you look at the image block, we
00:40:15.280 | saw that that's defined as a transform block where
00:40:19.200 | the type transforms is this and the batch transforms is this.
00:40:23.800 | So now's a good time to talk about how
00:40:25.360 | this all works, what this does.
00:40:27.720 | So if I pass in here transform block
00:40:33.720 | and don't pass any transforms, it won't do anything.
00:40:37.600 | So let's get rid of pretty much everything.
00:40:56.320 | Let's do that separate cell so it gets a little bit easier
00:41:00.680 | to read.
00:41:08.480 | Here is the world's simplest data block.
00:41:14.040 | So if we call that, as you can see,
00:41:26.160 | all it does is it takes the output of get image file 0
00:41:31.960 | and turns it into a tuple containing one thing, which
00:41:35.440 | is the thing itself.
00:41:37.920 | If we have two transform blocks, it
00:41:43.360 | returns a tuple with two things in it.
00:41:46.880 | So and the reason it's returning tuples
00:41:49.200 | is because this is what we want.
00:41:51.480 | When we train, we have batches containing inputs and outputs,
00:41:59.480 | potentially multiple inputs and potentially multiple outputs.
00:42:02.200 | So that's why indexing into this gives you back a tuple.
00:42:08.520 | My question, the blocks can either be a list or a tuple?
00:42:14.160 | I don't know, probably.
00:42:19.440 | I have no idea.
00:42:26.340 | So then we can do stuff to the first thing in the tuple.
00:42:44.160 | So get x equals-- say let's get a lambda o dot name.
00:42:56.560 | Hello.
00:43:09.540 | Hey, what are you doing?
00:43:17.480 | Something to do with lambda, right?
00:43:26.060 | Does name have to be call?
00:43:37.220 | Maybe it's notebook restart time.
00:43:39.380 | It's my notebook restart time.
00:43:42.820 | Oh, that's-- oh.
00:43:48.260 | I wonder if something happened to my GPU server.
00:43:53.780 | I mean, something has happened to my GPU server, clearly.
00:43:59.220 | Never happened before.
00:44:05.500 | Oh, it looks like it's back.
00:44:07.700 | Oh, OK.
00:44:09.060 | It just recognized that it disappeared.
00:44:10.980 | That's wild.
00:44:19.380 | Oh, OK.
00:44:20.380 | I'm very-- oh, I don't know what just happened.
00:44:28.700 | I guess it doesn't really matter.
00:44:41.300 | What are you doing right now?
00:44:42.900 | I'm just looking at the log, see if anything just happened.
00:44:50.620 | Who knows?
00:44:59.480 | All right.
00:45:09.140 | So you see what happened here is we got the first thing
00:45:18.500 | from image files, which was this, and get x got its name.
00:45:26.540 | So we could also do get y equals lambda o o dot parent, say.
00:45:42.500 | It first went-- first, the thing went to the transform block,
00:45:51.460 | get items.
00:45:52.820 | So whatever get items got went to transform blocks.
00:45:56.660 | And then it went to get x and get y.
00:46:00.060 | Well, transform block doesn't do anything, right, at all,
00:46:04.020 | unless you pass the transforms.
00:46:05.980 | So yeah.
00:46:07.140 | So it's basically-- but the number of them you have
00:46:11.500 | is the number of pipelines it's going to create.
00:46:15.420 | So if we created another one--
00:46:18.660 | But generally, if you have an image block,
00:46:21.020 | it would do something.
00:46:22.220 | So the order is--
00:46:23.580 | We're going to get to that, yeah.
00:46:24.860 | So here, look, we've now got--
00:46:26.700 | --the order.
00:46:27.740 | We're not quite there yet, right?
00:46:30.300 | So let's get to that.
00:46:32.700 | And it's not quite the mental model you've got, I think.
00:46:35.780 | Now that I've got three transform blocks,
00:46:38.340 | I only have things to create two of them.
00:46:42.660 | So it's sad, right?
00:46:45.860 | And so we could put them here, for instance.
00:46:59.580 | And that would be--
00:47:00.420 | [INAUDIBLE]
00:47:01.020 | Last one is the y in the first two or the x.
00:47:03.340 | Correct.
00:47:04.140 | Unless we say number of inputs equals 1, in which case
00:47:11.660 | now we get x is just going to have to return one thing.
00:47:16.740 | There's going to be one function.
00:47:18.820 | And get y will be two things.
00:47:28.180 | So you could even put it here instead, right?
00:47:43.620 | So you could say, oh, well, this is actually--
00:47:45.460 | let's move it.
00:47:56.540 | We could put it here.
00:47:59.140 | Item transforms equals--
00:48:03.300 | And so the transform block is stuff
00:48:06.340 | that is applied for that transform.
00:48:09.700 | How is that not working?
00:48:14.780 | That's slightly surprising to me.
00:48:21.020 | , it needs to be a type transform.
00:48:37.900 | OK, type transform.
00:48:39.060 | So it's now converted to the type it's meant to be.
00:48:44.260 | So Radek, you were asking about image block.
00:48:49.180 | I'm just curious how all the pieces interact.
00:48:53.820 | [INTERPOSING VOICES]
00:48:55.580 | Let me show you.
00:48:57.260 | Let me show you.
00:48:58.340 | So let's do it manually.
00:49:01.260 | So image block is just this, OK?
00:49:04.220 | So let's not use image block.
00:49:05.820 | Let's instead--
00:49:07.220 | Why didn't the item transform work?
00:49:09.540 | Let's figure that out later.
00:49:11.140 | Yeah, why don't we figure out what's going on here,
00:49:13.380 | and then we'll debug it.
00:49:14.700 | OK, so now we've got three transform blocks, two of them
00:49:19.700 | which do nothing, and the first one of which
00:49:22.380 | is going to call something.create.
00:49:25.740 | That was period image.create.
00:49:27.500 | So transform blocks don't--
00:49:34.500 | if you look at the code of them, transform blocks
00:49:39.420 | don't do anything at all.
00:49:43.100 | They actually-- they only store things.
00:49:47.940 | There's no done to call.
00:49:51.540 | There's no forward.
00:49:53.060 | There's nothing.
00:49:55.020 | Transform blocks don't do anything.
00:49:56.620 | They just store stuff.
00:49:59.140 | The data block is the thing that then going to go through
00:50:01.980 | and say, OK, for each thing, call its type transforms,
00:50:05.700 | and then call to tensor, and then call its item transforms,
00:50:08.580 | and then data load of time, call its batch transforms.
00:50:12.980 | So does that help answer your question, Hamill?
00:50:16.140 | It's not that a transform block doesn't get called.
00:50:20.460 | It just stores the list of things
00:50:22.460 | that will get called at each of these times.
00:50:25.300 | The first thing that gets called is type transforms.
00:50:27.740 | Wait, is that right?
00:50:33.580 | Let me think.
00:50:35.660 | No, that's not correct.
00:50:37.100 | The first thing that gets called is get x and get y,
00:50:40.100 | and then the result of that is passed into type transforms.
00:50:44.580 | And so get x and get y--
00:50:47.020 | so get x would be responsible for making sure
00:50:49.580 | that you have a path that you can pass to pio-image.create.
00:50:54.780 | That's the order.
00:50:55.460 | So this whole path of what happens in a sequence
00:50:58.740 | that lives in a data block.
00:51:00.380 | That lives in data block, exactly.
00:51:01.940 | Now, the data block code is, frankly, hairy,
00:51:05.380 | and it could do with some simplifying and documenting
00:51:09.420 | and refactoring.
00:51:11.260 | It's not long.
00:51:12.380 | It's about 50 or 60 lines of code.
00:51:16.580 | In fact, it's almost all here.
00:51:21.060 | But basically, when you call .datasets, really,
00:51:27.980 | all it's doing is it creates a data sets
00:51:31.380 | object passing in all of the type transforms to it.
00:51:38.340 | And the answer to your question, Hamill,
00:51:39.980 | why didn't the item transforms get done,
00:51:42.100 | is because item transforms actually
00:51:43.740 | get done by the data loader, not by the data sets.
00:51:47.860 | So data sets only use the type transforms.
00:51:51.780 | And basically, the only reason there's quite a bit of code
00:51:55.180 | in here is we try to make sure that if two different things
00:52:00.820 | have the same type transforms, we merge them together
00:52:05.340 | in a sensible way.
00:52:06.580 | So there's some stuff to try to make sure this all just works.
00:52:10.900 | I was going to assume the type transforms
00:52:15.260 | are separate from the items transforms
00:52:16.820 | because of some optimization you can do with the type transforms?
00:52:21.300 | Because the type transforms, they're happening earlier.
00:52:26.100 | They're happening before data loaders time.
00:52:31.020 | So data loaders are the things that
00:52:34.460 | are going to take tensors, or at least things that
00:52:42.540 | can be converted into tensors.
00:52:45.340 | So yeah, so type transforms are the things
00:52:48.780 | that are going to create your data sets for you.
00:52:51.420 | And they're going to spit out things
00:52:53.060 | which need to be convertible into tensors.
00:52:56.380 | And then data loaders has item transforms,
00:53:01.220 | which are things like reshaping everything to the same size.
00:53:04.180 | And batch transforms, which are things like data augmentation.
00:53:09.580 | But you can have an item transform run on the GPU
00:53:13.420 | or not on the GPU, right?
00:53:14.820 | It depends on the ordering.
00:53:16.100 | I don't think an item transform is generally
00:53:23.260 | going to run on the GPU because it's not a batch yet.
00:53:27.020 | I mean, maybe it's theoretically possible,
00:53:29.100 | but that would be pretty weird because you really
00:53:32.060 | don't need things to be in a batch before the GPU can
00:53:34.380 | be optimizing it effectively.
00:53:37.820 | And everything in batch transforms
00:53:39.700 | will run on the GPU.
00:53:43.420 | Assuming that you're using a GPU, I mean, this is OK.
00:53:48.500 | This is some part of the code base
00:53:50.060 | we're not looking at today.
00:53:51.060 | But I can't remember.
00:53:54.420 | I think this might be a callback which sticks things on the GPU.
00:53:57.820 | So it just depends on whether things are before or after
00:54:00.340 | that callback.
00:54:02.820 | Yeah, that's probably a bit of a distraction.
00:54:05.380 | So let's skip that bit for now.
00:54:08.780 | To kind of revise the difference between data set and data loader,
00:54:13.140 | is it best to revisit the PyTorch documentation and kind of--
00:54:16.900 | Yeah, pretty much.
00:54:17.860 | We have our own implementation of them.
00:54:19.500 | But our implementation of data loader
00:54:21.180 | is a superset of PyTorches.
00:54:24.380 | And PyTorches data set is like literally it's an abstract class.
00:54:29.380 | It doesn't do anything at all.
00:54:31.740 | So a data set is something that you can index into.
00:54:35.300 | And it returns a single tuple of your independent and dependent
00:54:39.540 | variables.
00:54:40.140 | That's what a data set is defined as by PyTorch.
00:54:44.660 | And therefore, that's what we do as well.
00:54:48.260 | A data loader, you can't index into it.
00:54:51.860 | The only thing you can do is iterate through it.
00:54:54.220 | You can grab the next one.
00:54:55.660 | And it gives you a mini-batch, which is a tensor.
00:55:00.180 | So that's the difference.
00:55:01.220 | But yeah, that's a PyTorch concept.
00:55:04.140 | I guess I'm trying to understand the type transform thing,
00:55:09.340 | why it has to be done in the data set before the data loader.
00:55:13.420 | Well, it doesn't have to be.
00:55:14.540 | But it's like we want data sets.
00:55:17.220 | Data sets are a very convenient thing
00:55:19.780 | to have to have something you can go into and grab items,
00:55:24.700 | numbered x, y, or z.
00:55:26.940 | That's the basic foundation of the PyTorch data model,
00:55:30.660 | is that there's things you can index into.
00:55:34.020 | The type transform aspect of it.
00:55:36.460 | Yeah, so you need something that converts
00:55:39.700 | the output of get image files into what
00:55:43.580 | you want in your data set.
00:55:45.780 | And that thing needs a name.
00:55:47.180 | And the name we gave it was type transforms.
00:55:52.740 | OK, I think I understand.
00:55:57.140 | This is not the only way you could do this, right?
00:55:59.860 | But it's our way that's really nice
00:56:02.540 | because we now have this thing that you can say like,
00:56:05.220 | oh, Hamill, can you show me the 14th image and its label?
00:56:09.260 | And you can say, yes, no problem, Jeremy.
00:56:10.940 | You can type DSS dot train bracket 13.
00:56:14.820 | And there it is, right?
00:56:17.020 | So yes, that's just a convenient thing, basically.
00:56:23.900 | So I guess a question around that
00:56:26.100 | is that if we did not have type transforms,
00:56:29.340 | then it would just be one more step in the item transforms,
00:56:33.580 | right?
00:56:34.300 | Yeah, I think so.
00:56:36.020 | So it is just separating those things out.
00:56:38.180 | Yeah, your data sets would always just return a single thing,
00:56:42.300 | or maybe two things, the get x and get y results.
00:56:46.540 | And then your data loader would have to do more work, basically.
00:56:50.860 | Exactly.
00:56:51.380 | Yeah, yeah.
00:56:52.260 | Which would be a perfectly OK way to do things as far as I
00:56:56.220 | can tell that I think would be a little harder to debug
00:56:59.740 | and work with and keep things decoupled.
00:57:05.140 | Yeah, I think that's a reasonable comment.
00:57:07.860 | Is it like anything you want to do up front that
00:57:11.300 | is like kind of uniform across your whole data set,
00:57:13.540 | maybe put it in the type transform
00:57:15.460 | that you don't need to change at training time?
00:57:18.860 | Basically, like anything that you
00:57:23.660 | want to be able to index into it and look at that thing, really.
00:57:28.340 | If you're not sure where to put it,
00:57:33.660 | I'd say just chuck it somewhere and don't worry about it.
00:57:35.860 | You know, we kind of put--
00:57:41.300 | the rule is that you need something that
00:57:47.180 | can be turned into a tensor.
00:57:48.900 | Like that's the way fast AI does it.
00:57:52.100 | So you need to make sure that your type transform, when
00:57:55.860 | you're working with fast AI, returns something
00:57:57.780 | that is a tensor or going to be turned into a tensor.
00:58:02.100 | Which PIL image can be, for example?
00:58:08.820 | I think I understand.
00:58:09.700 | It's kind of like you want to make sure
00:58:12.500 | it's a convenient thing that you understand to look at.
00:58:16.300 | Yeah.
00:58:18.900 | Yeah.
00:58:20.700 | OK, so then like--
00:58:22.660 | OK, so I can remove all that.
00:58:26.500 | This is the definition of image block.
00:58:29.100 | So let's replace it with the word image block.
00:58:33.420 | And then let's change-- OK, let me think.
00:58:47.420 | OK, so let's put a dot name here.
00:59:01.340 | Here's kind of something we want as our label, right?
00:59:04.180 | That's one of our labels.
00:59:05.940 | And then the other label we wanted
00:59:09.900 | was the function called get variety, right?
00:59:21.740 | Now we can't-- this breaks our rule.
00:59:24.260 | This can't be turned into a tensor because it's a string.
00:59:29.540 | So what do we do about that?
00:59:34.060 | You might remember from a previous lesson
00:59:35.780 | we learned that what we do is we replace strings with integers
00:59:39.580 | where that integer is a lookup into a vocabulary.
00:59:42.660 | It's a list of all of the possible options.
00:59:45.500 | So if we change this to category block,
00:59:55.860 | that is exactly what category block will do, right?
01:00:04.380 | And so category block, it's got a type transform
01:00:15.580 | categorize, which I'm not going to go into because it's not
01:00:19.380 | particularly exciting.
01:00:21.620 | But if you look up the documentation for categories,
01:00:25.180 | you can see how it does that.
01:00:27.140 | So basically, internally now, you'll
01:00:29.500 | find that the vocab is stored for these things.
01:00:35.380 | So if we look at this at a high level, get items, get--
01:00:39.900 | By the way, just a moment, here's the vocab, right?
01:00:42.140 | It's got two things.
01:00:43.420 | It's got the vocab for the diseases
01:00:45.260 | and the vocab for the varieties.
01:00:47.260 | Yeah, sorry, Radek.
01:00:48.540 | No worries.
01:00:49.540 | So get items gets us the rows or the examples
01:00:53.940 | or whatever allows us to--
01:00:55.660 | and then the core for a single example.
01:00:59.140 | And then from get items, we use get y or get x
01:01:03.820 | to transform it somehow so that we can pass it
01:01:07.220 | into those blocks.
01:01:08.340 | Correct, specifically pass it into the type
01:01:10.820 | transforms of those blocks.
01:01:12.500 | Into type transforms.
01:01:13.700 | And type transforms are things that can get triggered, right?
01:01:19.700 | So they're doing a little bit something similar to get y,
01:01:23.460 | but are building on what get y does.
01:01:25.780 | Correct, exactly.
01:01:26.980 | Because these are very general things, right?
01:01:31.340 | And so I didn't want you guys to have
01:01:32.980 | to write your own every time.
01:01:34.940 | So these basically say, this says,
01:01:38.860 | I will work if you can pass me a path to an image.
01:01:42.540 | And this says, I will work if you pass me a string.
01:01:45.620 | And so get x and get y then are responsible for ensuring
01:01:48.980 | that you pass them a path and pass this one a string.
01:01:53.500 | And get image files is already returning paths,
01:01:56.060 | so we don't need to get x for this guy.
01:02:00.380 | But it's not returning strings, so we
01:02:02.340 | do need to get y for these guys.
01:02:04.060 | OK, so I'm going to finish--
01:02:11.060 | I'm going to run it slightly over time.
01:02:14.900 | But let's have a look at-- so this is exactly the same.
01:02:26.060 | OK, so this is exactly the same as what we just had, right?
01:02:30.620 | And so then we can also then add the two things, which
01:02:33.020 | is the item transforms and the batch transforms.
01:02:35.020 | Some other time, we will talk about how it is that--
01:02:38.100 | how come this is not being applied to the categories?
01:02:40.820 | It's only being applied to the images.
01:02:43.380 | For those of you interested in skipping ahead,
01:02:45.500 | the secret is using fast calls type dispatch functionality.
01:02:49.220 | Anyway, so that's why we're getting
01:02:55.460 | these three different things-- image, we've got y1.
01:02:58.140 | So Jeremy, if we had an image, if we had an image block
01:03:01.580 | in our--
01:03:04.900 | for our y's, for our targets, then item transform
01:03:08.860 | would get applied.
01:03:10.820 | Correct.
01:03:11.940 | Oh, wow.
01:03:13.260 | And there's a-- have a look at the Siamese tutorial
01:03:17.220 | on the fast.ai docs, because that has two images.
01:03:22.380 | Yeah.
01:03:22.900 | And if you think about it, any time we do segmentation,
01:03:25.780 | that's exactly what's happening, right?
01:03:27.340 | The data augmentation is happening to x and y.
01:03:29.580 | And this is really unusual.
01:03:32.300 | I don't know of any other libraries
01:03:33.980 | that have this kind of totally transparent ability
01:03:36.700 | to do bounding boxes, segmentation, point clouds,
01:03:42.380 | whatever as dependent variables, and have it all
01:03:45.220 | happen in unison very, very automatically.
01:03:49.420 | Well, at least it didn't used to be.
01:03:51.060 | Maybe there is now.
01:03:51.940 | OK, so now I can create data loaders from that.
01:04:04.940 | And thanks to the magic of fast.ai, this is so cool.
01:04:12.100 | Check this out.
01:04:12.700 | It's actually auto labeling it with each of our categories.
01:04:16.620 | So thanks to stuff we'll discuss later, basically this stuff
01:04:20.860 | called type dispatch, fast.ai does a lot of things
01:04:26.340 | automatically, even though I don't think I've ever explicitly
01:04:29.140 | coded this to work.
01:04:31.060 | It just does because of how the API is designed.
01:04:35.780 | So we now have something which can
01:04:37.340 | create batches of pictures and two
01:04:40.700 | different dependent variables, each one of which
01:04:43.540 | has a category.
01:04:44.020 | And so what we will get to next time
01:04:58.900 | is it actually turns out--
01:05:02.780 | well, I briefly mentioned it now, actually.
01:05:05.580 | All that stuff I did last time about messing around
01:05:07.660 | with multiple different heads and all that
01:05:09.940 | is actually totally unnecessary.
01:05:12.180 | All we need to do when we create our vision learner
01:05:15.140 | is tell it we don't want 10 outputs,
01:05:17.860 | but we don't want 20 outputs.
01:05:20.020 | So normally it automatically figures out
01:05:21.860 | how many outputs you want by how many levels are
01:05:24.940 | in your categorical dependent variable.
01:05:27.380 | But in this case, we've got something custom, right,
01:05:29.420 | which is we've got a tuple of outputs.
01:05:32.060 | So we have to tell it.
01:05:32.940 | We want 20 outputs.
01:05:34.380 | That's going to make the final matrix that it multiplies by
01:05:38.540 | have 20 outputs.
01:05:41.060 | Now, then you basically need to tell it what loss function
01:05:46.740 | to use.
01:05:48.540 | And so if you look it up, it turns out
01:05:50.540 | we used to use a loss function for this called
01:05:52.460 | cross-entropy_loss_flat.
01:05:54.300 | So we're going to call that exact loss function
01:05:56.620 | on the first 10 items.
01:06:00.740 | And we're going to compare that to the disease probabilities.
01:06:05.860 | And then the second 10, we're going
01:06:08.500 | to compare to the variety probabilities.
01:06:11.660 | And we'll do the same thing for having an error rate, which
01:06:14.940 | just looks at the first 10, the error rate for disease,
01:06:18.340 | and the same thing for variety.
01:06:19.980 | Look at the second 10, the variety.
01:06:22.460 | And so basically then, if you train that,
01:06:25.700 | it's going to print out the disease and the variety error.
01:06:29.300 | And the loss function will be the loss function
01:06:31.500 | on both of the two halves.
01:06:35.500 | And interestingly, for this single model,
01:06:42.100 | this 2.3% disease error is the best I'd ever
01:06:44.820 | got for this architecture.
01:06:47.580 | So at least for this single model case,
01:06:50.100 | this was better than training something that
01:06:56.580 | only predicts disease.
01:06:57.940 | Anyway, we can talk about that more later,
01:06:59.620 | because we kind of spent more time on it.
01:07:01.460 | I have a quick question.
01:07:02.580 | Yeah.
01:07:03.580 | The last layer, it's a flat 20 output layer.
01:07:09.540 | Does this mean at inference time that we
01:07:12.780 | would have to do the softmax plus--
01:07:16.460 | what would it be?
01:07:17.340 | I can't remember.
01:07:18.060 | No, firstly, I handled all that for you automatically.
01:07:20.380 | All right.
01:07:21.180 | Yeah.
01:07:21.660 | Great.
01:07:28.260 | All right.
01:07:28.740 | And by the way, in the inference functions,
01:07:31.620 | you'll see there's always a rule in options
01:07:34.900 | as to whether to decode it and whether to put
01:07:37.140 | the final activation function on it and stuff like that.
01:07:39.740 | So actually, now I think about it.
01:07:42.740 | In this case, because we used a custom loss function,
01:07:46.180 | I think that would have broken its ability
01:07:48.700 | to do it automatically.
01:07:49.780 | So yeah, OK, I'm going to say actually,
01:07:51.540 | you would need to add a softmax if you wanted to.
01:07:55.820 | Although you actually don't need to,
01:07:59.620 | because at least for the Kaggle competition,
01:08:03.260 | I just needed which disease had the highest prediction.
01:08:09.260 | And whether it's softmax or not, it's
01:08:11.420 | going to be the same because that's a monotonic function.
01:08:18.460 | So it depends whether you actually
01:08:20.380 | need probabilities or not.
01:08:22.860 | In my case, I didn't have to do this.
01:08:26.540 | Yeah, but you would only look at the first 10
01:08:29.300 | or the second, I guess, the first ones.
01:08:31.620 | Yeah, so you just--
01:08:33.180 | You can see it.
01:08:33.780 | Because otherwise, yeah.
01:08:36.740 | So I was using TTA to do test plan augmentation.
01:08:41.540 | And I stacked up and I did an ensemble of TTA.
01:08:44.460 | And then I just did an argmax on the first 10.
01:08:49.380 | Yeah, all right.
01:08:51.420 | All right.
01:08:52.500 | Just hold up.
01:08:54.100 | OK, sure.
01:08:55.380 | In the architecture, you selected for ResNet 18, 128.
01:09:03.300 | Is there any programmatic way to find out
01:09:06.220 | the size or the input size of the models
01:09:10.100 | that you are trying to use?
01:09:12.260 | These models handle any input size.
01:09:15.660 | All right.
01:09:17.500 | Yeah.
01:09:17.980 | All right.
01:09:18.620 | All right.
01:09:20.180 | All the ResNets and all the ConfNecks
01:09:22.220 | handle any input size.
01:09:24.580 | All right.
01:09:25.540 | Thank you.
01:09:26.060 | It's only the transformer models.
01:09:28.740 | That also tripped me up in the beginning.
01:09:30.620 | But there's a lot of interesting stuff there
01:09:32.620 | that might take a whole lecture to understand.
01:09:36.180 | All right, again, all that stuff, yeah.
01:09:38.500 | Thanks, gang.
01:09:39.460 | See you.
01:09:40.340 | Thank you.
01:09:40.860 | Thank you.
01:09:42.420 | [BLANK_AUDIO]