Live coding 13

Just continuing with this. OK, great. Yeah. Any other new faces today? Ali, have you been here before? I don't remember seeing you. Maybe you've been quiet. I've been here. I don't think I've had the camera on. OK, yeah. Nice to meet you. And Nick is back from his tour of Tropical Islands.

Hi, Nick. Yes, yep. Not so far away, though. Still pretty close. Yeah. Well, when you're at Queensland, Tropical Islands are easy. By default. Hi, Marie. Nice to see you. Where are you joining us from? Hello, can you hear me? Hi, absolutely. I'm Joni from South Bank of Brisbane. Uh-huh.

Yeah, I'm at the Children's Health Queensland Hospital building. Ah, yes. OK, great. Anybody got anything they wanted to talk about or ask about before we dive in? Because I've got some good news, which is I've made things much easier in the horrible direction we went last time. All right, sounds like I should dive in then.

I know there was, like, as I rather expected, and quite fair enough, some concern on the forum about how complicated and weird this all is, which is partly because we're kind of, like, jumping into stuff we'd normally do in part two. And so we haven't quite got all the necessary background here.

And so, yeah, don't worry if you're feeling a little at sea. But it's a good way to, like, I don't know, start to see, like, some stuff you could look into after the course is finished. All right, so let me just-- Jeremy. Yes. Sorry, just not mentioning part two.

Are you planning a part two this year? Planning would be too strong a word. But as much as I ever plan anything-- Thinking about? Yes, I would absolutely like to do a part two this year. Awesome. And the fact that you're asking that question means that you are not up to date on our Discord.

So you should definitely join our Discord channel because we've actually been talking about doing a conference, or an unconference, to go with part two at the end of the year in Queensland. And a lot of folks from places that are not in Australia are saying they would be up for coming here for that.

Partly a kind of a social thing, and also trying to do it all in a very COVID-safe way of outdoor, and masked, and people tested ahead of time, and stuff. Yeah, so that's-- yeah, so nothing's planned. We don't have dates, or a syllabus, or anything like that, but we absolutely hope to have something awesome maybe towards the end of the year where we can get to know each other a bit better, and get to know fast AI and deep learning a bit better.

Jeremy, can I ask you, are you going to continue this work for next week because I know that the class is going to be shown next Tuesday? Yeah, I think so. The fact that I'm doing it this week is interesting because, yeah, it has meant I've got less time to actually work on the course, but I also feel like the stuff we're doing, perhaps, is stuff I can use in the next lesson, depending on where we get to.

So yeah, so I think we'll do it next week. We'll see how things go. If I get really behind, then maybe not. But then I certainly plan to continue doing them after the last class until-- I don't know. We all know everything, I guess, and then we'll stop. At which point, there'll be more things to know, so yeah.

OK. We don't have to stop, necessarily. There's just so much to learn. The only problem is that it's obviously a burden on your time, but it's-- I enjoy it. I enjoy it. My issue is, what about when we get to the point where there's nothing left to learn, Radek?

What do we do then? Well, is there such a point? Sure, there must-- But then we just start doing it all in a different language. We start doing it all in R or Julia. Yeah, correct. C plus plus, that would keep us busy. I think this is my fifth year of doing fast AI courses, and I'm still trying to complete the part one.

Are you too monostatic? That's true. All right. Let me see. So multitask. All right. I am just so happy about how this worked out, to be honest. Although, spoiler alert, it didn't turn out to help our score. The score was about the same. But I was so happy at how the whole process turned out.

But I kind of want to show you how I got there, as well as where we ended up. And yeah, as soon as I kind of turned off Zoom last time, and I went for a walk, and then as soon as I did that, I was like, oh, no, of course I know how we should do this.

And really, so there's quite a few-- we can make this much, much simpler. So let me explain what we're going to try to do. We are going to try to predict two things, the disease and the variety for each image. And the first thing will be to create a pair of data loaders that look like this.

For each image, they will have connected two things to them, the disease and the type of race. So this is going to be step one. So let me kind of show you how I got to that step. So the step one that I took was to, first of all, try to replicate what we've already had before.

So patty's ball. But before, I used image data loaders, which is the highest level least flexible function. We can do all the data processing of a single line of code, but only if we want to do something really standard. And trying to predict two things is not standard enough for it.

So we now need to go one layer down. And there's a lot of really good tutorials on docs.fast.ai. And because it's been ages since I've done any of this, I'd forgotten entirely how fast.ai worked, so I used them very heavily to remind myself of what's going on. But for example, there is a data block tutorial.

This pat's tutorial is great. It goes through all the layers of different ways of doing things with fast.ai preprocessing. This Siamese tutorial is another really good one. So these are some of the things I looked at. And the other thing that I looked at was the actual API docs.

So if I click here on data block, this is actually probably what I found the most useful in the end. There's lots of great examples in the documentation. So yeah, as I kind of like-- you know how it is, you come back to something a couple of years after you built it, and you're now kind of the customer of your documentation.

And so my experience as a customer of my documentation was I was really delighted by it. So I can definitely suggest checking all that out. So what you can do-- before, we were using image data loaders.from folder. So if we just do the double question mark trick, we can see the source code for it.

And it's the normal size of fast.ai things. It's very small. And you can see that actually all the work's being done by data block. So data block is the still high level API, but not as high level. It's actually very flexible. And so we're going to-- step one that I did was to replicate exactly what we had before about using data blocks.

And for this, there's actually so many good examples in the tutorials and in the book. You've seen them all before. We don't need to talk about it too much. We can basically say, OK, for the data block, the input will be an image. The output will be a category.

This is just to do disease prediction. The labeling function will be the parent folder. Parent label. Do a random split. The item and batch transforms, we can copy and paste from what we had before. And that creates a data block. So data loaders is then a data block.data loaders.

And you then have to pass in a source. So the source is basically anything that you can iterate through or index into to grab the things that will be passed to these blocks and to this function. So for this, it's going to be a path. And then we also need to get items.

And so when we get a path, we're going to pass that path into the getImageFiles function, because that's the thing that returns a list of all of the images in a path. And let's see if that works. How do you know that-- OK, so the block, you have an image block, category block.

How do you know that the getImageFiles is going to be able to feed both those blocks? So I guess the short answer would be to read the documentation about those blocks to see what they take and what they do, or any of the tutorials that use them. As you can see, they're used all over the place, right?

So you can start with this tutorial, or this tutorial, or this tutorial. So any of those would show you what it does. Yeah, the actual-- sorry, this is not good documentation. I really never bothered to look at this, because it's basically all it does, because we should fix that, I guess, because there's so many tutorials.

I mean, as you can see, I guess the reason I never really wrote drops, right, is that it's literally a single line of code. So that's like-- yeah, so maybe looking at the code is actually interesting in this case. So an image block is something which calls class.create, where class is P-I-L image.

So it's going to call P-I-L image.create. So to find out what's actually going to be called by it, you can go P-I-L image.create, and you can see it's going to get passed a file name, which can be a path, or a string, or various other things. So get image files, then, obviously, you can either run it to see what it comes out with, or let's do that so we could just run get image files, passing in the thing it's going to be given, which is the path.

And so as you can see, it's a bunch of paths. And so we could pass one of those-- copy that, and it's going to be passed into this function. So we've now just replicated exactly what's happening in the code. Yeah, but for this, I generally just look at the tutorials, which tell you what to do.

Could you have two different get items that feed different blocks? We're going to come to that. OK, so park that. So yeah, so-- There was also a batch transform, like it gets transformed later after reading, right? Yeah, here we've got the batch transform, yeah. But in the image block, because right now, we have a build image, or something, and it needs to become a tensor, right?

Yeah, that's right. It gets changed from an int tensor to a float tensor later on. That's right. Yeah, that's a fairly subtle thing, but that's right. We stick that in something. Image equals-- and we look at, like, np.array image. It's actually stored as bytes, or UN8, as they call it, in PyTorch.

So yes, this is going to add a batch transform that's going to turn that into a float tensor, which we can see what that's going to look like. We could run it here, I expect, to float tensor. The transformation that we're applying is 224, which is like a square image, correct?

A 224 by 224? This one here? Yes. Yeah, so this is doing data augmentation of lots of different kinds. So let's copy doc paste. And if we show-- for data augmentation, looking at the docs is particularly important, because you can see examples of the augmentations it does. So it tells you a list of all the augmentations and how you can change them.

And here's some examples of what they look like. And-- These augmentations would happen after the integer float tensor, right? Yes, that's correct. And some data augmentations, they operate on the entire batch, and some operate with a single example. Yeah, that's right. So the ones in batch transforms operate on the whole batch, and the ones in item transforms operate on a single item.

And so batch transforms operate-- because they operate on a batch, before you get there, everything has to be the same shape. So it can be turned into a batch. So this resizes everything to the same shape. And then this does these various different types of data augmentation. And one of the key pieces of data augmentation it does is to randomly zoom into a subset of the image, as you can see in these various examples here.

And this data block API, can you also use it with data frames, where you would be reading your images from a data frame? We are going to do that. We're going to do that in a moment, yes. OK, cool. Absolutely. So yeah, I'm kind of skipping over quite a bit of this, because it's super well covered in the tutorials.

So I don't want to say stuff that you can very easily read. Whereas the stuff I'm about to show you isn't as well covered in the tutorials, and it's kind of new. But yeah, feel free to keep asking questions about anything you see. So basically, yeah, so all we've done is this is just the same thing that we have in lesson one.

And it's doing exactly the same thing as my image data loaders.from folder, but just turn it into data block. And so this is what I did, just to show you through my process. But step one was to get this working, and then I passed that into a learner. And so let's go copy, and I want this all to run as fast as possible.

So I would use the fastest-- Do you-- when you make this data loader thing, do you try to make sure that the shape that it's outputting is what you need for your model, or that's later? Well, I generally use models which don't care what size they get. So yeah, that's one of my tricks.

So ResNet18 is happy with any size. So actually, for my testing, I'm going to bring this back down to 128, so it's super fast. And so I just want to get the maximum iteration speed here. And so now, I can call learn.fit1cycle. And let's do one epoch. OK, so this is going to run in under 20 seconds, which is kind of what you want, right?

You want something that you can test in under about 20 seconds, so that way, you can just quickly try things and make sure that end-to-end, it's working. Yep, the error rate is down to 30%. So that's a good sign. I guess one correlated question is, OK, I understand the input size, but what about the output size of your data block?

You know that this is what you need for that model. Let's not say the model doesn't care. The model's happy with any size. I mean, the targets, or whatever. You're talking about the labels? Yeah, the labels. I mean, labels don't have sizes. The labels are strings. Or just the shape of that.

Like, hey, is it-- because maybe different models are kind of trying to predict different types of stuff, potentially? I don't know. Like, some might have the shape of the target. A vision learner is-- I suspect the thing you're kind of asking is the thing that we're going to be covering in a moment.

So maybe put that on hold, and then tell me if it doesn't make sense. OK. OK. Before-- I have a question. On the data block, you randomly select the amount of records or the amount of the batch size that you're going to process. I don't randomly pick the batch size, no.

The batch size is actually selected in the .dataLetters call and it take off to 64. 64. So what is the guarantee that every single one of the images in this particular case will be selected, or there's no way to know? Is there any way to know that every single one will be-- Yes.

I mean, well, yes, except that we're randomly selecting 20% to be a validation set. But every single one will go through the learner. Of the 80% that aren't in there, everyone will go through our learner because we randomly shuffle them and then we iterate through the whole lot. So in a single epoch, the model is guaranteed to see every example that is trained just once.

Yeah. Yeah, and that's what this one means. That's what one epoch means, is look at everything once. And so if we put two there, it would look at everything twice. But each time it randomly shuffles it, so it does it in a different random order. I have a quick question.

I guess this is PyTorch data loader stuff, but what actually happens for the last batch? The last batch, it depends. And this is actually not the PyTorch data loader, it's actually fast.ai's data loader. So we have our own data loader, although in the next version, we're likely to replace it with the fast.ai one.

So it depends what drop last is. If drop last is true, then it deletes the last batch. And if it's false, then it includes the last batch. And the reason that's interesting is that the last batch may not be of size 64. Yeah. For the validation set, it always keeps the last batch.

It's super important to shuffle the transfer. The fast.ai does it for you, but if you will mess around with the data loaders or do something yourself, if you don't shuffle the transfer, you might get very poor training performance. When we used to use Keras, I used to mess all this stuff up all the time.

Yeah, trying to get all those details, right? It's really annoying. Just to make sure on something you said, you said in the next iteration, you're going to replace it with the PyTorch data loaders? Yeah, probably. OK. You said fast.ai, so I got confused. Oh, did I? That is confusing.

Thanks. OK, so that was my step one. Is to just get it working exactly like before. And then I ran it in the background on the same architecture for the same epochs to make sure I got about the same error rate, and I did. So then I was happy that, OK, I'm matching what we had before.

So then step two was to try to make it so that the data block spits out three things, which would be one image and two categories. The category of disease and the category of rice type. So to get it to spit out an image and two categories, hopefully you wouldn't be surprised to hear that we just do that.

We say we want three blocks, an image, and two categories. Now, this variety, we did some way of getting that given an image ID. And actually, the way I did it was a bit ugly. And since then, I thought of a better way of doing it, which is what I think we should do is we should create a dict that maps from image ID to variety.

And then our function will just be to look that up, right? So let's call this image to variety equals-- OK, and it's going to be a dict comprehension. So we're going to loop through the rows in df dot iter items. Now, I always forget what these differences are. Column name comma series pair, returning a tuple with the column name.

OK, that's not what I want. >> Yeah, like iter rows, yeah. >> Yeah, iter rows, index comma series. OK, cool. I think like this iter tuples is the fastest one. But this is not very big, so let's keep it simple. OK, so this is going to iterate over rows and return index comma series.

OK, so we don't really care about the index. Another thing we could do is make the image ID the index, and then you could actually jump straight into it. But I think I'd rather not use pandas features. I'd rather use more pure Python-y things, because I think that'll make the explanation a little clearer.

So we're going to loop through. It's going to give us the index and the row. And so what we want is the key will be the row's image ID, and the value will be the row's variety. OK, that looks good. So then there's a couple of ways we could turn this into a function.

And I'm just going to show you a little neat trick, which is when you go like the-- let's pick out something. Let's say we're going to grab this one. When you go like this, behind the scenes, that square bracket thing is actually calling a special magic method in Python called Dunder getItem, which is a function.

This is the cool thing about Python. It's so dynamic and flexible, like all the syntax sugar is like behind the scenes just calling functions, basically. That's exactly the same thing. And so that means that this function here, imageToVariety.Dunder getItem, is a function that converts a file name into a variety.

So here's the cool thing. Forget why you can pass it an array, and it's going to call each of those functions, which I think is rather nice. So another thing I find helpful-- OK, cool. So when I call that, it complains. And it says, oh, getY contains two functions, but it should contain one, one for each target.

It thinks that there's only one target. Why is that? Well, if you think about it, we've said there's three blocks, but we haven't told it how many of those blocks are for the independent variable and how many are for the dependent variable. And so we have to tell it.

And the way we do that is to say the number of inputs equals-- and so it's one. We have one input, and then the rest will be outputs. So when we do that, it's now happy, OK? And personally, before I jump to data loaders, I first create data sets just to make sure they work.

So data sets are things where you just grab one thing at a time. There's no mini-batches to worry about. So data sets are easier to debug than data loaders. So you can create data sets using exactly the same approach, OK? And so-- all right, so we've got an error.

OK, so here's the problem. It tried to look up our function. And in fact, it's not indexed. It's not passing in the string of the name. It's actually passing in the path. And so that's why we've got a key error. This path does not exist as a key in this dictionary, which is quite true, right?

It doesn't. So what I think we should do is fix this up so that we've got train images, bacterial leaf streak, blah, blah, blah, OK. The get files function, the output of that is being passed to the get items is being passed to get y. So get image files.

Yeah, so we haven't kind of gone into the details of exactly what's going on behind the scenes, Hamill. Let's do that in a moment. I kind of like the way you're wanting to jump into the nitty gritty, but it's a little bit-- It's a problem that I have. I'm trying to do more top down, right?

So I've got to get to your bottom up. We'll meet in the middle, OK? OK, fair enough, fair enough. By the way, your video is not on. That's fine. I just don't know if it's intentional. I always like to see people when they're, you know, seeable. Hello. OK, so we're not going to use this trick after all.

We're going to create a function called get variety. Actually, no. Yeah, let's create a function called get variety. And so it's going to get past a path, OK? And so we're going to return image to variety. And we're going to return image to variety with the name of the file.

So the name of the file is the string. Wait, we need image to variety, the dunder thing? Oh, yeah, I'll just grab a bracket, actually, yes. OK. Oh, and then we need to use that. OK, so DSS contains a dot train data set. OK, and it also contains a dot valid data set.

OK, and so we can look at the zeroth thing in the training data set, which is a single thing, right? So we can have a look now. There's image and Y1 and Y2. And so then we can look at the image, for example. OK, so what's happened here is that get image files returned to a list of paths.

The first one got passed to image block, which, as we saw earlier, got passed to pyo image dot create. And here it is. And that path name also got passed to a function called parent label. In fact, let's do it, right? So let's say file name equals get image files.

And then the thing that we passed in, training path, and let's just get the zeroth one. OK, and so there it is, right? So it ended up calling pyo image dot create with that file name. OK, it also called parent label with that file name. Parent label, OK. And it also called get variety with that file name.

Jeremy, can we look at get variety one more time? I'm just curious how you build the path. I didn't. I removed the path. I called dot no. OK, OK. I see. Yeah, in my original version of this, I did it the other way around, of building back up the path, and then realized that that was kind of stupid.

Yeah, it's unique, so that works. One question. OK, this could be too low level, but just let me know. Can you have multiple get items? Is this the right place to ask that? So it wouldn't make sense to have multiple get items, right? Like, get item returns a single thing, but it could be anything you like, right?

It could be it could return a tuple, or a list, or an object, or whatever, right? And so, or dict. And then get y and get x, then the thing's responsible for pulling out the bit that you need to pass to your blocks. Now, we don't need to get x, because image blocks just take paths directly.

If I needed something a bit more like I wanted to put more things and get image files, like have it emit a tuple, then would I have to make my own image block to ignore? No, not your own image block. You would write your own function, just like get image files, that returns the list of all of the objects you want, which have all the information you need.

OK, and then-- It almost never happens. I don't think that's ever happened to me, because nearly always there's a row of a database table, or a path, or something has all the information you need to go out and get the stuff with your get x's and get y's. That's like the central piece of information for each row.

And based on this information, you can read in text, you can read in images, but specific to that one row. Actually, let me show you my hacky version, because this is the version that uses a data frame. So this is-- so the version that uses data frame-- let's see.

Is it right to think, get x's? Yeah, and that's interesting. Let me just do this, and let me come to your question. OK, so in this data block, I started out with a data frame, like so, right? And so by passing into data block.dataload, as I passed in the data frame, it's going to get each row.

And so then get y becomes colreader 1, which is just a function which-- I mean, let's look at it. That doesn't do much. Let's see what it does. So it's done as an object, because you can do things like add in a prefix path and a suffix path, and you can split it with a label delimiter and whatever.

But basically, all it's basically doing, OK, and it checks what kind of thing you're passing in. Basically, all it does is it calls getAtra to grab the column and we-- colreader for context, like reading data frames. Sorry? Is this colreader function specifically for data frames? I mean, it can work with anything, basically, that you're-- so what it's doing here is it's saying, grab that column.

But it's really-- I've only really used it for data frames, but you could use it for anything. But yeah, so basically here, getY is saying, OK, well, let's return the index 1 field and the index 2 field. What's up with getX? So because now we're being passed-- so you can't pass a row of a database table to pailimage.create.

So getX is this function, which basically is going, oh, it's going to be in the training path slash disease name slash image name. So that's-- and then there's a special case for the test set, because the test set things are not stored in subfolders according to label, because we don't know the label.

So it's just directly in the test path. So that's the-- as I said, this was more hacky. I don't-- This really helps. So getX is kind of like getY. You can have a list in there. It's like from-- Yeah, you can have-- yeah, it's totally flexible. And I mean, seriously, Hamel, we have so many examples of all of these patterns in the docs in the tutorials.

So like this exact pattern-- let's take a look at one, right, docs.faster.ai. So tutorials-- let's do data block tutorial. Right here, look, multi-label. So here's one. And yeah, you can see here, this is even splitting based on columns in the database table. And here's the co-reader using prefix. And here's a co-reader using a label delimiter.

And here's the examples coming out. Yeah, so there's-- yeah, lots of examples you can check out to see how to do all this. Yeah, so I think I'm at a point now where I actually do want to go into the weeds. So Hamel, you're now, after this, totally free to ask any super-weedy questions.

The most basic kind of data block is called the transform block. And the transform block, basically, it's going to store a bunch of things you pass in. It's going to store things called type transforms. It's going to store things called item transforms. It's going to store things called batch transforms.

And also, it always adds one thing, which is to tensor, because PyTorch means tensors. If you look at the image block, we saw that that's defined as a transform block where the type transforms is this and the batch transforms is this. So now's a good time to talk about how this all works, what this does.

So if I pass in here transform block and don't pass any transforms, it won't do anything. So let's get rid of pretty much everything. Let's do that separate cell so it gets a little bit easier to read. OK. Here is the world's simplest data block. OK. So if we call that, as you can see, all it does is it takes the output of get image file 0 and turns it into a tuple containing one thing, which is the thing itself.

If we have two transform blocks, it returns a tuple with two things in it. So and the reason it's returning tuples is because this is what we want. When we train, we have batches containing inputs and outputs, potentially multiple inputs and potentially multiple outputs. So that's why indexing into this gives you back a tuple.

My question, the blocks can either be a list or a tuple? I don't know, probably. Yes. OK. I have no idea. OK. OK. So then we can do stuff to the first thing in the tuple. So get x equals-- say let's get a lambda o dot name. Hello. Hey, what are you doing?

Oh. Something to do with lambda, right? Does name have to be call? No. Maybe it's notebook restart time. It's my notebook restart time. Oh, that's-- oh. I wonder if something happened to my GPU server. I mean, something has happened to my GPU server, clearly. Never happened before. Oh, it looks like it's back.

Oh, OK. It just recognized that it disappeared. That's wild. Oh, OK. I'm very-- oh, I don't know what just happened. I guess it doesn't really matter. What are you doing right now? I'm just looking at the log, see if anything just happened. Who knows? OK. All right. OK. So you see what happened here is we got the first thing from image files, which was this, and get x got its name.

So we could also do get y equals lambda o o dot parent, say. OK. So-- It first went-- first, the thing went to the transform block, get items. Yes. So whatever get items got went to transform blocks. And then it went to get x and get y. Well, transform block doesn't do anything, right, at all, unless you pass the transforms.

So yeah. So it's basically-- but the number of them you have is the number of pipelines it's going to create. So if we created another one-- But generally, if you have an image block, it would do something. So the order is-- We're going to get to that, yeah. So here, look, we've now got-- --the order.

We're not quite there yet, right? So let's get to that. And it's not quite the mental model you've got, I think. Now that I've got three transform blocks, I only have things to create two of them. So it's sad, right? And so we could put them here, for instance.

And that would be-- Last one is the y in the first two or the x. Correct. Unless we say number of inputs equals 1, in which case now we get x is just going to have to return one thing. There's going to be one function. And get y will be two things.

So you could even put it here instead, right? So you could say, oh, well, this is actually-- let's move it. We could put it here. Item transforms equals-- And so the transform block is stuff that is applied for that transform. How is that not working? That's slightly surprising to me.

, it needs to be a type transform. OK, type transform. So it's now converted to the type it's meant to be. So Radek, you were asking about image block. I'm just curious how all the pieces interact. Let me show you. Let me show you. So let's do it manually.

So image block is just this, OK? So let's not use image block. Let's instead-- Why didn't the item transform work? Let's figure that out later. Yeah, why don't we figure out what's going on here, and then we'll debug it. OK, so now we've got three transform blocks, two of them which do nothing, and the first one of which is going to call something.create.

That was period image.create. So transform blocks don't-- if you look at the code of them, transform blocks don't do anything at all. They actually-- they only store things. There's no done to call. There's no forward. There's nothing. Transform blocks don't do anything. They just store stuff. The data block is the thing that then going to go through and say, OK, for each thing, call its type transforms, and then call to tensor, and then call its item transforms, and then data load of time, call its batch transforms.

So does that help answer your question, Hamill? It's not that a transform block doesn't get called. It just stores the list of things that will get called at each of these times. The first thing that gets called is type transforms. Wait, is that right? Let me think. No, that's not correct.

The first thing that gets called is get x and get y, and then the result of that is passed into type transforms. And so get x and get y-- so get x would be responsible for making sure that you have a path that you can pass to pio-image.create. That's the order.

So this whole path of what happens in a sequence that lives in a data block. That lives in data block, exactly. Now, the data block code is, frankly, hairy, and it could do with some simplifying and documenting and refactoring. It's not long. It's about 50 or 60 lines of code.

In fact, it's almost all here. But basically, when you call .datasets, really, all it's doing is it creates a data sets object passing in all of the type transforms to it. And the answer to your question, Hamill, why didn't the item transforms get done, is because item transforms actually get done by the data loader, not by the data sets.

So data sets only use the type transforms. And basically, the only reason there's quite a bit of code in here is we try to make sure that if two different things have the same type transforms, we merge them together in a sensible way. So there's some stuff to try to make sure this all just works.

I was going to assume the type transforms are separate from the items transforms because of some optimization you can do with the type transforms? Because the type transforms, they're happening earlier. They're happening before data loaders time. So data loaders are the things that are going to take tensors, or at least things that can be converted into tensors.

So yeah, so type transforms are the things that are going to create your data sets for you. And they're going to spit out things which need to be convertible into tensors. And then data loaders has item transforms, which are things like reshaping everything to the same size. And batch transforms, which are things like data augmentation.

But you can have an item transform run on the GPU or not on the GPU, right? It depends on the ordering. I don't think an item transform is generally going to run on the GPU because it's not a batch yet. I mean, maybe it's theoretically possible, but that would be pretty weird because you really don't need things to be in a batch before the GPU can be optimizing it effectively.

And everything in batch transforms will run on the GPU. Assuming that you're using a GPU, I mean, this is OK. This is some part of the code base we're not looking at today. But I can't remember. I think this might be a callback which sticks things on the GPU.

So it just depends on whether things are before or after that callback. Yeah, that's probably a bit of a distraction. So let's skip that bit for now. To kind of revise the difference between data set and data loader, is it best to revisit the PyTorch documentation and kind of-- Yeah, pretty much.

We have our own implementation of them. But our implementation of data loader is a superset of PyTorches. And PyTorches data set is like literally it's an abstract class. It doesn't do anything at all. So a data set is something that you can index into. And it returns a single tuple of your independent and dependent variables.

That's what a data set is defined as by PyTorch. And therefore, that's what we do as well. A data loader, you can't index into it. The only thing you can do is iterate through it. You can grab the next one. And it gives you a mini-batch, which is a tensor.

So that's the difference. But yeah, that's a PyTorch concept. I guess I'm trying to understand the type transform thing, why it has to be done in the data set before the data loader. Well, it doesn't have to be. But it's like we want data sets. Data sets are a very convenient thing to have to have something you can go into and grab items, numbered x, y, or z.

That's the basic foundation of the PyTorch data model, is that there's things you can index into. The type transform aspect of it. Yeah, so you need something that converts the output of get image files into what you want in your data set. And that thing needs a name. And the name we gave it was type transforms.

OK, I think I understand. This is not the only way you could do this, right? But it's our way that's really nice because we now have this thing that you can say like, oh, Hamill, can you show me the 14th image and its label? And you can say, yes, no problem, Jeremy.

You can type DSS dot train bracket 13. And there it is, right? So yes, that's just a convenient thing, basically. So I guess a question around that is that if we did not have type transforms, then it would just be one more step in the item transforms, right? Yeah, I think so.

So it is just separating those things out. Yeah, your data sets would always just return a single thing, or maybe two things, the get x and get y results. And then your data loader would have to do more work, basically. Exactly. Yeah, yeah. Which would be a perfectly OK way to do things as far as I can tell that I think would be a little harder to debug and work with and keep things decoupled.

Yeah, I think that's a reasonable comment. Is it like anything you want to do up front that is like kind of uniform across your whole data set, maybe put it in the type transform that you don't need to change at training time? Basically, like anything that you want to be able to index into it and look at that thing, really.

If you're not sure where to put it, I'd say just chuck it somewhere and don't worry about it. You know, we kind of put-- the rule is that you need something that can be turned into a tensor. Like that's the way fast AI does it. So you need to make sure that your type transform, when you're working with fast AI, returns something that is a tensor or going to be turned into a tensor.

Which PIL image can be, for example? OK. I think I understand. It's kind of like you want to make sure it's a convenient thing that you understand to look at. Yeah. OK. Yeah. OK, so then like-- OK, so I can remove all that. This is the definition of image block.

So let's replace it with the word image block. OK. And then let's change-- OK, let me think. OK, so let's put a dot name here. Here's kind of something we want as our label, right? That's one of our labels. And then the other label we wanted was the function called get variety, right?

Now we can't-- this breaks our rule. This can't be turned into a tensor because it's a string. So what do we do about that? You might remember from a previous lesson we learned that what we do is we replace strings with integers where that integer is a lookup into a vocabulary.

It's a list of all of the possible options. So if we change this to category block, that is exactly what category block will do, right? And so category block, it's got a type transform categorize, which I'm not going to go into because it's not particularly exciting. But if you look up the documentation for categories, you can see how it does that.

So basically, internally now, you'll find that the vocab is stored for these things. So if we look at this at a high level, get items, get-- By the way, just a moment, here's the vocab, right? It's got two things. It's got the vocab for the diseases and the vocab for the varieties.

Yeah, sorry, Radek. No worries. So get items gets us the rows or the examples or whatever allows us to-- and then the core for a single example. And then from get items, we use get y or get x to transform it somehow so that we can pass it into those blocks.

Correct, specifically pass it into the type transforms of those blocks. Into type transforms. And type transforms are things that can get triggered, right? So they're doing a little bit something similar to get y, but are building on what get y does. Correct, exactly. Because these are very general things, right?

And so I didn't want you guys to have to write your own every time. So these basically say, this says, I will work if you can pass me a path to an image. And this says, I will work if you pass me a string. And so get x and get y then are responsible for ensuring that you pass them a path and pass this one a string.

And get image files is already returning paths, so we don't need to get x for this guy. But it's not returning strings, so we do need to get y for these guys. OK, so I'm going to finish-- I'm going to run it slightly over time. But let's have a look at-- so this is exactly the same.

OK, so this is exactly the same as what we just had, right? And so then we can also then add the two things, which is the item transforms and the batch transforms. Some other time, we will talk about how it is that-- how come this is not being applied to the categories?

It's only being applied to the images. For those of you interested in skipping ahead, the secret is using fast calls type dispatch functionality. Anyway, so that's why we're getting these three different things-- image, we've got y1. So Jeremy, if we had an image, if we had an image block in our-- for our y's, for our targets, then item transform would get applied.

Correct. Oh, wow. And there's a-- have a look at the Siamese tutorial on the fast.ai docs, because that has two images. Yeah. And if you think about it, any time we do segmentation, that's exactly what's happening, right? The data augmentation is happening to x and y. And this is really unusual.

I don't know of any other libraries that have this kind of totally transparent ability to do bounding boxes, segmentation, point clouds, whatever as dependent variables, and have it all happen in unison very, very automatically. Well, at least it didn't used to be. Maybe there is now. OK, so now I can create data loaders from that.

And thanks to the magic of fast.ai, this is so cool. Check this out. It's actually auto labeling it with each of our categories. So thanks to stuff we'll discuss later, basically this stuff called type dispatch, fast.ai does a lot of things automatically, even though I don't think I've ever explicitly coded this to work.

It just does because of how the API is designed. So we now have something which can create batches of pictures and two different dependent variables, each one of which has a category. And so what we will get to next time is it actually turns out-- well, I briefly mentioned it now, actually.

All that stuff I did last time about messing around with multiple different heads and all that is actually totally unnecessary. All we need to do when we create our vision learner is tell it we don't want 10 outputs, but we don't want 20 outputs. So normally it automatically figures out how many outputs you want by how many levels are in your categorical dependent variable.

But in this case, we've got something custom, right, which is we've got a tuple of outputs. So we have to tell it. We want 20 outputs. That's going to make the final matrix that it multiplies by have 20 outputs. Now, then you basically need to tell it what loss function to use.

And so if you look it up, it turns out we used to use a loss function for this called cross-entropy_loss_flat. So we're going to call that exact loss function on the first 10 items. And we're going to compare that to the disease probabilities. And then the second 10, we're going to compare to the variety probabilities.

And we'll do the same thing for having an error rate, which just looks at the first 10, the error rate for disease, and the same thing for variety. Look at the second 10, the variety. And so basically then, if you train that, it's going to print out the disease and the variety error.

And the loss function will be the loss function on both of the two halves. And interestingly, for this single model, this 2.3% disease error is the best I'd ever got for this architecture. So at least for this single model case, this was better than training something that only predicts disease.

Anyway, we can talk about that more later, because we kind of spent more time on it. I have a quick question. Yeah. The last layer, it's a flat 20 output layer. Does this mean at inference time that we would have to do the softmax plus-- what would it be?

I can't remember. No, firstly, I handled all that for you automatically. All right. Yeah. Great. All right. And by the way, in the inference functions, you'll see there's always a rule in options as to whether to decode it and whether to put the final activation function on it and stuff like that.

So actually, now I think about it. In this case, because we used a custom loss function, I think that would have broken its ability to do it automatically. So yeah, OK, I'm going to say actually, you would need to add a softmax if you wanted to. Although you actually don't need to, because at least for the Kaggle competition, I just needed which disease had the highest prediction.

And whether it's softmax or not, it's going to be the same because that's a monotonic function. So it depends whether you actually need probabilities or not. In my case, I didn't have to do this. Yeah, but you would only look at the first 10 or the second, I guess, the first ones.

Yeah, so you just-- You can see it. Because otherwise, yeah. So I was using TTA to do test plan augmentation. And I stacked up and I did an ensemble of TTA. And then I just did an argmax on the first 10. Yeah, all right. All right. Just hold up.

OK, sure. In the architecture, you selected for ResNet 18, 128. Is there any programmatic way to find out the size or the input size of the models that you are trying to use? These models handle any input size. All right. Yeah. All right. All right. All the ResNets and all the ConfNecks handle any input size.

All right. Thank you. It's only the transformer models. That also tripped me up in the beginning. But there's a lot of interesting stuff there that might take a whole lecture to understand. All right, again, all that stuff, yeah. Thanks, gang. See you. Thank you. Thank you.

Live coding 13

Chapters

Transcript