Can you folks hear me and see me okay? Thanks. So, yeah, so we're going to share basically all the tools and processes and try to document everything. So that, yeah, so that nothing's mysterious and certainly everything is open source as usual. The code walkthrough I'm going to kind of start in the middle and work down.
We might work back up to the top again, but we'll see. Depends a bit on your questions and stuff as well. Okay, so step one is to find the code. And currently, it's in the fast.ai/fastaidev repo. And so you can get clone this. And once you get clone it you will find that there is a dev folder.
And in the dev folder. There is a bunch of notebooks. So, step one would be to get clone that. And then you'll find, thanks to kind forum participants, that in the root of the fast.ai/dev there is an environment.yml, which you can use to create a conda environment. So I expect if I go conda environment.yml.
There you go, you can find out how to use those. Creating environment from an environment.yml file. There you go. By default, it will create an environment called fast.ai/dev. You don't have to create an environment, the alternative is you could just follow the readme with conda install and pip install.
That will work fine too. The, don't run PyTorch from master, run the version that you see here, so it has to be at least 1.2. And it should work fine with Python 3.6 or 3.7. Silva and I are using 3.7 most of the time so that might be safest but I have tried to recently make sure it's working correctly with 3.6 as well.
After you clone. So here we are. So after you clone. There's one more thing I want you to do, which is you'll see there's a tools directory. And what I want you to do is run tools, slash run after get clone so just run that command, and you just run that once after you clone.
And basically what that does is it sets things up so that when you, if you like send in a pull request or something, it will clean up the notebook to remove the kind of extraneous stuff which tends to cause conflicts. And when you. Yeah, so if you forget to do that.
And then you send us a PR, then we might complain and say, Oh, you forgot to run that. So just go back and run it so you just have to run it once it installs something called a get hook. As you can see, post merge hook, which, which every time you push or pull, it will run a little Python script that removes that extraneous stuff.
Okay, so there's a question about how usable is the library at this stage. I mean, I don't know, it's pretty. It's okay, like the highest level API we're still fiddling around with. But I'm going to be the stuff I'm going to be showing you works, I think works pretty well and it's pretty stable.
So, yeah. But it's early. So, you might find problems with it and if you do, we definitely want to hear about it. We are. Let me just fix that. So get checkout. Help CD dev. Get checkout one. Okay. So you will notice that it's being very actively developed so if you look at the commits.
There's, you know, quite a few every day, and they're making big changes. Yeah, I think it's it's fine for experimenting with and if you're keen to kind of get involved and you should definitely try to do as much as you can with it, but don't expect it to be super stable.
The other thing I'll say though is, once this is done. We're writing a book about it. And you know this is designed to be kind of the definitive version of fast AI. So once this is released. The. It'll be very different to fast AI version one, where the first release was something that we're kind of broke pretty quickly under a tight deadline and then we just change things a lot.
All the time. But for this one, we're going to try super hard to have a very stable branch and minimize API changes at least within the same major version and stuff like that amongst other things because you know there'll there'll be a book that it has to stay consistent with.
So it won't freeze the API like Keras has I think that was a big problem for Keras because deep learning changes a lot, and we don't, you know, in the end we want a library that suits modern best practices. But if there's going to be a, you know, if things change so much that we need to do a major API change we would do a version three, and we would try to backport bug fixes to version two.
The, this, the Swift development is quite separate to this so there was a lot of influence from both the Python and Swift lessons from the last part of the course in terms of what we're writing. But the, the Swift stuff is kind of, it's largely waiting for things to change in Swift for TensorFlow around kind of MLIR, and the new runtime, and the new metaprogramming stuff.
And it's. It's going to be a while I think until Swift for TensorFlow is ready for a really good version of Swift AI, so that the two pieces are quite separate. Okay, so I should try to make sure that I'm explaining the questions I'm responding to that makes sense.
Okay. Right. So, where are we. Yes, so one thing to be aware of is that the, you might be tempted to start at notebook 01, say, in terms of understanding the library. That would be a very enthusiastic approach, because notebook 01 is actually the most complicated. And the reason it's the most complicated is because this is the stuff that kind of sets up Python the way we want it to.
It like starts with metaclasses, and then decorators, and type checking, and monkey patching and context managers. So that's, that's going to be tough. If you're keen to learn a lot of advanced Python tricks it's a good way to do it, because kind of all of the advanced Python tricks are in the first couple of notebooks pretty much.
That's not where we're going to start today. We're going to start somewhere much more convenient, which is at 08. Notebook 08 is actually a tutorial notebook. It doesn't actually define any functionality. But instead it shows you how to use some functionality from the first seven notebooks. So, here is, here is notebook eight.
And so let's have a look at how these notebooks kind of work first. Note that the notebooks start with a number of imports. And the imports are coming from local.blah. And actually, some of these can be simplified. Instead of, see how I've got quite a few things coming from local.data.something.
I could actually delete all of those, and just have local.data.all. And that'll bring in everything from those sub modules. Why local? Well, the reason is, is that as you can see, we have a local directory here. And what happens is, in the notebooks. So let's pick a notebook. Let's do, say, vision core.
Simplify this one as well, to do local.data.all. So you'll see, for example, in seven vision core, there's a couple of interesting things. The first is default export vision.core. What that means is that this notebook is going to create a Python module called vision/core.py. Let's check. vim vision/, it'll be local though, local/vision/core.py.
So here it is, right? So here's the library, the package, the module, that is auto, as it says at the top, auto-generated from this notebook. And so the, this cell starts with hash export. And that's a special tag, which means, please include everything in this cell in the exported Python module.
But it makes a few changes. In particular, see this line here says from local.imports import star. That's been changed here to from dot dot imports import star. So the thing which creates Python modules from notebooks knows that local is special and local refers to the, this local repose version of FastAIV2.
And so in order to export a proper set of modules, it has to replace local with appropriate relative module paths. So dot or dot dot is appropriate. So there's a little special thing that happens. But on the whole, most of the time when you see something that says hash export.
So here's patch property npx. You will just see it in exactly the same in the .py file patch property npx. Okay. All right. So what happens is we, we have these notebooks, and they have these hash export cells in them. And what we do is then at the bottom of every notebook, you'll see there's a cell that says notebook to script.
And when you run this cell, as you can see, it converts all the notebooks, it takes about a second. And that's, that's the actual thing that when you run that, it creates these Python files. So the Python files. So we could go ctags. And we could go vim. And so the Python files are just it's just a normal module.
So for example, here's a function called image to byte. So I could go colon tag image to tab. And so you can, and there it is, there's image to byte. Right. So like we can jump around in the usual way. So here's something called tensor image so I could jump to that.
So you can use your editor in the usual way and treat it just like a normal Python module. But what we would request you do most of the time is instead to read, edit and use the notebooks. And the reason why is that as you can see the notebooks have a lot more.
They have examples, they have pictures, they have prose, they have notes, they have explanations, they have context. And so, you know, this is where you should generally be looking at the code. If you want to see what's really going on. So the only other thing at the top here is something which says default class level.
That's not particularly important but basically what happens is we actually create documentation automatically from these notebooks. So for example, we were just looking at, let's have a look at one, say core.layers. So you can see that this, the notebooks are turned into hyperlinked documentation. Data pipeline. And you can see that the all the different classes and methods and stuff automatically get documentation added from the doc strings and also from any code and markdown around them.
And these, the different, you know, classes and functions all come in at different heading levels in the HTML. So this is basically default class level just says normally the classes are documented as HTML heading level two, this says please document those HTML heading level three. That's all that's for is not at all important.
Something else that you see as you look at the notebooks is lots of things that say test. So like there's a couple of ways we could have done this right. If we wanted to say like, how do these things work, so we've opened up an image, and we can have a look at the image.
Gosh, it's a tiny tiny tiny image, because that's been resized to 30 by 20. Fair enough. So we could then say like, oh, we've just defined thing called npx. So I could go I am dot n underscore px. And it's 600. So like one way we could document this in the notebook is by like writing something that prints out answers.
But what instead we do is we add a test which says test for equality. I am dot n underscore px comma 30 times 20. So, in other words, I am dot npx will be equal to 30 times 20. So, we, we use the tests, and they're going to appear in the documentation.
Right. So that's when you read the documentation it's telling you what the result of this will be, but it's doing two things, as well as telling you in the documentation what the result of this will be so you understand how it works. It's also creating a test to see whether it's working correctly.
So then if you look in the repo. You'll see in the readme there is a way to run all of the notebooks as tests in parallel. So you can check. And basically, they should all basically pass pretty much all the time, except, you know, at the moment we're still under pretty heavy development so not everything passes all the time.
But if you notice something failing, feel free to tell us, because, yeah, it's we want to have everything passing. Okay, so we'll get back to actually what things like at patch property, do later, but this is more of a demonstration of how we end up with all of this from local dot stuff right so all of these have come from notebooks.
So this notebooks, this the test ones are coming from. So yeah, so for example where does test underscore EQ come from. It comes from a notebook called 00 test. And so here, let's find it. Here is test equals. Okay. So, and we'll eventually get through it and see how this code works.
So what's going to happen eventually, is that when we create condor and, and, and pip packages, out of this, rather than creating a directory called local, it's just going to create a directory called fast AI. And then internally it's just, it's all going to be exactly the same. So this will be super easy to generate normal pip and condor packages from.
So, let's take a look. So here is 08 pets tutorial. And so let's go through and see what we've got here. So the first thing we import is local dot imports local dot imports is special. It's the only Python module, which is not created by a notebook. It's just created by hand.
So here it is. And as you can see it's basically just a bunch of imports. I am thinking we haven't quite decided but I am thinking we might split some of this out into a kind of a core library that doesn't require PyTorch. And another bit that does and maybe split out some of the things like the testing and the notebook stuff into separate repos, but at the moment everything's in one place.
But other than those imports, there are 123456 functions defined in imports. And, yeah, that's about it. As you can see. So that's what imports is. Test is the test notebook we just saw. Core we'll be talking a lot about. But it's basically the kind of core functionality that I wish was in Python or PyTorch but isn't.
The data modules are all the stuff to do with transformation pipelines and stuff like that. Then we've got separate vision, text, tabular, etc. applications. And so that's that. So, this will look pretty familiar from fast.ai version one, which is that I can call untar data, for instance. So we can start to look at the pets data set.
And we can always, of course, just go untar data question question to see the source code. And generally speaking, we're trying to make make it so that all of the functions and classes pretty much fit into a screen of code, as you can see happens with this one. So this will just like a version one will download from urls.pets and untar it, if not already done so and will return a path object for where it's been put.
And the default paths are still the same as version one so .fastai/data. So, again, something which is basically the same as fast.ai is get image files, which is a super fast way to grab a bunch of files get image files just calls get files passing in a list of image extensions.
So it's just going to get a subset of the files by extension, and then get files is pretty much the same as it was in version one, it's an extremely fast way to grab optionally recursively all of the files in a directory. So, for pets, we can grab it like so, all of the files.
And so what we're going to try and do is we're going to try and load pets, using a minimum of fast.ai functionality we're going to try to do, try to use like only the lowest level functionality. So we're going to do a lot of stuff very much by hand.
So as you can see here one thing we're doing is using the random splitter. And splitter, which is a class. Well, it looks like actually it's not a class it's something that returns a function. But basically it's something which will give us a function that will return a random bunch of integers of some size.
So for example in this case we've got split IDX. So that has two things in it. And they are the indexes of the training set, and the validation set. So, why are we using capital letters, which we normally only use for classes. We basically decided that anything that when you call it, it returns something, which you still then have to call things on, we use capital letters.
So random splitter is a function that returns a function. So it kind of looks a lot like a class. So that's why it's capitalized. So, okay, so this is going to be a list of the indexes in the training set, and the test set. And so then that will be used combined with our items to decide which paths are in which spot.
Okay, so let's see how we can use this. The goal will be to do this, it will be to create a picture of a dog, and tell us what kind of dog it is. So we've got our list of paths, we've got the split between training set and test set.
So we need a function that can open an image and resize it to some fixed size. And then turn that into a tensor with the channel access first between zero and one. So let's write that function. And then the next thing we need is a class with a method called show, which is capable of displaying an image and a title.
So one of the things you'll see that we have is a function called show_titled_image, which simply calls show_image, passing in the first thing that's given as the image and the second thing is given as the title. And so what we're going to do is we're going to data sets, basically have return tuples.
So in this case the x will be the tensor containing the image, and the y will be the name of the class. And so why do we have this class with a special thing called show. So here is the first bit of kind of new API to tell you about, which is this.
So a class that has a show method is something which can be kind of used by the whole fast.ai transformation system, so that you can basically say .show. And that's how .show is going to work. It's going to try and find a type, which has a show method. And so the first thing, you know, one of the steps we need to do therefore to get to this point is to create a class with a show method.
And we already have lots in fast.ai, but we're going to try and we're trying to show you how to do this using no outside functionality in fast.ai, basically except for transform. So that's why we're doing everything by hand. So, you'll see that this says titled image. And then you'll see in the next bit we create a pet transform, which has a type annotation on the return.
This is a big difference between fast.ai version two and fast.ai version one, and between fast.ai version two and every other Python library I've seen, which is that we use type annotations in version two, a lot. Not for type checking, but actually to make certain behaviors happen. So I'm going to be telling you a little bit about this class called transform today, but a transform is basically a function.
It's not necessarily a reversible function, it's something that you can encode to turn, in this case, a path into a tuple of an image and a label, or it can decode, so it can go the opposite direction, going from a tuple of an image and a label into a titled image object.
Here, it's actually returning a tuple, it's not returning a titled image object. And the reason for that is that the return type annotations in fast.ai transforms actually cast, they actually cast the data into this type. So this is not just descriptive, it's not just documentation. It's actually changing the behavior.
So let me show you. So let's run these cells. So if I go, I've got to go through all these lines in a moment, but if I create a pets transform object, and I grab, I pass an item to it, and then I go decode, you can see I can go decode.show.
And if I go type decode, it's of type titled image. So, this is, this is something that we use all the time in fast.ai version two, and to people who are experienced Python programmers, this is going to be extremely surprising. Because Python doesn't normally work this way. But we found it unbelievably useful, it's made so much of our code, simpler, shorter, less buggy.
So we are using types, a lot in version two, but in a totally new way. And so we'll be showing you how we do that over time. But for now, I'm just going to tell you what you have to do. So, if you want to create a function that can convert, for example, I've got here items zero.
So item zero is a path. And we've got a function, so we could create a function. Let's just do the function version. We could just create a function like this, right, def pet function. And we're going to take an item. It would be simpler. Right. And so then we could go x comma y equals pet function.
Oh, sorry, item zero. Whoops, and that would be O to I. And that would be C. Labeler. Labeler. Okay. There we go. There's x. Or we could go show image, x. Right. And we can see y. But, you know, the issue with this is that x has no semantics, right.
x is like, not as y. Like these things have no semantics. Like what's the type of y? It's an int. How do we, how do we display that? How do we know what it is? How do we know what it represents? How do we do things to it? Same with the image, right.
Like, it's a tensor. We know it's not just a tensor. We know it's a tensor that represents a three channel image. We should be able to rotate it. We should be able to brighten it. We should be able to flip it. But in PyTorch, there are no semantics associated with, you know, kind of domain specific semantics associated with a tensor.
So a lot of what we do in Fastai v2 is to add these semantics to these tensors. So in this case, what I really want to do is change this to a titled image. And so I now can say, you know, whatever, t equals titled image. That's a tuple.
And then I can go t.show, right. And so now I'm kind of, I'm getting closer, right. So we use o quite a lot. It generally just means some object. Sometimes we use x. I'm trying to get in the habit of using o. That's a bit annoying here. I've used o in one place, x in another.
Plenty of room for improvement in our notation still. So you can kind of see how if we can add types to things, we can give things functionality so that then the users of our library can just rely on things like, oh, I can always say .show. Or if something's representing an image, I could do .rotate or .flip or whatever.
So that's kind of one piece of this. The second piece is, you know, it's not okay to see 21. We want to be able to see Great Pyrenees. So that's why everything that changes your data in a way that kind of removes the ability to know what that data is, that it should always, always be able to also go back to where we came from, to be able to display it again.
So in this case, this is basically what our transform subclasses are. Transform classes are things which know how to take some input data and convert it into something which you can get closer to modeling with. But anytime that's kind of losing information so that we can no longer display it and understand it, you also show how to reverse it.
And so for something like this, where we actually need to know what's our vocab, basically, what are the possible types of pet breed we have, that's some state we need. So that's why these things have to be classes, right? So in the init here, when we create a pet transform, we're passing in a vocab.
So here vocab is just a list of all of the possible pet types that we saw. And then we have O to I, so that's object to integer. So that's just reverse mapping. As you can see, that's the exact opposite. And what we're trying to do in Fast.io version 2 is, where things are really commonly used stuff, we try to make them directly available.
So the uniqueify function now has a bidirectional parameter that we'll look through all of the values, which in this case is all of the labels. So we'll look at vowels. That's annoying. Let's do it this way. Just going to do it like this so we can see it better.
Okay, so you know these are, this is just all of the pets that are in our data set for every image. So uniqueify is going to go through and find all of the possible unique values. And it's going to return the unique list in sorted order, as well as the reverse mapping.
That's why bidire is here. So that's everything that we then need, nearly, to create our pet transform object. The other thing we need is we need to tell it how do you convert a path like this into what kind of pet this is. And as we saw in lesson one, for pets you can use that regular expression.
So a lot of the labelers and stuff look the same. HxLabeler. So these things are super simple, right. And again, this is actually not a class, it's a function that returns a function, but it kind of feels like a class that's still got these capital letters. So, we can now put all this together.
So step one is to create a function that will label paths, so turn a path into a type of pet string. Step two is to label everything in the training set. So that's why we use items spit IDX zero, because we just want stuff in the training set. Step three is to create our vocab and the reverse mapping.
And then we can pass those three things to our pet transform, which simply stores them. Okay. And so now that we have those things, we can pass a path to pets. And the way that transform works is that if you just treat the transform object as if it were a function, it will call the encodes function for you.
As you'll see, it does a whole lot more, but this is the really simple version. So we're going to start there. So when I go pets item zero, it will call encodes, so it'll go resized image, comma, object to int, and the labeled version. And so that's why we get this.
Okay, so that's something we could start to create a mini batch from for modeling. And then if we go decode, decode will call decodes, which will create the tuple, wrap it in a titled image and a titled image has a show method. And so we can show it. Okay.
So, that is step one. So, understanding the transform pipeline is one of the kind of more interesting things if you want to learn how to really extend fast AI version two, kind of into other domains and stuff like that. And so here's a good place to start, because as you can see, we're using almost no functionality other than this one class called transform and the transform class, we're only using the very simplest functionality from it.
Everything else is in this notebook. This is stuff which, hopefully, you can pretty much understand yourself. I'll give you a little hint of something though, which is what is items. Items is the result of get image files. It's a list of paths. But as you can see it doesn't look like most lists.
It starts out by saying how many items are in the list. And then it only shows the first 10 things and then dot dot dot. It's got much nicer behavior than normal lists. Also, we are indexing into it with a list, which you can't normally do with Python lists.
So one of the things we actually put a lot of effort into, into fast AI version two is we, we didn't really like the Python list, so we created our own. So if you look at the type of items, you'll see it's actually of type L. It's, this is the only thing that has a one letter name in all of fast AI version two, in terms of stuff that's in the API.
But since it's actually something that we want to be a total replacement for list and we always use this instead of a list. We think it deserves that very special one letter name. So L is a class we're going to be learning a lot about it does a lot of things.
But basically, it looks like this. It looks just like you so I could write list here instead of L. Right. And, and this is a much more awkward but I have to remember, and I can do that right or I could replace anyway you can write list. You can also write L, and it'll do exactly the same thing.
And just like a normal list you can index into it. But unlike normal lists, you can index into it with lists. Unlike normal lists. You don't have to remember to put things in square brackets. Unlike normal lists, you can index into it with masks. There's all kinds of things you can do and even like simple little things like with a normal list.
Let's do both. Let's go B equals list. So you can always do this, A plus nine, B plus nine, B plus nine. So for example with a normal list you have to remember to listify anything that you append to it with an L. You don't. Also with a normal list.
You can't stick stuff on the front or else with these lists you can. So there's a lot of these nice little things. Anyway, there's a thousand more things but generally speaking, you'll find anytime you see something that looks like this hash and then the number of things and then the list.
That's an L. You can also do things like multiply. And as you can see here now it's going to go dot dot dot because there's more than 10 things in it. It does a lot more than list comprehensions, but you certainly can use list comprehensions. Times two for O and A.
But perhaps more interestingly, you can do things like L dot mapped, operator dot neg. What did I do wrong here? Oh, not L dot mapped, A dot mapped. So there's the negative of everything. Anyway, so that's what items is, that's why we were able to index into it with a list.
Now this is not a great transform because we kind of created all this stuff externally. So try two in O8 is we're going to do all that stuff internally. So this time we're going to create something where these cells are exactly the same as we had before. But this time we're just going to pass in the list of path names, and the index of the training set.
And so now we'll create the labeler inside there. We'll create the vocab and O2I inside there. We will look up the path there. And so, again, I'm not showing you any new functionality here, I'm showing you more, just a nicer design of a transform. There's just a little bit, tiny little bit more going on.
So with this we can now create a transform. And just like before we can decode it. And away we go. Okay, so that's the very simplest version of transform. In practice, as you know, in deep learning, you almost never have just one step to go from your source path or whatever to your modeling data, only there's a number of steps.
And so, to do a number of steps. We've created a simple class called pipeline. As you can see here. What pipeline does is it lets you apply a number of transforms in order, and it composes them all together for you. And so I've got I'm really excited about this example I think it's super nice.
I'm going to show you how to create a Siamese data set. So for those of you that haven't done Siamese deep learning before. Siamese deep learning is where you were here's an example, you or your, your data will generally have two things it will have, if it's a vision model it will have two images, and a Boolean.
The two images. In this case, either the same breed of pet, or a different breed of pet, and the Boolean is true or false depending on whether they're the same or different. So, for example, true, false, false. The reason that we do Siamese models is for example for face recognition.
You don't want to create a model of every single person's face and have like some huge dependent variable of the entire population of the world. So all you instead do is you create a model that says, is this person. Is this face of the same person as this face.
So, here, I've created a transform, which again as input, it's going to take a list of paths. And it's going to take a list of labels which is what breed is each one. And I sort the labels and I create a map from each breed to where they are.
Oh yes, so basically, I don't need to go through all the details of the code here because this is just Python. But the interesting bit is in the encodes here. I basically randomly decide 50%, whether or not the second image is going to be of the same breed as the first image or a different breed to the first image.
And then I return a triplet, a tuple, with the two images. So this is the ice image, the one you asked for. This is some random other image, which is either the same breed or a different breed 50% chance, and a boolean saying are they the same or not.
And then, secondly, I need to use that resized image function that's going to take our path and turn it into an image and convert it to all the same size. And then I need to do those two things in order first this transform then this transform. And that's why I then use a pipeline.
So I create my Siamese pair transform. And then I create a pipeline, which contains two things, first the Siamese pair transform. And then this open and resize transform. And now I can pass in say zero into my pipeline and I'm going to get back three, one, two, three things, which is my first image, my second image and my boolean.
And so here they are my first image, my second image and my boolean. And of course be able to show that we need something with a show method. Okay, so here is a Siamese image with a show method. Which means we can now create a pipeline. And let's now put three things on it.
The Siamese, the Siamese pair transform the open and resize, and then the Siamese image dot create, which is simply going to cast it. And so now we can display. That's pretty cool. And I've got a few cool things to show you one of which is going to be slightly mind squishing.
First of all show you the quick easy one, which is another nice part of L, which is item got. For those that don't know, in Python, it's got something called operator dot item getter, which basically works like this you can create a function. And then if I create something which has a number of different things in it like this.
I can go. Let's see, F. Yes, I can go list map F comma T, and it gets me the zero thing from each list, right. So that's what item getter does. So one of the nice things in L is item got, which basically does those two steps at once.
Where it gets really cool though is that it actually also has something called attr got, which will grab a particular named attribute from everything in a list, and it's got some things which make it a lot nicer than Python's like it handles defaults for things that are missing and stuff like that.
Anyway, you'll keep seeing these little handy things in L, which make life much easier. Now for the one that's kind of amazing, which is let's have a think about what's happened here. This has returned a tuple. A tuple has a pillow image, another pillow image, and a boolean. And that is the first thing in our pipeline.
The second thing in our pipeline says return resized image. So how can we possibly be applying this function to this tuple? You can't, right? This is meant to be a path, not a tuple containing two paths and a boolean. So the trick is another nifty thing in transform, which you may have guessed is this type annotation.
This is not a return annotation this time, this time this is a parameter annotation. And what this does is it tells the transform system that this particular encodes only works on paths. And so what this does, actually we can make this a little simpler. Tuple transform. Let's get this the right way around.
Yeah, there it is. So there's a subclass of transform called tuple transform, which will, if it gets a tuple as x, then it will run encodes on every element of the tuple separately. And it will only apply it to those of this type. For those that aren't of this type, it will just ignore them and pass them through.
So this is like, it'll turn out to be super, super handy for things like data augmentation, where basically you can define. So you can actually literally, as you'll see, have multiple different encodes methods for multiple different types, and it'll automatically pick the right one. And so we'll use this a lot for things like defining a single data augmentation, which behaves differently for images versus masks versus point clouds versus etc.
So again, this is something that's used all the time in fast.ai version two. It's not at all weird in lots of languages. So for example in something like Julia, this is how all of Julia's type system works. But it's not something that is very often seen in Python. It's not unheard of, but it's not common.
I'm not sure I've seen anything that quite combines OO and dispatch in this way before in Python. And I've seen some things that are similar. So this has been quite nice for us because we were able to create this single pipeline, which will only open and resize those things which can be opened and resized.
And specifically that is paths. Okay, so a question was, if I have a list of NumPy arrays and torch tensors mixed, can I use this to convert only the NumPy arrays to tensors? And the answer is yes, absolutely you could, except it can't be a list, it would have to be a tuple.
So this just specifically works on tuples and the reason for that is that PyTorch's data loaders all work on tuples. So that's how this works too. So you can see here a Siamese image then we've got our show. And yeah, that's that. So that one is a pretty interesting example.
Lots to, yeah, lots to think about there, as you can see. And by the way, tuple, if you're wondering, like, is tuple transform, like, what's the difference between tuple transform and transform? That's the entirety of tuple transform. A tuple transform is simply a transform which has a single parameter set.
So basically all transforms know either that they act on, over tuples, or they act over a whole item at a time. So it's not really a separate class, it's just a convenience. Okay, so we've got, we've got transforms, and we've got pipelines. So, kind of building up this intermediate data processing pipeline.
The next step is that you would generally want to apply a pipeline to a list of items. So for example, in this case, a list of paths to pets. So we have something called a transformed data set to fmds. And as you can see it applies transforms or pipelines to items of a collection.
So basically, what this is going to look like is we're going to create, we're going to be creating a tufmds, we're going to be passing in our items, again that's still our list of paths, just to remind you, items, and a list of transforms. And this time, it's going to give us something which we can subscript into, so it'll behave a lot like a list, except it's a list where these transforms will be lazily applied to the item.
So items zero is this particular pet, whereas tds zero will be that particular pet with this list of transforms applied. However, it's not just a list of transforms. It's a list of list of transforms. There's list number one, there's list number two. And that's because in practice in deep learning, you nearly always have at least a independent variable and a dependent variable.
So an x and a y. So list number one defines the transforms in the pipeline to create your independent variable. List number two is the transforms in your pipeline to create the dependent variable. So, and you can have more than that. You can have three, four, one, as many as you like, but the vast majority of the time, it will be two.
So let's look at an example, right? So items zero is this path. So in pipeline number one, all these are going to be composed together. So it's first of all, it's going to create a pillow image. And then it's going to resize it to 128 by 128. As you can see, we've got a little transform above here, which knows how to resize.
And then it's going to convert it into a tensor. And then it's going to convert the byte tensor to a float tensor. And so that is a pretty standard, simple pipeline for creating a ready-to-model tensor, except we haven't batched it yet, from a path. Pipeline number two is going to start with our labeler, which was that regex labeler we saw earlier.
And then it will pass it through categorize, which is something which knows how to create a vocab. And so the two things together is going to give us a tuple containing an image. And let's do it this way. Let's see, what have I done here? Okay, so this is good to have some debugging to do.
So in our labeler, it looks like we already have a label. So why is that? So we're passing in items, and going setup. That's our setup path, that's our category, and then unique. Okay, I'm not immediately sure what that is. It's something that I was just changing earlier today, so I've just introduced a bug that I will fix this afternoon.
So, the question being asked is about what we mean by lazily applied here. So the key issue is that there's two ways we could do this, we could take this list of items. And we could apply this pipeline and create all the images we want. And this pipeline and create all the labels we want and we could do that once, and kind of store that all in memory and then just grab them when we need them.
There's two problems with doing that though. The first is that for something like ImageNet, the images are too big to store them all in memory. So we, we wouldn't want to do that, and it would take a really long time to open them all. And the second is that often transforms will have random stuff in them, such as my Siamese transform, or if you have data augmentation.
So in practice, what we want is for these pipelines to be applied only at the moment when I actually asked for a particular item. And so that's what lazily applied means, it means that the function is only called, the transforms are only called when we request the particular item.
So I've had a couple of questions now about CPU and GPU so I may as well mention them now. So, when we do stuff like ImageNet, it tends to use up a lot of our CPU, often it's actually our CPU doing the transformations which is the bottleneck. And so, and this is why is because, you know, pretty much all libraries work this way they they lazily apply the transformations, and even just like opening a JPEG takes a long time.
So one of the things we've actually done, as you'll see, is nearly all of the transformations or the more complex transformations in FastAIV2 are actually run on the GPU, not in the CPU. And part of the reason we're able to do that is because of this type system that I was telling you about.
So you will see that shortly. Something else to notice here in ImageResizer is that encodes is defined twice, which normally would be meaningless in Python, the last one would override the second one, the first one. Notice this is a transform subclass, we actually have this magic dispatch stuff happening.
So if this transform receives a pillow image, then it's going to resize it using self.resample, which is a bilinear resampling. If it receives a pillow mask, it will resample it with a nearest neighbors resampling. And this is important because for masks you can't do bilinear resampling because you end up with floats instead of ints.
And so this is an example of where you can use this type dispatch system to automatically dispatch your function to the right behavior depending on what kind of data you're using. And so you'll see that not only do we have now types for tensor image, tensor image black and white, tensor mask, but we also have types for pillow image and pillow mask and so forth.
So we've got a lot more semantic subtypes of commonly used libraries that's used to make all this stuff super, super, super easy. So now, if our bug was fixed, then we can grab the zeroth item of our transform data set that will grab item zero, it will pass it through the two pipelines and it will return a tuple, which we can then later on decode to get back a decoded tuple, which we can show.
So this is basically, yeah, the same behavior. But as you can see, we've got both a lot more functionality going on. And we're also starting to use some built in stuff that's built into fast.ai version two like categorize and pillow image dot create and to tensor and stuff like that.
Generally speaking, most of the fast.ai version two types will have a dot create class method, which is used to create one of those things. So pil image dot create, for example, can be passed a path. Like so. And so that's used in a number of places in the library, when it's expecting to be able to create something it's going to look for a dot create method, most of the time.
Okay. Well, I think that is enough for now because data sources. Well, yeah, I think that that's enough for now that's quite a lot to sink in so I'll let you have a think about that and have some do some reading and ask any questions you like. My only request is that I've got a couple of requests.
One is that Sylvain and I are pretty busy trying to write the library so if somebody asks a question on the forum that you can have a reasonable guess at the answer to. Yeah, please try, that would be super helpful. The second is yet that on the forums. There are.
There are some you see like fast AIV to core tabular text transforms, so be super helpful to like go in and start to fill in this wiki topic, which I think is a wiki topic. Yes, it is with some kind of documentation and notes from today and stuff like that, that would be super helpful.
And then, yeah, tomorrow, we'll start to look at data, well we will look at data source that won't take too long. And we'll see the segmentation example. And then from there. Yeah, I'm not sure we'll go from there. We'll see. Happy to take suggestions after some of your look to the code.
Thank you very much for joining in. And yeah reminder next time if anybody wants to have more of a conversation about something, feel free to say so and I can add you as a participant and you can talk as well as text. Okay, thanks everybody. Bye. Bye. Bye. Bye.
Bye. Bye. Bye. Bye. Bye. Bye. Bye.