back to index

Practical Deep Learning for Coders: Lesson 1


Chapters

0:0 Introduction
0:25 What has changed since 2015
1:20 Is it a bird
2:9 Images are made of numbers
3:29 Downloading images
4:25 Creating a DataBlock and Learner
5:18 Training the model and making a prediction
7:20 What can deep learning do now
10:33 Pathways Language Model (PaLM)
15:40 How the course will be taught. Top down learning
19:25 Jeremy Howard’s qualifications
22:38 Comparison between modern deep learning and 2012 machine learning practices
24:31 Visualizing layers of a trained neural network
27:40 Image classification applied to audio
28:8 Image classification applied to time series and fraud
30:16 Pytorch vs Tensorflow
31:43 Example of how Fastai builds off Pytorch (AdamW optimizer)
35:18 Using cloud servers to run your notebooks (Kaggle)
38:45 Bird or not bird? & explaining some Kaggle features
40:15 How to import libraries like Fastai in Python
40:42 Best practice - viewing your data between steps
42:0 Datablocks API overarching explanation
44:40 Datablocks API parameters explanation
48:40 Where to find fastai documentation
49:54 Fastai’s learner (combines model & data)
50:40 Fastai’s available pretrained models
52:2 What’s a pretrained model?
53:48 Testing your model with predict method
55:8 Other applications of computer vision. Segmentation
56:48 Segmentation code explanation
58:32 Tabular analysis with fastai
59:42 show_batch method explanation
61:25 Collaborative filtering (recommendation system) example
65:8 How to turn your notebooks into a presentation tool (RISE)
65:45 What else can you make with notebooks?
68:6 What can deep learning do presently?
70:33 The first neural network - Mark I Perceptron (1957)
72:38 Machine learning models at a high level
78:27 Homework

Whisper Transcript | Transcript Only Page

00:00:00.000 | Welcome to practical deep learning for coders lesson one. This is version 5 of this course.
00:00:11.760 | And it's the first year one we've done in two years. So we've got a lot of cool things
00:00:15.560 | to cover. It's amazing how much has changed. Here is a XKCD from the end of 2015. Who here
00:00:28.520 | has seen XKCD comics before? Pretty much everybody, not surprising. So the basic joke here is
00:00:36.960 | I'll let you read it and then I'll come back to it. So it can be hard to tell what's easy
00:00:54.320 | and what's nearly impossible. And in 2015 or at the end of 2015, the idea of checking
00:00:58.800 | whether something is a photo of a bird was considered nearly impossible. So impossible
00:01:03.320 | it was the basic idea of a joke because everybody knows that that's nearly impossible. We're
00:01:08.680 | now going to build exactly that system for free in about two minutes. So let's build
00:01:15.880 | an is-it-a-bird system. So we're going to use Python. And so I'm going to run through
00:01:22.960 | this really quickly. You're not expected to run through it with me because we're going
00:01:25.900 | to come back to it. OK. But let's go ahead and run that cell. OK. So what we're doing
00:01:35.840 | is we're searching DuckDuckGo for images of bird photos and we're just going to grab one.
00:01:43.320 | And so here is the URL of the bird that we grabbed. OK. We'll download it. OK. So there
00:01:53.880 | it is. So we've grabbed a bird and so OK. We've now got something that can download
00:01:58.280 | pictures of birds. Now we're going to need to build a system that can recognize things
00:02:05.600 | that are birds versus things that aren't birds from photos. Now of course computers need
00:02:10.680 | numbers to work with. But luckily images are made of numbers. I actually found this really
00:02:18.320 | nice website called PickSpy where I can grab a bird. And if I go over it, let's pick its
00:02:31.360 | beak. You'll see here that that part of the beak was 251 brightness of red, 48 of green
00:02:41.360 | and 21 of blue. So that's GB. And so you can see as I wave around those colors are changing
00:02:50.400 | those numbers. And so this picture, the thing that we recognize as a picture, is actually
00:02:58.300 | 256 by 171 by 3 numbers between 0 and 255 representing the amount of red, green and blue
00:03:09.240 | on each pixel. So that's going to be an input to our program that's going to try and figure
00:03:15.420 | out whether this is a picture of a bird or not. OK. So let's go ahead and run this cell.
00:03:28.080 | Which is going to go through, and I needed bird and non-bird, but you can't really search
00:03:33.840 | Google images or dot dot go images for not a bird. This doesn't work that way. So I just
00:03:39.520 | decided to use forest. I thought, OK, pictures of forest versus pictures of bird sounds like
00:03:43.960 | a good starting point. So I go through each of forest and bird and I search for forest
00:03:51.520 | photo and bird photo, download images and then resize them to be no bigger than 400
00:03:58.480 | pixels on a side, just because we don't need particularly big ones. It takes a surprisingly
00:04:03.680 | large amount of time just for a computer to open an image. OK, so we've now got 200 of
00:04:09.720 | each. I find when I download images, I often get a few broken ones. And if you try and
00:04:14.960 | train a model with broken images, it will not work. So here's something which just verifies
00:04:20.040 | each image and unlinks, so deletes the ones that don't work. OK, so now we can create
00:04:28.080 | what's called a data block. So after I run this cell, you'll see that basically I go
00:04:40.960 | through the details of this later, but a data block gives fast AI, the library, all the
00:04:46.280 | information it needs to create a computer vision model. And so in this case, we're basically
00:04:50.560 | telling it get all the image files that we just downloaded. And then we say show me a
00:04:55.120 | few up to six. And let's see. Yeah, so we've got some birds, forest, bird, bird, forest.
00:04:59.800 | OK, so one of the nice things about doing computer vision models is it's really easy
00:05:03.600 | to check your data because you can just look at it, which is not the case for a lot of
00:05:07.960 | kinds of models. OK, so we've now downloaded 200 pictures of birds, 200 pictures of forests.
00:05:16.080 | So we'll now press run. And this model is actually running on my laptop. So it's not
00:05:25.040 | using a vast data center. It's running on my presentation laptop and it's doing it at
00:05:29.960 | the same time as my laptop is streaming video, which is possibly a bad idea. And so what
00:05:38.440 | it's going to do is it's going to run through every photo out of those 400. And for the
00:05:45.400 | ones that are forest, it's going to learn a bit more about what forest looks like. And
00:05:49.760 | for the ones that are bird, it'll learn a bit more about what bird looks like. So overall,
00:05:54.580 | it took under 30 seconds. And believe it or not, that's enough to finish doing the thing
00:06:03.440 | which was in that XKCD comic. Let's check by passing in that bird that we downloaded
00:06:11.000 | at the start. This is a bird. Probability, it's a bird. One. Rounded to the nearest four
00:06:19.280 | decimal places. So something pretty extraordinary has happened since late 2015, which is literally
00:06:28.800 | something that has gone from so impossible, it's a joke, to so easy that I can run it
00:06:35.040 | on my laptop computer in, I don't know how long it was, about two minutes. And so hopefully
00:06:42.160 | that gives you a sense that creating really interesting, real working programs with deep
00:06:52.520 | learning is something that it doesn't take a lot of code, didn't take any math, didn't
00:06:57.880 | take more than my laptop computer. It's pretty accessible, in fact. So that's really what
00:07:05.200 | we're going to be learning about over the next seven weeks.
00:07:09.280 | So where have we got to now with deep learning? Well, it moves so fast. But even in the last
00:07:17.520 | few weeks, we've taken it up another notch as a community. You might have seen that something
00:07:23.600 | called Dali 2 has been released, which uses deep learning to generate new pictures. And
00:07:30.000 | I thought this was an amazing thing that this guy, Nick, did, where he took his friend's
00:07:34.680 | Twitter bios and typed them into the Dali 2 input, and it generated these pictures.
00:07:42.460 | So this guy, he typed in commitments, empathetic, psychedelic, philosophical, and it generated
00:07:48.080 | these pictures. So I'll just show you a few of these. I'll let you read them. I love that.
00:08:05.280 | That one's pretty amazing, I reckon. Actually. I love this. Happy Sushi Fest has actually
00:08:17.680 | got a happy rock to move around. So this is like, yeah, I don't know. When I look at these,
00:08:27.240 | I still get pretty blown away that this is a computer algorithm using nothing but this
00:08:33.000 | text input to generate these arbitrary pictures, in this case of fairly complex and creative
00:08:39.720 | things. So the guy who made those points out, he spends about two minutes or so creating
00:08:48.480 | each of these. He tries a few different prompts, and he tries a few different pictures. And
00:08:52.880 | so he's given an example here of when he typed something into the system. Here's an example
00:08:56.520 | of 10 different things he gets back when he puts in expressive painting of a man shining
00:09:02.580 | rays of justice and transparency on a blue bird Twitter logo. So it's not just, you know,
00:09:10.960 | Dali too, to be clear. There's, you know, a lot of different systems doing something
00:09:14.760 | like this now. There's something called Mid Journey, which this Twitter account posted
00:09:19.960 | a female scientist with a laptop writing code in a symbolic, meaningful, and vibrant style.
00:09:26.200 | This one here is an HD photo of a rare psychedelic pink elephant. And this one, I think, is the
00:09:32.320 | second one here. I never know how to actually pronounce this. This one's pretty cool. A
00:09:39.660 | blind bat with big sunglasses holding a walking stick in his hand. And so when actual artists,
00:09:49.240 | you know, so this, for example, this guy said he knows nothing about art, you know, he's
00:09:54.040 | got no artistic talent, it's just something, you know, he threw together. This guy is an
00:09:58.140 | artist who actually writes his own software based on deep learning and spends, you know,
00:10:03.680 | months on building stuff. And as you can see, you can really take it to the next level.
00:10:09.200 | It's been really great, actually, to see how a lot of fast AI alumni with backgrounds as
00:10:15.080 | artists have gone on to bring deep learning and art together. And it's a very exciting
00:10:19.320 | direction. And it's not just images, to be clear. You know, one of the other interesting
00:10:25.240 | thing that's popped up in the last couple of weeks is Google's Pathways language model,
00:10:30.760 | which can take any arbitrary English as text question and can create an answer, which not
00:10:37.560 | only answers the question, but also explains its thinking, whatever it means for a language
00:10:44.640 | model to be thinking. One of the ones I found pretty amazing was that it can explain a joke.
00:10:51.080 | So, I'll let you read this. So, this is actually a joke that probably needs explanations for
00:11:12.240 | anybody who's not familiar with TPUs. So, this model just took the text as input and
00:11:18.120 | created this text as output. And so, you can see, you know, again, deep learning models
00:11:25.680 | doing things which I think very few, if any of us would have believed would be maybe possible
00:11:31.720 | to do by computers even in our lifetime. This means that there is a lot of practical and
00:11:41.680 | ethical considerations. We will touch on them during this course, but can't possibly hope
00:11:48.320 | to do them justice. So, I would certainly encourage you to check out ethics.fast.ai to see our
00:11:55.180 | whole data ethics course taught by my co-founder, Dr. Rachel Thomas, which goes into these issues
00:12:03.640 | in a lot more detail. All right. So, as well as being an AI researcher
00:12:12.880 | at the University of Queensland and fast.ai, I am also a homeschooling primary school teacher.
00:12:19.680 | And for that reason, I study education a lot. And one of the people who are live in education
00:12:26.600 | is a guy named Dylan Williams. And he has this great approach in his classrooms of figuring
00:12:31.880 | out how his students are getting along, which is to put a coloured cup on their desk, green
00:12:38.120 | to mean that they're doing fine, yellow cup to mean I'm not quite sure, and a red cup
00:12:44.160 | to mean I have no idea what's going on. Now, since most of you are watching this remotely,
00:12:49.680 | I can't look at your cups, and I don't think anybody bought coloured cups with them today.
00:12:54.120 | So instead, we have an online version of this. So, what I want you to do is go to cups.fast.ai/fast.
00:13:08.480 | That's cups.fast.ai/fast. And don't do this if you're like a fast.ai expert who's done
00:13:17.120 | the course five times, because if you're following along, that doesn't really mean much, obviously.
00:13:21.120 | It's really for people who are not already fast.ai experts. And so, click one of these
00:13:27.120 | coloured buttons. And what I will do is I will go to the teacher version and see what
00:13:36.800 | buttons you're pressing. All right. So, so far, people are feeling we're not going too
00:13:41.520 | fast on the whole. We've got one brief read. Okay. So, hey, Nick, this URL, it's the same
00:13:51.000 | thing with teacher on the end. Can you keep that open as well and let me know if it suddenly
00:13:57.000 | gets covered in red? If you are somebody who's read, I'm not going to come to you now because
00:14:07.160 | there's not enough of you to stop the class. So, it's up to you to ask on the forum or
00:14:11.880 | on the YouTube live chat. And there's a lot of folks, luckily, who will be able to help
00:14:16.520 | you. All right. I wanted to do a big shout out to Radik. Radik created cups.fast.ai for
00:14:29.120 | me. I said to him last week I need a way of seeing coloured cups on the internet. And
00:14:33.720 | he wrote it in one evening. And I also wanted to shout out that Radik just announced today
00:14:42.600 | that he got a job at NVIDIA AI. And I wanted to say, you know, that fast.ai alumni around
00:14:48.960 | the world very, very frequently, like every day or two, they may want me to say that they've
00:14:53.720 | got their dream job. And, yeah, if you're looking for inspiration of how to get into
00:15:01.640 | the field, I couldn't recommend nothing. Nothing would be better than checking out Radik's
00:15:07.200 | work. And he's actually written a book about his journey. It's got a lot of tips in particular
00:15:11.680 | about how to take advantage of fast.ai, make the most of these lessons. And so I would certainly
00:15:17.240 | say check that out as well. And if you're here live, he's one of our TAs as well. So
00:15:21.120 | you can say hello to him afterwards. He looks exactly like this picture here. So I mentioned
00:15:28.760 | I spent a lot of time studying education, both for my home schooling duties and also
00:15:34.680 | for my courses. And you'll see that there's something a bit different, very different
00:15:40.600 | about this course, which is that we started by training a model. We didn't start by doing
00:15:45.880 | a in-depth review of linear algebra and calculus. That's because two of my favorite writers
00:15:53.840 | and researchers on education, Paul Lockhart and David Perkins, and many others talk about
00:15:58.760 | how much better people learn when they learn with a context in place. So the way we learn
00:16:06.200 | math at school where we do counting and then adding and then fractions and then decimals
00:16:12.360 | and then blah, blah, blah. And 15 years later, we start doing the really interesting stuff
00:16:18.320 | at grad school. That is not the way most people learn effectively. The way most people learn
00:16:24.960 | effectively is from the way we teach sports. For example, where we show you a whole game
00:16:31.360 | of sports, we show you how much fun it is. You go and start playing sports, simple versions
00:16:36.720 | of them. You're not very good. And then you gradually put more and more pieces together.
00:16:41.800 | So that's how we do deep learning. You will go into as much depth as the most sophisticated,
00:16:52.240 | technically detailed classes you'll find later. But first, you'll learn to be very, very good
00:17:01.240 | at actually building and deploying models. And you will learn why and how things work
00:17:09.520 | as you need to to get to the next level. For those of you that have spent a lot of time
00:17:14.080 | in technical education, like if you've done a PhD or something, will find this deeply
00:17:18.080 | uncomfortable because you'll be wanting to understand why everything works from the start.
00:17:23.440 | Just do your best to go along with it. Those of you who haven't will find this very natural.
00:17:27.640 | Oh, and this is Dylan Williams, who I mentioned before, the guy who came up with the really
00:17:32.280 | cool cut scene. There'll be a lot of tricks that have come out of the educational research
00:17:40.520 | literature scattered through this course. On the whole, I won't call them out, they'll
00:17:43.840 | just be there. But maybe from time to time we'll talk about them.
00:17:47.200 | All right, so before we start talking about how we actually built that model and how it
00:17:51.720 | works, I guess I should convince you that I'm worth listening to. I'll try to do that
00:17:56.720 | reasonably quickly, because I don't like tooting my own horn, but I know it's important. So
00:18:03.360 | the first thing I mentioned about me is that me and my friend Songva wrote this extremely
00:18:07.520 | popular book, Deep Learning for Coders, and that book is what this course is quite heavily
00:18:12.760 | based on. We're not going to be using any material from the book directly, and you might
00:18:18.360 | be surprised by that. But the reason actually is that the educational research literature
00:18:23.320 | shows that people learn things best when they hear the same thing in multiple different
00:18:27.480 | ways. So I want you to read the book, and you'll also see the same information presented
00:18:33.800 | in a different way in these videos. So one of the bits of homework after each lesson
00:18:39.600 | will be to read a chapter of the book. A lot of people like the book. Peter Norvig, Director
00:18:46.760 | of Research, loves the book. In fact, this one's here. One of the best sources for a
00:18:50.880 | program to become proficient in deep learning. Eric Topol loves the book. Hal Varian, America's
00:18:57.640 | Professor at Berkeley, Chief Congressman at Google, likes the book. Jerome Percenti, who
00:19:02.320 | is the Head of AI at Facebook, likes the book. A lot of people like the book. So hopefully
00:19:08.520 | you'll find that you like this material as well. I've spent about 30 years of my life
00:19:15.240 | working in and around machine learning, including building a number of companies that relied
00:19:20.440 | on it, and became the highest ranked competitor in the world on Kaggle in machine learning
00:19:27.800 | competitions. My company in Liddick, which I founded, was the first company to specialize
00:19:33.560 | in deep learning for medicine, and MIT voted at one of the 50 smartest companies in 2016,
00:19:39.040 | just above Facebook and SpaceX. I started Fast AI with Rachel Thomas, and that was quite
00:19:49.720 | a few years ago now, but it's had a big impact on the world already. Including work we've
00:19:57.520 | done with our students has been globally recognized, such as our Wind in the Dawn Bench competition,
00:20:04.400 | which showed how we could train big neural networks faster than anybody in the world,
00:20:10.080 | and cheaper than anybody in the world. And so that was a really big step in 2018, which
00:20:18.840 | actually made a big difference. Google started using our special approaches in their models,
00:20:25.920 | Nvidia started optimizing their stuff using our approaches, so it made quite a big difference
00:20:32.360 | there. I'm the inventor of the ULM fit algorithm, which according to the Transformers book was
00:20:38.840 | one of the two key foundations behind the modern NLP revolution. This is the paper here. And
00:20:51.040 | actually, you know, interesting point about that, it was actually invented for a fast
00:20:57.560 | AI course. So the first time it appeared was not actually in the journal, it was actually
00:21:06.360 | in lesson four of the course, I think the 2016 course, if I remember correctly. And you know,
00:21:16.000 | most importantly, of course, I've been teaching this course since version one. And this is
00:21:22.400 | actually this, I think this is the very first version of it, which even back then was getting
00:21:26.640 | HBR's attention. A lot of people have been watching the course, and it's been, you know,
00:21:34.080 | fairly widely used. YouTube doesn't show likes anymore. So I have to show you our likes for
00:21:39.720 | you. You know, it's it's been amazing to see how, yeah, how many alumni have gone from
00:21:49.320 | this to, to, you know, to really doing amazing things, you know, and so, for example, Andre
00:21:59.000 | Capathy told me that Tesla, I think he said pretty much everybody who joins Tesla in AI
00:22:03.920 | is meant to do this course, I believe at OpenAI, they told me that all the residents joining
00:22:08.720 | there first do this course. So this, you know, this course is really widely used in industry
00:22:14.600 | and research for people. And they have a lot of success. Okay, so there's a bit of brief
00:22:20.640 | information about why you should hopefully get going with this. Alright, so let's get
00:22:27.000 | back to what's what's happened here. Why are we able to create a bird recognizer in a minute
00:22:34.800 | or two? And why couldn't we do it before? So I'm going to go back to 2012. And in 2012,
00:22:43.280 | this was how image recognition was done. This is the computational pathologist. It was a
00:22:51.240 | project done at Stanford, very successful, very famous project that was looking at the
00:22:55.760 | five year survival of breast cancer patients by looking at their histopathology images
00:23:03.200 | slides. Now, so this is like what I would call a classic machine learning approach.
00:23:09.720 | And I spoke to the senior author of this Daphne Coller. And I asked her why they didn't use
00:23:14.440 | deep learning. And she said, Well, it just, you know, it wasn't really on the radar at
00:23:18.680 | that point. So this is like a pre deep learning approach. And so the way they did this was
00:23:24.920 | they got a big team of mathematicians and computer scientists and pathologists and so
00:23:29.040 | forth to get together and build these ideas for features like relationships between epithelial
00:23:34.160 | nuclear neighbors. Thousands and thousands, actually, they created a features and each
00:23:38.760 | one required a lot of expertise from a cross disciplinary group of experts at Stanford.
00:23:44.520 | So this project took years, and a lot of people and a lot of code and a lot of math. And then
00:23:50.000 | once they had all these features, they then fed them into a machine learning model, in
00:23:54.360 | this case, logistic regression to predict survival. As I say, it's very successful, right?
00:24:00.480 | But it's not something that I could create for you in a minute, at the start of a course.
00:24:07.200 | The difference with neural networks is neural networks don't require us to build these features.
00:24:11.760 | They build them for us. And so what actually happened was, in I think it was 2015, Matt
00:24:19.880 | Zyla and Rob Fergus took a trained neural network and they looked inside it to see what
00:24:26.920 | it had learned. So we don't give it features, we ask it to learn features. So when Zyla
00:24:34.000 | and Fergus looked inside a neural network, they looked at the actual, the weights in
00:24:41.840 | the model, and they draw a picture of them. And this was nine of the sets of weights they
00:24:45.680 | found. And this set of weights, for example, finds diagonal edges. This set of weights
00:24:51.520 | finds yellow to blue gradients. And this set of weights finds red to green gradients and
00:24:57.920 | so forth, right? And then down here are examples of some bits of photos which closely matched,
00:25:04.560 | for example, this feature detector. And deep learning is deep because we can then take
00:25:13.480 | these features and combine them to create more advanced features. So these are some
00:25:19.360 | layer two features. So there's a feature, for example, that finds corners and a feature
00:25:24.540 | that finds curves and a feature that finds circles. And here are some examples of bits
00:25:29.140 | of pictures that the circle finder found. And so remember, with a neural net, which
00:25:35.720 | is the basic function used in deep learning, we don't have to hand code any of these or
00:25:41.060 | come up with any of these ideas. You just start with actually a random neural network,
00:25:47.560 | and you feed it examples, and you have a learn to recognize things. And it turns out that
00:25:54.400 | these are the things that it creates for itself. So you can then combine these features. And
00:26:03.600 | when you combine these features, it creates a feature detector, for example, that finds
00:26:07.880 | kind of repeating geometric shapes. And it creates a feature detector, for example, that
00:26:14.440 | finds kind of really little things, which it looks like is finding the edges of flowers.
00:26:20.680 | And this feature detector here seems to be finding words. And so the deeper you get,
00:26:28.120 | the more sophisticated the features it can find are. And so you can imagine that trying
00:26:32.080 | to code these things by hand would be insanely difficult, and you wouldn't know even what
00:26:38.680 | to encode by hand. So what we're going to learn is how neural networks do this automatically.
00:26:45.520 | But this is the key difference of why we can now do things that previously we just didn't
00:26:51.040 | even conceive of as possible, because now we don't have to hand code the features we
00:26:56.880 | look for. They can all be learned. Now, build is important to recognize. We're going to
00:27:06.080 | be spending some time learning about building image-based algorithms. And image-based algorithms
00:27:14.280 | are not just for images. And in fact, this is going to be a general theme. We're going
00:27:17.680 | to show you some foundational techniques. But with creativity, these foundational techniques
00:27:23.840 | can be used very widely. So for example, an image recognizer can also be used to classify
00:27:34.080 | sounds. So this was an example from one of our students who posted on the forum and said
00:27:41.160 | for their project, they would try classifying sounds. And so they basically took sounds
00:27:47.200 | and created pictures from their waveforms. And then they used an image recognizer on
00:27:52.200 | that. And they got a state of the art result, by the way. Another of our students on the
00:27:57.320 | forum said that they did something very similar to take time series and turn them into pictures
00:28:03.200 | and then use image classifiers. Another of our students created pictures from mouse movements
00:28:12.480 | from users of a computer system. So the clicks became dots and the movements became lines
00:28:17.880 | and the speed of the movement became colors. And then use that to create an image classifier.
00:28:23.640 | So you can see with some creativity, there's a lot of things you can do with images. There's
00:28:33.880 | something else I wanted to point out, which is that as you saw, when we trained a real
00:28:41.720 | working bird-recognized image model, we didn't need lots of math. There wasn't any. We didn't
00:28:48.200 | need lots of data. We had 200 pictures. We didn't need lots of expensive computers. We
00:28:52.680 | just used my laptop. This is generally the case for the vast majority of deep learning
00:28:59.640 | that you'll need in real life. There will be some math that pops up during this course,
00:29:09.560 | but we will teach it to you as needed or we'll refer you to external resources as needed.
00:29:14.040 | But it will just be the little bits that you actually need. The myth that deep learning
00:29:20.600 | needs lots of data I think is mainly passed along by big companies that want to sell you
00:29:27.480 | computers to store lots of data and to process it. We find that most real world projects
00:29:34.680 | don't need extraordinary amounts of data at all. And as you'll see, there's actually a
00:29:40.760 | lot of fantastic places you can do state-of-the-art work for free nowadays, which is great news.
00:29:51.080 | One of the key reasons for this is because of something called transfer learning, which
00:29:55.080 | we'll be learning about a lot during this course, and it's something which very few
00:30:00.080 | people are aware of and are aware of. In this course, we'll be using PyTorch. For those
00:30:08.720 | of you who are not particularly close to the deep learning world, you might have heard
00:30:14.880 | of TensorFlow and not of PyTorch. You might be surprised to hear that TensorFlow has been
00:30:22.960 | dying in popularity in recent years, and PyTorch is actually growing rapidly. And in research
00:30:34.560 | repositories amongst the top papers, TensorFlow is a tiny minority now compared to PyTorch.
00:30:44.780 | This is also great research that's come out from Ryan O'Connor. He also discovered that
00:30:52.960 | the majority of people that were doing TensorFlow in 2018, the majority of now shifted to PyTorch.
00:31:00.920 | And I mention this because what people use in research is a very strong leading indicator
00:31:06.600 | of what's going to happen in industry because this is where all the new algorithms are going
00:31:10.720 | to come out, this is where all the papers are going to be written about. It's going
00:31:14.640 | to be increasingly difficult to use TensorFlow. We've been using PyTorch since before it came
00:31:20.600 | out, before the initial release, because we knew just from technical fundamentals, it
00:31:25.520 | was far better. So this course has been using PyTorch for a long time. I will say, however,
00:31:32.080 | that PyTorch requires a lot of hairy code for relatively simple things. This is the
00:31:37.280 | code required to implement a particular optimizer called AdamW in plain PyTorch. I actually
00:31:43.740 | copied this code from the PyTorch repository. So as you can see, there's a lot of it. This
00:31:51.360 | gray bit here is the code required to do the same thing with FastAI. FastAI is a library
00:31:59.120 | we built on top of PyTorch. This huge difference is not because PyTorch is bad, it's because
00:32:06.520 | PyTorch is designed to be a strong foundation to build things on top of, like FastAI. When
00:32:16.180 | you use FastAI, the library, you get access to all the power of PyTorch as well. But you
00:32:23.540 | shouldn't be writing all this code if you only need to write this much code. The problem
00:32:29.040 | of writing lots of code is that that's lots of things to make mistakes with, lots of things
00:32:33.320 | to not have best practices in, lots of things to maintain. In general, we found particularly
00:32:39.960 | with deep learning, less code is better. Particularly with FastAI, the code you don't write is code
00:32:48.520 | that we've basically found kind of best practices for you. So when you use the code that we've
00:32:54.520 | provided for you, you're generally fine to get better results. So FastAI has been a really
00:33:02.480 | popular library, and it's very widely used in industry, in academia, and in teaching.
00:33:10.680 | And as we go through this course, we'll be seeing more and more pure PyTorch as we get
00:33:16.300 | deeper and deeper underneath to see exactly how things work. The FastAI library just won
00:33:23.760 | the 2020 Best Paper Award, or the paper about it, in information. So again, you can see
00:33:29.200 | it's a very well regarded library. Okay, so okay, we're still green. That's good. So you
00:33:41.880 | may have noticed something interesting, which is that I'm actually running code in these
00:33:48.800 | slides. That's because these slides are not in PowerPoint. These slides are in Jupyter
00:33:58.120 | Notebook. Jupyter Notebook is the environment in which you will be doing most of your computing.
00:34:09.080 | It's a web-based application, which is extremely popular and widely used in industry and in
00:34:19.200 | academia and in teaching, and is a very, very, very powerful way to experiment and explore
00:34:26.360 | and to build. Nowadays, I would say most people, at least most students, run Jupyter Notebooks
00:34:37.760 | not on their own computers, particularly for data science, but on a cloud server, of which
00:34:44.220 | there's quite a few. And as I mentioned earlier, if you go to course.fast.ai, you can see how
00:34:52.100 | to use various different cloud servers. One I'm going to show an example of is Kaggle.
00:35:05.440 | So Kaggle doesn't just have competitions, but it also has a cloud notebooks server.
00:35:10.380 | And I've got quite a few examples there. So let me give you a quick example of how we
00:35:21.340 | use Jupyter Notebooks to build stuff, to experiment, to explore. So on Kaggle, if you start with
00:35:30.480 | somebody else's Notebook, so why don't you start with this one, Jupyter Notebook 101.
00:35:35.700 | If it's your own Notebook, you'll see a button called edit. If it's somebody else's, that
00:35:39.080 | button will say copy and edit. If you use somebody's Notebook that you like, make sure
00:35:45.240 | you click the upvote button to encourage them and to help other people find it before you
00:35:49.720 | go ahead and copy and edit. And once we're in edit mode, we can now use this Notebook.
00:35:59.300 | And to use it, we can type in any arbitrary expression in Python and click run. And the
00:36:06.260 | very first time we do that, it says session is starting. It's basically launching a virtual
00:36:10.460 | computer for us to run our code. This is all free. In a sense, it's like the world's most
00:36:18.420 | powerful calculator. It's a calculator where you have all of the capabilities of the world's,
00:36:25.500 | I think, most popular programming language. Certainly, it and JavaScript would be the
00:36:29.260 | top two directly at your disposal. So Python does know how to do one plus one. And so you
00:36:35.400 | can see here, it spits out the answer. I hate clicking. I always use keyboard shortcuts.
00:36:40.580 | So instead of clicking this little arrow, you just press shift and to do the same thing.
00:36:46.820 | And as you can see, there's not just calculations here. There's also pros. And so Jupyter Notebooks
00:36:53.420 | are great for explaining to you the version of yourself in six months time, what on earth
00:36:59.180 | you are doing or to your coworkers or to people in the open source community, what are people
00:37:03.460 | you're blogging for, etc. And so you just type pros. And as you can see, when we create
00:37:09.820 | a new cell, you can create a code cell, which is a cell that lets you type calculations
00:37:16.700 | or a markdown cell, which is a cell that lets you create pros. And the pros use this formatting
00:37:26.340 | in a little mini language called markdown. There's so many tutorials around, I won't
00:37:30.300 | explain it to you, but it lets you do things like links and so forth. So I'll let you follow
00:37:41.100 | through the tutorial in your own time because it really explains to you what to do. One
00:37:48.220 | thing to point out is that sometimes you'll see me use cells with an exclamation mark
00:37:52.060 | at the start. That's not Python. That's a bash shell command. Okay, so that's what the
00:37:58.260 | exclamation mark means. As you can see, you can put images into notebooks. And so the
00:38:06.080 | image I popped in here was the one showing that Jupiter won the 2017 software system
00:38:11.100 | award, which is pretty much the biggest award there is for this kind of software. Okay,
00:38:17.420 | so that's the basic idea of how we use notebooks. So let's have a look at how we do our, how
00:38:28.140 | we do our bird or not bird model. One thing I always like to do when I'm using something
00:38:38.460 | like Colab or Kaggle cloud platforms that I'm not controlling is make sure that I'm
00:38:43.420 | using the most recent version of any software. So my first cell here is exclamation mark
00:38:49.620 | pip install minus U that means upgrade Q for quiet fast AI. So that makes sure that we
00:38:55.420 | have the latest version of fast AI. And if you always have that at the start of your
00:38:58.940 | notebooks, you're never going to have those awkward foreign threads where you say, why
00:39:02.820 | isn't this working? And somebody says to you, oh, you're using an old version of some software.
00:39:09.860 | So you'll see here, this notebook is the exact thing that I was showing you at the start
00:39:16.700 | of this lesson. So if you haven't done much Python, you might be surprised about how little
00:39:31.980 | code there is here. And so Python is a concise but not too concise language, you'll see that
00:39:41.980 | there's less boilerplate than some other languages you might be familiar with. And I'm also taking
00:39:47.380 | advantage of a lot of libraries. So fast AI provides a lot of convenient things for you.
00:39:54.940 | So I forgot to import. So to use a external library, we use import to import a symbol
00:40:09.260 | from a library. Fast AI has a lot of libraries we provide, they generally start with fast
00:40:15.100 | something. So for example, to make it easy to download a URL, fast download has download
00:40:20.100 | URL. To make it easy to create a thumbnail, we have image to thumb and so forth. So we
00:40:31.180 | I always like to view as I'm building a model, my data at every step. So that's why I first
00:40:38.740 | of all grab one bird, and then I grab one forest photo, and I look at them to make sure they
00:40:45.780 | look reasonable. And once I think okay, they look okay, then I go ahead and download. And
00:40:58.620 | so you can see fast AI has a download images, where you just provide a list of URLs. So
00:41:04.820 | that's how easy it is. And it does that in parallel. So it does that, you know, surprisingly
00:41:10.740 | quickly. One other fast AI thing I'm using here is resize images. You generally will
00:41:18.260 | find that for computer vision algorithms, you don't need particularly big images. So
00:41:23.660 | I'm resizing these to a maximum side length of 400. Because it's actually much faster.
00:41:30.820 | This GPUs are so quick for big images, most of the time can be taken up just opening it.
00:41:37.740 | The neural net itself often takes less time. So that's another good reason to make them
00:41:41.980 | smaller. Okay. So the main thing I wanted to tell you about was this data block command.
00:41:51.420 | So the data block is the key thing that you're going to want to get familiar with, as deep
00:41:56.660 | learning practitioners at the start of your journey. Because the main thing you're going
00:42:02.820 | to be trying to figure out is how do I get this data into my model? Now that might surprise
00:42:09.340 | you. You might be thinking we should be spending all of our time talking about neural network
00:42:13.300 | architectures and matrix multiplication and gradients and stuff like that. The truth is
00:42:20.780 | very little of that comes up in practice. And the reason is that at this point, the deep
00:42:28.380 | learning community has found a reasonably small number of types of model that work for
00:42:36.780 | nearly all the main applications you'll need. And fast AI will create the right type of
00:42:42.840 | model for you the vast majority of the time. So all of that stuff about tweaking neural
00:42:50.800 | network architectures and stuff, I mean, we'll get to it eventually in this course. But you
00:42:56.480 | might be surprised to discover that it almost never comes up. Kind of like if you ever did
00:43:02.260 | like a computer science course or something, and they spent all this time on the details
00:43:06.860 | of compilers and operating systems, and then you get to the real world and you never use
00:43:10.620 | it again. So this course is called practical deep learning. And so we're going to focus
00:43:15.340 | on the stuff that is practically important. Okay, so our images are finished downloading,
00:43:22.740 | and two of them were broken, so we just deleted them. Another thing you'll note, by the way,
00:43:31.620 | if you're a keen software engineer is I tend to use a lot of functional style in my programs
00:43:39.340 | I find for kind of the kind of work I do that a functional style works very well. If you're
00:43:47.220 | you know a lot of people in Python are less familiar with that it's more it may becomes
00:43:49.900 | more from other things. So yeah, that's why you'll see me using stuff like map and stuff
00:43:54.340 | quite a lot. Alright, so a data block is the key thing you need to know about if you're
00:44:01.120 | going to know how to use different kinds of data sets. And so these are all of the things
00:44:06.620 | basically that you'll be providing. And so what we did when we designed the data block
00:44:10.900 | was we actually looked and said, okay, over hundreds of projects, what are all the things
00:44:19.080 | that change from project to project to get the data into the right shape. And we realized
00:44:24.340 | we could basically split it down into these five things. So the first thing that we tell
00:44:30.460 | fast AI is what kind of input do we have. And so then so there are lots of blocks in
00:44:36.500 | fast AI for different kinds of input. So he said, Oh, the input is an image. What kind
00:44:41.340 | of output is there? What kind of label? The outputs are category. So that means it's one
00:44:45.580 | of a number of possibilities. So that's enough for fast AI to know what kind of model to
00:44:52.980 | build for you. So what are the items in this model? What am I actually going to be looking
00:44:58.060 | at to look to train from? This is a function. In fact, you might have noticed if you were
00:45:03.800 | looking carefully that we use this function here. It's a function which returns a list
00:45:11.940 | of all of the image files in a path based on extension. So every time it's going to
00:45:17.420 | try and find out what things to train from, it's going to use that function. In this case,
00:45:21.420 | we'll get a list of image files. Now, something we'll talk about shortly is that it's critical
00:45:27.780 | that you put aside some data for testing the accuracy of your model. And that's called
00:45:32.860 | a validation set. It's so critical that fast AI won't let you train a model with that one.
00:45:39.860 | So you actually have to tell it how to create a validation set, how to set aside some data.
00:45:45.020 | And in this case, we say randomly set aside 20% of the data. Okay, next question, then
00:45:55.740 | you have to tell fast AI is how do we know the correct label of a photo? How do we know
00:46:02.400 | if it's a bird photo or a forest photo? And this is another function. And this function
00:46:09.540 | simply returns the parent folder of a path. And so in this case, we saved our images into
00:46:20.580 | either forest or bird. So that's where the labels are going to come from.
00:46:26.980 | And then finally, most computer vision architectures need all of your inputs as you train to be
00:46:35.260 | the same size. So item transforms are all of the bits of code that are going to run
00:46:43.100 | on every item, on every image in this case. And we're saying, okay, we want you to resize
00:46:49.300 | each of them to being 192 by 192 pixels. There's two ways you can resize, you can either crop
00:46:56.580 | out a piece in the middle, or you can squish it. And so we're saying, squish it. So that's
00:47:06.100 | the data block, that's all that you need. And from there, we create an important class
00:47:11.660 | called data loaders. Data loaders are the things that actually PyTorch iterates through
00:47:18.220 | to grab a bunch of your data at a time. The way it can do it so fast is by using a GPU,
00:47:24.580 | which is something that can do thousands of things at the same time. And that means it
00:47:28.620 | needs thousands of things to do at the same time. So a data loader will feed the training
00:47:34.820 | algorithm with a bunch of your images at once. In fact, we don't call it a bunch, we call
00:47:42.220 | it a batch, or a mini batch. And so when we say show batch, that's actually a very specific
00:47:53.420 | word in deep learning, it's saying show me an example of a batch of data that you would
00:47:58.740 | be passing into the model. And so you can see show batch gives you tells you two things,
00:48:03.860 | the input, which is the picture, and the label. And remember, the label came by calling that
00:48:11.900 | function. So when you come to building your own models, you'll be wanting to know what
00:48:20.940 | kind of splitters are there and what kinds of labeling functions are there and so forth.
00:48:24.460 | That's wrong button. You'll be wanting to know what kind of labeling functions are there
00:48:30.420 | and what kind of splitters are there and so forth. And so docs.fast.ai is where you go
00:48:35.540 | to get that information. Often the best place to go is the tutorials. So for example, here's
00:48:43.060 | a whole data block tutorial. And there's lots and lots of examples. So hopefully you can
00:48:48.900 | start out by finding something that's similar to what you want to do and see how we did
00:48:54.860 | it. But then of course, there's also the underlying API information. So here's data blocks. OK.
00:49:07.540 | How are we doing? Still doing good. All right. So at the end of all this, we've got an object
00:49:19.380 | called dls. It stands for data loaders. And that contains iterators that PyTorch can run
00:49:27.700 | through to grab batches of randomly split out training images to train the model with
00:49:35.260 | and validation images to test the model with. So now we need a model. The critical concept
00:49:43.660 | here in fast.ai is called a learner. A learner is something which combines a model, that
00:49:50.660 | is the actual neural network function we'll be training, and the data we use to train
00:49:55.220 | it with. And that's why you have to pass in two things. The data, which is the data loaders
00:50:01.500 | object, and a model. And so the model is going to be the actual neural network function that
00:50:11.580 | you want to pass in. And as I said, there's a relatively small number that basically work
00:50:16.620 | for the vast majority of things you do. If you pass in just a bare symbol like this,
00:50:24.020 | it's going to be one of fast.ai's built-in models. But what's particularly interesting
00:50:30.100 | is that we integrate a wonderful library by Ross Whiteman called Tim, the PyTorch image
00:50:35.820 | models, which is the largest collection of computer vision models in the world. And at
00:50:40.580 | this point, fast.ai is the first and only framework to integrate this. So you can use
00:50:45.540 | any one of the PyTorch image models. And one of our students, Amanomora, was kind enough
00:50:51.740 | to create this fantastic documentation where you can find out all about the different models.
00:51:03.140 | And if we click on here, you can get lots and lots of information about all the different
00:51:08.380 | models that Ross has provided. Having said that, the model family called ResNet are probably
00:51:19.680 | going to be fine for nearly all the things you want to do. But it is fun to try different
00:51:23.900 | models out. So you can type in any string here to use any one of those other models.
00:51:33.980 | Okay, so if we run that, let's see what happens. Okay, so this is interesting. So when I ran
00:51:41.820 | this, so remember on Kaggle, it's creating a new virtual computer for us. So it doesn't
00:51:47.980 | really have anything ready to go. So when I ran this, the first thing it did was it
00:51:51.260 | said downloading resnet18.pth. What's that? Well, the reason we can do this so fast is
00:52:01.100 | because somebody else has already trained this model to recognize over 1 million images of
00:52:08.980 | over 1,000 different types, something called the image net dataset. And they then made
00:52:16.140 | those weights available, those parameters available on the internet for anybody to download.
00:52:22.780 | By default, on fast.ai, when you ask for a model, we will download those weights for
00:52:29.380 | you so that you don't start with a random network that can't do anything. You actually
00:52:34.900 | start with a network that can do an awful lot. And so then something that fast.ai has
00:52:40.580 | that's unique is this fine-tune method, which what it does is it takes those pre-trained
00:52:46.500 | weights we downloaded for you and it adjusts them in a really carefully controlled way
00:52:52.660 | to just teach the model the differences between your dataset and what it was originally trained
00:53:00.060 | for. That's called fine-tuning. Hence the name. So that's why you'll see this downloading
00:53:07.500 | happen first. And so as you can see at the end of it, this is the error rate here. After
00:53:13.980 | a few seconds, it's 100% accurate. So we now have a learner. And this learner has started
00:53:26.140 | with a pre-trained model. It's been fine-tuned for the purpose of recognizing bird pictures
00:53:32.220 | from forest pictures. So you can now call .predict on it. And .predict, you pass in
00:53:43.140 | an image. And so this is how you would then deploy your model. So in the code, you have
00:53:51.020 | whatever it needs to do. So in this particular case, this person had some reason that he
00:53:59.300 | needs the app to check whether they're in a national park and whether it's a photo of
00:54:03.220 | a bird. So at the bit where they need to know if it's a photo of a bird, it would just call
00:54:08.380 | this one line of code, learn.predict. And so that's going to return whether it's a bird
00:54:15.660 | or not as a string, whether it's a bird or not as an integer, and the probability that
00:54:20.780 | it's a non-bird or a bird. And so that's why we can print these things out.
00:54:29.100 | So that's how we can create a computer vision model. What about other kinds of models? There's
00:54:39.920 | a lot more in the world than just computer vision, a lot more than just image recognition.
00:54:45.220 | Or even within computer vision, there's a lot more than just image recognition. For
00:54:51.220 | example, there's segmentation. So segmentation, maybe the best way to explain segmentation
00:55:05.940 | is to show you the result of this model. Segmentation is where we take photos, in this case of road
00:55:14.700 | scenes, and we color in every pixel according to what is it. So in this case, we've got
00:55:22.300 | brown as cars, blue as fences, I guess, red as buildings, brown. And so on the left here,
00:55:32.780 | some photos that somebody has already gone through and classified every pixel of every
00:55:38.060 | one of these images according to what that pixel is a pixel of. And then on the right
00:55:45.200 | is what our model is guessing. And as you can see, it's getting a lot of the pixels
00:55:52.580 | correct, and some of them is getting wrong. It's actually amazing how many is getting
00:55:58.900 | correct because this particular model I trained in about 20 seconds using a tiny, tiny, tiny
00:56:15.260 | amount of data. So again, you would think this would be a particularly challenging problem
00:56:22.300 | to solve, but it took about 20 seconds of training to solve it not amazingly well, but
00:56:29.260 | pretty well. If I'd trained it for another two minutes, it'd probably be close to perfect.
00:56:34.940 | So this is called segmentation. Now, you'll see that there's very, very little code required,
00:56:45.860 | and the steps are actually going to look quite familiar. In fact, in this case, we're using
00:56:50.460 | an even simpler approach. Now, earlier on, we used data blocks. Data blocks are a kind
00:56:56.940 | of intermediate level, very flexible approach that you can take to handling almost any kind
00:57:05.740 | of data. But for the kinds of data that occur a lot, you can use these special data loaders
00:57:11.500 | classes, which kind of lets you use even less code. So in this case, to create data loaders
00:57:17.140 | for segmentation, you can just say, okay, I'm going to pass you in a function for labeling.
00:57:24.220 | And you can see here, it's got pretty similar things that we pass in to what we passed in
00:57:29.340 | for data blocks before. So our file names is getImageFiles again, and then our label
00:57:35.620 | function is something that grabs this path and the codes, so the code, so like what does
00:57:48.820 | each code mean, is going to be this text file. But you can see the basic information we're
00:57:55.340 | providing is very, very similar, regardless of whether we're doing segmentation or object
00:58:01.700 | recognition. And then the next steps are pretty much the same. We create a learner for segmentation.
00:58:06.460 | We create something called a unit learner, which we'll learn about later. And then again,
00:58:10.740 | we call fine-tune. So that is it. And that's how we create a segmentation model. What about
00:58:19.580 | stepping away from computer vision? So perhaps the most widely used kind of model used in
00:58:25.820 | industry is tabular analysis. So taking things like spreadsheets and database tables and
00:58:30.420 | trying to predict columns of those. So in tabular analysis, it really looks very similar
00:58:42.780 | to what we've seen already. We grab some data, and you'll see when I call this untied data,
00:58:49.140 | this is the thing in Fast.ai that downloads some data and decompresses it for you. And
00:58:53.220 | there's a whole lot of URLs provided by Fast.ai for all the kind of common data sets that
00:58:59.220 | you might want to use, all the ones that are in the book, or lots of data sets that are
00:59:03.460 | kind of widely used in learning and research. So that makes life nice and easy for you.
00:59:08.940 | So again, we're going to create data loaders, but this time it's tabular data loaders. But
00:59:12.700 | we provide pretty similar kind of information to what we have before. A couple of new things.
00:59:17.900 | We have to tell it which of the columns are categorical. So they can only take one of
00:59:22.220 | a few values, and which ones are continuous. So they can take basically any real number.
00:59:30.020 | And then again, we can use the exact same show batch that we've seen before to see the
00:59:35.740 | data. And so Fast.ai uses a lot of something called type dispatch, which is a system that's
00:59:43.740 | particularly popular in a language called Julia, to basically automatically do the right
00:59:49.240 | thing for your data, regardless of what kind of data it is. So if your call show batch
00:59:54.240 | on something, you should get back something useful, regardless of what kind of information
00:59:58.700 | you provided. So for a table, it shows you the information in that table. This particular
01:00:05.980 | data set is a data set of whether people have less than $50,000 or more than $50,000 in
01:00:14.260 | salary for different districts based on demographic information in each district. So to build a
01:00:24.740 | model for that data loaders, we do, as always, something_learner. In this case, it's a tabular
01:00:33.700 | learner. Now this time we don't say fine-tune. We say fit, specifically fit one cycle. That's
01:00:40.940 | because for tabular models, there's not generally going to be a pre-trained model that already
01:00:45.300 | does something like what you want, because every table of data is very different, whereas
01:00:51.560 | pictures often have a similar theme. They're all pictures. They all have the same kind
01:00:55.860 | of general idea of what pictures are. So that's why it generally doesn't make too much sense
01:01:02.260 | to fine-tune a tabular model. So instead, you just fit. So there's one difference there.
01:01:08.060 | I'll show another example. Okay, so collaborative filtering. Collaborative filtering is the
01:01:21.520 | basis of most recommendation systems today. It's a system where we basically take data
01:01:26.940 | set that says which users liked which products or which users used which products, and then
01:01:36.340 | we use that to guess what other products those users might like based on finding similar users
01:01:42.620 | and what those similar users liked. The interesting thing about collaborative filtering is that
01:01:47.020 | when we say similar users, we're not referring to similar demographically, but similar in
01:01:53.500 | the sense of people who liked the same kinds of products. So for example, if you use any
01:02:02.560 | of the music systems like Spotify or Apple Music or whatever, it'll ask you first like
01:02:09.020 | what's a few pieces of music you like, and you tell it. And then it says, okay, well,
01:02:15.380 | maybe let's start playing this music for you. And that's how it works. It uses collaborative
01:02:20.140 | filtering. So we can create a collaborative filtering data loaders in exactly the same
01:02:28.620 | way that we're used to by downloading and decompressing some data, create our collab
01:02:34.340 | data loaders. In this case, we can just say from CSV and pass in a CSV. And this is what
01:02:39.700 | collaborative filtering data looks like. It's going to have, generally speaking, a user
01:02:43.500 | ID, some kind of product ID, in this case, a movie, and a rating. So in this case, this
01:02:52.900 | user gave this movie a rating of 3.5 out of 5. And so again, you can see show batch, right?
01:03:00.420 | So use show batch, you should get back some useful visualization of your data regardless
01:03:05.420 | of what kind of data it is. And so again, we create a learner. This time it's a collaborative
01:03:15.340 | filtering learner, and you pass in your data. In this case, we give it one extra piece of
01:03:22.340 | information, which is because this is not predicting a category, but it's predicting
01:03:26.980 | a real number, we tell it what's the possible range. The actual range is 1 to 5. But for
01:03:34.800 | reasons you'll learn about later, it's a good idea to actually go from a little bit lower
01:03:38.820 | than the possible minimum to a little bit higher. That's why I say 0.5 to 5.5. And then
01:03:45.260 | fine-tune. Now again, we don't really need to fine-tune here because there's not really
01:03:49.060 | such a thing as a pre-trained collaborative filtering model. We could just say fit or
01:03:53.140 | fit one cycle. But actually fine-tune works fine as well. So after we train it for a while,
01:04:01.060 | this here is the mean squared error. So it's basically that on average, how far off are
01:04:07.080 | we for the validation set. And you can see as we train, and it's literally so fast, it's
01:04:13.060 | less than a second each epoch, that error goes down and down. And for any kind of fast AI
01:04:20.740 | model, you can... And for any kind of fast AI model, you can always call show results
01:04:29.820 | and get something sensible. So in this case, it's going to show a few examples of users
01:04:33.140 | and movies. Here's the actual rating that user gave that movie, and here's the rating
01:04:38.420 | that the model predicted. Okay, so apparently a lot of people on the forum are asking how
01:04:45.080 | I'm turning this notebook into a presentation. So I'd be delighted to show you because I'm
01:04:53.020 | very pleased that these people made this thing for free for us to use. It's called Rise.
01:04:59.700 | And all I do is it's a notebook extension. And in your notebook, it gives you an extra
01:05:07.020 | little thing on the side where you say which things are slides or which things are fragments.
01:05:12.500 | And a fragment just means this is a slide that's a fragment. So if I do that, you'll
01:05:18.460 | see it starts with a slide, and then the fragment gets added in. Yeah, that's about all theories
01:05:26.180 | to it, actually. It's pretty great. And it's very well documented. I'll just mention, what
01:05:32.420 | do I make with Jupyter Notebooks? This entire book was written entirely in Jupyter Notebooks.
01:05:47.420 | Here are the notebooks. So if you go to the Fast.io Fastbook repo, you can read the whole
01:05:54.740 | book. And because it's all in notebooks, every time we say here's how you create this plot
01:06:01.660 | or here's how you train this model, you can actually create the plot or you can actually
01:06:05.180 | train the model because it's all notebooks. The entire Fast.io library is actually written
01:06:17.580 | in notebooks. So you might be surprised to discover that if you go to fast.io/fast.io,
01:06:27.180 | the source code for the entire library is notebooks. And so the nice thing about this
01:06:40.140 | is that the source code for the Fast.io library has actual pictures of the actual things that
01:06:46.020 | we're building, for example. What else have we done with notebooks? Oh, blogging. I love
01:06:58.700 | blogging with notebooks because when I want to explain something, I just write the code
01:07:04.820 | and you can just see the outputs. And it all just works. Another thing you might be surprised
01:07:14.140 | by is all of our tests and continuous integration are also all in notebooks. So every time we
01:07:20.920 | change one of our notebooks, every time we change one of our notebooks, hundreds of tests
01:07:33.020 | get run automatically in parallel. And if there's any issues, we will find out about
01:07:39.700 | it. So notebooks are great. And Rise is a really nice way to do slides in notebooks.
01:07:51.180 | All right. So what can deep learning do at present? We're still scratching the tip of
01:08:02.300 | the iceberg, even though it's a pretty well-hyped, heavily marketed technology at this point.
01:08:08.180 | When we started in 2014 or so, not many people were talking about deep learning. And really,
01:08:18.860 | there was no accessible way to get started with it. There were no pre-trained models
01:08:22.380 | you could download. There was just starting to appear some of the first open source software
01:08:30.700 | that would run on GPUs. But despite the fact that today there's a lot of people talking
01:08:36.740 | about deep learning, we're just scratching the surface. Every time pretty much somebody
01:08:40.840 | says to me, "I work in domain X and I thought I might try deep learning out to see if it
01:08:45.700 | can help." And I say to them a few months later and I say, "How did it go?" They nearly
01:08:50.180 | always say, "Wow, we just broke the state-of-the-art results in our field." So when I say these
01:08:55.540 | are things that it's currently state-of-the-art for, these are kind of the ones that people
01:08:59.180 | have tried so far. But still, most things haven't been tried. So in NLP, deep learning
01:09:04.700 | is the state-of-the-art method in all these kinds of things and a lot more. Computer vision,
01:09:14.660 | medicine, biology, recommendation systems, playing games, robotics. I've tried elsewhere
01:09:28.580 | to make bigger lists and I just end up with pages and pages and pages. Generally speaking,
01:09:37.860 | if it's something that a human can do reasonably quickly, like look at a Go board and decide
01:09:47.420 | if it looks like a good Go board or not, even if it needs to be an expert human, then that's
01:09:53.820 | probably something that deep learning will be pretty good at. If it's something that
01:09:59.140 | takes a lot of logical thought processes over an extended period of time, particularly if
01:10:05.700 | it's not based on much data, maybe not, like who's going to win the next election or something
01:10:13.140 | like that. That'd be kind of broadly how I would try to decide, is your thing useful
01:10:17.540 | for deep, good for deep learning or not. It's been a long time to get to this point. Yes,
01:10:26.700 | deep learning is incredibly powerful now, but it's taken decades of work. This was the
01:10:32.340 | first neural network. Remember, neural networks are the basis of deep learning. This was back
01:10:37.140 | in 1957. The basic ideas have not changed much at all, but we do have things like GPUs now
01:10:51.320 | and solid state drives and stuff like that. Of course, much more data just is available
01:10:56.980 | now, but this has been decades of really hard work by a lot of people to get to this point.
01:11:11.700 | Let's take a step back and talk about what's going on in these models. I'm going to describe
01:11:23.660 | the basic idea of machine learning, largely as it was described by Arthur Samuel in the
01:11:30.140 | late '50s when it was invented. I'm going to do it with these graphs, which, by the way,
01:11:40.500 | you might find fun. These graphs are themselves created with Jupyter Notebooks. These are
01:11:54.340 | graph-fizz descriptions that are going to get turned into these. There's a little sneak
01:11:59.540 | peek behind the scenes for you. Let's start with a graph of what does a normal
01:12:06.420 | program look like? In the pre-deep learning, machine learning days, you still have inputs
01:12:13.020 | and you still have results. Then you code a program in the middle, which is a bunch
01:12:18.060 | of conditionals and loops and setting variables and blah, blah, blah. A machine learning model
01:12:25.620 | doesn't look that different, but the program has been replaced with something called a
01:12:36.860 | model. We don't just have inputs now. We now also have weights, which are also called parameters.
01:12:43.780 | The key thing is this. The model is not anymore a bunch of conditionals and loops and things.
01:12:51.060 | It's a mathematical function. In the case of a neural network, it's a mathematical function
01:12:55.500 | that takes the inputs, multiplies them together by one set of weights, and adds them up. It
01:13:03.420 | does that again for a second set of weights and adds them up. It does it again for a third
01:13:06.660 | set of weights and adds them up and so forth. It then takes all the negative numbers and
01:13:11.860 | replaces them with zeros. Then it takes those as inputs to a next layer. It does the same
01:13:18.700 | thing, multiplies them a bunch of times and adds them up. It does that a few times. That's
01:13:24.220 | called a neural network. The model, therefore, is not going to do anything useful, and this
01:13:32.780 | leads weights to very carefully chosen. The way it works is that we actually start out
01:13:39.140 | with these weights as being random. Initially, this thing doesn't do anything useful at all.
01:13:53.020 | What we do, the way Arthur Samuel described it back in the late 50s, the inventor of machine
01:13:58.100 | lighting, is he said, "Okay, let's take the inputs and the weights, put them through our
01:14:02.860 | model." He wasn't talking particularly about neural networks. He's just like, "Whatever
01:14:06.380 | model you like. Get the results, and then let's decide how good they are." If, for example,
01:14:16.660 | we're trying to decide, "Is this a picture of a bird?" The model said, which initially
01:14:22.300 | is random, says, "This isn't a bird." Actually, it is a bird. It would say, "Oh, you're wrong."
01:14:28.020 | We then calculate the loss. The loss is a number that says, "How good were the results?"
01:14:35.740 | That's all pretty straightforward. We could, for example, say, "Oh, what's the accuracy?"
01:14:38.860 | We could look at 100 photos and say, "Which percentage of them did it get right?" No worries.
01:14:45.840 | Now the critical step is this arrow. We need a way of updating the weights that is coming
01:14:53.220 | up with a new set of weights that are a bit better than the previous set. By a bit better,
01:15:00.940 | we mean it should make the loss get a little bit better. We've got this number that says,
01:15:06.980 | "How good is our model?" Initially, it's terrible, right? It's random. We need some mechanism
01:15:14.140 | of making it a little bit better. If we can just do that one thing, then we just need
01:15:19.740 | to iterate this a few times because each time we put in some more inputs and put in our
01:15:25.220 | weights and get our loss and use it to make it a little bit better, then if we make it
01:15:29.820 | a little bit better enough times, eventually it's going to get good, assuming that our
01:15:35.820 | model is flexible enough to represent the thing we want to do.
01:15:40.340 | Now remember what I told you earlier about what a neural network is, which is basically
01:15:44.700 | multiplying things together and adding them up and replacing the negatives with zeros,
01:15:49.420 | and you do that a few times? That is, preferably, an infinitely flexible function. It actually
01:15:56.780 | turns out that that incredibly simple sequence of steps, if you repeat it a few times and
01:16:03.580 | you do enough of them, can solve any computable function, and something like generate an artwork
01:16:14.820 | based off somebody's Twitter bio is an example of a computable function, or translate English
01:16:24.140 | to Chinese is an example of a computable function. They're not the kinds of normal functions
01:16:30.780 | you do in Year 8 math, but they are computable functions. Therefore, if we can just create
01:16:38.660 | this step and use the neural network as our model, then we're good to go. In theory, we
01:16:48.140 | can solve anything given enough time and enough data. That's exactly what we do.
01:16:59.860 | Once we've finished that training procedure, we don't need the loss anymore. Even the weights
01:17:07.980 | themselves, we can integrate them into the model. We finish changing them, so we can
01:17:12.100 | just say that's now fixed. Once we've done that, we now have something which takes inputs,
01:17:18.340 | puts them through a model, and gives us results. It looks exactly like our original idea of
01:17:25.460 | a program. That's why we can do what I described earlier. That is, once we've got that learn.predict
01:17:32.900 | for our bird recognizer, we can insert it into any piece of computer code. Once we've
01:17:38.220 | got a trained model as just another piece of code, we can call with some inputs and
01:17:43.900 | get some outputs. Deploying machine learning models in practice can come with a lot of
01:17:55.820 | little tricky details, but the basic idea in your code is that you're just going to
01:17:59.500 | have a line of code that says learn.predict, and then you just fit it in with all the rest
01:18:04.020 | of your code in the usual way. This is why, because a trained model is just another thing
01:18:08.960 | that maps inputs to results. As we come to wrap up this first lesson,
01:18:25.860 | for those of you that are already familiar with notebooks and Python, this is going to
01:18:32.140 | be pretty easy for you. You're just going to be using some stuff that you're already
01:18:37.260 | familiar with and some slightly new libraries. For those of you who are not familiar with
01:18:41.260 | Python, you're biting into a big thing here. There's obviously a lot you're going to have
01:18:47.980 | to learn. To be clear, I'm not going to be teaching Python in this course, but we do
01:18:55.300 | have links to great Python resources in the forum, so check out that thread. Regardless
01:19:04.100 | of where you're at, the most important thing is to experiment. Experimenting could be as
01:19:13.500 | simple as just running those Kaggle notebooks that I've shown you just to see them run.
01:19:21.620 | You could try changing things a little bit. I'd really love you to try doing the bird
01:19:26.100 | or forest exercise, but come up with something else. Maybe try to use three or four categories
01:19:31.140 | rather than two. Have a think about something that you think would be fun to try. Depending
01:19:39.060 | on where you're at, push yourself a little bit, but not too much. Make sure you get something
01:19:44.580 | finished before the next lesson. Most importantly, read chapter one of the book. It's got much
01:19:51.940 | the same stuff that we've seen today, but presented in a slightly different way. Then
01:19:57.540 | come back to the forums and present what you've done in the share your work here thread. After
01:20:05.900 | the first time we did this in year one of the course, we got over a thousand replies.
01:20:13.580 | Of those replies, it's amazing how many of them have ended up turning into new startups,
01:20:20.420 | scientific papers, job offers. It's been really cool to watch people's journeys. Some of them
01:20:26.140 | are just plain fun. This person classified different types of Trinidad and Tobago people.
01:20:33.020 | People do stuff based on where they live and what their interests are. I don't know if
01:20:36.580 | this person is particularly interested in zucchini and cucumber, but they made a zucchini
01:20:40.060 | and cucumber classifier. I thought this was a really interesting one, classifying satellite
01:20:44.780 | imagery into what city it's probably a picture of. Amazingly accurate actually, 85% with
01:20:50.620 | 110 classes. Panama City bus classifier, battered cloth classifier. This one, very practically
01:21:01.380 | important, recognizing the state of buildings. We've had quite a few students actually move
01:21:05.820 | into disaster resilience based on satellite imagery using exactly this kind of work. We've
01:21:11.940 | already actually seen this example, Ethan Sooten, the sound classifier. I mentioned it was state
01:21:18.740 | of the art. He actually checked up the data sets website and found that he beat the state
01:21:22.500 | of the art for that. Elena Harley did human normal sequencing. She was at Human Longevity
01:21:30.220 | International. She actually did three different really interesting pieces of cancer work during
01:21:35.140 | that first course, if I remember correctly. I showed you this picture before. What I didn't
01:21:40.860 | mention is actually this student lab was a software developer at Splunk, a big NASDAQ-listed
01:21:48.220 | company. This student project he did turned into a new patented product at Splunk and
01:21:55.020 | a big blog post. The whole thing turned out to be really cool. It was basically something
01:21:58.700 | to identify fraudsters using image recognition with these pictures we discussed. One of our
01:22:06.820 | students built this startup called Envision. Anyway, there's been lots and lots of examples.
01:22:14.780 | All of this is to say, have a go at starting something, create something you think would
01:22:22.860 | be fun or interesting, and share it in the forum. If you're a total beginner with Python,
01:22:29.580 | then start with something simple, but I think you'll find people very encouraging. If you've
01:22:33.660 | done this a few times before, then try to push yourself a little bit further. Don't
01:22:38.220 | forget to look at the quiz questions at the end of the book and see if you can answer
01:22:42.460 | them all correctly. Thanks, everybody, so much for coming. Bye.
01:22:52.460 | (audience clapping)