Back to Index

Lesson "0": Practical Deep Learning for Coders (fast.ai)


Chapters

0:0 Two groups of students in general
2:1 The fastai book
2:55 The course
3:22 Finish the course!
4:27 Finish a project!
5:49 What can a project be?
6:53 Be tenacious!
8:27 Radek Osmulski story
10:15 Stop endlessly preparing for doing deep learning
11:16 What will fastai teach you
12:25 How to get started with coding
14:0 The missing semester of your CS education
15:30 Share your work or learning
17:25 four steps to do fastai lessons
20:32 Notebook Server vs Linux Server
23:43 Get started with Colab
29:37 Github with Colab
30:37 Clean version of notebook
31:26 Questionnaires
32:32 Share your model on your dataset
34:29 Wrong ways to do fastai
36:41 Start positive learning feedback
37:27 Read and Write code
38:7 Immerse yourself in DL world through twitter
40:39 Go blogging
42:3 A great thing to blog
44:3 How ML differs from other coding
45:15 Why and How to create a good validation set
46:20 Coding DL is harder than other forms of coding
47:16 Baseline for your project
49:51 Kaggle competitions as best projects
52:33 Build your portfolio for job
55:30 Get to be the firsts to do part 2
56:8 How to get started with AWS EC2

Transcript

OK, hi everybody, thanks for joining. This is an entirely optional presentation, which I'll call a Lesson 0, which is all about how to fast AI. It's all about how to get the most out of this course, how to make sure you finish it, and how to make sure you feel like it's been a productive time.

And the reason I'm doing this is because a lot of people who take the course, when they get to the end of it, they say to me, "Oh, it wasn't until I got to the end of the course that I realized how I should have done the whole course, and now I'm going to go back and redo the whole thing over again." And so I'm going to tell you about what the messages I've heard are about what people have found the best approaches to making the course work.

I'm also going to go through the actual mechanics of how to get set up with two systems, Google Colab and AWS EC2, and I'll talk about why you might use one versus the other. So a lot of people now, as in many hundreds of thousands, have gone through the fast AI practical deep learning for coders course, and many, many, many of them have gone on to create successful startups, to write research papers with high impact factors, to create new products at their companies.

It's a pretty well proven course at this time, but there's also a lot of people that never finish the course. And so if you're watching this, it's because you've decided you do want to learn deep learning, so I'm going to talk a bit about like what's it going to take for you to be one of the people that makes this into a great experience.

When I talk about the course, I'm also talking about the book. So just to be clear, there's a book that Sylvain Goudre and I wrote, which you can either buy from Amazon, and people like it happily, or believe it or not, you can read the whole thing for free.

So it's called Fastbook, it's a fastbook repo. Honestly, I make basically nothing from the book, so I don't feel like you need to buy it to say thank you or something, buy it if you want the book. If you're happy using notebooks, use the free one, it's all good.

So the book was actually written as Jupyter notebooks, and we wrote something to turn it into a book book. Now the book will also, by the way, actually looks great on Kindle, online, as well as paper. I know often technical books don't, this one actually does. And then the course goes through half of the book.

And so quite soon we'll do a part two, which will go through the other half of the book plus some other new stuff. But basically each lesson covers a chapter or so of the book. So if you're doing this course, you'll be going through the book, at least in the notebooks and you might want the paper one as well.

So here is the main thing that you should commit to right now, which is to finish the damn course, right, and or finish at least half of the book. Because everybody I think who joins comes in thinking, okay, I'm going to do this, I'm going to do deep learning, but if you, when I look at our YouTube analytics, a lot of people don't finish, okay?

So you just need to decide what day are you going to watch the course each week? What day are you going to do the assignment? What day, like how are you going to stretch your time to finish the course? And maybe you're coming in deciding, I don't want to finish it, which is fine, right?

If that's your intention up front, no problem. But if your intention is to be a really effective deep learning practitioner, you need to finish the damn course, okay? So put it in your head that that's your goal. Talk to your friends or your spouse and tell them that's my goal.

Get that social pressure that you're going to finish it. You're not just going to finish the course, but try to finish a project, right? So Christine McClavy is one of our fantastic alumni. She's now at OpenAI, one of the world's top research organizations. She built a fantastic system for creating new music with deep learning.

She used to be a pianist herself. And I remember this discussion, I told her, focus on making one project great and polishing it off and finishing it. And she did. And that project has ended up creating music which the BBC Orchestra played, right? And amongst other things helped her get this extremely exclusive job at OpenAI.

So this is a clip from a podcast with one of our students, Sanyam and Christine, in which Christine is saying this is one of her key insights. And so I've got to be giving you a few key insights, some of which are from me or some of them are from me via students, but they're all like things I've heard a bunch of times.

So this is one example. So finish the course and finish a project. The project doesn't have to be something no one's ever built before. Maybe it's just like, oh, I really love that thing that person built. Gosh, it would be a real stretch if I could build it too.

You know, great. Or it doesn't have to be world changing, you know. So one of our students built something for his fiancé, which was a cousin recognizer. He had, I think, 14 cousins. And so his fiancé could take a picture of one of the cousins and it would tell them which cousin it was, right?

In our first course, one of our students built the app for the Silicon Valley TV show which did Hot Dog or Not Hot Dog, which was actually a huge smash hit, like millions of downloads that it was written about in the media. And it did exactly one thing just to tell you whether or not something was a hot dog.

Anyway, or it could, you know, solve medicine. That would be fine too. I mean, whatever. So finishing the course means being tenacious. And one of the things I hear a lot is a lot of the approaches people learn as they do fast AI around how to learn and how to study are useful more generally.

And in fact, this is a quote from our book, the number one thing I see the difference between successful deep learning practitioners and not is tenacity, okay? And tenacity is on the whole something you can choose. Now something you can't choose is whether you find yourself in the middle of a global pandemic or, you know, somebody in your family dies or you come down with a terrible cold or whatever, like obstacles happen, right?

And so part of being tenacious is being understanding with yourself, right? And saying, okay, something's happened. I can't do what I hope to do right now, but then getting back to it, right? So part of tenacity is not about ignoring the bumps, but keeping going after the bumps. And maybe that's, you know, quite often I'll have a bump that's like a year long, right?

But if I've decided to finish something, you know, at the end of that year, I'll go back and finish it. So sometimes that involves me emailing somebody more than a year after they've sent me something and saying, okay, I'm ready to reply now, and they forgot that they even sent me an email.

Okay, so what I'm going to do now is I'm going to share with you a bunch of insights from this book called Meta-Learning. If you haven't seen it before, that's okay, it came out yesterday. And it was written by a guy called Redek, who is one of the top alumni of this course.

And it's a book well worth reading because his journey is extraordinary. You know, this is a guy without a degree who couldn't code just a few years ago with a job that he found boring, and he set out to learn deep learning and repeatedly failed to do so. But Redek is extremely tenacious, and each time he failed to do so, he tried again.

And eventually he figured out a way to do it. And the way he did it was very intensely based on fast AI, both the course and the philosophy of learning. And he is now a Kaggle competition winner. He was the only non-San Francisco person at QAI, which is one of the world's top medical AI startups.

And now he works at a new nonprofit that is literally trying to translate animal language. And so he's kind of a good example, like I always think it's a good idea to have a role model. And in the fast AI community, there's a lot of role models. And so here's somebody who's like both a role model for like trying, failing, trying, failing, trying, failing, and then, you know, finding some success.

And so I'm going to show you some things from his book. And a lot of his book is him taking stuff I say and kind of casting it into what he took away from it. That's his ideas. So one of the things we hear again and again from unsuccessful deep learning students is they keep preparing to do deep learning.

And they keep preparing to do projects. So they study linear algebra. They study calculus. They study C++. They study all these different things. They do a MOOC and then another MOOC and then they read a book and then another book. You know, and at what point are they actually going to start doing something?

So the fast AI philosophy is you start doing something week one, OK? So week one, you need to actually train a model, OK? Which is not to say that you're not going to learn theory. You will, right? As needed in the context of getting stuff done, OK? And so if you do finish it, right, particularly if you finish the full two parts of the course, right, you'll have implemented basically all of fast AI's library just about from scratch.

You'll know all about batch normalization. You'll have benchmarked various matrix multiplication approaches. You'll know how to write bare metal GPU optimized code. You'll understand how to do back propagation and the calculus of that from scratch. You'll do all of that, OK? But it will all be as you go along in the context of like solving a particular problem or understanding the next piece of the puzzle.

So yeah, really just reading books and watching videos is not going to get you there. The thing which is going to get you there is writing code, doing experiments and training models. Some of you might not be that great at coding. Fine. OK. That's a perfectly OK place to be.

And but you guys are going to find it the most challenging because being good at coding is the thing that lets you zip through quickly. So rather than think, oh, that's a shame. I'm not that good at coding yet. This is actually an opportunity because now you have a really fun project to learn to code in.

So a lot of people have become good coders by doing the course. Because as you do the course, you'll learn about a lot of computer science concepts like object-oriented programming and functional programming and mapping over a list and list comprehensions and GPU acceleration and so on and so forth, right?

So the thing is, though, if you come across a computer science concept or a programming idea or a piece of syntax that you're not that familiar with, that's a place it's worth pausing for a moment and making sure that you do understand how that code works. Because the coding is the kind of critical foundational skill.

This is a pretty good course for getting started with basic computer science. Harvard CS50 course, which everybody at Harvard does for computer science to get started. And that's all available for free online. So I would recommend, well, and so would Radek start there. And so these quotes are all from Radek's book, by the way.

And then the other piece, so Radek talks about this four-legged table of the things that are going to help you do your deep learning experiments more effectively and efficiently. And these are the ideas. Knowing the basic ideas around code, knowing your tools, so an editor, do you put a notebook, knowing stuff like Git, like how to save your work and pull in other people's work and so forth, and understanding kind of SSH and Linux, like how to access a server and manipulate it and do stuff with it.

So there's this great course called the missing semester of your CS education, which was actually created, I believe, by students at MIT who said, oh, everybody at MIT is assuming we already know this stuff, but a lot of us don't. So there's nothing to be ashamed of if you've never used Git or you've never used SSH or whatever.

They're just tools which, at some point in the journey, most people just kind of have to figure out. So this is actually a great time to do it, and this is a great course to use to help you get there. And of course, again, the main thing is to practice these tools.

So that's the kind of foundation around coding and your kind of development environment. The next big piece of advice, which we talk about a lot in the course and that Radek talks about in his book, is sharing your work, communicating your work, and writing about your work. This is something that a lot of people feel very uncomfortable, like tweeting or blogging or whatever.

It's like, who the hell am I to start writing about deep learning? I've just started. Well, here's the thing. No one is better placed than you to write for, like, what would you have wanted to know six months ago? So you now know more than you did six months ago, and you'll know more in a week and more in a week, more in a week.

And so if you've got a background in, say, the hospitality industry, you know, you could probably write something very interesting for your colleagues in the hospitality industry about ideas around deep learning, for example. Or if you teach at high school, you know, you might have ideas that you could write down about what high school students might find interesting or teachers might find interesting.

So you know, everybody's got something to say. And the key thing is just to write it down because that is going to help embed your understanding a lot better, and it's going to start to build up your portfolio. And so we'll talk more about that in a moment. But a lot of people have found that this message of sharing their work has been a critical part of their journey of learning, and of also building up their personal brand that has ended up getting them a job.

OK. So what does it mean to do a fast AI lesson? So a fast AI lesson is basically a chapter of the book or one video from the course. Or both. So what does it mean to do one of these lessons? Assuming you're doing the video, then it means, OK, obviously watching the video.

So there's a couple of hours, right? And then it means running the notebook, which we'll look at in a moment. When you run the notebook, you have the whole book with all of its code and all of its outputs there, you're playing with it. You should experiment, right? You should try things out.

So if you wonder, oh, why is this done before that? Well, try removing it. Try doing it in a different order. If you're wondering, you know, what would happen if I did that, but to this other image, try it, right? The more you can start to experiment, the more you're feeding your brain with these kind of like your own deep learning happening in your brain.

Input output patterns. You try something, what happens? You try something, what happens? So after that, the next step is to try to reproduce the notebook from scratch, OK? Now you're going to have to look things up, obviously. But the idea is, can you, with a fresh new notebook, can you go back and recreate some of those models, retrain them, or redo some of that data processing pipeline?

So try to like type it in yourself, you know, you can switch back to the answer as much as you like, but you're really trying to start to actually, you know, fill in your own, write your own code. And then what you really, the point you really want to get to is repeating some parts of the lesson with a different data set, which you collect or download.

Now this whole process often takes people a number of times through the course, right? So often the first time through, people might just watch each lecture and try to kind of run it and, you know, just get to the end to get a kind of a general sense of what's going on.

So people will often kind of go through the whole thing like three times and then come back and try to go further and further, right? So don't worry if you can't do all this right away. Certainly in lesson one, that's going to be challenging. Just take it as far as you can, right?

And as you go along, try to push yourself to do more and more, and you could even go back to an earlier notebook and see if you can understand more and more of it. So let's take a look at what that looks like. So here's the course, okay? And here's the lessons, which you can watch.

And then here are the places you can run the notebooks. So there's two types of platform for running the notebooks. There are notebook servers. These are things that as soon as you click into it, the actual environment we use, Jupyter notebook, will pop up and you can just start running it pretty much straight away.

So that is obviously the easiest. Colab is free. Colab has a free tier. And SageMaker is not free. So we're going to look at Colab today. The other option is to use a full Linux server, and this is something where you're going to have to basically set up Linux and install the Python system and install notebooks and get the code from GitHub and run the server and log into SSH and do all that.

That's obviously a lot more work. You might want to skip it for now in like lesson one. But I would recommend at some point you go through this path. And the reason why is that in real life at your workplace or if you do your own startup or whatever, this is what you'll be doing.

You will be interacting with a Linux server using SSH that's running a GPU. And you'll want to understand how it all works. And once you're using your own Linux server, you'll suddenly learn about all these productivity-enhancing tips and tools that make your life easier. So I'll be showing how to set up AWS EC2, that's the Amazon platform today.

You'll find Google Cloud looks very, very similar indeed. Jarvis Labs was created by a Fast AI alum, and this is probably at this stage the best value of the full Linux servers. So that would certainly be also very much worth checking out. One good thing about AWS, so a couple of things, AWS is currently the most popular platform for cloud computing.

So it's very likely that whatever company you're at or end up at is already using it. They're also pretty generous with credits for startups and students. So even though it can set you back 60 or 70 cents an hour, you might well find you can get a few hundred dollars worth of credits through your school or even a few thousand dollars worth of credits through their startup programs and so forth.

So let's have a look at what Colab looks like. So Colab is, it's wonderful how easy it is to get started. You literally just click on the chapter, so let's do chapter one, and it pops up Colab. You can pay, I think it's $10 a month for Colab Pro to get like longer sessions and more likely that you'll get a better GPU, but for most people you'll find the free version is totally fine.

One of the biggest problems with Colab is that it's not persistent, which is to say when I go to this notebook, it thinks it's never seen me before. Nothing's set up for me the way I want it, but we've set up the notebook so that the very first cell actually installs everything you need.

So if I click this little run cell button here, it will run the cell. Although what I will do is I'm going to pop over to Colab here, and let's also read the steps here. And actually it says here before running anything, you should tell Colab you're interested in using a GPU.

So if you find that when you run a cell from the course, and it's going to take like half an hour or an hour or more, it's very likely you forgot to use GPU. The GPU runs things many hundreds of times faster. So all you do as it says here is go runtime, change runtime type, and say GPU, okay?

So now I can run this cell. And this is all Python code except lines that start with an explanation mark actually sent to a terminal. So PIP is something that installs Python software, and Fastbook contains all of the Python software necessary for the course, and so it's going to go away and set it all up.

And so this is this like mildly annoying bit. You can then connect Colab to Google Drive, and that's going to be how you can save your notebooks and save your work as you go, okay? I'm not going to do that right now, but if you go to this link that it says and it'll give you a code and then that'll connect it up to your Google Drive.

And so at this point now everything from the for the course is now available, and you can see the whole book is here, okay? So here's the book, and you can open up sections to read them, okay? You can go to the table of contents, okay? And so eventually we'll get to this cell here which contains all the code needed to run a model.

So if I click run, here is where it goes. Now this is going to it's amazing how much this little bit of code is going to do. It's going to download tens of thousands of pictures of dogs and cats. It's going to use a simple rule to recognize the dogs from the cats based on their file names.

Basically the way that this has been set up is that you can tell from the file name whether it's a dog or a cat. It's then going to download something called a pre-trained model, which is something that already knows how to recognize various types of images. It's then going to construct, it's then going to train that model to make it particularly good at recognizing dogs from cats, and then it's going to validate that model to see how good it is at recognizing dogs from cats using a set of pictures that it hasn't seen before.

And that's all happening. So so far it's already downloaded the dataset, it's already downloaded the pre-trained model, and it's now busily going through the first epoch, which is to look at every picture once to try to learn how to recognize dogs from cats. And that's it. The lines starting with a hash are just comments.

Because this is also the source of an actual book, there's a few like slightly weird comments that you can ignore. They're just things that are used for setting up references in the book. There's the caption, so forth. Okay, so it's now testing out, I think that first epoch. Okay, so it's finished in epoch, and so far it's got a 1% error rate.

So after 54 seconds, it has learned to recognize dogs from cats with 99% accuracy. And so yeah, we're going to let that finish off. So that's how we get started with Colab. And there's nothing else to set up. Now what you can do is you can open Notebook, and you can open a notebook from GitHub.

And here is the Fastbook repository. And you'll see in the Fastbook repository, for every notebook, there's a second copy inside the clean folder with the same name. So I was just looking at 01Intro. There's also a clean 01Intro. If I open that up, you'll see that it's got exactly the same thing as the last one I was just looking at, but all the pros is now missing.

It's just got headings and code. Also all the outputs are missing. So the reason that we have this clean version is to help you with these stages here, is our suggestion is once you've gone through the lesson, and you've run the notebook, and you feel like, okay, I think I get it, is you open up this clean version.

And before you run each cell, try to think, okay, why is this cell here? What's it for? What's it going to do? What's the output going to look like, right? So once you remove all that context, this is a good test for you to kind of get your brain going to think what was actually going on.

So this is a kind of much more active approach to reading and recall. And so then once you've done that, and you've finished going through this, at the bottom, one thing that is kept is the questionnaire. So at the end of every chapter is a questionnaire. And so then at this point, you should now, as much as you can without looking, go through and try to answer each of those questions.

They all have answers in the notebook, in the book. So if you can't remember, you can always look it up. But if you can't remember, that's a sign to you that like, oh, did I skip over that bit too quickly? Like what's happened that I've not remembered? And then try to remind yourself, and then go back and finish the questionnaire.

So there's a lot of pieces to help take this from a passive, I'm just watching a video, I'm just reading a book, into a participatory exercise that you're a part of. So as soon as you can, we want you to create something that's yours. And so this is the easiest way to do that, is basically at the end of lesson one, once you're kind of up and running, try to do it with your own data set.

And if you go to forums.fast.ai, which is something that you're going to want to be deeply familiar with, because this is going to be full of people just like you, other people who want to learn deep learning. And these people are all asking questions, and making comments, and you can see there's like a lot going on all the time.

And so you can see here's the part one course topic. And you can see there's 1.4 thousand topics there, and each one is going to have lots and lots of replies. So this is where, amongst other things, you'll find, if you search for it, something called Share Your Work Here, which has 2,000 replies, and you can see links to and pictures of lots of examples of things that other people have done after the first week or two of the course.

And so hopefully that might help give you some inspiration. And it would be great if you could reply and add a picture or a link to what you build. And you'll see, you know, everybody is very positive to each other on the forums in general and in this topic in particular.

Nobody's going to go, "Oh my god, I could have done that years ago," right? People are going to be excited for you, that you have now joined the ranks of people that have built their first deep learning model. And I will be excited for you. So as I said, Radik, this is again from his book, expresses in his book a way of not doing fast AI, which I have heard now probably hundreds of times.

I don't know why this is so common, but many, many people do what Radik did, which was basically to learn all these math things, right? So he started with calculus, and then once he got to a certain point in calculus, he found that he had to start understanding real analysis.

And then as he started understanding real analysis, he had found he had to learn set theory, you know, and you get the idea, right? If you want to learn all of math, that's going to take a while. There's a lot of gatekeeping out there that says like, "Oh, if you're going to be a real deep learning practitioner, you have to finish, you know, a graduate level course in linear algebra." Here's the truth, the actual linear algebra you do in basically all deep learning is matrix multiplication.

And if you've forgotten what that is, that is multiplying things together and then adding them up, okay? So what you need to be able to do is multiply things together and add them up, right? So if you can do that, you're good to go. So yeah, don't get, you know, you're not going to finish it if A, you never start it because you keep preparing, or B, you keep thinking, "Oh, I wonder exactly what's happening here," and you go all the way down to the bottom until you found yourself in the midst of set theory, right?

Don't worry, you'll get deeper and deeper over time, but if you're learning mathematical theory, you're not coding, you're not experimenting, you're not practicing, you're not actually building deep learning models, and if you're watching this course and your goal is not to build deep learning models, you're in the wrong course, okay?

But if your goal is to build deep learning models, then don't do this. So as Rennick says here, it's as you train actual models that you're going to get feedback, right? And the feedback that a lot of people get is, "Oh my God, I can already train useful models," like a lot of people are surprised at how early on they can actually get astonishingly good results.

Okay, so, you know, jump in and be open to surprising yourself that you can do a bit more than you thought. You can't do everything right away, okay, but start that feedback loop of figuring out what do you know, what can you do, what can you get working, what can't you get working?

So one of the key things that you're going to need to do if you're going to finish all of the course is become an even better developer than you are now, even better coder than you are now, wherever you're up to, and so to do this, you need to read code and write code.

The fast.ai source code is designed to be extremely readable, so you can read that code. You can obviously read the code in the notebooks, but yeah, you want to be spending as much time as possible reading and writing code, and particularly reading and writing deep learning code. All right, how do you find out what's going on in the world of deep learning, and how do you get yourself on the map of people doing deep learning?

Probably the best answer is Twitter. For those of you whose only knowledge of Twitter is the Kardashians and Donald Trump, this might come as a surprise, but actually to create this slide, I opened Twitter and I copied and pasted the first three tweets that appeared on my screen. So one of them is somebody has a discussion about costs and impacts of different approaches to labeling.

This is a fast.ai alum who's a 17 year old PhD graduate who's doing well, who shows how to mix PyTorch and fast.ai, and then Hilary Mason, who's a professor, and I guess not a professor anymore, but now in industry, talking about organizational issues in data science. So there's a whole world out there of machine learning on Twitter, and if you want to get your work noticed, that's a great place to do it because really everybody's there.

And if you want me to highlight your work, that's where I can see it and I can retweet it. So yeah, Twitter is a really good place to be. If you're just starting with Twitter and you don't know who to follow, go to my Twitter, go to my likes, and go through my likes and find tweets that you think you actually like that tweet to, and then follow the person who did that tweet.

Okay, and pretty quickly you'll have 100 people you're following, okay, and then they'll retweet things and you'll find other people you like, and before you know it, hopefully you've got a nice big lot of interesting, deep learning stuff to read every day. At first you'll understand like 1% of it, which is fine, but you know, you're there, you're in it, and it'll be all washing over you, and you'll start to find the people who write stuff you find engaging and interesting, and you'll also find the people that actually you don't, and make sure you unfollow them so that you don't have your feed, have stuff you don't care about.

So then beyond Twitter, you want to start blogging. Okay, and again, blogging is not about writing what you had for dinner, okay, it's about writing something that you of six months ago would have found interesting. Okay, so you know more than you did six months ago, so write that down.

We have something called Fast Pages that makes it ridiculously easy to start a blog, and so there's no reason for you not to, you know, at least create a blog. There we go. And one of the nice things about Fast Pages is you can even turn Jupyter Notebooks into blog posts, so it's great for kind of technical reasons.

So this is what a Fast Pages blog looks like. This is a Fast Pages blog about Fast Pages. I had to write Fast Pages in order to write the Fast Pages blog about Fast Pages. But basically, and one of the other nice things, it's all in GitHub, right? So as you're blogging, you're learning more about Git.

It's all written with Markdown, which is something that you're definitely going to need to know anyway. So as you're blogging, you'll be learning about a lot of the tools you need to learn about anyway. So one interesting idea for things to blog about is this example from Aman Arora, who is an Aussie Fast AI alum who is now working at Weights and Biases, which is one of the top AI startups in the world.

This is a really interesting kind of blog post. What Aman did was he took a video that I did at the launch here of the Queensland AI Hub, and he wrote down what I said. And that's an example of something that you could do. If there are videos out there that you liked and nobody's turned it into a post, be the first to do so because there's all these benefits.

When somebody sends me something saying, "I've written up this talk you gave," I'm very grateful to that person because now my talk is now available in a second medium. A lot of people prefer to read rather than listen to a talk. You know, that person's taken the time to do this.

They've taken the time to have me check their work. And kind of everybody ends up winning from this. So I've seen with Aman's post about my talk, it's got attention from people that my talk didn't. So for example, I noticed on my LinkedIn feed, the CEO of Data61, which is the CSIRO, so the top data science body in Australia, highlighted it and said, "Check out this post from Anamurara." So this is like an example of the kind of stuff you can do.

It's like try to be helpful, and at the same time you're also learning. So there's an example of an interesting kind of blog post which very few people are writing, and so there's a huge amount of opportunity here for you to practice your writing. Okay, now, what is the difference between machine learning and other kinds of coding?

Ezra Dex says in this chapter of his book, "The key about machine learning is that we can generalize. We can train a model with one set of data and apply it to a different set of data and still get good results." And everything just about that we're doing in this course is all about creating models that are going to generalize well.

And we're going to be learning about how you can measure how well your model generalizes. So answering these questions about can we trust our model to be correct on new data that we feed it is absolutely critical to every model that you build, whether it be in a Kaggle competition or a little prototype or a production model you're creating at work.

One of the most important things here is creating a good validation set, and this is something that you'll hear about in lesson one of the course. But I really wanted to highlight it here, as did Radek in his book. It's a really important idea is you need a good way to measure whether your model is any good.

So you need a data set that really represents what kind of data is your model likely to have to deal with in real life. And my partner Rachel wrote this really great blog post on the Fast.ai blog about this. Actually interestingly, this was kind of came out of a lesson that I did at the University of San Francisco and then Rachel turned it into a blog post and Rachel's blog post has ended up much more influential than my video ever was.

So this is actually a good example of what I was talking about. And she took it a lot further. Okay. The next key thing that Radek mentions and I totally agree with is it's hard to write correct machine learning code. I always assume that every line of machine learning code I write is wrong.

And I'm normally correct about that. It normally is wrong because there's lots of ways to be wrong. And unlike creating a you know, a context management app on the web or whatever, it's much harder to see that you're wrong. You know, you can't see that the name didn't get stored in the database or you can't see that the title isn't centered.

Right. Often it's wrong that it's going to be like half a percent less accurate, you know, or your image is upside down, but it's kind of maybe you didn't even look at it. I got straight into the system and you end up with something that can only recognize upside down images or whatever.

So whenever you're doing, you know, whenever you're building a project, make sure you start with a simple baseline, right? Like create the simplest possible model you can that's that, you know, solves the problem so simply that you can't have made a mistake. So often that'll be like just taking the average of the data or if there's two groups, take the average of each of the two groups or you know, something that something really, really simple and then you can gradually build up from there.

So another very common beginner mistake with projects, remember we want you all doing projects is somebody in a project group will say, Oh, I read about this new Bayesian learning thing with these clusters and this, you know, advanced transformers pipeline, and we could put all that together. It's going to be better than anything before.

And they then spend months creating this complex thing. And at the end, it doesn't work. Now, why doesn't it work? Well, I don't know. It's so big and so complicated. Maybe it's a stupid idea. Maybe there's a bug in one piece of it. Maybe that one piece there shouldn't be there, but it should be somewhere else.

I don't know, right? That's not how anybody creates successful machine learning projects. Machine successful machine learning projects are always built, in my experience, by creating a simplest possible solution that gets something all the way from end to end first, and then very gradually it makes it incrementally slightly better.

Okay, so keep that in mind, right? You might feel a bit silly when you build that first model that just takes the average of the data, right? But that's how, that's how the pros do it. That's how everybody that actually gets it to work does it. So often I've had, you know, Silicon Valley startup hotshots come to me and ask me to like, check out their amazing new startup, and I'll ask them, you know, oh, you reckon this can separate, you know, sick people from well people or whatever.

Have you taken the average of each of these two groups and compared that to your model, for example? And they'll say, oh, no. And then they try it and they find out their model's worse, right? So you need to know whether your model's actually doing something useful. For projects, one of the things you might want to do is join a Kaggle competition.

That might be the last thing you see yourself as doing is being a Kaggle competitor, but actually this is one of the best possible projects you can do because to enter a Kaggle competition, even to come last, you have to go through the entire process of downloading a dataset, formatting it into the right method, ready for a model, getting it through the model, saving the output, getting it into the correct submission format and submitting it back to Kaggle, right?

So getting a model actually up onto the Kaggle leaderboard is really going to test out your end-to-end understanding, right? And once you've done that, you can start to iterate. You can start to make it slightly better, slightly better, slightly better. So although in a lot of ways, Kaggle is not representative of the real world, you know, you don't have to worry about deployment.

You don't particularly have to worry about kind of inference speed, stuff like that. In a lot of ways, it is closer to the real world than you might expect and that it really does force you to go through the whole process and also to think about kind of planning your project carefully.

So enter a competition with your kind of goal that I want to win. Now obviously on your first one, you're not going to win, but the whole point is it's a competition. So you've got to try to do your best, right? And so to do your best, join a competition that's early, right?

Give yourself plenty of time. And every single day try to make a small improvement. And then you'll find that, you know, if you keep reading the forums on Kaggle and keep trying a bit more every day, you'd be amazed at the end of the three months how much you've learned, how much of the stuff that at the start you thought this is, I have no idea what's going on.

And then you'll realize, oh, suddenly I do know what's going on. And you might find you get in the top 50% which might be better than you expected. So that this is, you know, highly recommended at some point during this course is have a real go at a Kaggle competition.

So at the end of all of this, you might be looking for a job. Now this could mean a number of things. A lot of people just want to bring some deep learning into their current job. And so, you know, that's, if your organization's already doing some deep learning, that might be easier than if it's not.

If it's not, you might just have to start prototyping some things and try to build up some kind of, you know, proof of concepts internally. Or maybe you're going to try and go out and get a new role as a researcher or a data scientist or whatever. Most people are not going to be able to rely on their, you know, Stanford PhD to get them there, right?

Most people are going to rely, have to rely on their portfolio. So your portfolio is going to be all the stuff you build along the way. It's your footprint on the deep learning community. And that footprint is going to include, you know, think things like your contributions to the Fast AI forums and your tweets and your stuff on Discord.

I would say pretty much every one of the Fast AI alumni that have come to my attention as being thoughtful and effective community members all have very, very, very good jobs now. And so like people really, really notice this footprint, right? So your blog posts, your GitHub projects, these are the things that are going to get you a job.

They probably won't get you a job at a big company, a big old company in a, you know, kind of standard established IT job, right? That's going to go through HR and HR, like they're not going to understand any of your GitHub code or know any about your community impact.

They're just going to know about credentials, right? And you'll come up against somebody with a Stanford PhD and they'll get the job, right? But startups, really startups from other people who've got similar backgrounds of which there are many are going to appreciate you or companies that don't really have an established AI group yet, or the startup you built yourself will certainly appreciate you, right?

So it's the more you've got a portfolio and that you can show that you've really built stuff, the better. And so start early. Another reason to finish this first course is that it's going to allow you to do the second course. And if you're doing this live, part two, we're going to be doing actually a whole new part two towards, you know, basically shortly after this is finished, right?

So if you finish this and do a good job of it, then you could actually be one of the first to do part two. Now, we've seen how easy Colab is to get started. We've also talked about some of the downsides of it, right? It's kind of ephemeral. You start from scratch every time.

You've got this kind of hacky stuff of saving notebooks into your Google Drive, blah, blah, blah. AWS, on the other hand, is going to give you and Google Cloud and Java Slabs and so forth are going to give you a real Linux server. Okay. And it's going to cost you, Java Slabs is the cheapest, about 40 cents, AWS, I think about 60 cents US per hour.

It's not going to send you broke, but it's, you know, it's not nothing. But it's a good idea to try it if you can. And I'm going to show you how to get started there. And what we might do, Michael, is I'll do some Q&A while things are running.

So I'm going to head over to AWS EC2. Okay. So one of the tricky things about AWS is they've got hundreds of products. This is Amazon Web Services, and they all have names that are totally meaningless. Okay. So you just have to know, EC2 is the name of the thing that you go to, to rent a computer.

Okay. So they don't call it Amazon Computer Rental, they call it EC2. So the first thing you need to do is you need to sign up to AWS. And one of the things that they get is a lot of fraud. So a lot of people try to use their GPUs to mine Bitcoin.

So you have to ask them to give you permission to use their GPUs. Now that's called requesting a service limit increase. So you'll need to follow the steps here to ask them for a limit increase. If you write these exact words with this exact formatting, it might come through a little bit quicker.

If you're from a country where there's a lot of fraud, you might not even get this permission. Maybe Java Slabs is going to be easier. I'm not sure Java Slabs even has the fraud check. So anyway, there's quite a few places you can try to get an instance. So if AWS has a problem with your quota, try somewhere else.

But generally speaking, most people should get a response pretty quickly saying you've now got approved. So for you doing this course, if you're going to try out AWS EC2, I suggest you log in and request this service limit increase right away. So that, you know, by the time you come back tomorrow or the next day, it'll be done.

And so what I'm currently doing is I'm on course fast.ai and I've gone Linux servers, AWS EC2, and we're following through that project process. Okay. Now to log in to your server, you're going to need to use something called SSH, Secure Shell. So this is something where on your computer screen, that server's computer screen effectively is going to appear and the stuff you type is actually running on that remote server, not on your computer.

Nowadays, pretty much nobody uses usernames and passwords for SSH. Instead, we use something called public key cryptography, which is where you basically have a secret number, which only you know. And then there's another public number that you tell other people. And basically there's a really cool math trick, which allows people to check whether you have the secret number without actually anybody, without actually telling them the secret number.

And the process, so that's called, so that's what an SSH key is. So there's this thing called a public key, and that's the number that you're, the code that you're going to give to anybody you want to be able to log into. And then there's your private key, which you're going to keep for yourself.

So you're going to need a terminal. So on Windows, in the store, there's something called the Windows Terminal, which Microsoft provides for free, which is pretty good. Mac has a terminal that comes with it. Linux has a terminal that comes with it. So I'm using Windows, but it'll basically look the same for everybody.

Now on Windows, you need a Ubuntu Linux shell, not a normal Windows shell. So to do that, you need something called WSL, Windows Subsystem for Linux. And that will give you a full Ubuntu system on your Windows computer. Again, it's free. It only takes a couple of minutes to set up.

So there's a link to how to do it here. So once you've done it, whatever, whether you're on Mac or Linux or Windows, it's going to look basically the same, right? And so you'll create your SSH key by following the instructions in the documentation, which is basically you run SSH keygen, and it's just going to go through and create these two files.

So you just run it, it creates these two files. And so this is the one that we have to give Amazon. This is the one that we're going to keep for ourselves. So following along in the documentation here, it says to click on services, EC2, find key pairs. Okay, and then we'll go here, import key pair, and whatever, AWS.

And this is where we're going to find the ID RSA pub that we just created. And you can see this, here it is, right? It's just a big long code. And it's fine, you can all look at this. This is public, not secret. This is the call thing, right?

There's no passwords. And I say import. And so now we have an SSH key, and we can use that to log in. Okay, so this is just all this is. Here's all those steps. So renting a server in AWS speak is called launching an instance. So to launch an instance, we'll scroll back up to the top to instances, and we will say launch instance.

Okay. And it'll say, okay, what kind of thing do you want to run Amazon Linux or Windows or Red Hat or whatever? I strongly, strongly suggest you use Ubuntu and the latest version, which is currently 20. So I'm just going to say select. Okay. And then it'll say, okay, what kind of server do you want?

For playing around, there's actually one that you can get for free. Now it doesn't do, it's pretty, it's kind of slow, right? But for learning about SSH and Linux and stuff, this is actually a great one to use. It's no good for deep learning. It doesn't have a GPU.

So if I go to G4DN, that's the cheapest kind of good GPUs we can get. And I'll get the smallest one there. G4DN xlarge. And then I'll say next. Next. So how big a hard drive do I want? I normally say about 100 gig. Launch. And launch. And so now it's going to say, okay, when you log into this, which key pair are you going to use?

Okay. So you just select the one that you just imported and say, yep, I know that I have that. And then launch. And you'll see, Dell says, this has now been initiated. It's got a code. So this is the thing that I've just launched. So if I click on it, here it shows me, here's my instance.

Okay. So as you, if you haven't done much with servers and Linux and SSH and stuff, there's going to be this whole world of new stuff for you to learn about. But this is an opportunity. It's not a problem. So if you're not familiar with things like IP addresses, that's cool.

There's lots of tutorials around at the moment. But for now, just know this is the unique address, like a street address that your new computer has. And so we're going to connect to it. So this button here will click, will copy that address. Okay. So we can then go to our terminal and we can type SSH and paste in the address.

And then the only other thing I do need to do is I need to say, provide a username and AWS always uses the username Ubuntu for all of its Ubuntu images. So you say Ubuntu at, and then the IP. And so if I'm now press enter, we're in. Okay.

So now everything I type here is actually being typed on that remote computer. So for example, to list the contents of a directory, I type LS. Okay. So the thing I'm actually typing into here is bash, a bash shell. So bash is something, another of these things need to be familiar with and you can learn about it in that missing semester MIT course I mentioned.

You know, it takes a few weeks to get somewhat comfortable with bash. It's a very different feel to using a GUI if you're more familiar with explorer or finder or whatever, but you'll find it's, will be much more productive soon enough because you can replicate things quickly. You can script things, you can copy and paste things and so forth.

Anyway, so here's my, here's my computer. It's going to sit here running until you tell it not to. Even if you turn your computer off, your server is still running and that means you're still paying for it. Okay. So one of the things I guarantee you're going to learn the hard way by wasting money is that you're going to forget to turn it off.

Okay. So to turn it off, you're just going to go stop instance. Okay. So you make sure you do that. All right. Let's see how we're going here. So we've launched our instance and we SSH into it. Okay. So keeping a Linux server up to date and running used to be kind of annoying, but luckily I've created something called fast set up for you, which makes it easy.

And all you need to do is copy this and paste it into your terminal. And this is one of the really cool things about Linux and using bash is like in windows or with Mac finder, you'd have pages and pages of click this and drag that and scroll here.

But I've just scripted the whole thing. So I'm just going to go ahead and paste it over here and it's off. Okay. Now what this is going to do is it's going to fully set up this Linux server. It's going to make it automatically update with the latest software.

It's going to configure it all correctly. And so forth. And it's going to ask a minimum number of questions. So I'm just going to show you the questions it's going to ask you. It's going to ask for a host name. So a host name is just a more convenient way to access a server.

And so you can basically write anything you like as long as it's got at least two dots in it. So I'm going to call this course test dot fast dot AI, for example. Okay. And then after an email address. Now the email address is basically just goes where it's going to send kind of error locks and stuff too.

So maybe we'll say info at fast dot AI. Okay. Do you want to set a password? Probably do. So hit enter for yes. So I'm going to put in a password. Ask you to type it again. Okay. Reboot automatically when required. I'll say yes. And that's it. Okay. So that's all the information that it needed.

So behind the scenes. What's actually happening here. Is it's grabbed the latest get repo from fast set up. And it's running this thing called Ubuntu initial. And you know this is something you can check out if you're interested. It's basically 125 lines of bash script, which is going to set up your firewall for you, set up SSH security for you, set up your swap file for you, set up your SSH configuration for you, install all the software you need for you, set up your logging and upgrades for you, set up your password and host name for you.

Okay. So it's going to do all that. And you know this is the kind of thing that if you, you know from time to time you can just might think oh I'm interested in how X works. And since everything is open source you can just go in and see how X works.

And at first none of this might make any sense. And so you go oh all right let's pick something and learn about it. Enable firewall. UFW. No what's UFW? Copy, paste. UFW. Probably not United Farm Workers. Uncomplicated firewall. Did Jeremy mention firewall? Okay. What the hell's a firewall? And you know you can start reading right.

And then you could be like oh maybe firewall tutorial. Often adding tutorial can be helpful. Okay. So you know you can start to just jump in here and there. Okay. Don't get too distracted. We want to spend as much time as possible training models. But this is how we learn about our tools.

Okay. So this is now going and downloading the latest version of all the software that it's going to need from Linux. So it may be a good time for questions if we have any Michael. What's your current opinion regarding Swift and Julia as replacements of Python? So Swift is basically out now.

So Google has basically archived the Swift for TensorFlow project. So you can safely ignore that. Yeah. Julia is interesting. You know I think it's a lovely language. Nothing has the ecosystem that Python does. So you know if you use Julia you're going to have to figure out a lot more stuff on your own and you'll find a lot more hard edges.

But I do think at some point Python is going to have to be replaced and Julia seems like one of if not the most likely thing to replace it. Or maybe it won't be replaced by Julia. Maybe there will be replaced by something else that's kind of Python like Jax which actually takes Python and compiles it using something called XLA into a much faster thing than Python otherwise would be.

Okay. Do you think that deep learning or more traditional ML or stats approaches are more useful for traditional industry applications right now? So before I answer that question I'm going to press Y which is going to reboot our computer now that it's all updated. And obviously when we reboot that computer running at the AWS data center it closes the connection because it's busy rebooting.

Okay so we'll give it a couple of minutes. There's not a single good answer to that question and you don't really need to answer that question because basically any time you want to try any kind of machine learning model on a problem you should try a few different algorithms.

And switching from a random forest to a gradient boosting machine to logistic regression to deep learning is you know an extra half hour. So you should just try a few different approaches. I find personally for me deep learning is increasingly turning out to be the easiest thing to get started with and gives me the best results for most projects I seem to do nowadays.

But you know have a look at like Kaggle competitions from time to time there are still things where gradient boosting machines work better or very often people use both and ensemble them. But yeah it's not a question that you actually need to answer because it's you want to get to a point where it just takes you a few minutes to try another algorithm out and so you don't need to be wedded to one or the other.

So I'm just going to see if I could I don't know how long it's going to take to reboot so I'm just going to I just pressed up arrow to get back my last SSH command and I'll press enter and we'll see if we're back we're back. OK so this is finished rebooting.

Oh actually this time it says to do something slightly different which is to add this minus L here. This is the thing that's going to let us connect to Jupyter Notebook. So I'm going to type excerpt to excerpt from the server and this time I'm going to add the extra bit of the command.

There we go. OK and all right so the next thing is we're going to install something called MIDI Conda. MIDI Conda is a very nice distribution of Python the programming language. A lot of people have bad experiences of their computers getting really confused with Python packages and things conflicting and all kinds of stuff like that.

That's because pretty much all the major operating systems now come with a version of Python that is used by your computer for you know important operating system tasks. You should not be using that Python to train your machine learning models. Leave that Python alone. Right. You should always install many Conda which is going to give you your own version of Python which is nothing to do with your operating system as you can play around with as you like.

It's really easy just you can delete the whole folder and create it again in like three minutes. You can create new environments which is like little testing grounds. You can try different things. This is a very strong recommendation is to make sure that you install even if you're just playing around on Windows or a Mac not on a server install many Conda.

It's cross platform. You can use it everywhere and use that Python. OK. So many Conda is now installed. So we now have our own Python setup. So the last setup step is we have to install drivers for the GPU and Ubuntu actually comes with something that figures out for you what the best drivers are for your device.

So this is just what this step here is. And so I'm going to look down. Look here it says recommended. OK. So here's the driver I want. OK. But what I actually recommend is you use that but also the one at the dash server to the end. That's going to make like not install the stuff for playing computer games or whatever.

OK. So let's go ahead and run these lines of code. This is a bit here. See this is 460 depending on your graphics card. When you run this you might have some different number. OK. But since I wrote this today it's still 460. So we'll go ahead and do that.

And this is going to go ahead and install this. Oh pseudo pseudo is a special thing you can add to the front of a command that runs it as an administrator. OK. So some things you know by default commands you run basically can't break your system. Right. Where else things like installing new software you have to tell it to run it as an administrator.

So when you do that it'll ask you for your password. This is the password that you put in just just a moment ago with the setup. There we go. OK. Is there a section of the course that people skip over too quickly. Yes. Part two. But not enough people do part two.

And the difference between part one and part two is the difference between being a pretty handy practitioner you know who can who can do some pretty good work as long as it's in reasonably well established kinds of areas and versus being somebody who understands how everything's put together you could you know if you're told to create a deep learning model on in a domain that's like there are no published models you'll be able to create one.

If you'll you'll understand how to create models which combine multiple different data types you know you'll it's it's yeah it's it's a really big thing to to finish and not enough people realize how much is is there. And just the later lessons in in general you know it can like after you've done three lessons you you are pretty handy and you'll feel pretty handy right.

But it's pretty easy to stop there because it feels like OK I get it you know I can train a model I get what's going on and to be fair it does very dramatically kind of scale up in terms of intensity after that because in lesson 4 you'll have to write your own optimizer from scratch and you'll be getting into the calculus and stuff.

But you know it it it is a big difference in terms of what what you can do and what you understand. So I think in general you know not enough people are getting deeper into the lessons. OK. So this is now finished installing the Nvidia drivers. Normally at this point people say to reboot but there's actually a magic thing you can do which means you don't have to reboot.

And the Nvidia provides something called Nvidia SMI which will tell you about your installed GPUs. And so if you run it and it pops up anything at all other than an error it means that you are successfully have your GPUs installed. So in this case we have a Tesla T4.

It's currently 36 centigrade in there and the most important thing to know about is that it has 15 gigabytes of memory of which we're using nothing at all. And there are no processors currently running on the GPU. So if you're finding something's going very slowly and you're wondering maybe it's not using the GPU you can always run Nvidia SMI and if it says no running process is found you're not using the GPU.

OK. OK. So one more setup step which is we have to install all of the software all of the Python libraries needed. So PyTorch, FastAI, Jupyter Notebook and so forth. And so I've created a package which has that whole lot. It's called Fastbook. If you're if you've used had a condor or mini condor before you might be surprised who says Mamba rather than condor.

You should definitely use Mamba and not condor. It's way way way faster. So anytime you see something saying condor install you should instead type Mamba install. It's way faster. OK. So off it goes. Members now going to install all of the all this Python software getting installed for us.

PyTorch is well over a gigabyte. So this is going to take a few minutes just because it has to download that that whole thing. And yeah that can take a while. So while this is going do we got any more questions Michael. Do you recommend any software for experiment tracking.

So the most popular experiment tracking software would be TensorBoard and Weights and Biases. Experiment tracking software is stuff which will basically you can use a faster call back and you basically will say train whilst tracking with TensorBoard or train whilst tracking with Weights and Biases. And what it will do is it will kind of create a little database showing you all the training results from all the different experiments you've run and create some little graphs of them and so forth.

Personally I don't use any experiment tracking software. And the reason I don't is I found that many many many people just about everybody I know who uses them finds it incredibly distracting. So the trick to training models is don't watch them train. So if you've done any C programming it's like don't watch it compile right.

Go and do something else preferably set up your next experiment. Experiment tracking software just makes it so tempting to look at all the pretty graphs in my opinion. So I would suggest get it running leave come back when it's done and there should be a bit of reason you were running that experiment.

So check whatever that reason was right. Having said that if you're really sure you need the services of experiment tracking software for what you're doing and there are some things that genuinely need it then I think Weights and Biases is the best at the moment. I think it's really great.

And furthermore they've hired lots of fast AI alumni and they're super nice people. So definitely recommend that. So that's all the installation. So the last step is just to grab the book the notebooks. And so you use something called get clone to grab a repository of code and this is going to grab the fast book repository.

Paste. So you can see it's saying cloning this repository. And so you'll now find that there is a fast book directory so you can CD into it. And there is our book. OK. So I think something on Anaconda is going slowly so we're not going to wait for it to download.

But so I want to show the very last step. But the very last step is to run Jupyter Notebook and then you'll be able to click on the URL that pops up and it'll bring up something that basically looks just like we saw in Colab. But the nice thing is everything you save like everything you do will be remembered.

So all of your experiments are going to be there. The data sets you download is still there. So on and so forth. So that's that. So when you're done it will remind you here to as I mentioned before stop your instance. So you can either choose stop in this menu or you can choose stop here.

Or personally what I quite like to do is to run pseudo. Remember pseudo is this thing that runs as an administrator shut down halt now. And so that shuts it down from here without having to go into the AWS GUI. And there we go. OK. So if we look back at the EC2 here in a moment this will switch from running state to stop state.

OK. So I think that's everything Michael is there anything else to cover. OK. Great. All right. Well thank you everybody for listening into Lesson 0. And I look forward to hearing how you go with Lesson 1 and seeing your projects that you create. And don't forget to get involved in the forums.

If you do get stuck with something the first thing to do is to search the forums because out of the hundreds of thousands of people that have done this before somebody's probably got stuck in the same way before. So hopefully they can answer your question. Otherwise feel free to ask your own questions and hopefully somebody will answer for you.

Thanks everybody. Bye. (audience applauds)