Lesson 1 Overview

Hi, and welcome to lesson one of this deep learning MOOC. Thanks for joining us. I'm Jeremy Howard and I'm Rachel Thomas We're the people who put this together. You'll be seeing my face in front of the camera most of the time You'll be hearing Rachel's voice however, and I'll be asking questions that were coming through a slack channel asked by students in the in-person version of this course So we thought before we started we should tell you a little bit about what to expect and maybe you get to know us a little bit My background is really in Coding and data I spent 10 years of management consulting and 10 years running startups throughout that time I've been using data and machine learning to try to solve problems My background is a lot more academic and theoretical I have a PhD in math and then I worked as a quant later as a software engineer and data scientist at Uber One of the most fun and exciting parts of my life was when I spent some time really competing heavily in Kaggle competitions I was really pleased to win some of those competitions and get to the top of the leaderboard And I'm hoping to show you guys during this course some of the techniques that I used to do that I think the techniques that allow you to win Kaggle competitions are the same as the techniques that allow you to great Great results on your own models in solving your own problems we also both love teaching and so I taught calculus one and two when I was in graduate school and Then I later left my job as a software engineer to teach full-stack software development women at Hackbright Academy for a year and a half I think that's really cool Rachel was a quant and then she works as both a data scientist and a full-stack engineer at Uber But she realized that one of the highest leverage things you can do is to teach and it's great fun, too I feel the same way even when I was running startups.

I was creating Course content online for example on the left here is a angular JS tutorial that I originally created for my colleagues at Kaggle But I recorded it and put it online and it's had over 200,000 views Makes me feel really good to know that people are learning From some of the things that I found really helpful myself So this is a quote from Paul Lockhart who was a He was actually working as a primary school math teacher Got his PhD in math at Columbia and became a math professor at Brown and then left to go back to teaching primary school And he's written a wonderful essay called a mathematician's lament on everything.

That's horrible about how mathematics is taught in the United States Yeah, I think that that essay has been really influential to both Rachel and I Although Rachel stuck with her math education for decades longer than I did We both definitely felt like modern mathematics education is not done. Well Paul Lockhart uses a wonderful analogy about imagine if with music We didn't allow children to sing or play instruments until they had spent Years and even decades studying set theory and music notation and could transcribe scales and only then once they were in their 20s We would let them create music He says that's exactly what we're doing with mathematics, but that we should let people kind of Play and create and build patterns with it and something very similar happens with deep learning and how it's taught In fact, one of my heroes is a guy called David Perkins who at Harvard has created some Really interesting research about effective educational techniques and he has a very similar analogy to Paul Lockhart But he talks about baseball imagine if the way you learn baseball was that you never saw a game of baseball But instead you learned about how to stitch a baseball and you like the physics of a parabola And you learn every aspect of baseball and then after 20 years of study could be considered good enough to go and actually watch your first game We tend to think that this is rather the way that most mathematics perhaps particularly including deep learning is really taught so we Decided when we set up our research lab fast AI that the first thing we would do would be to try and Fill this need and particularly we decided to focus on deep learning because we both think that deep learning is The most exciting technology that we have ever seen We think it's going to be more transformative than even than the internet and so the more people who can participate the better Andrew Ng has called it the new electricity but kind of to say that it's going to have the impact on society that electricity has so Some other kinds of problems we've seen with technical teaching for these and I just want to say we're introducing this to tell you That this course is taught in a very different style And so we want to kind of set your expectations ahead of time and motivate why it's so different how we teach it here And one is that a lot of them existing deep learning materials are very math centric and even as a mathematician and someone who loves math I found them to be pretty unhelpful for actually building and creating practical applications In fact every time I see somebody ask on a forum or on Hacker News or whatever What do I need in order to get into deep learning a whole bunch of people reply by saying well first?

You need five years of real analysis and vector analysis And then you need to study probability and statistics and blah blah blah blah blah and it really comes across to me as Something which is all about being exclusive rather than inclusive. So that's why we have this little Thing making your own at some call again is kind of our slogan we're all about not being exclusive but about making things as simple as possible, but never about Dumbing it down, right?

another way that Kind of technical education fails is what David Perkins and the Harvard professor Jeremy mentioned a moment ago Calls elementitis and that's that often math does this so much It teaches kind of each separate element and it's only at the end when you've learned all the elements Needed that you can put them together and see the whole thing and that's kind of what was going on in that baseball analogy And that happens in a lot of deep learning It's like, you know, we need to teach you probability theory and we need to teach you information theory and only way later on Are we going to let you put it together?

You can think of it as being depth first rather than breadth first if you like So the traditional depth first approach means that you as a student have to trust that at some point all these things are going To come together and turn into something that's genuinely useful I think with this breadth first approach you still have to trust but it's different kind of trust Which is that it's okay that when we first show you an end-to-end process that you don't deeply understand every part But that you are able to actually do useful things from the very first lesson and that as the lessons go along You're going to get more and more in-depth understanding of each piece and two ways that the elementitis or the depth first Approach fails our one is motivation A lot of students kind of give up because they don't have the motivation of seeing how are these going to fit together?

And then secondly, it's harder to get that like you don't have the context when you're learning all these discrete elements and you Can't learn how they're going to fit into the process until later Right and in fact this goes together with the idea of using a code centric Approach and sort of a math centric approach with a code centric approach and looking at the whole game That is an end-to-end machine learning process from the very start.

It means that you can do experiments You can actually run experiments and see what goes in and out of each part of the system and build up that intuition And if this whole game analogy intrigues you David Perkins has a book called making learning hole where he goes into a lot more detail About it.

Love that book So then not not only are we going to be showing you end-to-end processes from this very first lesson But these processes are going to not going to just end up with good enough results nearly all of the deep learning educational materials I've seen so far get you to a point where You can kind of get an okay-ish result now The whole point of deep learning is that you can get state-of-the-art results and so in the very first piece of code We're going to run we're going to run a piece of code which gives you a state-of-the-art result We know something as a state-of-the-art result if it is better than other approaches that people have tried The best way to know that is to try things on a Kaggle competition Having been the president and chief scientist of Kaggle I saw again and again that every Kaggle competition beat all previous academic state state-of-the-art results So very often in this course We're actually going to use Kaggle benchmarks and see if we can beat them because we know if we can then that's truly a world best So this is from actually a very good book It's the Ian Goodfellow Yoshua Bengio deep learning book But it's a very good math book which teaches you the math of deep learning and so in this book when they say Here is how we gain some intuition in how to back propagation through time works.

This is how they develop intuition Rachel is a math PhD. Did you find this helped your intuitions? We'll have a very different approach to intuition So this is a good book if you're interested in math and theorems in this course We're really going to be focused on code In fact, this is what Rachel and I put together when we were trying to explain back prop and specifically Stochastic gradient descent and the use of back prop there was we created a spreadsheet And we found each time that we taught our students in the in-person course through a spreadsheet They could see every single piece of what was going on every single intermediate result And it was very easy for them to experiment with and so one of the unusual things we do is that you'll see that nearly every major Idea is presented at some point using a spreadsheet.

We present it in many different ways, but spreadsheets Diagrams and code are three of the key ways that we present these ideas I believe this is the first deep learning course in the world to implement convolutional neural network in an Excel spreadsheet and also as you see from this page not just stochastic gradient descent But at a grad or a mess prop Adam and even Eve which just came out a few weeks ago or modern examples of Accelerated SGD approaches So I think everything you really need to know about the course comes in this very first piece of code that you see And this very first piece of code that you see you can see that there's a number of things going on The first is that this piece of code shows not just how to complete a project But how to get a state-of-the-art result on a project this particular piece of code gives you 97% accuracy in determining cats versus dogs As recently as about five years ago the state-of-the-art for this particular Problem was about 80% accuracy Um, it's also an example of showing why working with code is so interesting rather than showing math What we're showing here is some working code and I'll give an example of what that means you can do So the code environment that we're working in is something called Jupiter notebook And you'll be using this in every single lesson throughout the course and in Jupiter notebook as you can see we provide you with Pros and information about what's going on and we draw pictures and at any point in time you can take a look at one of these results and you can You can take a look at one of these results and you can look to see what's going on behind the scenes So for example in this case We're running something called VGG dot predict and we're getting back some probabilities and you might wonder well What's VGG dot predict actually doing so at any time you can take anything and put two question marks on the front and Run that piece of code and it will actually show you The full documentation and source code of what you just ran now in this case It's actually running a function that we wrote for you One of the other different things about this course is that we're not just showing you how existing libraries work But every time we found that using somebody else's library takes more than four or five lines of code We would make sure we found a way to do it easier So generally speaking we show you how to do things in one line of code and then you can look behind the scenes and see What the lines of code are actually doing?

so for example in this case the predict method is running some other predict method called model dot predict and So then what I always encourage people to do is to do some experiments. So what does model dot predict actually do? one thing that you can do in Jupyter notebook at any time is to press shift tab a couple of times and When you press shift tab the first time it pops up is tells you what?

Parameters you need to pass this method and it also tells you what the method actually does If you press it three times it then gives you additional information about what each of those arguments are and what they're expecting and what it returns So it's really nice that using this method you can find out exactly what's going on behind the scenes and do some experiments And so then for example, you could find out.

Okay. Well, what is the shape? What is the size and shape of the array? That this thing returns What are the first four elements of the classes that are in this? object and so forth and this is really the way to Use this style of teaching effectively It's to have the code in front of you all of the time and in every line look and see what's being passed in What's coming out?

What else could we do with that and then even look at the documentation? So VGG dot model is apparently a care us dot model dot sequential. So if we were to just copy that into Google Then we can click on the first item and find out exactly what's going on what is being used here What are the other methods that this could take and then we can try calling some of these other methods and see what kind of results We get so really what we're trying to do is in the two hours of each lesson We're trying to give you enough information to get you started with your own experiments We're not trying to teach you everything and we're certainly not assuming that the lesson can stand alone And we'll talk more about this in a moment But the videos are just a small part of the course but the IPY on notebooks and the code are a huge resource And we'll talk about some of the other resources that we have available for you But the important thing to realize with these Six lines of code is that you can run this for anything not just for dogs versus cats But these first six lines of code you learn We're actually as it says here work for any image recognition task with any number of categories So if you can get this far in today's lesson Then you've learned to do one of the most important types of computer vision Which is image classification or any number of categories for any type of images?

As Rachel said we've actually run this course already Specifically what you're going to be seeing are the recorded lessons from an in-person course And we thought it'd be helpful for you to see what some of our students said about that in-person course because it might Help you to be a more effective learner And I do want to say I'm again because this course is taught in such a different way that It takes some faith kind of that this new technique is worth trying and kind of sticking with but you can see that almost all the students said that this was that the homework assignments were very helpful or Extremely helpful in understanding the material And the class resources which includes the wiki the scripts that we give you our forums our slack channel We're very helpful or extremely helpful and we want to mention and we wanted to mention that because Rachel and I are both being kind of Coursera addicts in the past and Udacity addicts and Generally speaking we all often watch a video at one and a half speed or two speed and just zip through them This is not designed to be possible to do that this way This is designed that you need to use the homework assignments and the class resources So as you can see from the people who have already been through this class They're actually finding that these are really important parts of the overall course As each video is giving you you're kind of seeing an end-to-end process of solving a real problem with deep learning And that means that there's not though a separate video on the kind of this is everything you need to know about AWS in your environment and this is everything you need to know about this piece of code But rather you're kind of seeing the end-to-end process, but you'll see it again and again throughout the lessons Now it's okay if you're coming into this course with either a very large amount or a very small amount of data science background Everybody in the in-person course simply had to have had at least a year of coding experience even with that very wide variety and background Nearly everybody said they found the pacing about right for them And the reason for that I think is that we really give people the ability to pick up as much as little as they want Through the forums if you want to dig very very deep into advanced topics you can or if absolutely everything is new to you Then that's fine, too There'll be more than enough to do just to get through the basic parts of the assignments and of course on the forums We'd be very happy to help you with all of your questions there And if you are more advanced we really appreciate your help in adding new material to the wiki Answering others questions on the forums people started their own threads on the forums around kind of outside related topics that they were interested in There are a lot of different ways to be involved So here's a couple of quotes we got from people after they completed the in-person course And this is one that we heard again and again so for example this person says I personally fell into the habit of watching the lectures too much and Googling definitions and concepts and so forth too much without running the code at first I thought that I should read the code quickly and then spend time researching the theory behind it In retrospect I should have spent the majority of my time on the actual code in the notebooks instead in terms of running it And seeing what goes into it and seeing what comes out of it and Rachel I know you've seen similar things in your past teaching experience I've seen this in teaching full stack software development and test students And I also know that I've been guilty a bit myself sometimes And that was that students would sometimes kind of rather than start their project They would keep doing more and more research reading more and more tutorials and feeling like there's more and more they need to learn before They can start coding and two problems with that are one and I mean you want to have some background before you begin But there's a point where you just need to start coding Because you can't know exactly what you're going to need until you start Coding and building and seeing what errors you get and what things you don't know how to do And then secondly the test of whether you understand something is whether you can build with it and so kind of reading tutorials It's very possible to think oh, I understand all this, but it's not till you're writing code yourself I'm kind of seeing what your what your error rates are and what what's working and what's not that you know whether or not you truly Understand something yeah, so when I saw students at the study sessions during the week at USF I would keep telling them the same thing again And again just don't stop and wait till you feel ready to code start coding now And it's through that coding experience that you're actually going to figure out what you don't know and what you do know and you'll be able to develop the intuition by running lots of experiments This is another interesting quote from somebody talking about This learning style he said it's been very interesting learning from somebody who is an entrepreneur That'd be me a very known nonsense approach to getting things done very hands-on very smart and driven Your usual career and structure is quite the opposite So it's been refreshing and even somewhat shocking this is possibly understating things a bit it can be in fact We heard from quite a few people at the start.

It was somewhat shocking to find so many things taught so quickly and it kind of Can seem like such a high level, but of course by the end of the seven weeks and assuming that each time you're putting 10 Hours into those weeks you've actually got Many many full end-to-end processes under your belt so by the end of it you actually are going to develop a very deep and complete understanding Yeah, I know after the first lesson I heard a number of students kind of say things like oh I I didn't really get the details from that lesson and you know I feel like I need to spend all this time understand studying the details And we hadn't taught the details in the first lesson And the idea is that kind of we went more and more in depth each time You're seeing this end-to-end process and then kind of as time goes on digging into it more But even after the first lesson you can apply it you can actually create World-class image recognition models, and so you can go back to your organization and start trying things This is something else we encourage people to do try things with your own data and your own problems from the very first lesson So it's been a interesting experience in every way even the way we built this course was unusual For example, I actually wrote most of the material while traveling from the northern tip to the southern tip of Japan.

I coded and wrote in every possible place you can imagine And this was a really an experiment for me because I studied human learning theory a lot And I know that in theory human creativity is meant to be better when you have a wider variety of contexts Interestingly, I actually found I was more productive in that month than I feel I ever have been before and you'll see actually in the material that you learn we show a lot of new techniques or different techniques or different ways of thinking about things and I think this kind of Different way of building the course perhaps what was really helpful and coming up with this kind of more creative approach So not everybody in the course In the in the in person course were able to put in the at least eight hours a week, but the vast majority were and those who didn't Still completed the course They just found they didn't necessarily pick everything up the way that they hoped they would But of course the nice thing is you can always come back to it later So our suggestion would be now that it's a MOOC now that you don't have to do it every single week Ideally you will put in the 10 hours a week.

Did you want to talk about those 10 hours a little bit Rachel? Yeah, we wanted to give you kind of some suggestions on how how to use that time So the videos are between two and two and a half hours long and so with those videos You may find it helpful as you review them to use these notes Yeah, so this is coming from our wiki with you.fast.ai.

There's a page for each lesson that has notes Kind of about the lesson. It also has links to other other resources That may be useful to you These notes are pretty complete. They're not designed to be read entirely independently from the video lesson But they are something which you can Read on your way to work.

Maybe when you don't have It's not convenient to actually watch the lesson Sorry, I was gonna say we're expecting that you'll watch the lessons more than once, you know So the first time through you're kind of watching to get maybe a lot of the high-level ideas Then you'll probably want to read the wiki try out the notebooks and then go back and watch the lesson again Kind of maybe to get more detail.

Yeah, I don't think any of our students in the in-person course just watch the lessons once They saw them live of course But then we also they also had the recording from the next day and I think everybody has spoken to watch them at least twice And then of course the other thing you've got is the notebooks the notebooks as you see have quite a lot of pros in them as well They've got quite a lot of additional detail that we don't necessarily get into in the video lesson But most importantly as we described they give you an environment in which you can experiment In fact, not only do we suggest that you experiment we have a very specific suggestion about how to use these notebooks Yeah, so we recommend that you Read through the notebook and then and this is after you've watched the video At least once and everything makes sense put it aside and try creating a new a new notebook where you go through that process yourself And so this is from scratch, right?

This is like creating your own notebook to test that you can actually build it yourself Yeah We do not want you to just hit shift enter shift enter and run through the existing notebook Because again the test of whether whether you know something is can you can you build and code with it yourself?

So if you get stuck you can always then go back and refer to the class notebook and then rather than copying and pasting it Make sure you do understand what it's saying Maybe look up some documentation about that concept and then put that notebook aside and see if you can now do it yourself So in a sense you're plagiarizing a lot from the notebook, but you're plagiarizing in a good way You know you're plagiarizing not by copying and pasting but by plagiarizing the concepts and making sure that you can recreate them yourself And then if you have a have questions Please ask them the forums are the first place you should go to and first search to see if someone's already asked your question As we said earlier, there's a separate thread for each lesson that are already Have tons of helpful questions and answers from the students that took our in-person course In fact, there's a great quote which we talk about in one of the lessons from the head of Google Brain Who says that their rule at Google Brain is that if you have a problem you first of all try to fix it yourself for half an hour And if half a half an hour you can't fix it yourself you then have to ask somebody so that ensures that you Always give it a go yourself and hopefully learn from the experience But you never waste too much time on something which somebody else can help you with.

Yes, it's great advice So as Rachel said the forums are a really helpful resource and when you go to the forums You'll find that there's a lot of existing discussions There's a separate discussion for every lesson for example and each and each of those discussions You'll see that there's a summary of the existing discussion at the start So you may find that what you need is already in the question and answers there If it's not of course Feel free to add your question and you're generally found it's responded to within a small number of hours Maybe by Rachel or I or maybe by one of the other students The other thing you may find helpful is that each lesson has a timeline on the wiki and those hyperlinks are actually hyperlinks Directly to the part of the video which discusses that topic So if you're trying to remember how momentum works You can just click on that link and you'll jump straight to me telling you about momentum As Rachel said there's also a number of resources available to help you So this is taken from the front page of our wiki There's a whole section of tools with links where you can learn about learn more about each of the pieces that we use in the development Environment and so our goal here is not to be a single source of truth If somebody else has already done a great job of teaching one of these tools We'll leave it to them So we don't attempt to give you a great bash reference or a great umpire reference because people have already done that So if you want to learn more about one of these things jump onto the wiki click through here And you'll find some curated resources that we think are really helpful So this lesson You're going to cover a lot of stuff But these are the four things to keep in mind By the end of the lesson you want to make sure that you can create an AWS instance That you can connect to it with SSH It can run a Jupyter notebook in it and you can run those that state-of-the-art custom model code that we showed you earlier Those first three things you're going to be doing every single project in every single lesson So you're going to want to be really comfortable at doing that and for those of you who don't I haven't done that before It might take you a little while to get the hang of it and maybe a few unsuccessful attempts first So this first lessons unusual and that it's a lot more about Kind of getting your development environment set up and not as much about deep learning So indeed if you've got a background in Python and AWS and Linux You may find this lesson on the easy side in which case you can dip through it pretty fast If you don't have a background in these tools today's class may seem really overwhelming And we don't don't want you to be discouraged by that because this is very different from the future lessons But it's necessary to get your environment set up so that you can be coding throughout the course Yeah, I mean the folks who didn't have that background in the in-person course once they actually got through this and Often it was a lot of work and it was pretty tough But at the end they finally got the point and they could say, okay I've set up a GPU instance in the cloud I've set up my development environment and I have trained from scratch a model that can recognize dogs from cats And it was very very exciting.

So if this is hard work for you Just know that when you get through the other end of it, it's going to be really exciting So I asked that everyone who's trying to decide if this course is for them Try at least the first two lessons since the first lesson is so much about setup Yeah, these lessons become Obviously we build more and more on the techniques we've learned and so we're going to be using this infrastructure in every lesson By the time we get to lesson seven, we're going to be looking at some pretty sophisticated and custom neural network architectures We're going to cover every different type of SGD optimization We're going to be covering convolutional neural networks and recurrent neural networks.

So there's going to be a lot of exciting stuff and yeah, we really look forward to seeing you on the forums and Good luck with learning about deep learning

Lesson 1 Overview

Chapters

Transcript