back to indexAndrew Ng: Advice on Getting Started in Deep Learning | AI Podcast Clips
Chapters
0:0
1:17 How Does One Get Started in Deep Learning
2:0 Prerequisites for Somebody To Take the Deep Learning Specialization
6:54 Concepts in Deep Learning Do You Think Students Struggle the Most with
7:12 Challenges of Deep Learning
12:45 How Long Does It Take To Complete the Course
19:8 How Does One Make a Career out of an Interest in Deep Learning
21:41 Should Students Pursue a Phd
00:00:00.000 |
So let's perhaps talk about each of these areas. 00:00:10.080 |
how does a person interested in deep learning 00:00:14.680 |
- Deeplearning.ai is working to create courses 00:00:24.560 |
is one of the most popular courses on Coursera. 00:00:27.800 |
To this day, it's probably one of the courses, 00:00:34.640 |
or how did you fall in love with machine learning 00:00:37.960 |
it always goes back to Andrew Yang at some point. 00:00:41.480 |
The amount of people you've influenced is ridiculous. 00:00:45.560 |
So for that, I'm sure I speak for a lot of people 00:01:02.440 |
something like one third of our programmers are self-taught. 00:01:12.120 |
'Cause you teach yourself, I don't teach people. 00:01:16.800 |
So yeah, so how does one get started in deep learning 00:01:20.360 |
and where does deeplearning.ai fit into that? 00:01:26.480 |
I think it was Coursera's top specialization. 00:01:36.800 |
to learn about everything from neural networks 00:01:53.280 |
So you deeply understand it and can implement it 00:02:01.840 |
for somebody to take the deep learning specialization 00:02:04.440 |
in terms of maybe math or programming background? 00:02:10.360 |
since there are programming exercises in Python. 00:02:21.000 |
But deliberately try to teach that specialization 00:02:25.000 |
So I think high school math would be sufficient. 00:02:44.520 |
will find a deep learning specialization a bit easier. 00:02:48.760 |
into the deep learning specialization directly, 00:03:02.560 |
- Could you briefly mention some of the key concepts 00:03:08.680 |
in the first few months, in the first year or so? 00:03:11.640 |
- So if you take the deep learning specialization, 00:03:14.200 |
you learn the foundations of what is a neural network? 00:03:27.200 |
One thing I'm very proud of in that specialization 00:03:39.640 |
So how do you tell if the algorithm is overfitting? 00:03:42.600 |
When should you not bother to collect more data? 00:03:48.560 |
there are engineers that will spend six months 00:04:00.680 |
and could have figured out six months earlier 00:04:06.280 |
So just don't spend six months collecting more data. 00:04:12.600 |
So go through a lot of the practical know-how 00:04:17.760 |
when you take the deep learning specialization, 00:04:26.640 |
to train it, to do the inference on a particular dataset, 00:04:34.520 |
to where you spend, like you said, six months learning, 00:04:42.520 |
a small aspect of the data that could already tell you 00:04:47.960 |
- Yes, and also the systematic frameworks of thinking 00:04:51.640 |
for how to go about building practical machine learning. 00:04:57.720 |
we have to learn the syntax of some programming language, 00:05:00.160 |
right, be it Python or C++ or Octave or whatever. 00:05:07.280 |
is to understand how to string together these lines of code 00:05:16.960 |
So those frameworks are what makes a programmer efficient, 00:05:24.000 |
I remember when I was an undergrad at Carnegie Mellon, 00:05:38.920 |
Well, they would delete every single line of code 00:05:42.000 |
So really efficient for getting rid of syntax errors 00:05:51.720 |
is very different than the way you, you know, 00:05:55.440 |
or use a debugger, like trace through the code 00:06:01.320 |
but I find that the people that are really good 00:06:18.760 |
sort of going into the questions of overfitting 00:06:23.080 |
That's the logical space that the debugging is happening in 00:06:28.880 |
- Yeah, often the question is, why doesn't it work yet? 00:06:37.200 |
Change the architecture, more data, more regularization, 00:06:48.160 |
so you don't spend six months heading down the blind alley 00:06:56.360 |
do you think students struggle the most with? 00:07:12.600 |
I think one of the challenges of deep learning 00:07:28.200 |
I think one of the challenges of learning math 00:07:45.880 |
to maximize the odds of each component being understandable. 00:07:49.320 |
So when you move on to the more advanced thing, 00:07:56.320 |
to then understand why we structure confinates 00:08:00.960 |
And then eventually why we built RNNs on LSTMs 00:08:15.520 |
this is the hard concept moment in your teaching? 00:08:37.520 |
that are like aha moments that really inspire people. 00:08:41.800 |
I think for some reason, reinforcement learning, 00:08:47.960 |
is a really great way to really inspire people 00:08:51.960 |
and get what the use of neural networks can do. 00:08:55.920 |
Even though neural networks really are just a part 00:09:00.960 |
but it's a really nice way to paint the entirety 00:09:08.320 |
knowing nothing and explore the world and pick up lessons. 00:09:17.840 |
about neural networks, which is counterintuitive. 00:09:20.240 |
I find like a lot of the inspired sort of fire 00:09:29.400 |
to be a useful part of the teaching process or no? 00:09:37.960 |
and my PhD thesis was on reinforcement learning. 00:09:43.880 |
the most useful techniques for them to use today, 00:09:54.720 |
Maybe it'll be totally different in a couple of years, 00:10:02.920 |
- One of my teams is looking to reinforcement learning 00:10:10.040 |
of all of the impact of the types of things we do, 00:10:12.480 |
is at least today, outside of playing video games 00:10:20.800 |
Actually at NeurIPS, a bunch of us were standing around 00:10:25.200 |
"of an actual deploy reinforcement learning application?" 00:10:27.640 |
And among senior machine learning researchers. 00:10:40.560 |
The sad thing is there hasn't been a big application, 00:10:44.280 |
impactful real-world application reinforcement learning. 00:10:47.200 |
I think its biggest impact to me has been in the toy domain, 00:10:55.920 |
it seems to be a fun thing to explore neural networks with. 00:11:01.440 |
and I think that might be the best perspective, 00:11:04.280 |
is if you're trying to educate with a simple example 00:11:07.120 |
in order to illustrate how this can actually be grown 00:11:16.080 |
of supervised learning in the context of a simple dataset, 00:11:26.280 |
I just, the amount of fun I've seen people have 00:11:30.840 |
but not in the applied impact on the real world setting. 00:11:35.200 |
So it's a trade-off, how much impact you want to have 00:11:40.560 |
And I feel like the world actually needs all sorts, 00:11:48.200 |
but the AI team shouldn't just use deep learning. 00:11:50.800 |
I find that my teams use a portfolio of tools, 00:11:54.080 |
and maybe that's not the exciting thing to say, 00:12:02.400 |
actually the other day I was sitting down with my team 00:12:08.160 |
And some days we use a probabilistic graphical model, 00:12:15.440 |
but the amount of chatter about knowledge drafts 00:12:30.240 |
it'd be sad if everyone just learned one narrow thing. 00:12:34.800 |
help you discover the right tool for the job. 00:12:46.680 |
How long does it take to complete the course, 00:12:49.880 |
- The official length of the deep learning specialization 00:12:57.840 |
So if you subscribe to the deep learning specialization, 00:13:00.800 |
there are people that finish it in less than a month 00:13:02.920 |
by working more intensely and studying more intensely. 00:13:07.840 |
When we created the deep learning specialization, 00:13:10.600 |
we wanted to make it very accessible and very affordable. 00:13:15.200 |
And with Coursera and deeplearning.ai's education mission, 00:13:18.920 |
one of the things that's really important to me 00:13:20.600 |
is that if there's someone for whom paying anything 00:13:26.600 |
then just apply for financial aid and get it for free. 00:13:29.880 |
- If you were to recommend a daily schedule for people 00:13:35.240 |
in learning, whether it's through the deeplearning.ai 00:13:39.920 |
in the world of deep learning, what would you recommend? 00:13:44.040 |
How do they go about day-to-day sort of specific advice 00:13:49.800 |
in the world of deep learning, machine learning? 00:13:52.000 |
- I think getting the habit of learning is key, 00:13:58.360 |
So for example, we send out our weekly newsletter, 00:14:06.800 |
you can spend a little bit of time on Wednesday 00:14:08.800 |
catching up on the latest news through The Batch 00:14:17.160 |
of spending some time every Saturday and every Sunday 00:14:24.920 |
Do I feel like reading or studying today or not? 00:14:31.440 |
So I think if someone can get into that habit, 00:14:39.280 |
If I thought about it, it's a little bit annoying 00:14:43.200 |
but it's a habit that it takes no cognitive load, 00:15:05.560 |
In my own life, like I play guitar every day for, 00:15:09.920 |
I force myself to at least for five minutes play guitar. 00:15:24.560 |
exceptionally good at certain aspects of a thing 00:15:26.960 |
by just doing it every day for a very short period of time. 00:15:29.280 |
It's kind of a miracle that that's how it works. 00:15:36.200 |
the burst of sustained efforts and the all-nighters, 00:15:39.120 |
because you can only do that a limited number of times. 00:15:44.480 |
I think, you know, reading two research papers 00:15:49.200 |
but the power is not reading two research papers. 00:15:51.480 |
It's reading two research papers a week for a year. 00:15:56.120 |
and you actually learn a lot when you read a hundred papers. 00:16:07.440 |
for particularly deep learning that people should, 00:16:19.400 |
when I'm trying to study something really deeply 00:16:37.280 |
you know, and I still take some every now and then, 00:16:39.520 |
the most recent one I took was a course on clinical trials, 00:16:49.400 |
And that act, we know that that act of taking notes, 00:16:52.640 |
preferably handwritten notes, increases retention. 00:17:01.040 |
and then taking the basic insights down on paper? 00:17:07.240 |
If you search online, you find some of these studies 00:17:12.360 |
because handwriting is slower, as we're saying just now, 00:17:16.240 |
it causes you to recode the knowledge in your own words more 00:17:20.360 |
and that process of recoding promotes long-term retention. 00:17:36.760 |
For a lot of people, they can handwrite notes. 00:17:40.160 |
they're more likely to just transcribe verbatim 00:17:46.280 |
and that actually results in less long-term retention. 00:17:49.640 |
- I don't know what the psychological effect there is, 00:17:58.800 |
as just the time it takes to write is slower. 00:18:01.560 |
- Yeah, and because you can't write as many words, 00:18:22.480 |
I really love learning how to more efficiently 00:18:26.400 |
Yeah, one of the things I do both when creating videos 00:18:36.400 |
going to be a more efficient learning experience 00:18:47.360 |
So when we're editing, I often tell my teams, 00:18:57.160 |
- Oh, it's so amazing that you think that way, 00:19:15.880 |
We just talked about sort of the beginning, early steps, 00:19:18.560 |
but if you want to make it an entire life's journey, 00:19:29.160 |
- And I think in the early parts of a career, 00:19:32.640 |
coursework, like the deep learning specialization, 00:19:36.880 |
is a very efficient way to master this material. 00:19:41.000 |
So, because instructors, be it me or someone else, 00:19:46.000 |
or Lawrence Moroney teaches our TensorFlow specialization, 00:19:50.520 |
spend effort to try to make it time efficient 00:19:55.200 |
So coursework is actually a very efficient way 00:19:59.880 |
at the beginning parts of breaking into a new field. 00:20:10.120 |
look, in your first couple of years as a PhD student, 00:20:22.120 |
there's materials that doesn't exist in courses, 00:20:28.600 |
that we're not yet that good at teaching in a course. 00:20:31.800 |
And I think after exhausting the efficient coursework, 00:21:09.000 |
to whatever, doing your own fun hobby project. 00:21:16.040 |
I find this to be true at the individual level 00:21:20.880 |
For a company to become good at machine learning, 00:21:56.920 |
- I think that there are multiple good options 00:22:12.840 |
Or if someone gets a job at a top organization, 00:22:20.360 |
There are some things you still need a PhD to do. 00:22:38.680 |
Where are the places where you can get a job? 00:22:40.240 |
Where are the places where you can get in a PhD program? 00:22:42.200 |
And kind of weigh the pros and cons of those. 00:22:44.800 |
- So just to linger on that for a little bit longer, 00:23:03.280 |
that already have huge teams of machine learning engineers. 00:23:09.480 |
that kind of like Google Research, Google Brain. 00:23:22.280 |
Is there anything that stands out between those options 00:23:29.880 |
- I think the thing that affects your experience more 00:23:31.960 |
is less are you in this company versus that company 00:23:37.280 |
I think the thing that affects your experience most 00:23:38.800 |
is who are the people you're interacting with 00:23:42.640 |
So even if you look at some of the large companies, 00:23:46.680 |
the experience of individuals in different teams 00:23:50.200 |
And what matters most is not the logo above the door 00:23:53.360 |
when you walk into the giant building every day. 00:23:55.600 |
What matters the most is who are the 10 people, 00:24:09.600 |
We tend to become more like the people around us. 00:24:17.680 |
if you get a job at a great company or a great university, 00:24:28.320 |
And then that's actually a really bad experience. 00:24:35.160 |
For small companies, you can kind of figure out 00:24:40.880 |
if a company refuses to tell you who you will work with, 00:24:55.360 |
with great peers and great people to work with. 00:25:02.160 |
We don't consider too rigorously or carefully. 00:25:13.840 |
So it's not about whether you learn this thing or that thing, 00:25:18.840 |
or like you said, the logo that hangs up top, 00:25:26.480 |
of finding, just like finding the right friends 00:25:31.240 |
and somebody to get married with and that kind of thing. 00:25:34.560 |
It's a very hard search, it's a people search problem. 00:25:50.880 |
"Well, maybe that's 'cause you don't have a good answer." 00:26:03.480 |
That's a really important signal to consider. 00:26:11.760 |
I think I gave like a hour-long talk on career advice, 00:26:15.440 |
including on the job search process and some of these.