back to index

Most Research in Deep Learning is a Total Waste of Time - Jeremy Howard | AI Podcast Clips


Whisper Transcript | Transcript Only Page

00:00:00.000 | (gentle music)
00:00:02.580 | - So much of fast AI students and researchers
00:00:10.960 | and the things you teach are pragmatically minded,
00:00:15.440 | practically minded, figuring out ways
00:00:18.160 | how to solve real problems and fast.
00:00:21.080 | So from your experience, what's the difference
00:00:23.440 | between theory and practice of deep learning?
00:00:28.960 | - Well, most of the research in the deep learning world
00:00:32.800 | is a total waste of time.
00:00:35.080 | - Right, that's what I was getting at.
00:00:36.320 | - Yeah.
00:00:37.160 | (laughing)
00:00:37.980 | It's a problem in science in general.
00:00:41.520 | Scientists need to be published,
00:00:44.880 | which means they need to work on things
00:00:46.760 | that their peers are extremely familiar with
00:00:49.340 | and can recognize and advance in that area.
00:00:51.480 | So that means that they all need to work on the same thing.
00:00:54.280 | And so it really, and the thing they work on,
00:00:58.320 | there's nothing to encourage them to work on things
00:01:00.960 | that are practically useful.
00:01:04.160 | So you get just a whole lot of research,
00:01:06.440 | which is minor advances and stuff
00:01:08.540 | that's been very highly studied
00:01:09.960 | and has no significant practical impact.
00:01:14.640 | Whereas the things that really make a difference,
00:01:16.200 | like I mentioned transfer learning,
00:01:18.080 | like if we can do better at transfer learning,
00:01:20.920 | then it's this like world-changing thing
00:01:23.520 | where suddenly like lots more people
00:01:25.080 | can do world-class work with less resources and less data.
00:01:30.080 | But almost nobody works on that.
00:01:33.800 | Or another example, active learning,
00:01:36.080 | which is the study of like,
00:01:37.160 | how do we get more out of the human beings in the loop?
00:01:41.160 | - That's my favorite topic.
00:01:42.440 | - Yeah, so active learning is great,
00:01:43.840 | but it's almost nobody working on it
00:01:46.480 | because it's just not a trendy thing right now.
00:01:49.080 | - You know what, somebody started to interrupt.
00:01:52.320 | He was saying that nobody is publishing on active learning,
00:01:56.800 | but there's people inside companies,
00:01:58.720 | anybody who actually has to solve a problem,
00:02:02.080 | they're going to innovate on active learning.
00:02:04.920 | - Yeah, everybody kind of reinvents active learning
00:02:07.360 | when they actually have to work in practice
00:02:09.040 | because they start labeling things and they think,
00:02:11.640 | gosh, this is taking a long time and it's very expensive.
00:02:14.560 | And then they start thinking,
00:02:16.520 | well, why am I labeling everything?
00:02:17.920 | I'm only, the machine's only making mistakes
00:02:20.120 | on those two classes, they're the hard ones.
00:02:22.160 | Maybe I'll just start labeling those two classes
00:02:24.120 | and then you start thinking,
00:02:25.640 | well, why did I do that manually?
00:02:26.840 | Why can't I just get the system to tell me
00:02:28.280 | which things are going to be hardest?
00:02:30.040 | It's an obvious thing to do, but yeah,
00:02:33.600 | it's just like transfer learning,
00:02:36.680 | it's understudied and the academic world
00:02:39.400 | just has no reason to care about practical results.
00:02:42.720 | The funny thing is,
00:02:43.560 | like I've only really ever written one paper.
00:02:45.240 | I hate writing papers and I didn't even write it.
00:02:48.040 | It was my colleague, Sebastian Ruder, who actually wrote it.
00:02:50.760 | I just did the research for it,
00:02:53.320 | but it was basically introducing transfer learning,
00:02:55.840 | successful transfer learning to NLP for the first time.
00:02:59.160 | And the algorithm is called ULMfit.
00:03:01.280 | And I actually wrote it for the course,
00:03:07.200 | for the fast AI course.
00:03:08.920 | I wanted to teach people NLP
00:03:10.560 | and I thought I only want to teach people practical stuff.
00:03:12.720 | And I think the only practical stuff is transfer learning.
00:03:15.760 | And I couldn't find any examples of transfer learning in NLP.
00:03:18.560 | So I just did it.
00:03:19.760 | And I was shocked to find that as soon as I did it,
00:03:22.520 | which the basic prototype took a couple of days,
00:03:26.280 | smashed the state of the art
00:03:27.720 | on one of the most important data sets
00:03:29.480 | in a field that I knew nothing about.
00:03:31.920 | And I just thought, well, this is ridiculous.
00:03:35.600 | And so I spoke to Sebastian about it
00:03:39.000 | and he kindly offered to write it up, the results.
00:03:42.880 | And so it ended up being published in ACL,
00:03:46.560 | which is the top computational linguistics conference.
00:03:50.760 | So like people do actually care once you do it,
00:03:54.080 | but I guess it's difficult for maybe like junior researchers
00:03:58.000 | or like, I don't care whether I get citations
00:04:01.800 | or papers or whatever.
00:04:02.960 | There's nothing in my life that makes that important,
00:04:04.840 | which is why I've never actually bothered
00:04:06.720 | to write a paper myself.
00:04:08.240 | But for people who do,
00:04:09.200 | I guess they have to pick the kind of safe option,
00:04:14.800 | which is like, yeah, make a slight improvement
00:04:17.520 | on something that everybody's already working on.
00:04:19.920 | (upbeat music)
00:04:22.520 | (upbeat music)
00:04:25.120 | (upbeat music)
00:04:27.720 | (upbeat music)
00:04:30.320 | (upbeat music)
00:04:32.920 | (upbeat music)
00:04:35.520 | [BLANK_AUDIO]