back to indexFrançois Chollet: Limits of Deep Learning | AI Podcast Clips
00:00:00.000 |
What do you think are the current limits of deep learning? 00:00:06.000 |
If we look specifically at these function approximators that try to generalize from 00:00:15.000 |
You've talked about local versus extreme generalization. 00:00:19.760 |
You mentioned that neural networks don't generalize well, humans do. 00:00:26.360 |
And you've also mentioned that extreme generalization requires something like reasoning to fill 00:00:32.120 |
So how can we start trying to build systems like that? 00:00:39.200 |
Deep learning models are like huge parametric models, differentiable, so continuous, that 00:00:52.720 |
So they're trained pretty much point by point. 00:00:55.760 |
They're learning a continuous geometric morphing from an input vector space to an output vector 00:01:04.120 |
And because this is done point by point, a deep neural network can only make sense of 00:01:11.400 |
points in experience space that are very close to things that it has already seen in the 00:01:17.680 |
At best, it can do interpolation across points. 00:01:22.520 |
But that means in order to train your network, you need a dense sampling of the input cross 00:01:29.040 |
output space, almost a point by point sampling, which can be very expensive if you're dealing 00:01:35.520 |
with complex real world problems like autonomous driving, for instance, or robotics. 00:01:42.240 |
It's doable if you're looking at the subset of the visual space, but even then it's still 00:01:49.700 |
And it's only going to be able to make sense of things that are very close to what it has 00:01:55.280 |
And in contrast to that, well, of course you have human intelligence, but even if you're 00:01:59.200 |
not looking at human intelligence, you can look at very simple rules, algorithms. 00:02:05.360 |
If you have a symbolic rule, it can actually apply to a very, very large set of inputs 00:02:13.400 |
It is not obtained by doing a point by point mapping. 00:02:19.400 |
So for instance, if you try to learn a sorting algorithm using a deep neural network, well, 00:02:24.280 |
you're very much limited to learning point by point what the sorted representation of 00:02:33.040 |
But instead you could have a very, very simple sorting algorithm written in a few lines. 00:02:44.260 |
And it can process any list at all because it is abstract, because it is a set of rules. 00:02:50.840 |
So deep learning is really like point by point geometric morphings, morphing, train, risk, 00:02:57.420 |
And meanwhile, abstract rules can generalize much better. 00:03:02.760 |
And I think the future is really to combine the two. 00:03:08.260 |
How do we combine good point by point functions with programs, which is what the symbolic 00:03:20.220 |
I mean, obviously we're jumping into the realm of where there's no good answers. 00:03:25.980 |
You just kind of ideas and intuitions and so on. 00:03:29.020 |
Well, if you look at the really successful AI systems today, I think they are already 00:03:33.700 |
hybrid systems that are combining symbolic AI with deep learning. 00:03:38.100 |
For instance, successful robotics systems are already mostly model-based, rule-based, 00:03:48.020 |
At the same time, they're using deep learning as perception modules. 00:03:52.820 |
Sometimes they're using deep learning as a way to inject a fuzzy intuition into a rule-based 00:03:59.500 |
If you look at a system like in a self-driving car, it's not just one big end to end neural 00:04:05.260 |
network that wouldn't work at all, precisely because in order to train that, you would 00:04:09.420 |
need a dense sampling of experience base when it comes to driving, which is completely unrealistic, 00:04:17.460 |
Instead, the self-driving car is mostly symbolic. 00:04:27.060 |
It's mostly based on explicit models, in this case, mostly 3D models of the environment 00:04:34.260 |
around the car, but it's interfacing with the real world using deep learning modules. 00:04:40.020 |
So the deep learning there serves as a way to convert the raw sensory information to 00:04:47.060 |
Okay, well, let's linger on that a little more. 00:04:50.940 |
So dense sampling from input to output, you said it's obviously very difficult. 00:05:02.180 |
Self-driving for many people, let's not even talk about self-driving, let's talk about 00:05:13.780 |
Lane following, yeah, it's definitely a problem you can solve with an end-to-end deep learning 00:05:19.780 |
I don't know why you're jumping from the extreme so easily, because I disagree with you on 00:05:25.500 |
I think, well, it's not obvious to me that you can solve lane following. 00:05:34.260 |
I think in general, there is no hard limitations to what you can learn with a deep neural network 00:05:42.140 |
as long as the search space is rich enough, is flexible enough. 00:05:49.940 |
And as long as you have this dense sampling of the input cross output space. 00:05:53.900 |
The problem is that this dense sampling could mean anything from 10,000 examples to trillions 00:06:04.820 |
And if you could just give it a chance and think what kind of problems can be solved 00:06:10.300 |
by getting a huge amount of data and thereby creating a dense mapping. 00:06:16.580 |
So let's think about natural language dialogue, the Turing test. 00:06:22.540 |
Do you think the Turing test can be solved with a neural network alone? 00:06:28.820 |
Well, the Turing test is all about tricking people into believing they're talking to a 00:06:36.100 |
And I don't think that's actually very difficult because it's more about exploiting a human 00:06:43.420 |
perception and not so much about intelligence. 00:06:46.140 |
There's a big difference between mimicking intelligent behavior and actual intelligent 00:06:51.020 |
So, okay, let's look at maybe the Alexa prize and so on. 00:06:53.860 |
The different formulations of the natural language conversation that are less about 00:06:58.380 |
mimicking and more about maintaining a fun conversation that lasts for 20 minutes. 00:07:03.420 |
That's a little less about mimicking and that's more about, I mean, it's still mimicking, 00:07:07.760 |
but it's more about being able to carry forward a conversation with all the tangents that 00:07:13.300 |
Do you think that problem is learnable with this kind of, with a neural network that does 00:07:23.100 |
So I think it would be very, very challenging to do this with deep learning. 00:07:26.300 |
I don't think it's out of the question either. 00:07:31.940 |
The space of problems that can be solved with a large neural network. 00:07:35.580 |
What's your sense about the space of those problems? 00:07:44.820 |
In practice, well, deep learning is a great fit for perception problems. 00:07:50.420 |
In general, any problem which is not really amenable to explicit handcrafted rules or 00:07:59.780 |
rules that you can generate by exhaustive search over some program space. 00:08:04.660 |
So perception, artificial intuition, as long as you have a sufficient training dataset. 00:08:12.900 |
I mean, perception, there's interpretation and understanding of the scene, which seems 00:08:17.300 |
to be outside the reach of current perception systems. 00:08:21.540 |
So do you think larger networks will be able to start to understand the physics and the 00:08:27.740 |
physics of the scene, the three-dimensional structure and relationships of objects in 00:08:34.260 |
Or really that's where symbolic AI has to step in? 00:08:37.820 |
Well, it's always possible to solve these problems with deep learning. 00:08:47.140 |
A model would be an explicit rule-based abstract model would be a far better, more compressed 00:08:55.580 |
Then learning just this mapping between in this situation, this thing happens. 00:08:59.580 |
If you change the situation slightly, then this other thing happens and so on. 00:09:03.020 |
Do you think it's possible to automatically generate the programs that would require that 00:09:12.100 |
The way the expert systems failed is so many facts about the world had to be hand-coded 00:09:17.540 |
Do you think it's possible to learn those logical statements that are true about the 00:09:25.380 |
Do you think, I mean, that's kind of what theorem proving at a basic level is trying 00:09:31.060 |
Yeah, except it's much harder to formulate statements about the world compared to formulating 00:09:39.220 |
Statements about the world tend to be subjective. 00:09:52.100 |
However, today we just don't really know how to do it. 00:09:56.620 |
So it's very much a grass search or tree search problem. 00:10:01.140 |
And so we are limited to the sort of tree search and grass search algorithms that we 00:10:07.340 |
But certainly I think genetic algorithms are very promising.