back to indexYann LeCun: Can Neural Networks Reason? | AI Podcast Clips
Chapters
0:0 Can Neural Networks Reason
0:15 Discrete Reasoning
2:15 Working Memory
3:44 Transformer
5:18 Energy minimization
00:00:02.580 |
- Do you think neural networks can be made to reason? 00:00:44.560 |
that don't use kind of gradient information, if you want. 00:00:52.160 |
It's just that it's incompatible with learning 00:01:00.240 |
deep learning has been kind of looked at with suspicion 00:01:12.320 |
the kind of math you do in electrical engineering 00:01:14.800 |
than the kind of math you do in computer science. 00:01:17.360 |
And nothing in machine learning is exact, right? 00:01:28.800 |
And you can prove that an algorithm is correct, right? 00:01:31.760 |
Machine learning is the science of sloppiness, really. 00:01:38.080 |
So, okay, maybe let's feel around in the dark 00:01:46.360 |
or a system that works with continuous functions 00:01:58.840 |
build on previous knowledge, build on extra knowledge, 00:02:04.080 |
generalize outside of any training set ever built. 00:02:23.640 |
whose main research interest is actually exactly that, right? 00:02:26.800 |
So what you need to have is a working memory. 00:02:29.880 |
So you need to have some device, if you want, 00:02:34.480 |
some subsystem that can store a relatively large number 00:02:48.440 |
there are kind of three main types of memory. 00:02:50.320 |
One is the sort of memory of the state of your cortex. 00:02:56.120 |
And that sort of disappears within 20 seconds. 00:02:58.280 |
You can't remember things for more than about 20 seconds 00:03:00.680 |
or a minute if you don't have any other form of memory. 00:03:04.960 |
The second type of memory, which is longer term, 00:03:08.840 |
So you can, you know, you came into this building, 00:03:11.040 |
you remember where the exit is, where the elevators are. 00:03:20.280 |
You might remember something about what I said. 00:03:30.200 |
And then the longer term memory is in the synapse, 00:03:38.320 |
is that you want the hippocampus like thing, right? 00:03:46.520 |
neural training machines and stuff like that, right? 00:03:48.520 |
And now with transformers, which have sort of 00:03:51.360 |
a memory in their kind of self-attention system, 00:03:59.800 |
Another thing you need is some sort of network 00:04:05.720 |
get an information back, and then kind of crunch on it, 00:04:35.680 |
so that seems to be too small to contain the knowledge 00:04:43.800 |
- Well, a transformer doesn't have this idea of recurrence. 00:04:51.680 |
- But recurrence would build on the knowledge somehow. 00:04:59.280 |
and expand the amount of information, perhaps, 00:05:04.880 |
But is this something that just can emerge with size? 00:05:09.320 |
Because it seems like everything we have now is too small. 00:05:17.600 |
way, I mean, sort of the original memory network 00:05:19.760 |
maybe had something like the right architecture, 00:05:25.080 |
so that the memory contains all of Wikipedia, 00:05:29.640 |
- So there's a need for new ideas there, okay. 00:05:41.600 |
and it's based on, let's call it energy minimization. 00:06:26.640 |
You have a model of what's gonna happen in the world 00:06:30.080 |
And that allows you to, by energy minimization, 00:06:34.360 |
that optimizes a particular objective function, 00:06:48.080 |
And perhaps what led to the ability of humans to reason 00:06:52.600 |
is the fact that, or species that appear before us 00:07:04.160 |
And so it's the same capacity that you need to have. 00:07:21.280 |
is not a useful way to think about knowledge? 00:07:24.840 |
- Graphs are a little brittle or logic representation. 00:07:35.840 |
is a little too rigid and too brittle, right? 00:07:45.560 |
So a rule, if you have this and that symptom, 00:08:09.480 |
So there is, I mean, certainly a lot of interesting work 00:08:15.960 |
The main issue with this is knowledge acquisition. 00:08:18.400 |
How do you reduce a bunch of data to a graph of this type? 00:08:26.360 |
on the human being to encode, to add knowledge. 00:08:35.960 |
do you want to represent knowledge as symbols 00:08:39.160 |
and do you want to manipulate them with logic? 00:08:41.760 |
And again, that's incompatible with learning. 00:09:27.680 |
and then put the result back in the same space. 00:09:29.440 |
So it's this idea of working memory basically. 00:09:41.480 |
I mean, you can learn basic logic operations there. 00:09:47.960 |
There's a big debate on sort of how much prior structure 00:09:51.240 |
you have to put in for this kind of stuff to emerge.