back to indexJay McClelland: Neural Networks and the Emergence of Cognition | Lex Fridman Podcast #222
Chapters
0:0 Introduction
0:43 Beauty in neural networks
5:2 Darwin and evolution
10:47 The origin of intelligence
17:29 Explorations in cognition
23:33 Learning representations by back-propagating errors
29:58 Dave Rumelhart and cognitive modeling
43:1 Connectionism
65:54 Geoffrey Hinton
67:49 Learning in a neural network
84:42 Mathematics & reality
91:50 Modeling intelligence
102:28 Noam Chomsky and linguistic cognition
116:49 Advice for young people
127:56 Psychiatry and exploring the mind
140:35 Legacy
146:24 Meaning of life
00:00:00.000 |
The following is a conversation with Jay McClelland, 00:00:12.300 |
Having written the parallel distributed processing book 00:00:27.580 |
machine learning revolution of the past 15 years. 00:00:33.500 |
please check out our sponsors in the description. 00:00:38.820 |
and here is my conversation with Jay McClelland. 00:00:54.180 |
as the most beautiful aspect about neural networks, 00:00:59.520 |
- The fundamental thing I think about with neural networks 00:01:26.100 |
cognitive psychology had just become a field. 00:01:44.540 |
It wasn't gonna tell us anything about the mind. 00:02:07.980 |
So there was a sense with cognitive psychology 00:02:11.660 |
that in understanding the sort of neuronal structure 00:02:19.980 |
And then your sense is if we study these neural networks, 00:02:25.860 |
to understanding the fundamentals of the human mind. 00:02:29.260 |
I used to think, or I used to talk about the idea 00:02:35.220 |
So Descartes, you know, thought about these things, right? 00:02:41.580 |
He was walking in the gardens of Versailles one day, 00:02:46.220 |
and he stepped on a stone, and a statue moved. 00:02:52.140 |
And he walked a little further, he stepped on another stone, 00:03:02.860 |
and he found out that they had a hydraulic system 00:03:05.820 |
that allowed the physical contact with the stone 00:03:10.620 |
to cause water to flow in various directions, 00:03:15.840 |
And he used this as the beginnings of a theory 00:03:26.460 |
And he had this notion that these little fibers 00:03:33.260 |
that people had identified that weren't carrying the blood, 00:03:39.860 |
that if you touch something, there would be pressure, 00:03:49.220 |
So he had a mechanistic theory of animal behavior. 00:03:54.220 |
And he thought that the human had this animal body, 00:04:10.540 |
So the physical world includes the body in action, 00:04:15.540 |
but it doesn't include thought, according to Descartes, 00:04:19.500 |
- And so the study of physiology at that time 00:04:22.900 |
was the study of sensory systems and motor systems 00:04:30.060 |
when you stimulated neurons and stuff like that. 00:04:33.620 |
And the study of cognition was something that, you know, 00:04:38.140 |
was tied in with abstract computer algorithms 00:04:48.660 |
and so when I'm studying cognitive psychology 00:04:53.700 |
wait a minute, the whole thing is biological, right? 00:05:13.140 |
- Well, obvious and not obvious at the same time. 00:05:18.140 |
And I think about Darwin in this context, too, 00:05:20.360 |
because Darwin knew very early on that none of the ideas 00:05:25.360 |
that anybody had ever offered gave him a sense 00:05:30.320 |
of understanding how evolution could have worked. 00:05:34.560 |
But he wanted to figure out how it could have worked. 00:05:42.540 |
And he spent a lot of time working on this idea 00:05:52.300 |
and thinking they were interesting but not knowing why, 00:05:54.620 |
and drawing more and more pictures of different birds 00:05:57.500 |
that differ slightly from each other and so on. 00:06:02.500 |
But after he figured it out, he had nightmares about it. 00:06:06.940 |
He would dream about the complexity of the eye 00:06:16.140 |
that that could have ever emerged from some sort of 00:06:20.340 |
unguided process, that it hadn't been the product of design. 00:06:31.980 |
in part because he was scared of his own ideas. 00:06:39.160 |
But then by the time the 20th century rolls around, 00:06:51.480 |
many people understand or believe that evolution produced 00:07:02.180 |
And Descartes' idea starts to seem a little wonky 00:07:09.560 |
There's the apes and the chimpanzees and the bonobos, 00:07:24.520 |
They don't, there's no hippocampus in the monkey brain. 00:07:30.160 |
Huxley had to do a surgery in front of many, many people 00:07:36.240 |
there's actually a hippocampus in the chimpanzee's brain. 00:07:40.340 |
So their continuity of the species is another element 00:07:53.640 |
that we are ourselves a total product of nature. 00:08:05.920 |
how nature could actually give rise to organisms 00:08:20.120 |
- So it's interesting because even the idea of evolution 00:08:23.000 |
is hard for me to keep all together in my mind. 00:08:30.120 |
it's hard to imagine that like the development 00:08:33.600 |
of the human eye would give me nightmares too, 00:08:39.920 |
And it's very tempting to think about kind of a growth 00:08:44.680 |
And it's like, how is it possible for that such a thing 00:08:50.160 |
'Cause also me from a robotics engineering perspective, 00:09:08.680 |
that would have been equally interesting to me 00:09:10.600 |
would have been to actually study the process 00:09:20.880 |
into brain development and the exquisite sort of 00:09:27.360 |
laying down of pathways and so on that occurs in the brain. 00:09:32.320 |
And I know the slightest bit about that is not my field, 00:09:35.760 |
but there are fascinating aspects to this process 00:09:40.760 |
that eventually result in the complexity of various brains. 00:10:02.680 |
in the study of vision, the continuity between humans 00:10:20.160 |
The monkey's visual system and the human visual system, 00:10:22.960 |
extremely similar up to certain levels, of course. 00:10:35.600 |
and the first few layers of cortex or cortical areas, 00:10:40.600 |
I guess one would say, are extremely similar. 00:10:47.060 |
- Yeah, so on the cognition side is where the leap 00:11:00.320 |
or if there's other intelligent alien civilizations 00:11:06.040 |
So one special thing seems to be the origin of life itself. 00:11:09.320 |
However you define that, there's a gray area. 00:11:11.880 |
And the other leap, this is very biased perspective 00:11:24.440 |
An important one is how difficult does that leap? 00:11:33.720 |
and some apes had to touch a monolith to get it? 00:11:41.640 |
- Exactly, but it just seems one heck of a leap 00:11:51.540 |
argued that some genetic fluke occurred 100,000 years ago. 00:12:12.140 |
had this one genetic tweak that resulted in language. 00:12:20.420 |
And language then provided this special thing 00:12:40.540 |
but I think it comes along with the evolution 00:12:46.020 |
of a lot of other related things related to sociality 00:13:12.980 |
- Right, so language is a tool that allows you 00:13:25.260 |
And it's interesting to think about that one fluke, 00:13:39.260 |
Like evolution just kind of opens the door a little bit 00:13:41.540 |
and then time and selection takes care of the rest. 00:13:45.900 |
- You know, there's so many fascinating aspects 00:13:49.060 |
So we think of evolution as continuous, right? 00:13:54.060 |
We think, oh yes, okay, over 500 million years, 00:13:58.700 |
there could have been this relatively continuous changes. 00:14:12.540 |
evolutionary biologists found from the fossil record. 00:14:27.100 |
Well, suddenly on that scale is a million years or something, 00:14:38.940 |
was a very important concept in evolutionary biology. 00:14:55.220 |
We seem to have a certain kind of mindset at a certain age. 00:15:04.900 |
oh my God, how could they have thought that way? 00:15:07.220 |
So Piaget was known for this kind of stage theory 00:15:13.580 |
and suddenly those stages are so discreet, transitions. 00:15:20.820 |
And that's another thing that's always interested me 00:15:32.020 |
where something like an insight or a transition 00:15:47.620 |
And so evolutionary biology, developmental biology, 00:15:57.820 |
that have been approached in this kind of way. 00:16:02.060 |
I find both fascinating those early years of human life, 00:16:10.220 |
of the embryonic development to how from embryos 00:16:19.300 |
Again, from the engineering perspective, it's fascinating. 00:16:42.660 |
that self-assembly of a mechanism from the DNA material, 00:16:55.300 |
that just generate a system, this mushy thing 00:17:14.140 |
- Yeah, ultimately that is a very important part 00:17:22.340 |
this sort of emergence of mind from brain kind of thing. 00:17:27.340 |
- And the whole thing seems to be pretty continuous. 00:17:35.220 |
You wrote parallel distributed processing books 00:17:37.940 |
that explored ideas of neural networks in the 1980s 00:17:43.220 |
But the books you wrote with David Rommelhart, 00:17:47.180 |
who is the first author on the back propagation paper 00:18:02.580 |
- I'm gonna start sort of with my own process 00:18:15.340 |
when I met Geoff Hinton and he came to San Diego 00:18:32.700 |
okay, I'm really interested in human cognition, 00:18:35.660 |
but this disembodied sort of way of thinking about it 00:18:40.180 |
that I'm getting from the current mode of thought about it 00:19:06.260 |
And the book was called "Explorations in Cognition." 00:19:25.420 |
I'm coming to this community where people can get together 00:19:29.020 |
and feel like they've collectively exploring ideas. 00:19:33.180 |
And it was a book that had a lot of, I don't know, 00:19:41.020 |
And Don Norman, who was the more senior figure 00:19:46.020 |
at Rommelhart at that time, who led that project, 00:19:49.820 |
always created this spirit of playful exploration of ideas. 00:19:58.340 |
But I was also still trying to get from the neurons 00:20:16.740 |
where I heard a talk by a man named James Anderson, 00:20:24.580 |
in a psychology department who had used linear algebra 00:20:29.580 |
to create neural network models of perception 00:20:41.180 |
that one could create a model that was simulating neurons, 00:20:46.180 |
not just kind of engaged in a stepwise algorithmic process 00:20:58.580 |
but it was simulating, remembering, and recalling, 00:21:03.540 |
and recognizing the prior occurrence of a stimulus 00:21:08.820 |
So for me, this was a bridge between the mind and the brain. 00:21:13.500 |
And I just like, and I remember I was walking 00:21:20.500 |
and I almost felt like St. Paul on the road to Damascus. 00:21:25.020 |
I said to myself, you know, if I think about the mind 00:21:31.900 |
it will help me answer the questions about the mind 00:21:51.980 |
who had been writing about neural networks since the '60s. 00:22:00.820 |
And his PhD dissertation showed up in an applicant pool 00:22:11.780 |
that Dave and Don, the two men I mentioned before, 00:22:30.620 |
who came and joined this group of postdoctoral scholars 00:22:33.820 |
that was funded by this wonderful grant that they got. 00:22:53.060 |
organized a conference at UCSD where we were. 00:22:59.620 |
And it was called Parallel Models of Associative Memory. 00:23:06.460 |
who had been thinking about these kinds of ideas 00:23:30.180 |
So let me talk about Romelhart now for a minute, 00:23:37.740 |
So first of all, for people who are not familiar, 00:23:41.060 |
neural networks are at the core of the machine learning, 00:23:45.400 |
Jeffrey Hinton that we mentioned is one of the figures 00:23:48.660 |
that were important in the history, like yourself, 00:23:53.180 |
artificial neural networks that are then used 00:24:14.020 |
how you thought at the time about neural networks, 00:24:23.100 |
the word parallel in this comes from the idea 00:24:26.620 |
that each neuron is an independent computational unit. 00:24:40.920 |
And it's a very simple little computational unit, 00:24:51.460 |
It's in a biological medium where it's getting nutrients 00:24:58.600 |
But you can think of it as almost like a little computer 00:25:16.020 |
almost a billion of these little neurons, right? 00:25:25.460 |
So it's like instead of just a single central processor 00:25:29.900 |
that's engaged in chug, chug, one step after another, 00:25:34.360 |
we have a billion of these little computational units 00:25:44.460 |
maybe you can comment, it seems to me, even still to me, 00:25:49.160 |
quite a revolutionary way to think about computation 00:25:55.140 |
of theoretical computer science alongside of that, 00:25:58.060 |
where it's very much like sequential computer. 00:26:06.440 |
why don't we take a really dumb, very simple computer 00:26:11.380 |
and just have a lot of them interconnected together? 00:26:14.380 |
And they're all operating in their own little world 00:26:23.440 |
trying to understand how things like certain characteristics 00:26:31.620 |
- That's quite a revolutionary way of thinking, I would say. 00:26:47.560 |
not sort of knowing how we kind of get all the way there, 00:26:57.420 |
at the core of the questions that everybody's asking 00:27:02.840 |
But if I could just play this out a little bit, 00:27:15.380 |
is a set of, you could think of it biologically 00:27:26.400 |
Each one had, each collection has maybe 10,000 neurons in it 00:27:40.040 |
but others are closer to the biological brain 00:27:49.600 |
we have thousands of neurons or tens of thousands maybe. 00:27:54.440 |
Well, in the brain, we probably have millions in each layer, 00:27:59.440 |
but we're getting sort of similar in a certain way, right? 00:28:02.780 |
And then we think, okay, at the bottom level, 00:28:13.320 |
They respond to the amount of light of a certain wavelength 00:28:24.600 |
And then there's several further stages going up, 00:28:47.980 |
there's a cell phone, there's a water bottle. 00:29:02.240 |
they are doing this massively parallel computation 00:29:08.560 |
in each of those layers is thought of as computing 00:29:17.320 |
simultaneously with all the other ones in the same layer. 00:29:28.200 |
one activation pattern that's computed in a single step. 00:29:36.200 |
but it's still that parallel and distributed processing. 00:29:53.880 |
You can start getting all the beautiful things 00:30:02.280 |
but it's Parallel and Something Associative Memory 00:30:04.440 |
and so on, very exciting, technical and exciting title. 00:30:08.700 |
And you started talking about Dave Romerhart. 00:30:17.280 |
Can you tell me about him, his ideas, his mind, 00:30:35.920 |
and his father was the editor of the newspaper. 00:30:58.520 |
They competed in sports and they competed in mind games. 00:31:13.880 |
He went at a younger age than most people do to college 00:31:21.440 |
at the University of South Dakota and majored in mathematics. 00:31:24.960 |
And I don't know how he got interested in psychology 00:31:29.960 |
but he applied to the mathematical psychology program 00:31:34.520 |
at Stanford and was accepted as a PhD student 00:31:37.800 |
to study mathematical psychology at Stanford. 00:31:40.400 |
So mathematical psychology is the use of mathematics 00:32:23.240 |
what the probability that the subject will be correct 00:32:30.560 |
So it's a use of mathematics to descriptively characterize 00:32:40.000 |
And Stanford at that time was the place where 00:32:48.720 |
mathematical thinkers who were also connected 00:32:53.600 |
who brought a lot of really exciting ideas onto the table. 00:33:07.200 |
He was a very strong student within that program. 00:33:11.420 |
And he got this job at this brand new university 00:33:19.120 |
in San Diego in 1967 where he's one of the first 00:33:23.080 |
assistant professors in the Department of Psychology at UCSD. 00:33:44.200 |
mathematical modeling, but he had gotten interested 00:33:49.200 |
in cognition, he'd gotten interested in understanding 00:34:11.080 |
like how would we know if we really understood something? 00:34:23.640 |
So for example, one of his favorite things at that time was, 00:34:32.840 |
when she heard the familiar jingle of the good humor man. 00:34:38.400 |
She remembered her birthday money and ran into the house. 00:34:45.680 |
Well, there's a couple of ideas you could have, 00:34:50.200 |
but the most natural one is that the good humor man 00:35:00.160 |
so she's gonna run into the house and get her money 00:35:03.960 |
It's a huge amount of inference that has to happen 00:35:06.160 |
to get those things to link up with each other. 00:35:08.560 |
And he was interested in how the hell that could happen. 00:35:13.160 |
And he was trying to build good old-fashioned AI style 00:35:47.400 |
to actually build something that looks like a web 00:35:55.040 |
something like understanding, whatever the heck that is. 00:36:00.980 |
that they grappled with at the end of that book 00:36:03.360 |
that I was describing, Explorations in Cognition. 00:36:14.320 |
- By the way, that's called good old-fashioned AI now. 00:36:37.120 |
of the recognition that this wasn't all working. 00:36:39.640 |
Anyway, so he started thinking in terms of the idea 00:36:51.480 |
to integrate multiple simultaneous constraints 00:36:55.280 |
in a way that would be mutually influencing each other. 00:37:05.000 |
first time I read it, I thought, oh, well, yeah, 00:37:11.960 |
But after a while, it just got under my skin. 00:37:15.240 |
And it was called an Interactive Model of Reading. 00:37:27.520 |
to read, our interpretation of what's coming off the page 00:37:32.520 |
when we read, at every level of analysis you can think of 00:37:40.960 |
actually depends on all the other levels of analysis. 00:37:45.920 |
So what are the actual pixels making up each letter? 00:37:54.000 |
And what do those pixels signify about which letters 00:38:05.640 |
And what do those words tell us about what ideas 00:38:12.560 |
And so he had this model where we have these little tiny 00:38:36.440 |
And at that time, his idea was there's this set of experts. 00:38:43.140 |
There's an expert about how to construct a line 00:38:50.700 |
which sets of lines go together to make which letters, 00:38:53.280 |
and another one about which letters go together 00:38:59.580 |
and another one about how the meanings fit together, 00:39:04.180 |
And all these experts are looking at this data, 00:39:06.220 |
and they're updating hypotheses at other levels. 00:39:12.700 |
So the word expert can tell the letter expert, 00:39:17.260 |
because I think there should be a word the here, 00:39:20.580 |
and the bottom-up sort of feature-to-letter expert 00:39:23.620 |
can say, I think there should be a T there too, 00:39:28.700 |
And so there's a top-down, bottom-up, interactive process, 00:39:32.600 |
but it's going on at all layers simultaneously. 00:39:34.860 |
So everything can filter all the way down from the top, 00:39:38.900 |
and it's a completely interactive, bidirectional, 00:39:45.180 |
- That is somehow, because of the abstractions, 00:39:47.740 |
it's hierarchical, so there's different layers 00:39:51.460 |
of responsibilities, different levels of responsibilities. 00:39:54.700 |
First of all, it's fascinating to think about it 00:40:02.140 |
of a neural network, or something like a neural network, 00:40:06.860 |
that work on letters, and then the letters become words, 00:40:14.780 |
that from that kind of hierarchical structure 00:40:34.620 |
one for the features, and one for the letters, 00:40:36.860 |
and one for how the letters make the words, and so on, 00:40:43.060 |
sort of evaluating various propositions about, 00:40:48.980 |
going to be one that looks like the letter T, and so on? 00:40:59.400 |
and hearing about Jim Anderson's linear algebra book, 00:41:12.720 |
which just would have their connection weights 00:41:24.160 |
called the Interactive Activation Model of Letter Perception, 00:41:41.860 |
but now we built it out of a set of neuron-like 00:41:45.000 |
processing units that are just connected to each other 00:41:52.220 |
has a connection to the unit for the letter T 00:41:56.200 |
and the letter I in the second position, so on, 00:41:59.920 |
and because these connections are bidirectional, 00:42:03.720 |
if you have prior knowledge that it might be the word time, 00:42:08.800 |
that starts to prime the letters and the features, 00:42:12.000 |
and if you don't, then it has to start bottom-up, 00:42:24.240 |
they can convergently result in an emergent perception, 00:42:27.720 |
and that was the piece of work that we did together 00:42:32.720 |
that sort of got us both completely convinced 00:42:44.560 |
was going to be able to actually address the questions 00:42:48.460 |
that we were interested in as cognitive psychologists. 00:42:50.800 |
- So the algorithmic side, the optimization side, 00:42:53.160 |
those are all details, like when you first start, 00:43:17.320 |
is in the connections between the units, right? 00:43:24.780 |
There's just the connections between the units. 00:43:36.920 |
The unit for the word time isn't a unit for the word time 00:43:40.040 |
for any other reason than it's got the connections 00:43:46.040 |
Those are the units on the input that excite it 00:43:48.360 |
when it's excited that it in a sense represents 00:43:52.680 |
in the system that there's support for the hypothesis 00:44:03.120 |
the word time isn't written anywhere inside the model. 00:44:08.360 |
It's only written there in the picture we drew of the model 00:44:11.780 |
to say that's the unit for the word time, right? 00:44:21.080 |
You have to use the connections from that out 00:44:27.800 |
- That's such a, that's a counterintuitive idea. 00:44:35.020 |
This idea of connectionism, it doesn't, it's weird. 00:44:43.520 |
- Yeah, but let's go back to that CNN, right? 00:44:46.120 |
That CNN with all those layers of neuron-like processing 00:44:51.520 |
it's gonna come out and say, this is a cat, that's a dog. 00:45:02.040 |
like from the very first layer to the, you know, 00:45:07.880 |
they just get numbered after a while because they, 00:45:17.200 |
but it's a graded and continuous sort of process 00:45:24.440 |
very, very specific to much more sort of global, 00:45:28.880 |
but it's still, you know, another sort of pattern 00:45:33.980 |
And then at the output side, it says it's cat or it's a dog. 00:45:37.360 |
And when we, when I open my eyes and say, oh, that's Lex, 00:45:49.280 |
which is a member of the same species as many other dogs, 00:45:57.440 |
I don't know how to describe what it is that makes me know 00:46:01.420 |
that I'm looking at Lex or at my particular dog, right? 00:46:04.680 |
Or even that I'm looking at a particular brand of car. 00:46:09.420 |
but if I wrote you a paragraph about the car, 00:46:16.760 |
So the idea that we have propositional knowledge 00:46:36.560 |
You couldn't ever write down a set of propositions 00:47:00.140 |
You cannot, you don't read the contents of the connections. 00:47:04.060 |
The connections only cause outputs to occur based on inputs. 00:47:22.200 |
But, you know, each layer is probably equally as important 00:47:27.280 |
Like there's no reason why the cat versus dog 00:47:30.240 |
is more important than the lower level activations. 00:47:34.100 |
I mean, all of it is just this beautiful stacking 00:47:37.660 |
And we humans live in this particular layers for us. 00:47:43.400 |
to use those cat versus dog, predator versus prey, 00:47:55.300 |
you ask, are we able to introspect and convert 00:47:58.300 |
the very things that allow us to tell the difference 00:48:00.740 |
between cat and dog into logic, into formal logic. 00:48:06.620 |
I would say that's still part of the dream of symbolic AI. 00:48:10.460 |
And I've recently talked to Doug Lennard who created Psych. 00:48:15.460 |
And that's a project that lasted for many decades 00:48:23.180 |
and still carries a sort of dream in it, right? 00:48:30.700 |
It seems like connectionism is really powerful, 00:48:34.820 |
but it also seems like there's this building of knowledge. 00:48:38.740 |
And so how do we, how do you square those two? 00:48:41.420 |
Like, do you think the connections can contain 00:48:46.940 |
of what Dave Romahart was thinking about of understanding? 00:49:04.660 |
You know, I think that from the emergentist side, 00:49:34.380 |
that I wanted to build like anything into the machine. 00:49:38.280 |
But I don't like the word eliminative anymore 00:49:44.420 |
because it makes it seem like it's wrong to think 00:49:50.580 |
that there is this emergent level of understanding. 00:50:06.900 |
rather than eliminative connectionist, right? 00:50:09.460 |
Because I want to acknowledge that these higher level 00:50:28.980 |
And there was an example that Doug Hofstadter used to use 00:50:36.660 |
Just the idea that we can think about sand dunes 00:50:41.300 |
as entities and talk about like how many there are even, 00:50:51.380 |
but we also know that a sand dune is a very fluid thing. 00:50:56.380 |
It's a pile of sand that is capable of moving around 00:51:10.100 |
And if we think about our thoughts as like sand dunes, 00:51:28.140 |
yes, they exist as such, but they also, you know, 00:51:33.140 |
we shouldn't treat them as completely monolithic entities 00:51:41.540 |
sort of all of the stuff that allows them to change 00:51:58.620 |
then it doesn't mean that the contents of thought 00:52:06.960 |
but it's more fluid maybe than is easier to capture 00:52:15.380 |
- Yeah, that's a heck of a sort of thing to put 00:52:18.420 |
at the top of a resume, radical emerginist connectionist. 00:52:44.380 |
It seems like maybe all of reality is emergent. 00:53:06.780 |
that start looking very quickly like organisms 00:53:10.260 |
that you forget how the actual thing operates. 00:53:13.620 |
They start looking like they're moving around, 00:53:21.900 |
And it seems like maybe it's something about the human mind 00:53:24.980 |
that wants to operate in some layer of the emergent 00:53:35.240 |
also it seems like unfair to eliminate the magic 00:53:43.040 |
Like eliminate the fact that that emergent is real. 00:53:54.140 |
- Yeah, because it seemed like that was trying to say 00:54:13.480 |
many people have confronted that possibility over time, 00:54:17.760 |
but it's still important to accept it as magic, right? 00:54:30.260 |
I think of others who have appreciated the role of magic, 00:54:38.040 |
of actual trickery in creating illusions that move us. 00:54:49.860 |
give rise to something much deeper than that. 00:55:01.020 |
we'll just accept it as given that that occurs. 00:55:09.900 |
We won't try to really, really, really deeply understand 00:55:16.660 |
Okay, but you worked closely with Dave Rommelhart. 00:55:38.140 |
and his demise was actually one of the most poignant 00:55:43.720 |
and relevant tragedies relevant to our conversation. 00:55:59.720 |
He started to undergo a progressive neurological condition 00:56:14.980 |
That is to say his particular course isn't fully understood 00:56:19.760 |
because brain scans weren't done at certain stages 00:56:28.140 |
and no autopsy was done or anything like that. 00:56:33.140 |
So we don't know as much about the underlying pathology 00:56:37.640 |
as we might, but I had begun to get interested 00:56:42.640 |
in this neurological condition that might have been 00:56:52.500 |
as my own efforts to understand another aspect 00:57:09.380 |
The disorder is something my colleagues and collaborators 00:57:17.100 |
So it's a specific form of loss of mind related to meaning, 00:57:27.600 |
And it's progressive in the sense that the patient 00:57:44.680 |
either from touch, from sight, from sound, from language. 00:57:51.600 |
I hear sounds, but I don't know what they mean 00:57:55.180 |
So as this illness progresses, it starts with 00:58:21.960 |
But as it progresses, it becomes more and more striking 00:58:26.960 |
and the patient loses the ability to recognize 00:58:42.600 |
and can't recognize rabbits and rodents anymore. 00:58:55.840 |
So there was this one patient who went through 00:59:03.040 |
any four-legged animal, he would call it either a horse 00:59:08.120 |
And if it was big, he would tend to call it a horse. 00:59:25.300 |
So my collaborator in this work, Carolyn Patterson, 00:59:28.800 |
developed a test called the pyramids and palm trees test. 00:59:33.320 |
So you give the patient a picture of pyramids 00:59:39.520 |
and they have a choice, which goes with the pyramids? 00:59:46.580 |
And she showed that this wasn't just a matter of language 00:59:50.980 |
because the patient's loss of this ability shows up 01:00:00.220 |
The pictures, they can't put the pictures together 01:00:05.280 |
They can't relate the pictures to the words either. 01:00:15.780 |
And so that's why it's called semantic dementia. 01:00:30.460 |
a pattern of activation represents the concepts. 01:00:39.220 |
And then, so the difference between the dog and the goat 01:00:42.380 |
sort of is no longer part of the pattern anymore. 01:00:49.340 |
And we understand that in the way the models work and learn. 01:00:57.300 |
So on the one hand, it's a fascinating aspect 01:01:07.180 |
of distributed representation in a very nice way, 01:01:11.500 |
But at the same time, it was extremely poignant 01:01:33.420 |
thoughtful person who was willing to work for years 01:01:42.340 |
to solve a hard problem, he starts to disappear. 01:01:47.340 |
And there was a period of time when it was like, 01:02:12.660 |
Was he, I mean, this is one of the big scientists 01:02:36.180 |
the Hawking of cognitive science to me in some ways. 01:02:39.820 |
Both of them suffered from a degenerative condition. 01:02:46.200 |
In Hawking's case, it affected the motor system. 01:02:49.300 |
In Rommel-Hart's case, it's affecting the semantics. 01:03:18.660 |
But on the other hand, at some level, he sort of did. 01:03:28.620 |
that he had really become profoundly impaired. 01:03:35.220 |
and it wasn't just like he was distracted that day 01:03:40.180 |
So he retired from his professorship at Stanford 01:03:45.020 |
and he lived with his brother for a couple years 01:04:06.780 |
And I would spend time with him during that period. 01:04:17.140 |
And we would go bowling and he could still bowl. 01:04:37.260 |
And I said, "Okay, well, where do you wanna go?" 01:04:41.580 |
So he still had a certain amount of spatial cognition 01:04:53.020 |
And he couldn't come up with any of the words 01:04:56.860 |
but he knew where on the menu the thing was that he wanted. 01:05:05.340 |
but he knew that that's what he wanted to eat. 01:05:20.060 |
graded in certain kinds of ways but also multi-partite. 01:05:27.980 |
certain sort of partial competencies still exist 01:05:32.140 |
in the absence of other aspects of these competencies. 01:05:39.340 |
about what used to be called cognitive neuropsychology, 01:05:50.060 |
But in particular, this gradual disintegration part. 01:05:54.300 |
I'm a big believer that the loss of a human being 01:05:58.700 |
that you value is as powerful as first falling in love 01:06:03.340 |
I think it's all a celebration of the human being. 01:06:06.940 |
So the disintegration itself too is a celebration in a way. 01:06:12.180 |
But just to say something more about the scientist 01:06:17.740 |
and the backpropagation idea that you mentioned. 01:06:37.740 |
and then there was this opportunity to bring him back. 01:06:47.560 |
And Rumelhart and I had decided we wanted to do this 01:06:54.900 |
And the papers on the interactive activation model 01:06:59.420 |
that I was telling you about had just been published. 01:07:01.580 |
And we both sort of saw a huge potential for this work 01:07:07.460 |
And so the three of us started a research group 01:07:23.660 |
'Cause Jeff was known among Brits to be brilliant 01:07:28.340 |
and Francis was well connected with his British friends. 01:07:35.620 |
- And several, Paul Spolensky was one of the other postdocs. 01:08:09.300 |
"with the way you guys have been approaching this 01:08:13.220 |
"is that you've been looking for inspiration from biology 01:08:26.400 |
He said, "That's the wrong way to go about it. 01:08:31.120 |
"What you should do is you should think in terms of 01:08:35.340 |
"how you can adjust connection weights to solve a problem. 01:08:45.140 |
"So you define your problem, and then you figure out 01:08:50.140 |
"how the adjustment of the connection weights 01:08:54.260 |
And Rommelhart heard that and said to himself, 01:09:00.380 |
"Okay, so I'm gonna start thinking about it that way. 01:09:21.640 |
"I can measure how well they're doing on each image, 01:09:35.960 |
"so as to minimize my loss or reduce the error." 01:09:53.860 |
And in fact, there was an algorithm called the Delta Rule 01:10:04.380 |
in the electrical engineering department at Stanford, 01:10:08.380 |
Woodrow, Bernie Woodrow, and a collaborator named Hoff. 01:10:13.260 |
Anyway, so gradient descent in continuous neural networks 01:10:32.940 |
We want the output to produce a certain pattern. 01:10:36.200 |
We can define the difference between our target 01:10:41.620 |
and we can figure out how to change the connection weights 01:10:52.100 |
from earlier layers of units to the ones at a hidden layer 01:11:04.060 |
because it's just an extension of the gradient descent idea. 01:11:08.540 |
And interestingly enough, Hinton was thinking 01:11:15.500 |
So Hinton had his own alternative algorithm at the time 01:11:20.500 |
based on the concept of the Boltzmann machine 01:11:25.100 |
So the paper on the Boltzmann machine came out in, 01:11:27.760 |
learning in Boltzmann machines came out in 1985, 01:11:30.520 |
but it turned out that backprop worked better 01:11:35.820 |
than the Boltzmann machine learning algorithm. 01:11:46.180 |
And probably that name is opaque to many people. 01:11:52.660 |
What it meant was that in order to figure out 01:12:00.140 |
to the connections from the input to the hidden layer, 01:12:08.580 |
from the output layer through the connections 01:12:14.900 |
to get the signals that would be the error signals 01:12:23.100 |
It was like, well, we know what the error signals 01:12:26.340 |
Let's see if we can get a signal at the hidden layer 01:12:32.860 |
So it's backpropagating through the connections 01:12:36.460 |
from the hidden to the output to get the signals 01:12:40.540 |
to tell the hidden units how to change their weights 01:12:43.100 |
from the input, and that's why it's called backprop. 01:12:45.700 |
Yeah, so it came from Hinton having introduced the concept 01:13:02.580 |
so that they make progress towards your goal. 01:13:04.940 |
- So stop thinking about biology for a second, 01:13:15.820 |
You've gotten a chance to work with him in that little, 01:13:20.460 |
the set of people involved there, it's quite incredible, 01:13:52.500 |
So what kind of ideas have you learned from him? 01:14:01.220 |
what stands out to you in the full space of ideas here 01:14:06.220 |
at the intersection of computation and cognition? 01:14:27.660 |
He had two papers in 1981, just to give one example, 01:14:32.660 |
one of which was essentially the idea of transformers, 01:14:48.500 |
on semantic cognition, which inspired him and Rommelhart 01:15:01.980 |
and still I think sort of grounds my own thinking 01:15:16.040 |
He also, in a small paper that was never published 01:15:27.900 |
or maybe a couple of years even before that, I don't know, 01:15:56.660 |
you need to save the state that you had when you called it 01:16:04.940 |
And the idea was that you would save the state 01:16:11.260 |
by making fast changes to connection weights, 01:16:14.820 |
and then when you finished with the subroutine call, 01:16:22.320 |
would allow you to go back to where you had been before 01:16:35.660 |
And I always thought, okay, that's really, you know, 01:16:45.300 |
and many of them in the 1970s and early 1980s. 01:16:50.720 |
So another thing about Geoff Hinton's way of thinking, 01:17:18.120 |
you don't get up at the board and write equations 01:17:20.280 |
like you do in everybody else's machine learning lab. 01:17:44.680 |
for what's happening as this gradient descent process. 01:17:59.760 |
And it speaks to me of the fundamentally intuitive character 01:18:13.480 |
that together with a commitment to really understanding 01:18:18.480 |
in a way that's absolutely, ultimately explicit and clear, 01:18:33.720 |
He's an example, some kind of weird mix of visual 01:18:40.640 |
Feynman is another example, different style of thinking, 01:18:58.260 |
Just having experienced that unique way of thinking 01:19:02.000 |
transforms you and makes your work much better. 01:19:13.640 |
It's not always the exact ideas that you talk about, 01:19:16.680 |
but it's the process of generating those ideas, 01:19:19.700 |
being around that, spending time with that human being, 01:19:26.760 |
as it was a little bit in your case with Jeff. 01:19:30.640 |
- Yeah, Jeff is a descendant of the logician Boole. 01:19:38.640 |
He comes from a long line of English academics. 01:19:43.640 |
And together with the deeply intuitive thinking ability 01:20:00.000 |
and I think he's mentioned it from time to time 01:20:02.360 |
in other interviews that he's had with people. 01:20:07.800 |
He's wanted to be able to sort of think of himself 01:20:11.400 |
as contributing to the understanding of reasoning itself, 01:20:25.720 |
It's about what can we conclude from what else 01:20:41.800 |
how we derive truths from givens and things like this. 01:20:48.520 |
And the work that Jeff was doing in the early to mid '80s 01:20:59.880 |
was his way of connecting with that Boolean tradition 01:21:08.200 |
probabilistic-graded constraint satisfaction realm. 01:21:31.240 |
I've always been inspired by the Bolton machine too. 01:21:33.480 |
It's like, well, if the neurons are probabilistic 01:21:36.720 |
rather than deterministic in their computations, 01:21:40.440 |
then maybe this somehow is part of the serendipity 01:21:45.440 |
or adventitiousness of the moment of insight, right? 01:21:54.240 |
It might not have occurred at that particular instant. 01:22:01.520 |
And that too is part of the magic of the emergence 01:22:08.120 |
- Well, you're right with the Boolean lineage 01:22:09.880 |
and the dream of computer science is somehow, 01:22:14.880 |
I mean, I certainly think of humans this way, 01:22:17.440 |
that humans are one particular manifestation of intelligence, 01:22:26.200 |
The mechanisms of intelligence, the mechanisms of cognition 01:22:31.120 |
- Yeah, so I think of, I started using the phrase 01:22:43.040 |
people like Geoff Hinton and many of the people 01:23:11.000 |
But at the same time, I feel like that's where 01:23:16.000 |
a huge amount of the excitement of deep learning 01:23:27.720 |
we may be able to even go beyond what we can achieve 01:23:40.200 |
not limited in the ways that we are by our own biology. 01:23:47.000 |
- Perhaps allowing us to scale the very mechanisms 01:23:50.780 |
of human intelligence, just increase its power through scale. 01:23:55.560 |
- Yes, and I think that that, obviously that's the, 01:24:00.360 |
that's being played out massively at Google Brain, 01:24:07.240 |
at OpenAI, and to some extent at DeepMind as well. 01:24:28.240 |
- Still not as many synapses and neurons as the human brain. 01:24:32.560 |
So we still got, we're still beating them on that. 01:24:41.220 |
You write about modeling of mathematical cognition. 01:24:46.360 |
So let me first ask about mathematics in general. 01:24:54.280 |
to Mathematical Cognition, where in the introduction, 01:24:57.400 |
there's some beautiful discussion of mathematics. 01:25:04.600 |
who criticizes a narrow form of view of mathematics 01:25:17.160 |
So from that perspective, what do you think is mathematics? 01:25:23.720 |
- Well, I think of mathematics as a set of tools 01:26:09.320 |
so as to allow the implications of certain facts 01:26:16.320 |
to then allow you to derive other facts with certainty. 01:26:28.160 |
and you know that there is an angle in the first one 01:26:36.600 |
that has the same measure as an angle in the second one, 01:26:45.560 |
adjacent to that angle in each of the two triangles, 01:26:50.080 |
the corresponding sides adjacent to that angle 01:26:57.600 |
then you can then conclude that the triangles are congruent. 01:27:02.240 |
That is to say, they have all of their properties in common. 01:27:18.540 |
In fact, you know, we built bridges out of triangles, 01:27:30.940 |
by extending these ideas about triangles a little further. 01:27:42.960 |
all of the ability to get a tiny speck of matter 01:27:53.520 |
to intersect with some tiny, tiny little body 01:28:03.520 |
is something that depends on these ideas, right? 01:28:31.060 |
these triangles or these distances or these points, 01:28:34.020 |
whatever they are, that allow for this set of tools 01:28:39.020 |
to be created that then gives human beings the, 01:28:48.500 |
that they didn't have without these concepts. 01:29:19.420 |
Natural numbers, zero, one, two, three, four, 01:29:40.060 |
there were 23 sheep, you came back with only 22. 01:29:49.220 |
- It's a fundamental problem of human society 01:29:54.660 |
the same number of sheep as you started with. 01:30:00.300 |
it allows contracts, it allows the establishment 01:30:20.100 |
sort of abstract and idealized and generalizable 01:30:26.860 |
potentially very, very grounded and concrete. 01:30:35.980 |
for the incredible achievements of the human mind 01:30:42.060 |
is the fact that humans invented these idealized systems 01:30:53.500 |
in such a way as to allow all this kind of thing to happen. 01:31:01.820 |
is the development of systems for thinking about 01:31:05.620 |
the properties and relations among sets of idealized objects. 01:31:12.300 |
And you know, the mathematical notation system 01:31:43.880 |
They're not necessarily the deep representation 01:31:53.400 |
such powerful mathematical reasoning, would you say? 01:31:58.400 |
What are some ideas you have for capturing this in a model? 01:32:02.480 |
- The insights that human mathematicians have had 01:32:05.680 |
is a combination of the kind of the intuitive 01:32:17.400 |
that makes it so that something is just like obviously true 01:32:22.400 |
so that you don't have to think about why it's true. 01:32:31.040 |
That then makes it possible to then take the next step 01:32:37.480 |
and ponder and reason and figure out something 01:32:41.960 |
that you previously didn't have that intuition about. 01:32:46.600 |
It then ultimately becomes a part of the intuition 01:32:50.560 |
that the next generation of mathematical thinkers 01:32:59.400 |
so that they can extend the ideas even further. 01:33:02.160 |
I came across this quotation from Henri Poincaré 01:33:08.040 |
while I was walking in the woods with my wife 01:33:15.860 |
in a state park in Northern California late last summer. 01:33:32.960 |
And so what for me the essence of the project 01:33:41.280 |
the intuitive connectionist resources to bear on 01:33:51.700 |
from engagement in thinking with this formal system. 01:33:59.120 |
So I think of the ability of somebody like Hinton 01:34:09.260 |
or Newton or Einstein or Romal Hart or Poincaré 01:34:30.800 |
simultaneous constraints that somehow or other 01:34:38.960 |
that it never did before and give rise to a new idea 01:34:53.700 |
How do I write down the steps of that theorem 01:34:56.440 |
that allow me to make it rigorous and certain? 01:35:08.220 |
that we're beginning to see deep learning systems do 01:35:13.220 |
of their own accord kind of gives me this feeling 01:35:40.660 |
have become really interested in thinking about, 01:35:48.700 |
with massive amounts of text can be given a prompt 01:35:53.700 |
and they can then sort of generate some really interesting, 01:36:03.900 |
And there's kind of like a sense that they've somehow 01:36:16.820 |
all of the particulars of all of the billions 01:36:21.520 |
and billions of experiences that went into the training data 01:36:25.440 |
that gives rise to something like this sort of 01:36:47.800 |
as a input to get it to start to generate its own thoughts. 01:36:52.800 |
And to me that sort of represents the potential 01:37:03.360 |
I don't know if you find them as captivating as, 01:37:06.040 |
you know, on the deep mind side with AlphaZero, 01:37:17.680 |
It feels very like there's brilliant moments of insight. 01:37:34.500 |
that has the intuition of looking at a board, 01:37:42.580 |
And the next few positions, how good are those? 01:37:47.860 |
Grandmasters have this and understanding positionally, 01:37:54.900 |
how can it be improved without doing this full, 01:37:58.180 |
like deep search, and then maybe doing a little bit 01:38:02.320 |
of what human chess players call calculation, 01:38:05.760 |
which is the search, taking a particular set of steps 01:38:11.040 |
But there is moments of genius in those systems too. 01:38:26.360 |
Yes, and I think that, I think Demis Hassabis is, 01:38:50.120 |
to kind of collaborate with some of those guys at DeepMind. 01:38:54.920 |
So I think though that what I like to really emphasize here 01:39:12.540 |
is that philosophers and logicians going back 01:39:21.260 |
three or even a little more than 3,000 years ago 01:39:30.060 |
And gradually the whole idea about thinking formally 01:39:42.940 |
And it's preceded Euclid, certainly President Euclid, 01:39:50.500 |
certainly present in the work of Thales and others. 01:39:58.740 |
But Euclid's elements were the kind of the touch point 01:40:15.240 |
within which these objects were characterized 01:40:19.180 |
and the system of inference that allowed new truths 01:40:49.920 |
who is capable of thinking in this abstract formal way 01:41:08.700 |
that we now begin to think of our understanding 01:41:13.180 |
So we immerse ourselves in a particular language, 01:41:18.180 |
in a particular world of objects and their relationships, 01:41:24.800 |
And we develop intuitive understanding of the real world. 01:41:29.200 |
In a similar way, we can think that what academia 01:41:34.200 |
has created for us, what those early philosophers 01:41:49.340 |
of these schools of thought, modes of thought 01:42:06.260 |
that systematic thought is the essential characteristic 01:42:28.640 |
- Would you say it's more fundamental than like language? 01:42:39.520 |
is it unfair to draw a line between mathematical cognition 01:42:48.640 |
- I think that's a very interesting question. 01:42:53.960 |
that I'm actually very interested in right now. 01:42:56.320 |
But I think the answer is, in important ways, 01:43:35.120 |
His father was a professor of rabbinical studies 01:43:38.120 |
at a small rabbinical college in Philadelphia. 01:43:55.480 |
and brought to the effort to understand natural language 01:44:02.560 |
this profound engagement with these formal systems. 01:44:07.560 |
And I think that there was tremendous power in that 01:44:25.360 |
but that, and I'm gonna use the word but there, 01:44:32.120 |
the actual intuitive knowledge of these things 01:44:49.160 |
who was actually trained in the same linguistics department 01:44:53.020 |
So what Lila discovered was that the intuitions 01:45:00.640 |
that linguists had about even the meaning of a phrase, 01:45:11.120 |
but about what they thought a phrase must mean 01:45:18.800 |
of an ordinary person who wasn't a formally trained thinker. 01:45:23.800 |
And well, it recently has become much more salient. 01:45:29.100 |
I happen to have learned about this when I myself 01:45:31.480 |
was a PhD student at the University of Pennsylvania, 01:45:37.480 |
with all of my other thinking about these things. 01:46:17.140 |
to be more organized around the systematicity 01:46:26.780 |
and ability to be conformant with the principles 01:46:38.560 |
of the natural human mind without that immersion. 01:46:48.700 |
actually take you away from the natural operation 01:47:04.900 |
and so-called knowledge that we consider private, 01:47:29.220 |
what to believe about certain kinds of things. 01:47:39.420 |
Well, they are the product of this sort of immersion 01:47:57.280 |
- Does that limit you from having a good model 01:48:05.900 |
- So when you look at mathematical or linguistics, 01:48:16.240 |
Are you, when you're focusing on mathematical thinking, 01:48:24.380 |
I think that's a great way of characterizing it. 01:48:38.460 |
and another concept called the expert blind spot. 01:48:46.180 |
So the expert blind spot is much more prosaic-seeming 01:48:59.200 |
when they try to communicate their understanding 01:49:03.580 |
And that is that things are self-evident to them 01:49:42.060 |
God made the natural numbers, all else is the work of man, 01:49:50.380 |
that somehow or other, the basic fundamentals 01:49:56.460 |
of discrete quantities being countable and innumerable 01:50:22.340 |
There was a long period of time where the natural numbers 01:50:27.340 |
were considered to be a part of the innate endowment 01:50:31.860 |
of core knowledge or to use the kind of phrases 01:51:00.540 |
and sort of like study those few people who still exist 01:51:08.940 |
and where a certain mode of thinking about language itself 01:51:21.540 |
so it becomes so second nature that you don't know 01:51:46.120 |
Some of the students in the class sort of like, 01:51:49.100 |
they get it, they start to get the way of thinking 01:51:58.020 |
But most of the students who don't kind of engage 01:52:05.460 |
And we think, oh, that man must be brilliant. 01:52:17.780 |
That makes him so that he or she could have that insight. 01:52:26.660 |
biological individual differences completely, 01:52:35.740 |
it was that difference in the dinner table conversation 01:52:44.820 |
that made it so that he had that cast of mind. 01:52:48.020 |
- Yeah, and there's a few topics we talked about 01:52:53.740 |
'Cause I wonder, the better I get at certain things, 01:52:57.460 |
we humans, the deeper we understand something, 01:53:07.180 |
We talked about David and his degenerative mind. 01:53:26.300 |
Like what, if I can, having thought about language 01:53:38.180 |
What is in my blind spot and how big is that? 01:53:44.220 |
out of your deep structure that you formed for yourself 01:54:00.540 |
How is your mind less powerful than it used to be 01:54:04.100 |
or more powerful or different, powerful in different ways? 01:54:10.820 |
'cause we're living, we're looking at the world 01:54:18.460 |
but it seems necessary if you want to make progress. 01:54:22.820 |
- You know, one of the threads of psychological research 01:54:40.980 |
aren't necessarily actually part of the process 01:54:54.820 |
or even valid observations of the set of constraints 01:55:07.180 |
that we can give based on information at our disposal 01:55:11.900 |
about what might have contributed to the result 01:55:23.340 |
in a very important paper by Nisbet and Wilson 01:55:28.060 |
about the limits on our ability to be aware of the factors 01:55:33.060 |
that cause us to make the choices that we make. 01:55:38.680 |
And I think it's something that's very important 01:55:45.240 |
and it's something that we really ought to be 01:55:50.240 |
much more cognizant of in general as human beings 01:55:59.740 |
we hold the beliefs that we do and we hold the attitudes 01:56:03.120 |
and make the choices and feel the feelings that we do 01:56:15.740 |
And it's subject to our culturally transmitted 01:56:28.180 |
that we give to explain these things when asked to do so 01:56:43.440 |
is a product of culture and belief, practice. 01:56:48.080 |
- So let me ask you the big question of advice. 01:56:53.940 |
So you've lived an incredible life in terms of the ideas 01:57:00.900 |
in terms of the trajectory you've taken through your career, 01:57:04.420 |
What advice would you give to young people today 01:57:07.220 |
in high school and college about how to be more open 01:57:17.200 |
- Finding the thing that you are intrinsically motivated 01:57:24.120 |
to engage with and then celebrating that discovery 01:57:34.760 |
When I was in college, I struggled with that. 01:57:46.020 |
because I think I was interested in human psychology 01:57:51.620 |
And at that time, the only sort of information I had 01:58:00.260 |
and sort of popular psychiatry kinds of things. 01:58:03.940 |
And so, well, they were psychiatrists, right? 01:58:08.820 |
And that meant I had to go to medical school. 01:58:11.380 |
And I got to college and I find myself taking, you know, 01:58:16.260 |
the first semester of a three-quarter physics class 01:58:21.260 |
And this was so far from what it was I was interested in, 01:58:30.860 |
But I wandered about the rest of my freshman year 01:58:38.860 |
until I found myself in the midst of this situation 01:58:56.020 |
Columbia's building a gym in Morningside Heights, 01:59:00.300 |
And people are thinking, oh, the big, bad rich guys 01:59:26.860 |
and the whole university blew up and got shut down. 01:59:33.860 |
why people were behaving the way they were in this context. 01:59:44.620 |
I happened to have been taking psychology that quarter, 01:59:49.140 |
And somehow things in that space all ran together in my mind 01:59:53.540 |
and got me really excited about asking questions 02:00:00.300 |
go into the buildings and not others and things like that. 02:00:06.140 |
and I had just been wandering around aimlessly. 02:00:08.900 |
And at the different points in my career, you know, 02:00:11.900 |
when I think, okay, well, should I take this class 02:00:25.300 |
some idea that I wanna understand better, you know, 02:00:58.580 |
And I wasn't even getting honors based on my grades. 02:01:02.100 |
They just happened to have thought I was interested enough 02:01:13.300 |
through accidents of too early morning of classes 02:01:25.380 |
and then you celebrate the fact that this happens 02:01:39.300 |
all the answers to it and I don't think I wanna, 02:01:44.060 |
I want anybody to think that you should be sort of 02:02:13.460 |
that mere mortals would ever do an experiment 02:02:15.740 |
in those sciences, except one that was in the textbook 02:02:39.060 |
and to bring together a certain set of things 02:02:51.180 |
you know, profoundly amazing musical geniuses, right? 02:02:56.180 |
They get immersed in it at an early enough point 02:03:04.900 |
So my little brother had intrinsic motivation for music 02:03:22.400 |
because he could sort of see which ones had which scratches, 02:03:31.580 |
- And he enjoyed that, that connected with him somehow. 02:03:33.860 |
- Yeah, and there was something that it fed into 02:03:40.940 |
and if you can nurture it and can let it grow 02:03:44.860 |
and let it be an important part of your life. 02:03:49.140 |
is like be attentive enough to feel it when it comes. 02:03:59.140 |
I really like tabular data, like Excel sheets. 02:04:07.820 |
I don't know how useful that is for anything. 02:04:24.180 |
but then the other part that you're mentioning, 02:04:25.900 |
which is the nurture, is take time and stay with it, 02:04:29.140 |
stay with it a while and see where that takes you in life. 02:04:33.380 |
- Yeah, and I think the motivational engagement 02:04:45.340 |
So we could call it the Mozart effect, right? 02:04:55.860 |
as the fourth member of the Family String Quartet, right? 02:04:59.220 |
And they handed him the violin when he was six weeks old. 02:05:09.720 |
So the level of immersion there was amazingly profound, 02:05:20.040 |
maybe this is where the more sort of the genetic part 02:05:38.520 |
So that's what I really consider to be the Mozart effect. 02:05:41.420 |
It's sort of the synergy of something with experience 02:05:52.080 |
So I know my siblings and I are all very different 02:06:00.260 |
We've all gone in our own different directions. 02:06:02.700 |
And I mentioned my younger brother who was very musical. 02:06:08.980 |
was like this amazing like intuitive engineer. 02:06:25.000 |
such a hugely important issue that it is today. 02:06:29.320 |
So we all sort of somehow find a different thing. 02:06:43.440 |
but it's also when that happens, where you can find that, 02:06:47.040 |
then you can do your thing and you can be excited about it. 02:06:50.800 |
So people can be excited about fitting people on bicycles 02:06:53.640 |
as well as excited about making neural networks, 02:06:56.040 |
achieve insights into human cognition, right? 02:07:09.920 |
is since I was a child, just observing people around me 02:07:25.520 |
just to think of your own career and your own life. 02:07:38.880 |
and there may be a lot of other people similar to you 02:07:51.680 |
And ultimately the whole ride, it's undirected. 02:07:58.240 |
in terms of psychiatry when you were younger? 02:08:08.080 |
and just those kind of popular psychiatry ideas. 02:08:13.080 |
And that was a dream for me early on in high school 02:08:15.840 |
to like, I hope to understand the human mind by, 02:08:20.000 |
somehow psychiatry felt like the right discipline for that. 02:08:27.200 |
Does that make you sad that psychiatry is not 02:08:43.920 |
and biochemistry is involved in the discipline of psychiatry 02:08:58.760 |
And that's why I kind of went to computer science 02:09:01.440 |
and thinking like, maybe you can explore the human mind 02:09:08.660 |
sort of the biomedical/pharmacological aspects 02:09:14.780 |
of psychiatry at that point because I didn't, 02:09:21.920 |
that I never even found out about that until much later. 02:09:40.480 |
who advised the director of the National Institute 02:09:47.600 |
And in fact, at that time, the man who came in 02:09:53.040 |
as the new director, I had been on this board for a year 02:10:11.160 |
and that's what we're gonna do with schizophrenia. 02:10:18.240 |
And we're not gonna listen to anybody's grandmother anymore. 02:10:26.200 |
is not something we're going to support any further. 02:10:30.200 |
And he completely alienated me from the institute 02:11:17.000 |
and sorry to romanticize the whole philosophical 02:11:30.080 |
In the same way that physicists are the deep thinkers 02:11:37.920 |
And I think that flag has been taken from them 02:11:51.280 |
you can like intuit about the functioning of the mind 02:12:08.360 |
where you're starting to actually be able to observe, 02:12:12.000 |
you know, do certain experiments on human beings 02:12:14.240 |
and observe how the brain is actually functioning. 02:12:25.180 |
a cognitive psychologist can become the philosopher 02:12:28.260 |
and psychiatrists become much more like doctors. 02:12:44.840 |
to do great low level mathematics and physics 02:12:52.100 |
- Yeah, I think it was Fromm and Jung more than Freud 02:12:57.100 |
that was sort of initially kind of like made me feel like, 02:13:06.680 |
I actually, when I got to college and I lost that thread, 02:13:10.640 |
I found more of it in sociology and literature 02:13:20.520 |
So I took quite a lot of both of those disciplines 02:13:25.820 |
And, you know, I was actually deeply ambivalent 02:13:31.880 |
about the psychology because I was doing experiments 02:13:39.440 |
in why people would occupy buildings during an insurrection 02:13:44.240 |
and consider, you know, be sort of like so over committed 02:13:56.480 |
And so I had these profound sort of like dissonance 02:14:01.200 |
between, okay, the kinds of issues that would be explored 02:14:09.280 |
in modern British literature versus what I could study 02:14:17.480 |
That got resolved when I went to graduate school 02:14:22.120 |
And so for me, that was the path out of this sort of like, 02:14:37.080 |
actual mechanistically oriented thinking about it. 02:14:40.100 |
And I think we've come a long way in that regard 02:14:46.080 |
and that you're absolutely right that nowadays 02:14:50.520 |
this is something that's accessible to people 02:14:53.080 |
through the pathway in through computer science 02:15:02.900 |
You know, you can get derailed in neuroscience 02:15:10.260 |
where you might find the cures of various conditions, 02:15:18.120 |
So it's in the systems and cognitive neuroscience 02:15:32.980 |
by having had the opportunity to fall into that space. 02:15:41.260 |
speaking of which, you happen to be a human being 02:15:48.300 |
That seems to be a fundamental part of the human condition 02:15:54.780 |
Do you think about the fact that you're going to die one day? 02:16:01.380 |
- I would say that I am not as much afraid of death 02:16:12.880 |
And I say that in part for reasons of having, you know, 02:16:19.380 |
seen some tragic degenerative situations unfold. 02:16:28.080 |
It's exciting when you can continue to participate 02:16:41.020 |
where the wave is breaking on the shore, if you like. 02:16:46.160 |
And I think about, you know, my own future potential 02:16:53.600 |
if I were to undergo, begin to suffer from dementia, 02:17:09.840 |
I would sort of gradually lose the thread of that ability. 02:17:19.960 |
for a decade after, you know, sort of having to retire 02:17:24.880 |
because one no longer has these kinds of abilities to engage. 02:17:29.880 |
And I think that's the thing that I fear the most. 02:17:35.560 |
- The losing of that, like the breaking of the wave, 02:17:40.560 |
the flourishing of the mind where you have these ideas 02:17:44.120 |
and they're swimming around, you're able to play with them. 02:17:46.720 |
- Yeah, and collaborate with other people who, you know, 02:17:51.000 |
are themselves really helping to push these ideas forward. 02:18:09.520 |
and, you know, sort of continuous sort of way 02:18:12.800 |
of thinking about most things makes it so that, 02:18:21.640 |
is less apparent than it seems to be to most people. 02:18:29.780 |
Yeah, I wonder, so I don't know if you know the work 02:18:33.880 |
of Ernest Becker and so on, I wonder what role mortality 02:18:44.840 |
what role that plays in our reasoning of the world. 02:18:49.840 |
- I think that it can be motivating to people 02:19:06.160 |
on decision making that were satisfying in a certain way 02:19:16.480 |
on whether the model fit the data perfectly or not. 02:19:21.480 |
And I could see how one could test, you know, 02:19:30.880 |
But I just realized, hey, wait a minute, you know, 02:19:34.240 |
I may only have about 10 or 15 years left here 02:19:37.800 |
and I don't feel like I'm getting towards the answers 02:19:43.360 |
while I'm doing this particular level of work. 02:19:48.640 |
okay, let's pick something that's hard, you know? 02:19:54.800 |
So that's when I started working on mathematical cognition. 02:20:03.300 |
well, I got 15 more years possibly of useful life left, 02:20:09.960 |
I'm actually getting close to the end of that now, 02:20:15.980 |
well, I probably have another five after that. 02:20:18.100 |
So, okay, I'll give myself another six or eight. 02:20:25.560 |
And so, yeah, I gotta keep thinking about the questions 02:20:37.480 |
You've done some incredible work in your life 02:20:43.000 |
When the aliens and the human civilization is long gone 02:20:51.640 |
what do you hope is the paragraph written about you? 02:21:32.360 |
other than that I'd had the right context prior to that, 02:21:35.140 |
but that I had gone ahead and followed that lead. 02:21:41.800 |
but I said in this preface that the joy of science 02:21:46.800 |
is the moment in which a partially formed thought 02:22:11.460 |
concrete piece of actual scientific progress. 02:22:22.020 |
and when Rumelhart heard Hinton talk about gradient descent 02:22:51.100 |
by finding exciting collaborative opportunities 02:23:07.700 |
So it's the old Robert Frost, road less taken. 02:23:11.640 |
So maybe, 'cause you said like this incomplete initial idea, 02:23:16.640 |
that step you take is a little bit off the beaten path. 02:23:40.220 |
was completely empirical experimental project. 02:23:44.320 |
And I wrote a paper based on the two main experiments 02:24:08.080 |
that would explain the data that I had collected. 02:24:16.480 |
So I got back a letter from the editor saying, 02:24:20.600 |
"Thank you very much, these are great experiments. 02:24:32.200 |
And so I did, I took that part out of the paper. 02:24:59.200 |
And so when I got to my assistant professorship, 02:25:11.920 |
submitted my first article to Psychological Review, 02:25:26.120 |
"You should keep thinking about it this time." 02:25:28.320 |
And then that was what got me going to think, 02:25:38.700 |
"You don't have to be, you can do it as a mere mortal." 02:25:47.600 |
don't succumb to the labels of a particular reviewer. 02:26:05.920 |
- Yeah, I'm a connectionist or a cognitive scientist 02:26:16.800 |
that can completely revolutionize in totally new areas. 02:26:34.920 |
and you wanna know why are they doing this stuff. 02:26:45.160 |
why do you think we're all doing what we're doing? 02:26:51.740 |
We seem to be very busy doing a bunch of stuff 02:26:54.480 |
and we seem to be kind of directed towards somewhere, 02:27:00.640 |
- Well, I myself think that we make meaning for ourselves 02:27:43.720 |
But I do believe that we are an emergent result 02:27:48.720 |
of a process that happened naturally without guidance 02:28:01.000 |
and that the creation of efforts to reify meaning 02:28:14.320 |
is just a part of the expression of that goal 02:28:18.240 |
that we have to not find out what the meaning is 02:28:29.640 |
So to me, it's something that's very personal, 02:28:38.160 |
it's very individual, it's like meaning will come for you 02:28:43.160 |
through the particular combination of synergistic elements 02:29:04.760 |
it's all made in a certain kind of a local context though. 02:29:08.840 |
Here I am at UCSD with this brilliant man, Rommelhart, 02:29:38.760 |
there's some kind of peculiar little emergent process 02:29:43.600 |
that then, which is basically each one of us, 02:30:04.840 |
It's an emergent process that lives for a time, 02:30:09.260 |
is defined by its local pocket and context in time and space 02:30:17.040 |
and then we celebrate how nice the stories are 02:30:23.260 |
and eventually we'll colonize, hopefully, other planets, 02:30:37.240 |
Jay, you're speaking of peculiar emergent processes 02:30:49.760 |
of cognitive science, of psychology, of computation. 02:30:57.000 |
It's a huge honor that you would talk to me today, 02:31:08.320 |
and this has been an amazing opportunity for me 02:31:11.640 |
to let ideas that I've never fully expressed before come out 02:31:16.300 |
'cause you ask such a wide range of the deeper questions 02:31:20.760 |
that we've all been thinking about for so long. 02:31:31.000 |
please check out our sponsors in the description. 02:31:36.920 |
"In the long run, curiosity-driven research works best. 02:31:40.840 |
"Real breakthroughs come from people focusing 02:31:45.040 |
Thanks for listening and hope to see you next time.