back to indexJay McClelland: Neural Networks and the Emergence of Cognition | Lex Fridman Podcast #222

Chapters
0:0 Introduction
0:43 Beauty in neural networks
5:2 Darwin and evolution
10:47 The origin of intelligence
17:29 Explorations in cognition
23:33 Learning representations by back-propagating errors
29:58 Dave Rumelhart and cognitive modeling
43:1 Connectionism
65:54 Geoffrey Hinton
67:49 Learning in a neural network
84:42 Mathematics & reality
91:50 Modeling intelligence
102:28 Noam Chomsky and linguistic cognition
116:49 Advice for young people
127:56 Psychiatry and exploring the mind
140:35 Legacy
146:24 Meaning of life
00:00:00.000 | 
The following is a conversation with Jay McClelland, 00:00:12.300 | 
Having written the parallel distributed processing book 00:00:27.580 | 
machine learning revolution of the past 15 years. 00:00:33.500 | 
please check out our sponsors in the description. 00:00:38.820 | 
and here is my conversation with Jay McClelland. 00:00:54.180 | 
as the most beautiful aspect about neural networks, 00:00:59.520 | 
- The fundamental thing I think about with neural networks 00:01:26.100 | 
cognitive psychology had just become a field. 00:01:44.540 | 
It wasn't gonna tell us anything about the mind. 00:02:07.980 | 
So there was a sense with cognitive psychology 00:02:11.660 | 
that in understanding the sort of neuronal structure 00:02:19.980 | 
And then your sense is if we study these neural networks, 00:02:25.860 | 
to understanding the fundamentals of the human mind. 00:02:29.260 | 
I used to think, or I used to talk about the idea 00:02:35.220 | 
So Descartes, you know, thought about these things, right? 00:02:41.580 | 
He was walking in the gardens of Versailles one day, 00:02:46.220 | 
and he stepped on a stone, and a statue moved. 00:02:52.140 | 
And he walked a little further, he stepped on another stone, 00:03:02.860 | 
and he found out that they had a hydraulic system 00:03:05.820 | 
that allowed the physical contact with the stone 00:03:10.620 | 
to cause water to flow in various directions, 00:03:15.840 | 
And he used this as the beginnings of a theory 00:03:26.460 | 
And he had this notion that these little fibers 00:03:33.260 | 
that people had identified that weren't carrying the blood, 00:03:39.860 | 
that if you touch something, there would be pressure, 00:03:49.220 | 
So he had a mechanistic theory of animal behavior. 00:03:54.220 | 
And he thought that the human had this animal body, 00:04:10.540 | 
So the physical world includes the body in action, 00:04:15.540 | 
but it doesn't include thought, according to Descartes, 00:04:19.500 | 
- And so the study of physiology at that time 00:04:22.900 | 
was the study of sensory systems and motor systems 00:04:30.060 | 
when you stimulated neurons and stuff like that. 00:04:33.620 | 
And the study of cognition was something that, you know, 00:04:38.140 | 
was tied in with abstract computer algorithms 00:04:48.660 | 
and so when I'm studying cognitive psychology 00:04:53.700 | 
wait a minute, the whole thing is biological, right? 00:05:13.140 | 
- Well, obvious and not obvious at the same time. 00:05:18.140 | 
And I think about Darwin in this context, too, 00:05:20.360 | 
because Darwin knew very early on that none of the ideas 00:05:25.360 | 
that anybody had ever offered gave him a sense 00:05:30.320 | 
of understanding how evolution could have worked. 00:05:34.560 | 
But he wanted to figure out how it could have worked. 00:05:42.540 | 
And he spent a lot of time working on this idea 00:05:52.300 | 
and thinking they were interesting but not knowing why, 00:05:54.620 | 
and drawing more and more pictures of different birds 00:05:57.500 | 
that differ slightly from each other and so on. 00:06:02.500 | 
But after he figured it out, he had nightmares about it. 00:06:06.940 | 
He would dream about the complexity of the eye 00:06:16.140 | 
that that could have ever emerged from some sort of 00:06:20.340 | 
unguided process, that it hadn't been the product of design. 00:06:31.980 | 
in part because he was scared of his own ideas. 00:06:39.160 | 
But then by the time the 20th century rolls around, 00:06:51.480 | 
many people understand or believe that evolution produced 00:07:02.180 | 
And Descartes' idea starts to seem a little wonky 00:07:09.560 | 
There's the apes and the chimpanzees and the bonobos, 00:07:24.520 | 
They don't, there's no hippocampus in the monkey brain. 00:07:30.160 | 
Huxley had to do a surgery in front of many, many people 00:07:36.240 | 
there's actually a hippocampus in the chimpanzee's brain. 00:07:40.340 | 
So their continuity of the species is another element 00:07:53.640 | 
that we are ourselves a total product of nature. 00:08:05.920 | 
how nature could actually give rise to organisms 00:08:20.120 | 
- So it's interesting because even the idea of evolution 00:08:23.000 | 
is hard for me to keep all together in my mind. 00:08:30.120 | 
it's hard to imagine that like the development 00:08:33.600 | 
of the human eye would give me nightmares too, 00:08:39.920 | 
And it's very tempting to think about kind of a growth 00:08:44.680 | 
And it's like, how is it possible for that such a thing 00:08:50.160 | 
'Cause also me from a robotics engineering perspective, 00:09:08.680 | 
that would have been equally interesting to me 00:09:10.600 | 
would have been to actually study the process 00:09:20.880 | 
into brain development and the exquisite sort of 00:09:27.360 | 
laying down of pathways and so on that occurs in the brain. 00:09:32.320 | 
And I know the slightest bit about that is not my field, 00:09:35.760 | 
but there are fascinating aspects to this process 00:09:40.760 | 
that eventually result in the complexity of various brains. 00:10:02.680 | 
in the study of vision, the continuity between humans 00:10:20.160 | 
The monkey's visual system and the human visual system, 00:10:22.960 | 
extremely similar up to certain levels, of course. 00:10:35.600 | 
and the first few layers of cortex or cortical areas, 00:10:40.600 | 
I guess one would say, are extremely similar. 00:10:47.060 | 
- Yeah, so on the cognition side is where the leap 00:11:00.320 | 
or if there's other intelligent alien civilizations 00:11:06.040 | 
So one special thing seems to be the origin of life itself. 00:11:09.320 | 
However you define that, there's a gray area. 00:11:11.880 | 
And the other leap, this is very biased perspective 00:11:24.440 | 
An important one is how difficult does that leap? 00:11:33.720 | 
and some apes had to touch a monolith to get it? 00:11:41.640 | 
- Exactly, but it just seems one heck of a leap 00:11:51.540 | 
argued that some genetic fluke occurred 100,000 years ago. 00:12:12.140 | 
had this one genetic tweak that resulted in language. 00:12:20.420 | 
And language then provided this special thing 00:12:40.540 | 
but I think it comes along with the evolution 00:12:46.020 | 
of a lot of other related things related to sociality 00:13:12.980 | 
- Right, so language is a tool that allows you 00:13:25.260 | 
And it's interesting to think about that one fluke, 00:13:39.260 | 
Like evolution just kind of opens the door a little bit 00:13:41.540 | 
and then time and selection takes care of the rest. 00:13:45.900 | 
- You know, there's so many fascinating aspects 00:13:49.060 | 
So we think of evolution as continuous, right? 00:13:54.060 | 
We think, oh yes, okay, over 500 million years, 00:13:58.700 | 
there could have been this relatively continuous changes. 00:14:12.540 | 
evolutionary biologists found from the fossil record. 00:14:27.100 | 
Well, suddenly on that scale is a million years or something, 00:14:38.940 | 
was a very important concept in evolutionary biology. 00:14:55.220 | 
We seem to have a certain kind of mindset at a certain age. 00:15:04.900 | 
oh my God, how could they have thought that way? 00:15:07.220 | 
So Piaget was known for this kind of stage theory 00:15:13.580 | 
and suddenly those stages are so discreet, transitions. 00:15:20.820 | 
And that's another thing that's always interested me 00:15:32.020 | 
where something like an insight or a transition 00:15:47.620 | 
And so evolutionary biology, developmental biology, 00:15:57.820 | 
that have been approached in this kind of way. 00:16:02.060 | 
I find both fascinating those early years of human life, 00:16:10.220 | 
of the embryonic development to how from embryos 00:16:19.300 | 
Again, from the engineering perspective, it's fascinating. 00:16:42.660 | 
that self-assembly of a mechanism from the DNA material, 00:16:55.300 | 
that just generate a system, this mushy thing 00:17:14.140 | 
- Yeah, ultimately that is a very important part 00:17:22.340 | 
this sort of emergence of mind from brain kind of thing. 00:17:27.340 | 
- And the whole thing seems to be pretty continuous. 00:17:35.220 | 
You wrote parallel distributed processing books 00:17:37.940 | 
that explored ideas of neural networks in the 1980s 00:17:43.220 | 
But the books you wrote with David Rommelhart, 00:17:47.180 | 
who is the first author on the back propagation paper 00:18:02.580 | 
- I'm gonna start sort of with my own process 00:18:15.340 | 
when I met Geoff Hinton and he came to San Diego 00:18:32.700 | 
okay, I'm really interested in human cognition, 00:18:35.660 | 
but this disembodied sort of way of thinking about it 00:18:40.180 | 
that I'm getting from the current mode of thought about it 00:19:06.260 | 
And the book was called "Explorations in Cognition." 00:19:25.420 | 
I'm coming to this community where people can get together 00:19:29.020 | 
and feel like they've collectively exploring ideas. 00:19:33.180 | 
And it was a book that had a lot of, I don't know, 00:19:41.020 | 
And Don Norman, who was the more senior figure 00:19:46.020 | 
at Rommelhart at that time, who led that project, 00:19:49.820 | 
always created this spirit of playful exploration of ideas. 00:19:58.340 | 
But I was also still trying to get from the neurons 00:20:16.740 | 
where I heard a talk by a man named James Anderson, 00:20:24.580 | 
in a psychology department who had used linear algebra 00:20:29.580 | 
to create neural network models of perception 00:20:41.180 | 
that one could create a model that was simulating neurons, 00:20:46.180 | 
not just kind of engaged in a stepwise algorithmic process 00:20:58.580 | 
but it was simulating, remembering, and recalling, 00:21:03.540 | 
and recognizing the prior occurrence of a stimulus 00:21:08.820 | 
So for me, this was a bridge between the mind and the brain. 00:21:13.500 | 
And I just like, and I remember I was walking 00:21:20.500 | 
and I almost felt like St. Paul on the road to Damascus. 00:21:25.020 | 
I said to myself, you know, if I think about the mind 00:21:31.900 | 
it will help me answer the questions about the mind 00:21:51.980 | 
who had been writing about neural networks since the '60s. 00:22:00.820 | 
And his PhD dissertation showed up in an applicant pool 00:22:11.780 | 
that Dave and Don, the two men I mentioned before, 00:22:30.620 | 
who came and joined this group of postdoctoral scholars 00:22:33.820 | 
that was funded by this wonderful grant that they got. 00:22:53.060 | 
organized a conference at UCSD where we were. 00:22:59.620 | 
And it was called Parallel Models of Associative Memory. 00:23:06.460 | 
who had been thinking about these kinds of ideas 00:23:30.180 | 
So let me talk about Romelhart now for a minute, 00:23:37.740 | 
So first of all, for people who are not familiar, 00:23:41.060 | 
neural networks are at the core of the machine learning, 00:23:45.400 | 
Jeffrey Hinton that we mentioned is one of the figures 00:23:48.660 | 
that were important in the history, like yourself, 00:23:53.180 | 
artificial neural networks that are then used 00:24:14.020 | 
how you thought at the time about neural networks, 00:24:23.100 | 
the word parallel in this comes from the idea 00:24:26.620 | 
that each neuron is an independent computational unit. 00:24:40.920 | 
And it's a very simple little computational unit, 00:24:51.460 | 
It's in a biological medium where it's getting nutrients 00:24:58.600 | 
But you can think of it as almost like a little computer 00:25:16.020 | 
almost a billion of these little neurons, right? 00:25:25.460 | 
So it's like instead of just a single central processor 00:25:29.900 | 
that's engaged in chug, chug, one step after another, 00:25:34.360 | 
we have a billion of these little computational units 00:25:44.460 | 
maybe you can comment, it seems to me, even still to me, 00:25:49.160 | 
quite a revolutionary way to think about computation 00:25:55.140 | 
of theoretical computer science alongside of that, 00:25:58.060 | 
where it's very much like sequential computer. 00:26:06.440 | 
why don't we take a really dumb, very simple computer 00:26:11.380 | 
and just have a lot of them interconnected together? 00:26:14.380 | 
And they're all operating in their own little world 00:26:23.440 | 
trying to understand how things like certain characteristics 00:26:31.620 | 
- That's quite a revolutionary way of thinking, I would say. 00:26:47.560 | 
not sort of knowing how we kind of get all the way there, 00:26:57.420 | 
at the core of the questions that everybody's asking 00:27:02.840 | 
But if I could just play this out a little bit, 00:27:15.380 | 
is a set of, you could think of it biologically 00:27:26.400 | 
Each one had, each collection has maybe 10,000 neurons in it 00:27:40.040 | 
but others are closer to the biological brain 00:27:49.600 | 
we have thousands of neurons or tens of thousands maybe. 00:27:54.440 | 
Well, in the brain, we probably have millions in each layer, 00:27:59.440 | 
but we're getting sort of similar in a certain way, right? 00:28:02.780 | 
And then we think, okay, at the bottom level, 00:28:13.320 | 
They respond to the amount of light of a certain wavelength 00:28:24.600 | 
And then there's several further stages going up, 00:28:47.980 | 
there's a cell phone, there's a water bottle. 00:29:02.240 | 
they are doing this massively parallel computation 00:29:08.560 | 
in each of those layers is thought of as computing 00:29:17.320 | 
simultaneously with all the other ones in the same layer. 00:29:28.200 | 
one activation pattern that's computed in a single step. 00:29:36.200 | 
but it's still that parallel and distributed processing. 00:29:53.880 | 
You can start getting all the beautiful things 00:30:02.280 | 
but it's Parallel and Something Associative Memory 00:30:04.440 | 
and so on, very exciting, technical and exciting title. 00:30:08.700 | 
And you started talking about Dave Romerhart. 00:30:17.280 | 
Can you tell me about him, his ideas, his mind, 00:30:35.920 | 
and his father was the editor of the newspaper. 00:30:58.520 | 
They competed in sports and they competed in mind games. 00:31:13.880 | 
He went at a younger age than most people do to college 00:31:21.440 | 
at the University of South Dakota and majored in mathematics. 00:31:24.960 | 
And I don't know how he got interested in psychology 00:31:29.960 | 
but he applied to the mathematical psychology program 00:31:34.520 | 
at Stanford and was accepted as a PhD student 00:31:37.800 | 
to study mathematical psychology at Stanford. 00:31:40.400 | 
So mathematical psychology is the use of mathematics 00:32:23.240 | 
what the probability that the subject will be correct 00:32:30.560 | 
So it's a use of mathematics to descriptively characterize 00:32:40.000 | 
And Stanford at that time was the place where 00:32:48.720 | 
mathematical thinkers who were also connected 00:32:53.600 | 
who brought a lot of really exciting ideas onto the table. 00:33:07.200 | 
He was a very strong student within that program. 00:33:11.420 | 
And he got this job at this brand new university 00:33:19.120 | 
in San Diego in 1967 where he's one of the first 00:33:23.080 | 
assistant professors in the Department of Psychology at UCSD. 00:33:44.200 | 
mathematical modeling, but he had gotten interested 00:33:49.200 | 
in cognition, he'd gotten interested in understanding 00:34:11.080 | 
like how would we know if we really understood something? 00:34:23.640 | 
So for example, one of his favorite things at that time was, 00:34:32.840 | 
when she heard the familiar jingle of the good humor man. 00:34:38.400 | 
She remembered her birthday money and ran into the house. 00:34:45.680 | 
Well, there's a couple of ideas you could have, 00:34:50.200 | 
but the most natural one is that the good humor man 00:35:00.160 | 
so she's gonna run into the house and get her money 00:35:03.960 | 
It's a huge amount of inference that has to happen 00:35:06.160 | 
to get those things to link up with each other. 00:35:08.560 | 
And he was interested in how the hell that could happen. 00:35:13.160 | 
And he was trying to build good old-fashioned AI style 00:35:47.400 | 
to actually build something that looks like a web 00:35:55.040 | 
something like understanding, whatever the heck that is. 00:36:00.980 | 
that they grappled with at the end of that book 00:36:03.360 | 
that I was describing, Explorations in Cognition. 00:36:14.320 | 
- By the way, that's called good old-fashioned AI now. 00:36:37.120 | 
of the recognition that this wasn't all working. 00:36:39.640 | 
Anyway, so he started thinking in terms of the idea 00:36:51.480 | 
to integrate multiple simultaneous constraints 00:36:55.280 | 
in a way that would be mutually influencing each other. 00:37:05.000 | 
first time I read it, I thought, oh, well, yeah, 00:37:11.960 | 
But after a while, it just got under my skin. 00:37:15.240 | 
And it was called an Interactive Model of Reading. 00:37:27.520 | 
to read, our interpretation of what's coming off the page 00:37:32.520 | 
when we read, at every level of analysis you can think of 00:37:40.960 | 
actually depends on all the other levels of analysis. 00:37:45.920 | 
So what are the actual pixels making up each letter? 00:37:54.000 | 
And what do those pixels signify about which letters 00:38:05.640 | 
And what do those words tell us about what ideas 00:38:12.560 | 
And so he had this model where we have these little tiny 00:38:36.440 | 
And at that time, his idea was there's this set of experts. 00:38:43.140 | 
There's an expert about how to construct a line 00:38:50.700 | 
which sets of lines go together to make which letters, 00:38:53.280 | 
and another one about which letters go together 00:38:59.580 | 
and another one about how the meanings fit together, 00:39:04.180 | 
And all these experts are looking at this data, 00:39:06.220 | 
and they're updating hypotheses at other levels. 00:39:12.700 | 
So the word expert can tell the letter expert, 00:39:17.260 | 
because I think there should be a word the here, 00:39:20.580 | 
and the bottom-up sort of feature-to-letter expert 00:39:23.620 | 
can say, I think there should be a T there too, 00:39:28.700 | 
And so there's a top-down, bottom-up, interactive process, 00:39:32.600 | 
but it's going on at all layers simultaneously. 00:39:34.860 | 
So everything can filter all the way down from the top, 00:39:38.900 | 
and it's a completely interactive, bidirectional, 00:39:45.180 | 
- That is somehow, because of the abstractions, 00:39:47.740 | 
it's hierarchical, so there's different layers 00:39:51.460 | 
of responsibilities, different levels of responsibilities. 00:39:54.700 | 
First of all, it's fascinating to think about it 00:40:02.140 | 
of a neural network, or something like a neural network, 00:40:06.860 | 
that work on letters, and then the letters become words, 00:40:14.780 | 
that from that kind of hierarchical structure 00:40:34.620 | 
one for the features, and one for the letters, 00:40:36.860 | 
and one for how the letters make the words, and so on, 00:40:43.060 | 
sort of evaluating various propositions about, 00:40:48.980 | 
going to be one that looks like the letter T, and so on? 00:40:59.400 | 
and hearing about Jim Anderson's linear algebra book, 00:41:12.720 | 
which just would have their connection weights 00:41:24.160 | 
called the Interactive Activation Model of Letter Perception, 00:41:41.860 | 
but now we built it out of a set of neuron-like 00:41:45.000 | 
processing units that are just connected to each other 00:41:52.220 | 
has a connection to the unit for the letter T 00:41:56.200 | 
and the letter I in the second position, so on, 00:41:59.920 | 
and because these connections are bidirectional, 00:42:03.720 | 
if you have prior knowledge that it might be the word time, 00:42:08.800 | 
that starts to prime the letters and the features, 00:42:12.000 | 
and if you don't, then it has to start bottom-up, 00:42:24.240 | 
they can convergently result in an emergent perception, 00:42:27.720 | 
and that was the piece of work that we did together 00:42:32.720 | 
that sort of got us both completely convinced 00:42:44.560 | 
was going to be able to actually address the questions 00:42:48.460 | 
that we were interested in as cognitive psychologists. 00:42:50.800 | 
- So the algorithmic side, the optimization side, 00:42:53.160 | 
those are all details, like when you first start, 00:43:17.320 | 
is in the connections between the units, right? 00:43:24.780 | 
There's just the connections between the units. 00:43:36.920 | 
The unit for the word time isn't a unit for the word time 00:43:40.040 | 
for any other reason than it's got the connections 00:43:46.040 | 
Those are the units on the input that excite it 00:43:48.360 | 
when it's excited that it in a sense represents 00:43:52.680 | 
in the system that there's support for the hypothesis 00:44:03.120 | 
the word time isn't written anywhere inside the model. 00:44:08.360 | 
It's only written there in the picture we drew of the model 00:44:11.780 | 
to say that's the unit for the word time, right? 00:44:21.080 | 
You have to use the connections from that out 00:44:27.800 | 
- That's such a, that's a counterintuitive idea. 00:44:35.020 | 
This idea of connectionism, it doesn't, it's weird. 00:44:43.520 | 
- Yeah, but let's go back to that CNN, right? 00:44:46.120 | 
That CNN with all those layers of neuron-like processing 00:44:51.520 | 
it's gonna come out and say, this is a cat, that's a dog. 00:45:02.040 | 
like from the very first layer to the, you know, 00:45:07.880 | 
they just get numbered after a while because they, 00:45:17.200 | 
but it's a graded and continuous sort of process 00:45:24.440 | 
very, very specific to much more sort of global, 00:45:28.880 | 
but it's still, you know, another sort of pattern 00:45:33.980 | 
And then at the output side, it says it's cat or it's a dog. 00:45:37.360 | 
And when we, when I open my eyes and say, oh, that's Lex, 00:45:49.280 | 
which is a member of the same species as many other dogs, 00:45:57.440 | 
I don't know how to describe what it is that makes me know 00:46:01.420 | 
that I'm looking at Lex or at my particular dog, right? 00:46:04.680 | 
Or even that I'm looking at a particular brand of car. 00:46:09.420 | 
but if I wrote you a paragraph about the car, 00:46:16.760 | 
So the idea that we have propositional knowledge 00:46:36.560 | 
You couldn't ever write down a set of propositions 00:47:00.140 | 
You cannot, you don't read the contents of the connections. 00:47:04.060 | 
The connections only cause outputs to occur based on inputs. 00:47:22.200 | 
But, you know, each layer is probably equally as important 00:47:27.280 | 
Like there's no reason why the cat versus dog 00:47:30.240 | 
is more important than the lower level activations. 00:47:34.100 | 
I mean, all of it is just this beautiful stacking 00:47:37.660 | 
And we humans live in this particular layers for us. 00:47:43.400 | 
to use those cat versus dog, predator versus prey, 00:47:55.300 | 
you ask, are we able to introspect and convert 00:47:58.300 | 
the very things that allow us to tell the difference 00:48:00.740 | 
between cat and dog into logic, into formal logic. 00:48:06.620 | 
I would say that's still part of the dream of symbolic AI. 00:48:10.460 | 
And I've recently talked to Doug Lennard who created Psych. 00:48:15.460 | 
And that's a project that lasted for many decades 00:48:23.180 | 
and still carries a sort of dream in it, right? 00:48:30.700 | 
It seems like connectionism is really powerful, 00:48:34.820 | 
but it also seems like there's this building of knowledge. 00:48:38.740 | 
And so how do we, how do you square those two? 00:48:41.420 | 
Like, do you think the connections can contain 00:48:46.940 | 
of what Dave Romahart was thinking about of understanding? 00:49:04.660 | 
You know, I think that from the emergentist side, 00:49:34.380 | 
that I wanted to build like anything into the machine. 00:49:38.280 | 
But I don't like the word eliminative anymore 00:49:44.420 | 
because it makes it seem like it's wrong to think 00:49:50.580 | 
that there is this emergent level of understanding. 00:50:06.900 | 
rather than eliminative connectionist, right? 00:50:09.460 | 
Because I want to acknowledge that these higher level 00:50:28.980 | 
And there was an example that Doug Hofstadter used to use 00:50:36.660 | 
Just the idea that we can think about sand dunes 00:50:41.300 | 
as entities and talk about like how many there are even, 00:50:51.380 | 
but we also know that a sand dune is a very fluid thing. 00:50:56.380 | 
It's a pile of sand that is capable of moving around 00:51:10.100 | 
And if we think about our thoughts as like sand dunes, 00:51:28.140 | 
yes, they exist as such, but they also, you know, 00:51:33.140 | 
we shouldn't treat them as completely monolithic entities 00:51:41.540 | 
sort of all of the stuff that allows them to change 00:51:58.620 | 
then it doesn't mean that the contents of thought 00:52:06.960 | 
but it's more fluid maybe than is easier to capture 00:52:15.380 | 
- Yeah, that's a heck of a sort of thing to put 00:52:18.420 | 
at the top of a resume, radical emerginist connectionist. 00:52:44.380 | 
It seems like maybe all of reality is emergent. 00:53:06.780 | 
that start looking very quickly like organisms 00:53:10.260 | 
that you forget how the actual thing operates. 00:53:13.620 | 
They start looking like they're moving around, 00:53:21.900 | 
And it seems like maybe it's something about the human mind 00:53:24.980 | 
that wants to operate in some layer of the emergent 00:53:35.240 | 
also it seems like unfair to eliminate the magic 00:53:43.040 | 
Like eliminate the fact that that emergent is real. 00:53:54.140 | 
- Yeah, because it seemed like that was trying to say 00:54:13.480 | 
many people have confronted that possibility over time, 00:54:17.760 | 
but it's still important to accept it as magic, right? 00:54:30.260 | 
I think of others who have appreciated the role of magic, 00:54:38.040 | 
of actual trickery in creating illusions that move us. 00:54:49.860 | 
give rise to something much deeper than that. 00:55:01.020 | 
we'll just accept it as given that that occurs. 00:55:09.900 | 
We won't try to really, really, really deeply understand 00:55:16.660 | 
Okay, but you worked closely with Dave Rommelhart. 00:55:38.140 | 
and his demise was actually one of the most poignant 00:55:43.720 | 
and relevant tragedies relevant to our conversation. 00:55:59.720 | 
He started to undergo a progressive neurological condition 00:56:14.980 | 
That is to say his particular course isn't fully understood 00:56:19.760 | 
because brain scans weren't done at certain stages 00:56:28.140 | 
and no autopsy was done or anything like that. 00:56:33.140 | 
So we don't know as much about the underlying pathology 00:56:37.640 | 
as we might, but I had begun to get interested 00:56:42.640 | 
in this neurological condition that might have been 00:56:52.500 | 
as my own efforts to understand another aspect 00:57:09.380 | 
The disorder is something my colleagues and collaborators 00:57:17.100 | 
So it's a specific form of loss of mind related to meaning, 00:57:27.600 | 
And it's progressive in the sense that the patient 00:57:44.680 | 
either from touch, from sight, from sound, from language. 00:57:51.600 | 
I hear sounds, but I don't know what they mean 00:57:55.180 | 
So as this illness progresses, it starts with 00:58:21.960 | 
But as it progresses, it becomes more and more striking 00:58:26.960 | 
and the patient loses the ability to recognize 00:58:42.600 | 
and can't recognize rabbits and rodents anymore. 00:58:55.840 | 
So there was this one patient who went through 00:59:03.040 | 
any four-legged animal, he would call it either a horse 00:59:08.120 | 
And if it was big, he would tend to call it a horse. 00:59:25.300 | 
So my collaborator in this work, Carolyn Patterson, 00:59:28.800 | 
developed a test called the pyramids and palm trees test. 00:59:33.320 | 
So you give the patient a picture of pyramids 00:59:39.520 | 
and they have a choice, which goes with the pyramids? 00:59:46.580 | 
And she showed that this wasn't just a matter of language 00:59:50.980 | 
because the patient's loss of this ability shows up 01:00:00.220 | 
The pictures, they can't put the pictures together 01:00:05.280 | 
They can't relate the pictures to the words either. 01:00:15.780 | 
And so that's why it's called semantic dementia. 01:00:30.460 | 
a pattern of activation represents the concepts. 01:00:39.220 | 
And then, so the difference between the dog and the goat 01:00:42.380 | 
sort of is no longer part of the pattern anymore. 01:00:49.340 | 
And we understand that in the way the models work and learn. 01:00:57.300 | 
So on the one hand, it's a fascinating aspect 01:01:07.180 | 
of distributed representation in a very nice way, 01:01:11.500 | 
But at the same time, it was extremely poignant 01:01:33.420 | 
thoughtful person who was willing to work for years 01:01:42.340 | 
to solve a hard problem, he starts to disappear. 01:01:47.340 | 
And there was a period of time when it was like, 01:02:12.660 | 
Was he, I mean, this is one of the big scientists 01:02:36.180 | 
the Hawking of cognitive science to me in some ways. 01:02:39.820 | 
Both of them suffered from a degenerative condition. 01:02:46.200 | 
In Hawking's case, it affected the motor system. 01:02:49.300 | 
In Rommel-Hart's case, it's affecting the semantics. 01:03:18.660 | 
But on the other hand, at some level, he sort of did. 01:03:28.620 | 
that he had really become profoundly impaired. 01:03:35.220 | 
and it wasn't just like he was distracted that day 01:03:40.180 | 
So he retired from his professorship at Stanford 01:03:45.020 | 
and he lived with his brother for a couple years 01:04:06.780 | 
And I would spend time with him during that period. 01:04:17.140 | 
And we would go bowling and he could still bowl. 01:04:37.260 | 
And I said, "Okay, well, where do you wanna go?" 01:04:41.580 | 
So he still had a certain amount of spatial cognition 01:04:53.020 | 
And he couldn't come up with any of the words 01:04:56.860 | 
but he knew where on the menu the thing was that he wanted. 01:05:05.340 | 
but he knew that that's what he wanted to eat. 01:05:20.060 | 
graded in certain kinds of ways but also multi-partite. 01:05:27.980 | 
certain sort of partial competencies still exist 01:05:32.140 | 
in the absence of other aspects of these competencies. 01:05:39.340 | 
about what used to be called cognitive neuropsychology, 01:05:50.060 | 
But in particular, this gradual disintegration part. 01:05:54.300 | 
I'm a big believer that the loss of a human being 01:05:58.700 | 
that you value is as powerful as first falling in love 01:06:03.340 | 
I think it's all a celebration of the human being. 01:06:06.940 | 
So the disintegration itself too is a celebration in a way. 01:06:12.180 | 
But just to say something more about the scientist 01:06:17.740 | 
and the backpropagation idea that you mentioned. 01:06:37.740 | 
and then there was this opportunity to bring him back. 01:06:47.560 | 
And Rumelhart and I had decided we wanted to do this 01:06:54.900 | 
And the papers on the interactive activation model 01:06:59.420 | 
that I was telling you about had just been published. 01:07:01.580 | 
And we both sort of saw a huge potential for this work 01:07:07.460 | 
And so the three of us started a research group 01:07:23.660 | 
'Cause Jeff was known among Brits to be brilliant 01:07:28.340 | 
and Francis was well connected with his British friends. 01:07:35.620 | 
- And several, Paul Spolensky was one of the other postdocs. 01:08:09.300 | 
"with the way you guys have been approaching this 01:08:13.220 | 
"is that you've been looking for inspiration from biology 01:08:26.400 | 
He said, "That's the wrong way to go about it. 01:08:31.120 | 
"What you should do is you should think in terms of 01:08:35.340 | 
"how you can adjust connection weights to solve a problem. 01:08:45.140 | 
"So you define your problem, and then you figure out 01:08:50.140 | 
"how the adjustment of the connection weights 01:08:54.260 | 
And Rommelhart heard that and said to himself, 01:09:00.380 | 
"Okay, so I'm gonna start thinking about it that way. 01:09:21.640 | 
"I can measure how well they're doing on each image, 01:09:35.960 | 
"so as to minimize my loss or reduce the error." 01:09:53.860 | 
And in fact, there was an algorithm called the Delta Rule 01:10:04.380 | 
in the electrical engineering department at Stanford, 01:10:08.380 | 
Woodrow, Bernie Woodrow, and a collaborator named Hoff. 01:10:13.260 | 
Anyway, so gradient descent in continuous neural networks 01:10:32.940 | 
We want the output to produce a certain pattern. 01:10:36.200 | 
We can define the difference between our target 01:10:41.620 | 
and we can figure out how to change the connection weights 01:10:52.100 | 
from earlier layers of units to the ones at a hidden layer 01:11:04.060 | 
because it's just an extension of the gradient descent idea. 01:11:08.540 | 
And interestingly enough, Hinton was thinking 01:11:15.500 | 
So Hinton had his own alternative algorithm at the time 01:11:20.500 | 
based on the concept of the Boltzmann machine 01:11:25.100 | 
So the paper on the Boltzmann machine came out in, 01:11:27.760 | 
learning in Boltzmann machines came out in 1985, 01:11:30.520 | 
but it turned out that backprop worked better 01:11:35.820 | 
than the Boltzmann machine learning algorithm. 01:11:46.180 | 
And probably that name is opaque to many people. 01:11:52.660 | 
What it meant was that in order to figure out 01:12:00.140 | 
to the connections from the input to the hidden layer, 01:12:08.580 | 
from the output layer through the connections 01:12:14.900 | 
to get the signals that would be the error signals 01:12:23.100 | 
It was like, well, we know what the error signals 01:12:26.340 | 
Let's see if we can get a signal at the hidden layer 01:12:32.860 | 
So it's backpropagating through the connections 01:12:36.460 | 
from the hidden to the output to get the signals 01:12:40.540 | 
to tell the hidden units how to change their weights 01:12:43.100 | 
from the input, and that's why it's called backprop. 01:12:45.700 | 
Yeah, so it came from Hinton having introduced the concept 01:13:02.580 | 
so that they make progress towards your goal. 01:13:04.940 | 
- So stop thinking about biology for a second, 01:13:15.820 | 
You've gotten a chance to work with him in that little, 01:13:20.460 | 
the set of people involved there, it's quite incredible, 01:13:52.500 | 
So what kind of ideas have you learned from him? 01:14:01.220 | 
what stands out to you in the full space of ideas here 01:14:06.220 | 
at the intersection of computation and cognition? 01:14:27.660 | 
He had two papers in 1981, just to give one example, 01:14:32.660 | 
one of which was essentially the idea of transformers, 01:14:48.500 | 
on semantic cognition, which inspired him and Rommelhart 01:15:01.980 | 
and still I think sort of grounds my own thinking 01:15:16.040 | 
He also, in a small paper that was never published 01:15:27.900 | 
or maybe a couple of years even before that, I don't know, 01:15:56.660 | 
you need to save the state that you had when you called it 01:16:04.940 | 
And the idea was that you would save the state 01:16:11.260 | 
by making fast changes to connection weights, 01:16:14.820 | 
and then when you finished with the subroutine call, 01:16:22.320 | 
would allow you to go back to where you had been before 01:16:35.660 | 
And I always thought, okay, that's really, you know, 01:16:45.300 | 
and many of them in the 1970s and early 1980s. 01:16:50.720 | 
So another thing about Geoff Hinton's way of thinking, 01:17:18.120 | 
you don't get up at the board and write equations 01:17:20.280 | 
like you do in everybody else's machine learning lab. 01:17:44.680 | 
for what's happening as this gradient descent process. 01:17:59.760 | 
And it speaks to me of the fundamentally intuitive character 01:18:13.480 | 
that together with a commitment to really understanding 01:18:18.480 | 
in a way that's absolutely, ultimately explicit and clear, 01:18:33.720 | 
He's an example, some kind of weird mix of visual 01:18:40.640 | 
Feynman is another example, different style of thinking, 01:18:58.260 | 
Just having experienced that unique way of thinking 01:19:02.000 | 
transforms you and makes your work much better. 01:19:13.640 | 
It's not always the exact ideas that you talk about, 01:19:16.680 | 
but it's the process of generating those ideas, 01:19:19.700 | 
being around that, spending time with that human being, 01:19:26.760 | 
as it was a little bit in your case with Jeff. 01:19:30.640 | 
- Yeah, Jeff is a descendant of the logician Boole. 01:19:38.640 | 
He comes from a long line of English academics. 01:19:43.640 | 
And together with the deeply intuitive thinking ability 01:20:00.000 | 
and I think he's mentioned it from time to time 01:20:02.360 | 
in other interviews that he's had with people. 01:20:07.800 | 
He's wanted to be able to sort of think of himself 01:20:11.400 | 
as contributing to the understanding of reasoning itself, 01:20:25.720 | 
It's about what can we conclude from what else 01:20:41.800 | 
how we derive truths from givens and things like this. 01:20:48.520 | 
And the work that Jeff was doing in the early to mid '80s 01:20:59.880 | 
was his way of connecting with that Boolean tradition 01:21:08.200 | 
probabilistic-graded constraint satisfaction realm. 01:21:31.240 | 
I've always been inspired by the Bolton machine too. 01:21:33.480 | 
It's like, well, if the neurons are probabilistic 01:21:36.720 | 
rather than deterministic in their computations, 01:21:40.440 | 
then maybe this somehow is part of the serendipity 01:21:45.440 | 
or adventitiousness of the moment of insight, right? 01:21:54.240 | 
It might not have occurred at that particular instant. 01:22:01.520 | 
And that too is part of the magic of the emergence 01:22:08.120 | 
- Well, you're right with the Boolean lineage 01:22:09.880 | 
and the dream of computer science is somehow, 01:22:14.880 | 
I mean, I certainly think of humans this way, 01:22:17.440 | 
that humans are one particular manifestation of intelligence, 01:22:26.200 | 
The mechanisms of intelligence, the mechanisms of cognition 01:22:31.120 | 
- Yeah, so I think of, I started using the phrase 01:22:43.040 | 
people like Geoff Hinton and many of the people 01:23:11.000 | 
But at the same time, I feel like that's where 01:23:16.000 | 
a huge amount of the excitement of deep learning 01:23:27.720 | 
we may be able to even go beyond what we can achieve 01:23:40.200 | 
not limited in the ways that we are by our own biology. 01:23:47.000 | 
- Perhaps allowing us to scale the very mechanisms 01:23:50.780 | 
of human intelligence, just increase its power through scale. 01:23:55.560 | 
- Yes, and I think that that, obviously that's the, 01:24:00.360 | 
that's being played out massively at Google Brain, 01:24:07.240 | 
at OpenAI, and to some extent at DeepMind as well. 01:24:28.240 | 
- Still not as many synapses and neurons as the human brain. 01:24:32.560 | 
So we still got, we're still beating them on that. 01:24:41.220 | 
You write about modeling of mathematical cognition. 01:24:46.360 | 
So let me first ask about mathematics in general. 01:24:54.280 | 
to Mathematical Cognition, where in the introduction, 01:24:57.400 | 
there's some beautiful discussion of mathematics. 01:25:04.600 | 
who criticizes a narrow form of view of mathematics 01:25:17.160 | 
So from that perspective, what do you think is mathematics? 01:25:23.720 | 
- Well, I think of mathematics as a set of tools 01:26:09.320 | 
so as to allow the implications of certain facts 01:26:16.320 | 
to then allow you to derive other facts with certainty. 01:26:28.160 | 
and you know that there is an angle in the first one 01:26:36.600 | 
that has the same measure as an angle in the second one, 01:26:45.560 | 
adjacent to that angle in each of the two triangles, 01:26:50.080 | 
the corresponding sides adjacent to that angle 01:26:57.600 | 
then you can then conclude that the triangles are congruent. 01:27:02.240 | 
That is to say, they have all of their properties in common. 01:27:18.540 | 
In fact, you know, we built bridges out of triangles, 01:27:30.940 | 
by extending these ideas about triangles a little further. 01:27:42.960 | 
all of the ability to get a tiny speck of matter 01:27:53.520 | 
to intersect with some tiny, tiny little body 01:28:03.520 | 
is something that depends on these ideas, right? 01:28:31.060 | 
these triangles or these distances or these points, 01:28:34.020 | 
whatever they are, that allow for this set of tools 01:28:39.020 | 
to be created that then gives human beings the, 01:28:48.500 | 
that they didn't have without these concepts. 01:29:19.420 | 
Natural numbers, zero, one, two, three, four, 01:29:40.060 | 
there were 23 sheep, you came back with only 22. 01:29:49.220 | 
- It's a fundamental problem of human society 01:29:54.660 | 
the same number of sheep as you started with. 01:30:00.300 | 
it allows contracts, it allows the establishment 01:30:20.100 | 
sort of abstract and idealized and generalizable 01:30:26.860 | 
potentially very, very grounded and concrete. 01:30:35.980 | 
for the incredible achievements of the human mind 01:30:42.060 | 
is the fact that humans invented these idealized systems 01:30:53.500 | 
in such a way as to allow all this kind of thing to happen. 01:31:01.820 | 
is the development of systems for thinking about 01:31:05.620 | 
the properties and relations among sets of idealized objects. 01:31:12.300 | 
And you know, the mathematical notation system 01:31:43.880 | 
They're not necessarily the deep representation 01:31:53.400 | 
such powerful mathematical reasoning, would you say? 01:31:58.400 | 
What are some ideas you have for capturing this in a model? 01:32:02.480 | 
- The insights that human mathematicians have had 01:32:05.680 | 
is a combination of the kind of the intuitive 01:32:17.400 | 
that makes it so that something is just like obviously true 01:32:22.400 | 
so that you don't have to think about why it's true. 01:32:31.040 | 
That then makes it possible to then take the next step 01:32:37.480 | 
and ponder and reason and figure out something 01:32:41.960 | 
that you previously didn't have that intuition about. 01:32:46.600 | 
It then ultimately becomes a part of the intuition 01:32:50.560 | 
that the next generation of mathematical thinkers 01:32:59.400 | 
so that they can extend the ideas even further. 01:33:02.160 | 
I came across this quotation from Henri Poincaré 01:33:08.040 | 
while I was walking in the woods with my wife 01:33:15.860 | 
in a state park in Northern California late last summer. 01:33:32.960 | 
And so what for me the essence of the project 01:33:41.280 | 
the intuitive connectionist resources to bear on 01:33:51.700 | 
from engagement in thinking with this formal system. 01:33:59.120 | 
So I think of the ability of somebody like Hinton 01:34:09.260 | 
or Newton or Einstein or Romal Hart or Poincaré 01:34:30.800 | 
simultaneous constraints that somehow or other 01:34:38.960 | 
that it never did before and give rise to a new idea 01:34:53.700 | 
How do I write down the steps of that theorem 01:34:56.440 | 
that allow me to make it rigorous and certain? 01:35:08.220 | 
that we're beginning to see deep learning systems do 01:35:13.220 | 
of their own accord kind of gives me this feeling 01:35:40.660 | 
have become really interested in thinking about, 01:35:48.700 | 
with massive amounts of text can be given a prompt 01:35:53.700 | 
and they can then sort of generate some really interesting, 01:36:03.900 | 
And there's kind of like a sense that they've somehow 01:36:16.820 | 
all of the particulars of all of the billions 01:36:21.520 | 
and billions of experiences that went into the training data 01:36:25.440 | 
that gives rise to something like this sort of 01:36:47.800 | 
as a input to get it to start to generate its own thoughts. 01:36:52.800 | 
And to me that sort of represents the potential 01:37:03.360 | 
I don't know if you find them as captivating as, 01:37:06.040 | 
you know, on the deep mind side with AlphaZero, 01:37:17.680 | 
It feels very like there's brilliant moments of insight. 01:37:34.500 | 
that has the intuition of looking at a board, 01:37:42.580 | 
And the next few positions, how good are those? 01:37:47.860 | 
Grandmasters have this and understanding positionally, 01:37:54.900 | 
how can it be improved without doing this full, 01:37:58.180 | 
like deep search, and then maybe doing a little bit 01:38:02.320 | 
of what human chess players call calculation, 01:38:05.760 | 
which is the search, taking a particular set of steps 01:38:11.040 | 
But there is moments of genius in those systems too. 01:38:26.360 | 
Yes, and I think that, I think Demis Hassabis is, 01:38:50.120 | 
to kind of collaborate with some of those guys at DeepMind. 01:38:54.920 | 
So I think though that what I like to really emphasize here 01:39:12.540 | 
is that philosophers and logicians going back 01:39:21.260 | 
three or even a little more than 3,000 years ago 01:39:30.060 | 
And gradually the whole idea about thinking formally 01:39:42.940 | 
And it's preceded Euclid, certainly President Euclid, 01:39:50.500 | 
certainly present in the work of Thales and others. 01:39:58.740 | 
But Euclid's elements were the kind of the touch point 01:40:15.240 | 
within which these objects were characterized 01:40:19.180 | 
and the system of inference that allowed new truths 01:40:49.920 | 
who is capable of thinking in this abstract formal way 01:41:08.700 | 
that we now begin to think of our understanding 01:41:13.180 | 
So we immerse ourselves in a particular language, 01:41:18.180 | 
in a particular world of objects and their relationships, 01:41:24.800 | 
And we develop intuitive understanding of the real world. 01:41:29.200 | 
In a similar way, we can think that what academia 01:41:34.200 | 
has created for us, what those early philosophers 01:41:49.340 | 
of these schools of thought, modes of thought 01:42:06.260 | 
that systematic thought is the essential characteristic 01:42:28.640 | 
- Would you say it's more fundamental than like language? 01:42:39.520 | 
is it unfair to draw a line between mathematical cognition 01:42:48.640 | 
- I think that's a very interesting question. 01:42:53.960 | 
that I'm actually very interested in right now. 01:42:56.320 | 
But I think the answer is, in important ways, 01:43:35.120 | 
His father was a professor of rabbinical studies 01:43:38.120 | 
at a small rabbinical college in Philadelphia. 01:43:55.480 | 
and brought to the effort to understand natural language 01:44:02.560 | 
this profound engagement with these formal systems. 01:44:07.560 | 
And I think that there was tremendous power in that 01:44:25.360 | 
but that, and I'm gonna use the word but there, 01:44:32.120 | 
the actual intuitive knowledge of these things 01:44:49.160 | 
who was actually trained in the same linguistics department 01:44:53.020 | 
So what Lila discovered was that the intuitions 01:45:00.640 | 
that linguists had about even the meaning of a phrase, 01:45:11.120 | 
but about what they thought a phrase must mean 01:45:18.800 | 
of an ordinary person who wasn't a formally trained thinker. 01:45:23.800 | 
And well, it recently has become much more salient. 01:45:29.100 | 
I happen to have learned about this when I myself 01:45:31.480 | 
was a PhD student at the University of Pennsylvania, 01:45:37.480 | 
with all of my other thinking about these things. 01:46:17.140 | 
to be more organized around the systematicity 01:46:26.780 | 
and ability to be conformant with the principles 01:46:38.560 | 
of the natural human mind without that immersion. 01:46:48.700 | 
actually take you away from the natural operation 01:47:04.900 | 
and so-called knowledge that we consider private, 01:47:29.220 | 
what to believe about certain kinds of things. 01:47:39.420 | 
Well, they are the product of this sort of immersion 01:47:57.280 | 
- Does that limit you from having a good model 01:48:05.900 | 
- So when you look at mathematical or linguistics, 01:48:16.240 | 
Are you, when you're focusing on mathematical thinking, 01:48:24.380 | 
I think that's a great way of characterizing it. 01:48:38.460 | 
and another concept called the expert blind spot. 01:48:46.180 | 
So the expert blind spot is much more prosaic-seeming 01:48:59.200 | 
when they try to communicate their understanding 01:49:03.580 | 
And that is that things are self-evident to them 01:49:42.060 | 
God made the natural numbers, all else is the work of man, 01:49:50.380 | 
that somehow or other, the basic fundamentals 01:49:56.460 | 
of discrete quantities being countable and innumerable 01:50:22.340 | 
There was a long period of time where the natural numbers 01:50:27.340 | 
were considered to be a part of the innate endowment 01:50:31.860 | 
of core knowledge or to use the kind of phrases 01:51:00.540 | 
and sort of like study those few people who still exist 01:51:08.940 | 
and where a certain mode of thinking about language itself 01:51:21.540 | 
so it becomes so second nature that you don't know 01:51:46.120 | 
Some of the students in the class sort of like, 01:51:49.100 | 
they get it, they start to get the way of thinking 01:51:58.020 | 
But most of the students who don't kind of engage 01:52:05.460 | 
And we think, oh, that man must be brilliant. 01:52:17.780 | 
That makes him so that he or she could have that insight. 01:52:26.660 | 
biological individual differences completely, 01:52:35.740 | 
it was that difference in the dinner table conversation 01:52:44.820 | 
that made it so that he had that cast of mind. 01:52:48.020 | 
- Yeah, and there's a few topics we talked about 01:52:53.740 | 
'Cause I wonder, the better I get at certain things, 01:52:57.460 | 
we humans, the deeper we understand something, 01:53:07.180 | 
We talked about David and his degenerative mind. 01:53:26.300 | 
Like what, if I can, having thought about language 01:53:38.180 | 
What is in my blind spot and how big is that? 01:53:44.220 | 
out of your deep structure that you formed for yourself 01:54:00.540 | 
How is your mind less powerful than it used to be 01:54:04.100 | 
or more powerful or different, powerful in different ways? 01:54:10.820 | 
'cause we're living, we're looking at the world 01:54:18.460 | 
but it seems necessary if you want to make progress. 01:54:22.820 | 
- You know, one of the threads of psychological research 01:54:40.980 | 
aren't necessarily actually part of the process 01:54:54.820 | 
or even valid observations of the set of constraints 01:55:07.180 | 
that we can give based on information at our disposal 01:55:11.900 | 
about what might have contributed to the result 01:55:23.340 | 
in a very important paper by Nisbet and Wilson 01:55:28.060 | 
about the limits on our ability to be aware of the factors 01:55:33.060 | 
that cause us to make the choices that we make. 01:55:38.680 | 
And I think it's something that's very important 01:55:45.240 | 
and it's something that we really ought to be 01:55:50.240 | 
much more cognizant of in general as human beings 01:55:59.740 | 
we hold the beliefs that we do and we hold the attitudes 01:56:03.120 | 
and make the choices and feel the feelings that we do 01:56:15.740 | 
And it's subject to our culturally transmitted 01:56:28.180 | 
that we give to explain these things when asked to do so 01:56:43.440 | 
is a product of culture and belief, practice. 01:56:48.080 | 
- So let me ask you the big question of advice. 01:56:53.940 | 
So you've lived an incredible life in terms of the ideas 01:57:00.900 | 
in terms of the trajectory you've taken through your career, 01:57:04.420 | 
What advice would you give to young people today 01:57:07.220 | 
in high school and college about how to be more open 01:57:17.200 | 
- Finding the thing that you are intrinsically motivated 01:57:24.120 | 
to engage with and then celebrating that discovery 01:57:34.760 | 
When I was in college, I struggled with that. 01:57:46.020 | 
because I think I was interested in human psychology 01:57:51.620 | 
And at that time, the only sort of information I had 01:58:00.260 | 
and sort of popular psychiatry kinds of things. 01:58:03.940 | 
And so, well, they were psychiatrists, right? 01:58:08.820 | 
And that meant I had to go to medical school. 01:58:11.380 | 
And I got to college and I find myself taking, you know, 01:58:16.260 | 
the first semester of a three-quarter physics class 01:58:21.260 | 
And this was so far from what it was I was interested in, 01:58:30.860 | 
But I wandered about the rest of my freshman year 01:58:38.860 | 
until I found myself in the midst of this situation 01:58:56.020 | 
Columbia's building a gym in Morningside Heights, 01:59:00.300 | 
And people are thinking, oh, the big, bad rich guys 01:59:26.860 | 
and the whole university blew up and got shut down. 01:59:33.860 | 
why people were behaving the way they were in this context. 01:59:44.620 | 
I happened to have been taking psychology that quarter, 01:59:49.140 | 
And somehow things in that space all ran together in my mind 01:59:53.540 | 
and got me really excited about asking questions 02:00:00.300 | 
go into the buildings and not others and things like that. 02:00:06.140 | 
and I had just been wandering around aimlessly. 02:00:08.900 | 
And at the different points in my career, you know, 02:00:11.900 | 
when I think, okay, well, should I take this class 02:00:25.300 | 
some idea that I wanna understand better, you know, 02:00:58.580 | 
And I wasn't even getting honors based on my grades. 02:01:02.100 | 
They just happened to have thought I was interested enough 02:01:13.300 | 
through accidents of too early morning of classes 02:01:25.380 | 
and then you celebrate the fact that this happens 02:01:39.300 | 
all the answers to it and I don't think I wanna, 02:01:44.060 | 
I want anybody to think that you should be sort of 02:02:13.460 | 
that mere mortals would ever do an experiment 02:02:15.740 | 
in those sciences, except one that was in the textbook 02:02:39.060 | 
and to bring together a certain set of things 02:02:51.180 | 
you know, profoundly amazing musical geniuses, right? 02:02:56.180 | 
They get immersed in it at an early enough point 02:03:04.900 | 
So my little brother had intrinsic motivation for music 02:03:22.400 | 
because he could sort of see which ones had which scratches, 02:03:31.580 | 
- And he enjoyed that, that connected with him somehow. 02:03:33.860 | 
- Yeah, and there was something that it fed into 02:03:40.940 | 
and if you can nurture it and can let it grow 02:03:44.860 | 
and let it be an important part of your life. 02:03:49.140 | 
is like be attentive enough to feel it when it comes. 02:03:59.140 | 
I really like tabular data, like Excel sheets. 02:04:07.820 | 
I don't know how useful that is for anything. 02:04:24.180 | 
but then the other part that you're mentioning, 02:04:25.900 | 
which is the nurture, is take time and stay with it, 02:04:29.140 | 
stay with it a while and see where that takes you in life. 02:04:33.380 | 
- Yeah, and I think the motivational engagement 02:04:45.340 | 
So we could call it the Mozart effect, right? 02:04:55.860 | 
as the fourth member of the Family String Quartet, right? 02:04:59.220 | 
And they handed him the violin when he was six weeks old. 02:05:09.720 | 
So the level of immersion there was amazingly profound, 02:05:20.040 | 
maybe this is where the more sort of the genetic part 02:05:38.520 | 
So that's what I really consider to be the Mozart effect. 02:05:41.420 | 
It's sort of the synergy of something with experience 02:05:52.080 | 
So I know my siblings and I are all very different 02:06:00.260 | 
We've all gone in our own different directions. 02:06:02.700 | 
And I mentioned my younger brother who was very musical. 02:06:08.980 | 
was like this amazing like intuitive engineer. 02:06:25.000 | 
such a hugely important issue that it is today. 02:06:29.320 | 
So we all sort of somehow find a different thing. 02:06:43.440 | 
but it's also when that happens, where you can find that, 02:06:47.040 | 
then you can do your thing and you can be excited about it. 02:06:50.800 | 
So people can be excited about fitting people on bicycles 02:06:53.640 | 
as well as excited about making neural networks, 02:06:56.040 | 
achieve insights into human cognition, right? 02:07:09.920 | 
is since I was a child, just observing people around me 02:07:25.520 | 
just to think of your own career and your own life. 02:07:38.880 | 
and there may be a lot of other people similar to you 02:07:51.680 | 
And ultimately the whole ride, it's undirected. 02:07:58.240 | 
in terms of psychiatry when you were younger? 02:08:08.080 | 
and just those kind of popular psychiatry ideas. 02:08:13.080 | 
And that was a dream for me early on in high school 02:08:15.840 | 
to like, I hope to understand the human mind by, 02:08:20.000 | 
somehow psychiatry felt like the right discipline for that. 02:08:27.200 | 
Does that make you sad that psychiatry is not 02:08:43.920 | 
and biochemistry is involved in the discipline of psychiatry 02:08:58.760 | 
And that's why I kind of went to computer science 02:09:01.440 | 
and thinking like, maybe you can explore the human mind 02:09:08.660 | 
sort of the biomedical/pharmacological aspects 02:09:14.780 | 
of psychiatry at that point because I didn't, 02:09:21.920 | 
that I never even found out about that until much later. 02:09:40.480 | 
who advised the director of the National Institute 02:09:47.600 | 
And in fact, at that time, the man who came in 02:09:53.040 | 
as the new director, I had been on this board for a year 02:10:11.160 | 
and that's what we're gonna do with schizophrenia. 02:10:18.240 | 
And we're not gonna listen to anybody's grandmother anymore. 02:10:26.200 | 
is not something we're going to support any further. 02:10:30.200 | 
And he completely alienated me from the institute 02:11:17.000 | 
and sorry to romanticize the whole philosophical 02:11:30.080 | 
In the same way that physicists are the deep thinkers 02:11:37.920 | 
And I think that flag has been taken from them 02:11:51.280 | 
you can like intuit about the functioning of the mind 02:12:08.360 | 
where you're starting to actually be able to observe, 02:12:12.000 | 
you know, do certain experiments on human beings 02:12:14.240 | 
and observe how the brain is actually functioning. 02:12:25.180 | 
a cognitive psychologist can become the philosopher 02:12:28.260 | 
and psychiatrists become much more like doctors. 02:12:44.840 | 
to do great low level mathematics and physics 02:12:52.100 | 
- Yeah, I think it was Fromm and Jung more than Freud 02:12:57.100 | 
that was sort of initially kind of like made me feel like, 02:13:06.680 | 
I actually, when I got to college and I lost that thread, 02:13:10.640 | 
I found more of it in sociology and literature 02:13:20.520 | 
So I took quite a lot of both of those disciplines 02:13:25.820 | 
And, you know, I was actually deeply ambivalent 02:13:31.880 | 
about the psychology because I was doing experiments 02:13:39.440 | 
in why people would occupy buildings during an insurrection 02:13:44.240 | 
and consider, you know, be sort of like so over committed 02:13:56.480 | 
And so I had these profound sort of like dissonance 02:14:01.200 | 
between, okay, the kinds of issues that would be explored 02:14:09.280 | 
in modern British literature versus what I could study 02:14:17.480 | 
That got resolved when I went to graduate school 02:14:22.120 | 
And so for me, that was the path out of this sort of like, 02:14:37.080 | 
actual mechanistically oriented thinking about it. 02:14:40.100 | 
And I think we've come a long way in that regard 02:14:46.080 | 
and that you're absolutely right that nowadays 02:14:50.520 | 
this is something that's accessible to people 02:14:53.080 | 
through the pathway in through computer science 02:15:02.900 | 
You know, you can get derailed in neuroscience 02:15:10.260 | 
where you might find the cures of various conditions, 02:15:18.120 | 
So it's in the systems and cognitive neuroscience 02:15:32.980 | 
by having had the opportunity to fall into that space. 02:15:41.260 | 
speaking of which, you happen to be a human being 02:15:48.300 | 
That seems to be a fundamental part of the human condition 02:15:54.780 | 
Do you think about the fact that you're going to die one day? 02:16:01.380 | 
- I would say that I am not as much afraid of death 02:16:12.880 | 
And I say that in part for reasons of having, you know, 02:16:19.380 | 
seen some tragic degenerative situations unfold. 02:16:28.080 | 
It's exciting when you can continue to participate 02:16:41.020 | 
where the wave is breaking on the shore, if you like. 02:16:46.160 | 
And I think about, you know, my own future potential 02:16:53.600 | 
if I were to undergo, begin to suffer from dementia, 02:17:09.840 | 
I would sort of gradually lose the thread of that ability. 02:17:19.960 | 
for a decade after, you know, sort of having to retire 02:17:24.880 | 
because one no longer has these kinds of abilities to engage. 02:17:29.880 | 
And I think that's the thing that I fear the most. 02:17:35.560 | 
- The losing of that, like the breaking of the wave, 02:17:40.560 | 
the flourishing of the mind where you have these ideas 02:17:44.120 | 
and they're swimming around, you're able to play with them. 02:17:46.720 | 
- Yeah, and collaborate with other people who, you know, 02:17:51.000 | 
are themselves really helping to push these ideas forward. 02:18:09.520 | 
and, you know, sort of continuous sort of way 02:18:12.800 | 
of thinking about most things makes it so that, 02:18:21.640 | 
is less apparent than it seems to be to most people. 02:18:29.780 | 
Yeah, I wonder, so I don't know if you know the work 02:18:33.880 | 
of Ernest Becker and so on, I wonder what role mortality 02:18:44.840 | 
what role that plays in our reasoning of the world. 02:18:49.840 | 
- I think that it can be motivating to people 02:19:06.160 | 
on decision making that were satisfying in a certain way 02:19:16.480 | 
on whether the model fit the data perfectly or not. 02:19:21.480 | 
And I could see how one could test, you know, 02:19:30.880 | 
But I just realized, hey, wait a minute, you know, 02:19:34.240 | 
I may only have about 10 or 15 years left here 02:19:37.800 | 
and I don't feel like I'm getting towards the answers 02:19:43.360 | 
while I'm doing this particular level of work. 02:19:48.640 | 
okay, let's pick something that's hard, you know? 02:19:54.800 | 
So that's when I started working on mathematical cognition. 02:20:03.300 | 
well, I got 15 more years possibly of useful life left, 02:20:09.960 | 
I'm actually getting close to the end of that now, 02:20:15.980 | 
well, I probably have another five after that. 02:20:18.100 | 
So, okay, I'll give myself another six or eight. 02:20:25.560 | 
And so, yeah, I gotta keep thinking about the questions 02:20:37.480 | 
You've done some incredible work in your life 02:20:43.000 | 
When the aliens and the human civilization is long gone 02:20:51.640 | 
what do you hope is the paragraph written about you? 02:21:32.360 | 
other than that I'd had the right context prior to that, 02:21:35.140 | 
but that I had gone ahead and followed that lead. 02:21:41.800 | 
but I said in this preface that the joy of science 02:21:46.800 | 
is the moment in which a partially formed thought 02:22:11.460 | 
concrete piece of actual scientific progress. 02:22:22.020 | 
and when Rumelhart heard Hinton talk about gradient descent 02:22:51.100 | 
by finding exciting collaborative opportunities 02:23:07.700 | 
So it's the old Robert Frost, road less taken. 02:23:11.640 | 
So maybe, 'cause you said like this incomplete initial idea, 02:23:16.640 | 
that step you take is a little bit off the beaten path. 02:23:40.220 | 
was completely empirical experimental project. 02:23:44.320 | 
And I wrote a paper based on the two main experiments 02:24:08.080 | 
that would explain the data that I had collected. 02:24:16.480 | 
So I got back a letter from the editor saying, 02:24:20.600 | 
"Thank you very much, these are great experiments. 02:24:32.200 | 
And so I did, I took that part out of the paper. 02:24:59.200 | 
And so when I got to my assistant professorship, 02:25:11.920 | 
submitted my first article to Psychological Review, 02:25:26.120 | 
"You should keep thinking about it this time." 02:25:28.320 | 
And then that was what got me going to think, 02:25:38.700 | 
"You don't have to be, you can do it as a mere mortal." 02:25:47.600 | 
don't succumb to the labels of a particular reviewer. 02:26:05.920 | 
- Yeah, I'm a connectionist or a cognitive scientist 02:26:16.800 | 
that can completely revolutionize in totally new areas. 02:26:34.920 | 
and you wanna know why are they doing this stuff. 02:26:45.160 | 
why do you think we're all doing what we're doing? 02:26:51.740 | 
We seem to be very busy doing a bunch of stuff 02:26:54.480 | 
and we seem to be kind of directed towards somewhere, 02:27:00.640 | 
- Well, I myself think that we make meaning for ourselves 02:27:43.720 | 
But I do believe that we are an emergent result 02:27:48.720 | 
of a process that happened naturally without guidance 02:28:01.000 | 
and that the creation of efforts to reify meaning 02:28:14.320 | 
is just a part of the expression of that goal 02:28:18.240 | 
that we have to not find out what the meaning is 02:28:29.640 | 
So to me, it's something that's very personal, 02:28:38.160 | 
it's very individual, it's like meaning will come for you 02:28:43.160 | 
through the particular combination of synergistic elements 02:29:04.760 | 
it's all made in a certain kind of a local context though. 02:29:08.840 | 
Here I am at UCSD with this brilliant man, Rommelhart, 02:29:38.760 | 
there's some kind of peculiar little emergent process 02:29:43.600 | 
that then, which is basically each one of us, 02:30:04.840 | 
It's an emergent process that lives for a time, 02:30:09.260 | 
is defined by its local pocket and context in time and space 02:30:17.040 | 
and then we celebrate how nice the stories are 02:30:23.260 | 
and eventually we'll colonize, hopefully, other planets, 02:30:37.240 | 
Jay, you're speaking of peculiar emergent processes 02:30:49.760 | 
of cognitive science, of psychology, of computation. 02:30:57.000 | 
It's a huge honor that you would talk to me today, 02:31:08.320 | 
and this has been an amazing opportunity for me 02:31:11.640 | 
to let ideas that I've never fully expressed before come out 02:31:16.300 | 
'cause you ask such a wide range of the deeper questions 02:31:20.760 | 
that we've all been thinking about for so long. 02:31:31.000 | 
please check out our sponsors in the description. 02:31:36.920 | 
"In the long run, curiosity-driven research works best. 02:31:40.840 | 
"Real breakthroughs come from people focusing 02:31:45.040 | 
Thanks for listening and hope to see you next time.