back to index

Jay McClelland: Neural Networks and the Emergence of Cognition | Lex Fridman Podcast #222


Chapters

0:0 Introduction
0:43 Beauty in neural networks
5:2 Darwin and evolution
10:47 The origin of intelligence
17:29 Explorations in cognition
23:33 Learning representations by back-propagating errors
29:58 Dave Rumelhart and cognitive modeling
43:1 Connectionism
65:54 Geoffrey Hinton
67:49 Learning in a neural network
84:42 Mathematics & reality
91:50 Modeling intelligence
102:28 Noam Chomsky and linguistic cognition
116:49 Advice for young people
127:56 Psychiatry and exploring the mind
140:35 Legacy
146:24 Meaning of life

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Jay McClelland,
00:00:03.380 | a cognitive scientist at Stanford
00:00:05.380 | and one of the seminal figures
00:00:06.980 | in the history of artificial intelligence
00:00:09.520 | and specifically neural networks.
00:00:12.300 | Having written the parallel distributed processing book
00:00:15.880 | with David Rommelhart,
00:00:17.540 | who co-authored the back propagation paper
00:00:19.820 | with Jeff Hinton.
00:00:21.660 | In their collaborations,
00:00:23.260 | they've paved the way for many of the ideas
00:00:25.580 | at the center of the neural network based
00:00:27.580 | machine learning revolution of the past 15 years.
00:00:32.020 | To support this podcast,
00:00:33.500 | please check out our sponsors in the description.
00:00:36.300 | This is the Lex Friedman podcast
00:00:38.820 | and here is my conversation with Jay McClelland.
00:00:42.300 | You are one of the seminal figures
00:00:45.420 | in the history of neural networks
00:00:47.340 | at the intersection of cognitive psychology
00:00:49.780 | and computer science.
00:00:51.660 | What to you has over the decades emerged
00:00:54.180 | as the most beautiful aspect about neural networks,
00:00:57.460 | both artificial and biological?
00:00:59.520 | - The fundamental thing I think about with neural networks
00:01:03.820 | is how they allow us to link
00:01:05.860 | biology with the mysteries of thought.
00:01:13.420 | When I was first entering the field myself
00:01:19.980 | in the late 60s, early 70s,
00:01:26.100 | cognitive psychology had just become a field.
00:01:29.580 | There was a book published in '67
00:01:31.460 | called Cognitive Psychology.
00:01:33.440 | And the author said that, you know,
00:01:40.100 | the study of the nervous system
00:01:42.060 | was only of peripheral interest.
00:01:44.540 | It wasn't gonna tell us anything about the mind.
00:01:47.140 | And I didn't agree with that.
00:01:51.980 | I always felt, oh, look,
00:01:54.780 | I'm a physical being.
00:01:56.860 | From dust to dust, you know, ashes to ashes,
00:02:02.340 | and somehow I emerged from that.
00:02:04.840 | - So that's really interesting.
00:02:07.980 | So there was a sense with cognitive psychology
00:02:11.660 | that in understanding the sort of neuronal structure
00:02:16.540 | of things, you're not going to be able
00:02:18.180 | to understand the mind.
00:02:19.980 | And then your sense is if we study these neural networks,
00:02:23.660 | we might be able to get at least very close
00:02:25.860 | to understanding the fundamentals of the human mind.
00:02:28.220 | - Yeah.
00:02:29.260 | I used to think, or I used to talk about the idea
00:02:32.540 | of awakening from the Cartesian dream.
00:02:35.220 | So Descartes, you know, thought about these things, right?
00:02:41.580 | He was walking in the gardens of Versailles one day,
00:02:46.220 | and he stepped on a stone, and a statue moved.
00:02:52.140 | And he walked a little further, he stepped on another stone,
00:02:54.620 | and another statue moved.
00:02:55.860 | And he, like, why did the statue move
00:02:59.280 | when I stepped on the stone?
00:03:00.500 | And he went and talked to the gardeners,
00:03:02.860 | and he found out that they had a hydraulic system
00:03:05.820 | that allowed the physical contact with the stone
00:03:10.620 | to cause water to flow in various directions,
00:03:12.740 | which caused water to flow into the statue
00:03:14.740 | and move the statue.
00:03:15.840 | And he used this as the beginnings of a theory
00:03:22.820 | about how animals act.
00:03:26.460 | And he had this notion that these little fibers
00:03:33.260 | that people had identified that weren't carrying the blood,
00:03:36.400 | you know, were these little hydraulic tubes
00:03:39.860 | that if you touch something, there would be pressure,
00:03:42.100 | and it would send a signal of pressure
00:03:43.660 | to the other parts of the system,
00:03:46.220 | and that would cause action.
00:03:49.220 | So he had a mechanistic theory of animal behavior.
00:03:54.220 | And he thought that the human had this animal body,
00:03:59.040 | but that some divine something else
00:04:03.740 | had to have come down and been placed in him
00:04:06.940 | to give him the ability to think, right?
00:04:10.540 | So the physical world includes the body in action,
00:04:15.540 | but it doesn't include thought, according to Descartes,
00:04:18.580 | right? - Right.
00:04:19.500 | - And so the study of physiology at that time
00:04:22.900 | was the study of sensory systems and motor systems
00:04:26.380 | and things that you could directly measure
00:04:30.060 | when you stimulated neurons and stuff like that.
00:04:33.620 | And the study of cognition was something that, you know,
00:04:38.140 | was tied in with abstract computer algorithms
00:04:41.140 | and things like that.
00:04:42.360 | But when I was an undergraduate,
00:04:45.060 | I learned about the physiological mechanisms
00:04:48.660 | and so when I'm studying cognitive psychology
00:04:51.180 | as a first year PhD student, I'm saying,
00:04:53.700 | wait a minute, the whole thing is biological, right?
00:04:56.700 | (laughs)
00:04:57.940 | - You had that intuition right away.
00:04:59.580 | That seemed obvious to you.
00:05:00.820 | - Yeah, yeah.
00:05:02.940 | - Isn't that magical, though, that from just
00:05:05.820 | the little bit of biology can emerge
00:05:08.180 | the full beauty of the human experience?
00:05:11.540 | Why is that so obvious to you?
00:05:13.140 | - Well, obvious and not obvious at the same time.
00:05:18.140 | And I think about Darwin in this context, too,
00:05:20.360 | because Darwin knew very early on that none of the ideas
00:05:25.360 | that anybody had ever offered gave him a sense
00:05:30.320 | of understanding how evolution could have worked.
00:05:34.560 | But he wanted to figure out how it could have worked.
00:05:40.500 | That was his goal.
00:05:42.540 | And he spent a lot of time working on this idea
00:05:47.540 | and reading about things that gave him hints
00:05:52.300 | and thinking they were interesting but not knowing why,
00:05:54.620 | and drawing more and more pictures of different birds
00:05:57.500 | that differ slightly from each other and so on.
00:06:00.380 | And then he figured it out.
00:06:02.500 | But after he figured it out, he had nightmares about it.
00:06:06.940 | He would dream about the complexity of the eye
00:06:10.020 | and the arguments that people had given
00:06:12.700 | about how ridiculous it was to imagine
00:06:16.140 | that that could have ever emerged from some sort of
00:06:20.340 | unguided process, that it hadn't been the product of design.
00:06:27.100 | And so he didn't publish for a long time,
00:06:31.980 | in part because he was scared of his own ideas.
00:06:35.420 | He didn't think they could possibly be true.
00:06:38.320 | Yeah.
00:06:39.160 | But then by the time the 20th century rolls around,
00:06:45.960 | we all, we understand that,
00:06:51.480 | many people understand or believe that evolution produced
00:06:57.640 | the entire range of animals that there are.
00:07:02.180 | And Descartes' idea starts to seem a little wonky
00:07:07.160 | after a while, right?
00:07:08.240 | Like, well, wait a minute.
00:07:09.560 | There's the apes and the chimpanzees and the bonobos,
00:07:15.360 | and they're pretty smart in some ways.
00:07:18.460 | Oh, somebody comes up,
00:07:22.040 | oh, there's a certain part of the brain
00:07:23.680 | that's still different.
00:07:24.520 | They don't, there's no hippocampus in the monkey brain.
00:07:28.720 | It's only in the human brain.
00:07:30.160 | Huxley had to do a surgery in front of many, many people
00:07:34.240 | in the late 19th century to show to them
00:07:36.240 | there's actually a hippocampus in the chimpanzee's brain.
00:07:40.340 | So their continuity of the species is another element
00:07:46.960 | that contributes to this sort of idea
00:07:53.640 | that we are ourselves a total product of nature.
00:08:01.920 | And that to me is the magic and the mystery,
00:08:05.920 | how nature could actually give rise to organisms
00:08:10.920 | that have the capabilities that we have.
00:08:20.120 | - So it's interesting because even the idea of evolution
00:08:23.000 | is hard for me to keep all together in my mind.
00:08:27.080 | So because we think of a human time scale,
00:08:30.120 | it's hard to imagine that like the development
00:08:33.600 | of the human eye would give me nightmares too,
00:08:36.160 | because you have to think across many, many,
00:08:38.200 | many generations.
00:08:39.920 | And it's very tempting to think about kind of a growth
00:08:43.280 | of a complicated object.
00:08:44.680 | And it's like, how is it possible for that such a thing
00:08:49.320 | to be built?
00:08:50.160 | 'Cause also me from a robotics engineering perspective,
00:08:53.240 | it's very hard to build these systems.
00:08:55.360 | How can through an undirected process,
00:08:58.600 | can a complex thing be designed?
00:09:00.960 | It seems wrong.
00:09:03.440 | - Yeah, so that's absolutely right.
00:09:05.640 | And a slightly different career path
00:09:08.680 | that would have been equally interesting to me
00:09:10.600 | would have been to actually study the process
00:09:15.600 | of embryological development flowing on
00:09:20.880 | into brain development and the exquisite sort of
00:09:27.360 | laying down of pathways and so on that occurs in the brain.
00:09:32.320 | And I know the slightest bit about that is not my field,
00:09:35.760 | but there are fascinating aspects to this process
00:09:40.760 | that eventually result in the complexity of various brains.
00:09:53.960 | At least one thing where in the field,
00:09:58.400 | I think people have felt for a long time,
00:10:02.680 | in the study of vision, the continuity between humans
00:10:07.400 | and non-human animals has been second nature
00:10:11.000 | for a lot longer.
00:10:12.320 | I had this conversation with somebody
00:10:16.160 | who's a vision scientist and he was saying,
00:10:17.960 | oh, we don't have any problem with this.
00:10:20.160 | The monkey's visual system and the human visual system,
00:10:22.960 | extremely similar up to certain levels, of course.
00:10:27.960 | They diverge after a while, but the first,
00:10:32.040 | the visual pathway from the eye to the brain
00:10:35.600 | and the first few layers of cortex or cortical areas,
00:10:40.600 | I guess one would say, are extremely similar.
00:10:47.060 | - Yeah, so on the cognition side is where the leap
00:10:52.360 | seems to happen with humans.
00:10:54.240 | That it does seem we're kind of special.
00:10:56.680 | And that's a really interesting question
00:10:58.520 | when thinking about alien life
00:11:00.320 | or if there's other intelligent alien civilizations
00:11:03.120 | out there is how special is this leap?
00:11:06.040 | So one special thing seems to be the origin of life itself.
00:11:09.320 | However you define that, there's a gray area.
00:11:11.880 | And the other leap, this is very biased perspective
00:11:14.880 | of a human, is the origin of intelligence.
00:11:19.760 | And again, from an engineering perspective,
00:11:22.120 | it's a difficult question to ask.
00:11:24.440 | An important one is how difficult does that leap?
00:11:27.960 | How special were humans?
00:11:30.060 | Did a monolith come down?
00:11:32.380 | Did aliens bring down a monolith
00:11:33.720 | and some apes had to touch a monolith to get it?
00:11:38.120 | - That's a lot like Descartes' idea, right?
00:11:41.640 | - Exactly, but it just seems one heck of a leap
00:11:46.640 | to get to this level of intelligence.
00:11:48.520 | - Yeah, and so Chomsky,
00:11:51.540 | argued that some genetic fluke occurred 100,000 years ago.
00:11:56.540 | And just happened that some human,
00:12:07.140 | some hominin predecessor of current humans
00:12:12.140 | had this one genetic tweak that resulted in language.
00:12:20.420 | And language then provided this special thing
00:12:25.420 | that separates us from all other animals.
00:12:31.140 | I think there's a lot of truth
00:12:37.940 | to the value and importance of language,
00:12:40.540 | but I think it comes along with the evolution
00:12:46.020 | of a lot of other related things related to sociality
00:12:50.820 | and mutual engagement with others
00:12:54.180 | and establishment of, I don't know,
00:12:59.180 | rich mechanisms for organizing
00:13:06.260 | and understanding of the world,
00:13:08.260 | which language then plugs into.
00:13:12.980 | - Right, so language is a tool that allows you
00:13:17.020 | to do this kind of collective intelligence.
00:13:19.020 | And whatever is at the core of the thing
00:13:21.540 | that allows for this collective intelligence
00:13:23.780 | is the main thing.
00:13:25.260 | And it's interesting to think about that one fluke,
00:13:28.180 | one mutation could lead to the first crack,
00:13:33.340 | opening of the door to human intelligence.
00:13:37.900 | Like all it takes is one.
00:13:39.260 | Like evolution just kind of opens the door a little bit
00:13:41.540 | and then time and selection takes care of the rest.
00:13:45.900 | - You know, there's so many fascinating aspects
00:13:48.180 | to these kinds of things.
00:13:49.060 | So we think of evolution as continuous, right?
00:13:54.060 | We think, oh yes, okay, over 500 million years,
00:13:58.700 | there could have been this relatively continuous changes.
00:14:08.220 | But that's not what anthropologists,
00:14:12.540 | evolutionary biologists found from the fossil record.
00:14:15.660 | They found hundreds of years,
00:14:20.740 | hundreds of millions of years of stasis.
00:14:23.100 | And then suddenly a change occurs.
00:14:27.100 | Well, suddenly on that scale is a million years or something,
00:14:32.100 | or even 10 million years.
00:14:34.100 | But the concept of punctuated equilibrium
00:14:38.940 | was a very important concept in evolutionary biology.
00:14:42.780 | And that also feels somehow right
00:14:49.260 | about the stages of our mental abilities.
00:14:55.220 | We seem to have a certain kind of mindset at a certain age.
00:14:59.260 | And then at another age,
00:15:02.660 | we like look at that four-year-old and say,
00:15:04.900 | oh my God, how could they have thought that way?
00:15:07.220 | So Piaget was known for this kind of stage theory
00:15:10.140 | of child development, right?
00:15:11.580 | And you look at it closely
00:15:13.580 | and suddenly those stages are so discreet, transitions.
00:15:17.180 | But the difference between the four-year-old
00:15:18.860 | and the seven-year-old is profound.
00:15:20.820 | And that's another thing that's always interested me
00:15:24.300 | is how we, something happens over the course
00:15:28.220 | of several years of experience,
00:15:30.220 | where at some point we reached the point
00:15:32.020 | where something like an insight or a transition
00:15:35.940 | or a new stage of development occurs.
00:15:38.260 | And these kinds of things can be understood
00:15:43.100 | in complex systems research.
00:15:47.620 | And so evolutionary biology, developmental biology,
00:15:52.620 | cognitive development are all things
00:15:57.820 | that have been approached in this kind of way.
00:16:00.060 | - Yeah, just like you said,
00:16:02.060 | I find both fascinating those early years of human life,
00:16:06.660 | but also the early minutes, days
00:16:10.220 | of the embryonic development to how from embryos
00:16:15.220 | you get the brain, that development.
00:16:19.300 | Again, from the engineering perspective, it's fascinating.
00:16:22.140 | When you deploy the brain to the human world
00:16:27.380 | and it gets to explore that world and learn,
00:16:29.340 | that's fascinating.
00:16:30.500 | But just like the assembly of the mechanism
00:16:33.380 | that is capable of learning, that's amazing.
00:16:36.700 | The stuff they're doing with brain organoids
00:16:39.660 | where you can build many brains and study
00:16:42.660 | that self-assembly of a mechanism from the DNA material,
00:16:47.660 | that's like, what the heck?
00:16:51.780 | You have literally biological programs
00:16:55.300 | that just generate a system, this mushy thing
00:17:00.180 | that's able to be robust and learn
00:17:02.740 | in a very unpredictable world
00:17:05.660 | and learn seemingly arbitrary things
00:17:08.340 | or like a very large number of things
00:17:11.980 | that enable survival.
00:17:14.140 | - Yeah, ultimately that is a very important part
00:17:19.140 | of the whole process of understanding
00:17:22.340 | this sort of emergence of mind from brain kind of thing.
00:17:27.340 | - And the whole thing seems to be pretty continuous.
00:17:29.820 | So let me step back to neural networks
00:17:32.580 | for another brief minute.
00:17:35.220 | You wrote parallel distributed processing books
00:17:37.940 | that explored ideas of neural networks in the 1980s
00:17:42.140 | together with a few folks.
00:17:43.220 | But the books you wrote with David Rommelhart,
00:17:47.180 | who is the first author on the back propagation paper
00:17:50.940 | with Geoff Hinton.
00:17:52.460 | So these are just some figures at the time
00:17:54.460 | that were thinking about these big ideas.
00:17:56.740 | What are some memorable moments of discovery
00:18:00.380 | and beautiful ideas from those early days?
00:18:02.580 | - I'm gonna start sort of with my own process
00:18:08.460 | in the mid '70s and then into the late '70s
00:18:15.340 | when I met Geoff Hinton and he came to San Diego
00:18:21.100 | and we were all together.
00:18:23.020 | In my time in graduate school,
00:18:28.900 | as I've already described to you,
00:18:30.260 | I had this sort of feeling of,
00:18:32.700 | okay, I'm really interested in human cognition,
00:18:35.660 | but this disembodied sort of way of thinking about it
00:18:40.180 | that I'm getting from the current mode of thought about it
00:18:44.820 | isn't working fully for me.
00:18:47.220 | And when I got my assistant professorship,
00:18:52.220 | I went to UCSD and that was in 1974.
00:18:57.300 | Something amazing had just happened.
00:19:00.940 | Dave Rommelhart had written a book together
00:19:03.620 | with another man named Don Norman.
00:19:06.260 | And the book was called "Explorations in Cognition."
00:19:09.140 | And it was a series of chapters exploring
00:19:14.860 | interesting questions about cognition,
00:19:17.860 | but in a completely sort of abstract,
00:19:21.580 | non-biological kind of way.
00:19:23.460 | And I'm saying, gee, this is amazing.
00:19:25.420 | I'm coming to this community where people can get together
00:19:29.020 | and feel like they've collectively exploring ideas.
00:19:33.180 | And it was a book that had a lot of, I don't know,
00:19:39.900 | lightness to it.
00:19:41.020 | And Don Norman, who was the more senior figure
00:19:46.020 | at Rommelhart at that time, who led that project,
00:19:49.820 | always created this spirit of playful exploration of ideas.
00:19:55.100 | And so I'm like, wow, this is great.
00:19:58.340 | But I was also still trying to get from the neurons
00:20:03.340 | to the cognition.
00:20:10.460 | And I realized at one point,
00:20:14.140 | I got this opportunity to go to a conference
00:20:16.740 | where I heard a talk by a man named James Anderson,
00:20:20.460 | who was an engineer, but by then a professor
00:20:24.580 | in a psychology department who had used linear algebra
00:20:29.580 | to create neural network models of perception
00:20:34.740 | and categorization and memory.
00:20:37.540 | And it just blew me out of the water
00:20:41.180 | that one could create a model that was simulating neurons,
00:20:46.180 | not just kind of engaged in a stepwise algorithmic process
00:20:52.940 | that was construed abstractly,
00:20:58.580 | but it was simulating, remembering, and recalling,
00:21:03.540 | and recognizing the prior occurrence of a stimulus
00:21:07.980 | or something like that.
00:21:08.820 | So for me, this was a bridge between the mind and the brain.
00:21:13.500 | And I just like, and I remember I was walking
00:21:17.900 | across campus one day in 1977,
00:21:20.500 | and I almost felt like St. Paul on the road to Damascus.
00:21:25.020 | I said to myself, you know, if I think about the mind
00:21:28.980 | in terms of a neural network,
00:21:31.900 | it will help me answer the questions about the mind
00:21:34.060 | that I'm trying to answer.
00:21:36.140 | And that really excited me.
00:21:38.820 | So I think that a lot of people
00:21:43.160 | were becoming excited about that.
00:21:45.100 | And one of those people was Jim Anderson,
00:21:49.160 | who I had mentioned.
00:21:50.060 | Another one was Steve Grossberg,
00:21:51.980 | who had been writing about neural networks since the '60s.
00:21:56.480 | And Jeff Hinton was yet another.
00:22:00.820 | And his PhD dissertation showed up in an applicant pool
00:22:05.820 | to a postdoctoral training program
00:22:11.780 | that Dave and Don, the two men I mentioned before,
00:22:16.300 | Romelhart and Norman, were administering.
00:22:19.400 | And Romelhart got really excited
00:22:21.560 | about Hinton's PhD dissertation.
00:22:24.180 | And so Hinton was one of the first people
00:22:30.620 | who came and joined this group of postdoctoral scholars
00:22:33.820 | that was funded by this wonderful grant that they got.
00:22:39.500 | Another one who is also well-known
00:22:41.980 | in neural network circles is Paul Smolensky.
00:22:45.740 | He was another one of that group.
00:22:48.060 | Anyway, Jeff and Jim Anderson
00:22:53.060 | organized a conference at UCSD where we were.
00:22:59.620 | And it was called Parallel Models of Associative Memory.
00:23:04.620 | And it brought all the people together
00:23:06.460 | who had been thinking about these kinds of ideas
00:23:09.140 | in 1979 or 1980.
00:23:11.900 | And this began to kind of really resonate
00:23:16.900 | with some of Romelhart's own thinking,
00:23:23.260 | some of his reasons for wanting something
00:23:26.500 | other than the kinds of computation
00:23:28.660 | he'd been doing so far.
00:23:30.180 | So let me talk about Romelhart now for a minute,
00:23:32.100 | okay, with that context.
00:23:33.100 | - Well, let me also just pause
00:23:34.460 | 'cause he said so many interesting things
00:23:36.220 | before we go to Romelhart.
00:23:37.740 | So first of all, for people who are not familiar,
00:23:41.060 | neural networks are at the core of the machine learning,
00:23:43.220 | deep learning revolution of today.
00:23:45.400 | Jeffrey Hinton that we mentioned is one of the figures
00:23:48.660 | that were important in the history, like yourself,
00:23:51.180 | in the development of these neural networks,
00:23:53.180 | artificial neural networks that are then used
00:23:55.260 | for the machine learning application.
00:23:57.020 | Like I mentioned, the back propagation paper
00:23:59.420 | is one of the optimization mechanisms
00:24:02.120 | by which these networks can learn.
00:24:05.980 | And the word parallel is really interesting.
00:24:09.580 | So it's almost like synonymous
00:24:12.140 | from a computational perspective
00:24:14.020 | how you thought at the time about neural networks,
00:24:17.380 | that it's parallel computation.
00:24:19.060 | Would that be fair to say?
00:24:21.060 | - Well, yeah, the parallel,
00:24:23.100 | the word parallel in this comes from the idea
00:24:26.620 | that each neuron is an independent computational unit.
00:24:31.620 | It gathers data from other neurons,
00:24:36.380 | it integrates it in a certain way,
00:24:39.300 | and then it produces a result.
00:24:40.920 | And it's a very simple little computational unit,
00:24:44.900 | but it's autonomous in the sense that
00:24:47.820 | it does its thing, right?
00:24:51.460 | It's in a biological medium where it's getting nutrients
00:24:54.740 | and various chemicals from that medium.
00:24:58.600 | But you can think of it as almost like a little computer
00:25:05.260 | in and of itself.
00:25:06.980 | So the idea is that each, our brains have,
00:25:11.700 | oh, look, 100 or hundreds,
00:25:16.020 | almost a billion of these little neurons, right?
00:25:20.580 | And they're all capable of doing their work
00:25:24.380 | at the same time.
00:25:25.460 | So it's like instead of just a single central processor
00:25:29.900 | that's engaged in chug, chug, one step after another,
00:25:34.360 | we have a billion of these little computational units
00:25:41.120 | working at the same time.
00:25:42.540 | - So at the time, that's, I don't know,
00:25:44.460 | maybe you can comment, it seems to me, even still to me,
00:25:49.160 | quite a revolutionary way to think about computation
00:25:52.860 | relative to the development
00:25:55.140 | of theoretical computer science alongside of that,
00:25:58.060 | where it's very much like sequential computer.
00:26:00.420 | You're analyzing algorithms
00:26:02.020 | that are running on a single computer.
00:26:03.940 | - That's right.
00:26:04.780 | - You're saying, wait a minute,
00:26:06.440 | why don't we take a really dumb, very simple computer
00:26:11.380 | and just have a lot of them interconnected together?
00:26:14.380 | And they're all operating in their own little world
00:26:16.580 | and they're communicating with each other.
00:26:18.560 | And thinking of computation in that way,
00:26:20.940 | and from that kind of computation,
00:26:23.440 | trying to understand how things like certain characteristics
00:26:28.580 | of the human mind can emerge.
00:26:30.780 | - Right.
00:26:31.620 | - That's quite a revolutionary way of thinking, I would say.
00:26:35.020 | - Well, yes, I agree with you.
00:26:37.500 | And there's still this sort of sense of,
00:26:47.560 | not sort of knowing how we kind of get all the way there,
00:26:52.560 | I think, and this very much remains
00:26:57.420 | at the core of the questions that everybody's asking
00:27:00.000 | about the capabilities of deep learning
00:27:01.860 | and all these kinds of things.
00:27:02.840 | But if I could just play this out a little bit,
00:27:05.460 | a convolutional neural network or a CNN,
00:27:11.060 | which many people may have heard of,
00:27:15.380 | is a set of, you could think of it biologically
00:27:20.380 | as a set of collections of neurons.
00:27:26.400 | Each one had, each collection has maybe 10,000 neurons in it
00:27:31.400 | but there's many layers, right?
00:27:35.760 | Some of these things are hundreds
00:27:37.740 | or even a thousand layers deep,
00:27:40.040 | but others are closer to the biological brain
00:27:43.680 | and maybe they're like 20 layers deep
00:27:45.520 | or something like that.
00:27:46.960 | So we have, within each layer,
00:27:49.600 | we have thousands of neurons or tens of thousands maybe.
00:27:54.440 | Well, in the brain, we probably have millions in each layer,
00:27:59.440 | but we're getting sort of similar in a certain way, right?
00:28:02.780 | And then we think, okay, at the bottom level,
00:28:09.240 | there's an array of things
00:28:10.600 | that are like the photoreceptors in the eye.
00:28:13.320 | They respond to the amount of light of a certain wavelength
00:28:16.600 | at a certain location on the pixel array.
00:28:21.160 | So that's like the biological eye.
00:28:24.600 | And then there's several further stages going up,
00:28:27.280 | layers of these neuron-like units.
00:28:29.620 | And you go from that raw input,
00:28:35.540 | array of pixels to a classification,
00:28:39.280 | you've actually built a system
00:28:41.920 | that could do the same kind of thing
00:28:44.160 | that you and I do when we open our eyes
00:28:46.000 | and we look around and we see there's a cup,
00:28:47.980 | there's a cell phone, there's a water bottle.
00:28:51.000 | And these systems are doing that now, right?
00:28:54.960 | So they are, in terms of the parallel idea
00:28:59.960 | that we were talking about before,
00:29:02.240 | they are doing this massively parallel computation
00:29:05.560 | in the sense that each of the neurons
00:29:08.560 | in each of those layers is thought of as computing
00:29:12.320 | its little bit of something about the input
00:29:17.320 | simultaneously with all the other ones in the same layer.
00:29:20.860 | We get to the point of abstracting that away
00:29:24.100 | and thinking, oh, it's just one whole vector
00:29:26.880 | that's being computed,
00:29:28.200 | one activation pattern that's computed in a single step.
00:29:32.040 | And that abstraction is useful,
00:29:36.200 | but it's still that parallel and distributed processing.
00:29:41.200 | Each one of these guys is just contributing
00:29:43.200 | a tiny bit to that whole thing.
00:29:45.120 | - And that's the excitement that you felt
00:29:46.760 | that from these simple things,
00:29:49.320 | you can emerge when you add these level
00:29:52.000 | of abstractions on it.
00:29:53.880 | You can start getting all the beautiful things
00:29:56.040 | that we think about as cognition.
00:29:58.240 | And so, okay, so you have this conference,
00:30:01.200 | I forgot the name already,
00:30:02.280 | but it's Parallel and Something Associative Memory
00:30:04.440 | and so on, very exciting, technical and exciting title.
00:30:08.700 | And you started talking about Dave Romerhart.
00:30:11.720 | So who is this person that was so,
00:30:15.160 | you've spoken very highly of him.
00:30:17.280 | Can you tell me about him, his ideas, his mind,
00:30:21.780 | who he was as a human being, as a scientist?
00:30:24.960 | - So Dave came from a little tiny town
00:30:28.480 | in Western South Dakota.
00:30:31.800 | And his mother was the librarian
00:30:35.920 | and his father was the editor of the newspaper.
00:30:38.580 | And I know one of his brothers pretty well.
00:30:44.060 | They grew up, there were four brothers
00:30:49.600 | and they grew up together
00:30:53.720 | and their father encouraged them
00:30:55.580 | to compete with each other a lot.
00:30:58.520 | They competed in sports and they competed in mind games.
00:31:02.500 | I don't know, things like Sudoku and chess
00:31:06.960 | and various things like that.
00:31:08.880 | And Dave was a standout undergraduate.
00:31:13.880 | He went at a younger age than most people do to college
00:31:21.440 | at the University of South Dakota and majored in mathematics.
00:31:24.960 | And I don't know how he got interested in psychology
00:31:29.960 | but he applied to the mathematical psychology program
00:31:34.520 | at Stanford and was accepted as a PhD student
00:31:37.800 | to study mathematical psychology at Stanford.
00:31:40.400 | So mathematical psychology is the use of mathematics
00:31:45.400 | to model mental processes.
00:31:50.680 | - So something that I think these days
00:31:52.680 | might be called cognitive modeling,
00:31:54.320 | that whole space.
00:31:55.360 | - Yeah, it's mathematical in the sense that
00:31:58.480 | you say if this is true and that is true
00:32:05.560 | then I can derive that this should follow.
00:32:08.240 | And so you say these are my stipulations
00:32:10.320 | about the fundamental principles
00:32:12.080 | and this is my prediction about behavior
00:32:15.160 | and it's all done with equations.
00:32:16.800 | It's not done with a computer simulation.
00:32:18.860 | So you solve the equation and that tells you
00:32:23.240 | what the probability that the subject will be correct
00:32:27.440 | on the seventh trial of the experiment is
00:32:29.360 | or something like that.
00:32:30.560 | So it's a use of mathematics to descriptively characterize
00:32:35.560 | aspects of behavior.
00:32:40.000 | And Stanford at that time was the place where
00:32:43.880 | there were several really, really strong
00:32:48.720 | mathematical thinkers who were also connected
00:32:51.120 | with three or four others around the country
00:32:53.600 | who brought a lot of really exciting ideas onto the table.
00:32:58.600 | And it was a very, very prestigious part
00:33:02.860 | of the field of psychology at that time.
00:33:05.020 | So Rumelhart comes into this.
00:33:07.200 | He was a very strong student within that program.
00:33:11.420 | And he got this job at this brand new university
00:33:19.120 | in San Diego in 1967 where he's one of the first
00:33:23.080 | assistant professors in the Department of Psychology at UCSD.
00:33:29.020 | So I got there in '74, seven years later
00:33:35.200 | and Rumelhart at that time was still doing
00:33:44.200 | mathematical modeling, but he had gotten interested
00:33:49.200 | in cognition, he'd gotten interested in understanding
00:33:55.520 | and understanding I think remains,
00:34:02.520 | what does it mean to understand anyway?
00:34:06.560 | It's an interesting sort of curious,
00:34:11.080 | like how would we know if we really understood something?
00:34:14.240 | But he was interested in building machines
00:34:18.800 | that would hear a couple of sentences
00:34:21.580 | and have an insight about what was going on.
00:34:23.640 | So for example, one of his favorite things at that time was,
00:34:27.200 | Margie was sitting on the front step
00:34:32.840 | when she heard the familiar jingle of the good humor man.
00:34:38.400 | She remembered her birthday money and ran into the house.
00:34:42.120 | What is Margie doing?
00:34:45.680 | Well, there's a couple of ideas you could have,
00:34:50.200 | but the most natural one is that the good humor man
00:34:54.240 | brings ice cream, she likes ice cream,
00:34:56.600 | she knows she needs money to buy ice cream
00:35:00.160 | so she's gonna run into the house and get her money
00:35:02.120 | so she can buy herself an ice cream.
00:35:03.960 | It's a huge amount of inference that has to happen
00:35:06.160 | to get those things to link up with each other.
00:35:08.560 | And he was interested in how the hell that could happen.
00:35:13.160 | And he was trying to build good old-fashioned AI style
00:35:18.160 | models of representation of language
00:35:25.000 | and content of things like has money.
00:35:30.000 | - So like formal logic and knowledge basis,
00:35:35.480 | like that kind of stuff.
00:35:36.840 | So he was integrating that with his thinking
00:35:38.880 | about cognition.
00:35:40.640 | The mechanisms cognition,
00:35:42.920 | how can they mechanistically be applied
00:35:45.800 | to build these knowledge,
00:35:47.400 | to actually build something that looks like a web
00:35:51.480 | of knowledge and thereby from there emerges
00:35:55.040 | something like understanding, whatever the heck that is.
00:35:57.800 | - Yeah, he was grappling, this was something
00:36:00.980 | that they grappled with at the end of that book
00:36:03.360 | that I was describing, Explorations in Cognition.
00:36:06.520 | But he was realizing that the paradigm
00:36:09.320 | of good old-fashioned AI wasn't giving him
00:36:12.720 | the answers to these questions.
00:36:14.320 | - By the way, that's called good old-fashioned AI now.
00:36:18.840 | It wasn't called that at the time.
00:36:20.680 | - Well, it was.
00:36:21.520 | It was beginning to be called that.
00:36:23.640 | - Oh, 'cause it was from the '60s.
00:36:24.880 | - Yeah, yeah, by the late '70s,
00:36:27.760 | it was kind of old-fashioned
00:36:29.080 | and it hadn't really panned out.
00:36:31.680 | People were beginning to recognize that.
00:36:33.680 | And Rumelhart was like, yeah, he was part
00:36:37.120 | of the recognition that this wasn't all working.
00:36:39.640 | Anyway, so he started thinking in terms of the idea
00:36:44.640 | that we needed systems that allowed us
00:36:51.480 | to integrate multiple simultaneous constraints
00:36:55.280 | in a way that would be mutually influencing each other.
00:37:00.000 | So he wrote a paper that just really,
00:37:05.000 | first time I read it, I thought, oh, well, yeah,
00:37:10.600 | but is this important?
00:37:11.960 | But after a while, it just got under my skin.
00:37:15.240 | And it was called an Interactive Model of Reading.
00:37:18.280 | And in this paper, he laid out the idea
00:37:21.680 | that every aspect of our brain's ability
00:37:27.520 | to read, our interpretation of what's coming off the page
00:37:32.520 | when we read, at every level of analysis you can think of
00:37:40.960 | actually depends on all the other levels of analysis.
00:37:45.920 | So what are the actual pixels making up each letter?
00:37:54.000 | And what do those pixels signify about which letters
00:37:59.000 | they are, and what do those letters tell us
00:38:02.040 | about what words are there?
00:38:05.640 | And what do those words tell us about what ideas
00:38:10.000 | the author is trying to convey?
00:38:12.560 | And so he had this model where we have these little tiny
00:38:18.560 | elements that represent each of the pixels
00:38:23.560 | of each of the letters, and then other ones
00:38:30.320 | that represent the line segments in them,
00:38:32.160 | and other ones that represent the letters,
00:38:33.920 | and other ones that represent the words.
00:38:36.440 | And at that time, his idea was there's this set of experts.
00:38:43.140 | There's an expert about how to construct a line
00:38:47.660 | out of pixels, and another expert about how,
00:38:50.700 | which sets of lines go together to make which letters,
00:38:53.280 | and another one about which letters go together
00:38:55.340 | to make which words, and another one about
00:38:57.780 | what the meanings of the words are,
00:38:59.580 | and another one about how the meanings fit together,
00:39:02.860 | and things like that.
00:39:04.180 | And all these experts are looking at this data,
00:39:06.220 | and they're updating hypotheses at other levels.
00:39:12.700 | So the word expert can tell the letter expert,
00:39:15.580 | oh, I think there should be a T there,
00:39:17.260 | because I think there should be a word the here,
00:39:20.580 | and the bottom-up sort of feature-to-letter expert
00:39:23.620 | can say, I think there should be a T there too,
00:39:25.540 | and if they agree, then you see a T, right?
00:39:28.700 | And so there's a top-down, bottom-up, interactive process,
00:39:32.600 | but it's going on at all layers simultaneously.
00:39:34.860 | So everything can filter all the way down from the top,
00:39:37.180 | as well as all the way up from the bottom,
00:39:38.900 | and it's a completely interactive, bidirectional,
00:39:42.700 | parallel, distributed process.
00:39:45.180 | - That is somehow, because of the abstractions,
00:39:47.740 | it's hierarchical, so there's different layers
00:39:51.460 | of responsibilities, different levels of responsibilities.
00:39:54.700 | First of all, it's fascinating to think about it
00:39:56.620 | in this kind of mechanistic way.
00:39:58.440 | So not thinking purely from the structure
00:40:02.140 | of a neural network, or something like a neural network,
00:40:04.980 | but thinking about these little guys
00:40:06.860 | that work on letters, and then the letters become words,
00:40:09.860 | and words become sentences,
00:40:11.620 | and that's a very interesting hypothesis,
00:40:14.780 | that from that kind of hierarchical structure
00:40:18.400 | can emerge understanding.
00:40:21.580 | - Yeah, but the thing is, though,
00:40:23.260 | I wanna just sort of relate this
00:40:25.700 | to the earlier part of the conversation.
00:40:27.700 | When Romalhart was first thinking about it,
00:40:31.220 | there were these experts on the side,
00:40:34.620 | one for the features, and one for the letters,
00:40:36.860 | and one for how the letters make the words, and so on,
00:40:39.900 | and they would each be working,
00:40:43.060 | sort of evaluating various propositions about,
00:40:47.020 | is this combination of features here
00:40:48.980 | going to be one that looks like the letter T, and so on?
00:40:52.020 | And what he realized,
00:40:56.700 | kind of after reading Hinton's dissertation,
00:40:59.400 | and hearing about Jim Anderson's linear algebra book,
00:41:04.400 | algebra-based neural network models
00:41:06.120 | that I was telling you about before,
00:41:07.680 | was that he could replace those experts
00:41:10.840 | with neuron-like processing units,
00:41:12.720 | which just would have their connection weights
00:41:14.760 | that would do this job.
00:41:16.040 | So what ended up happening was
00:41:20.400 | that Romalhart and I got together,
00:41:22.320 | and we created a model
00:41:24.160 | called the Interactive Activation Model of Letter Perception,
00:41:28.240 | which takes these little pixel-level,
00:41:34.480 | inputs, constructs line segment features,
00:41:39.480 | letters, and words,
00:41:41.860 | but now we built it out of a set of neuron-like
00:41:45.000 | processing units that are just connected to each other
00:41:48.360 | with connection weights.
00:41:49.560 | So the unit for the word time
00:41:52.220 | has a connection to the unit for the letter T
00:41:55.080 | in the first position,
00:41:56.200 | and the letter I in the second position, so on,
00:41:59.920 | and because these connections are bidirectional,
00:42:03.720 | if you have prior knowledge that it might be the word time,
00:42:08.800 | that starts to prime the letters and the features,
00:42:12.000 | and if you don't, then it has to start bottom-up,
00:42:14.960 | but the directionality just depends
00:42:17.360 | on where the information comes in first,
00:42:20.160 | and if you have context together
00:42:22.080 | with features at the same time,
00:42:24.240 | they can convergently result in an emergent perception,
00:42:27.720 | and that was the piece of work that we did together
00:42:32.720 | that sort of got us both completely convinced
00:42:40.760 | that this neural network way of thinking
00:42:44.560 | was going to be able to actually address the questions
00:42:48.460 | that we were interested in as cognitive psychologists.
00:42:50.800 | - So the algorithmic side, the optimization side,
00:42:53.160 | those are all details, like when you first start,
00:42:55.800 | the idea that you can get far
00:42:57.560 | with this kind of way of thinking,
00:42:59.400 | that in itself is a profound idea.
00:43:01.440 | So do you like the term connectionism
00:43:05.040 | to describe this kind of set of ideas?
00:43:07.760 | - I think it's useful.
00:43:08.860 | It highlights the notion that the knowledge
00:43:15.120 | that the system exploits
00:43:17.320 | is in the connections between the units, right?
00:43:21.360 | There isn't a separate dictionary.
00:43:24.780 | There's just the connections between the units.
00:43:28.000 | So I already sort of laid that on the table
00:43:32.000 | with the connections from the letter units
00:43:34.160 | to the unit for the word time, right?
00:43:36.920 | The unit for the word time isn't a unit for the word time
00:43:40.040 | for any other reason than it's got the connections
00:43:43.220 | to the letters that make up the word time.
00:43:46.040 | Those are the units on the input that excite it
00:43:48.360 | when it's excited that it in a sense represents
00:43:52.680 | in the system that there's support for the hypothesis
00:43:57.680 | that the word time is present in the input.
00:44:00.120 | But it's not,
00:44:03.120 | the word time isn't written anywhere inside the model.
00:44:08.360 | It's only written there in the picture we drew of the model
00:44:11.780 | to say that's the unit for the word time, right?
00:44:14.920 | And if somebody wants to tell me,
00:44:18.640 | well, how do you spell that word?
00:44:21.080 | You have to use the connections from that out
00:44:24.320 | to then get those letters, for example.
00:44:27.800 | - That's such a, that's a counterintuitive idea.
00:44:31.600 | We humans want to think in this logic way.
00:44:35.020 | This idea of connectionism, it doesn't, it's weird.
00:44:41.240 | It's weird that this is how it all works.
00:44:43.520 | - Yeah, but let's go back to that CNN, right?
00:44:46.120 | That CNN with all those layers of neuron-like processing
00:44:49.200 | units that we were talking about before,
00:44:51.520 | it's gonna come out and say, this is a cat, that's a dog.
00:44:54.400 | But it has no idea why it said that.
00:44:57.720 | It's just got all these connections
00:44:59.440 | between all these layers of neurons,
00:45:02.040 | like from the very first layer to the, you know,
00:45:05.200 | like whatever these layers are,
00:45:07.880 | they just get numbered after a while because they,
00:45:10.520 | you know, they somehow further in you go,
00:45:13.640 | the more abstract the features are,
00:45:17.200 | but it's a graded and continuous sort of process
00:45:20.320 | of abstraction anyway.
00:45:21.640 | And, you know, it goes from very local,
00:45:24.440 | very, very specific to much more sort of global,
00:45:28.880 | but it's still, you know, another sort of pattern
00:45:32.000 | of activation over an array of units.
00:45:33.980 | And then at the output side, it says it's cat or it's a dog.
00:45:37.360 | And when we, when I open my eyes and say, oh, that's Lex,
00:45:42.480 | or, oh, you know, there's my own dog.
00:45:47.480 | And I recognize my dog,
00:45:49.280 | which is a member of the same species as many other dogs,
00:45:53.080 | but I know this one because of some
00:45:55.880 | slightly unique characteristics.
00:45:57.440 | I don't know how to describe what it is that makes me know
00:46:01.420 | that I'm looking at Lex or at my particular dog, right?
00:46:04.680 | Or even that I'm looking at a particular brand of car.
00:46:07.680 | Like I could say a few words about it,
00:46:09.420 | but if I wrote you a paragraph about the car,
00:46:12.480 | you would have trouble figuring out
00:46:14.200 | which car is he talking about, right?
00:46:16.760 | So the idea that we have propositional knowledge
00:46:19.400 | of what it is that allows us to recognize
00:46:23.360 | that this is an actual instance
00:46:25.300 | of this particular natural kind
00:46:27.760 | has always been, you know, something that,
00:46:32.360 | it never worked, right?
00:46:36.560 | You couldn't ever write down a set of propositions
00:46:38.920 | for, you know, visual recognition.
00:46:41.540 | And so in that space,
00:46:44.380 | it sort of always seemed very natural
00:46:46.260 | that something more implicit, you know,
00:46:51.260 | you don't have access to what the details
00:46:54.080 | of the computation were in between,
00:46:56.520 | you just get the result.
00:46:58.360 | So that's the other part of connectionism.
00:47:00.140 | You cannot, you don't read the contents of the connections.
00:47:04.060 | The connections only cause outputs to occur based on inputs.
00:47:09.060 | - Yeah, and for us that like final layer
00:47:13.720 | or some particular layer is very important.
00:47:16.560 | The one that tells us that it's our dog
00:47:19.480 | or like it's a cat or a dog.
00:47:22.200 | But, you know, each layer is probably equally as important
00:47:25.440 | in the grand scheme of things.
00:47:27.280 | Like there's no reason why the cat versus dog
00:47:30.240 | is more important than the lower level activations.
00:47:33.120 | It doesn't really matter.
00:47:34.100 | I mean, all of it is just this beautiful stacking
00:47:36.820 | on top of each other.
00:47:37.660 | And we humans live in this particular layers for us.
00:47:40.740 | For us, it's useful to survive,
00:47:43.400 | to use those cat versus dog, predator versus prey,
00:47:47.860 | all those kinds of things.
00:47:49.180 | It's fascinating that it's all continuous.
00:47:51.260 | But then you then ask, you know,
00:47:53.700 | the history of artificial intelligence,
00:47:55.300 | you ask, are we able to introspect and convert
00:47:58.300 | the very things that allow us to tell the difference
00:48:00.740 | between cat and dog into logic, into formal logic.
00:48:05.380 | That's been the dream.
00:48:06.620 | I would say that's still part of the dream of symbolic AI.
00:48:10.460 | And I've recently talked to Doug Lennard who created Psych.
00:48:15.460 | And that's a project that lasted for many decades
00:48:23.180 | and still carries a sort of dream in it, right?
00:48:28.900 | - And we still don't know the answer, right?
00:48:30.700 | It seems like connectionism is really powerful,
00:48:34.820 | but it also seems like there's this building of knowledge.
00:48:38.740 | And so how do we, how do you square those two?
00:48:41.420 | Like, do you think the connections can contain
00:48:44.180 | the depth of human knowledge and the depth
00:48:46.940 | of what Dave Romahart was thinking about of understanding?
00:48:51.500 | - Well, that remains the $64 question.
00:48:58.020 | - With inflation, that number is higher.
00:48:59.700 | - Okay, $64,000.
00:49:01.820 | Maybe it's the $64 billion question now.
00:49:04.660 | You know, I think that from the emergentist side,
00:49:13.820 | which, you know, I place myself on.
00:49:22.440 | So I used to sometimes tell people
00:49:25.980 | I was a radical eliminative connectionist
00:49:29.620 | because I didn't want them to think
00:49:34.380 | that I wanted to build like anything into the machine.
00:49:38.280 | But I don't like the word eliminative anymore
00:49:44.420 | because it makes it seem like it's wrong to think
00:49:50.580 | that there is this emergent level of understanding.
00:49:55.860 | And I disagree with that.
00:50:00.100 | So I think, you know, I would call myself
00:50:02.260 | in a radical emergentist connectionist
00:50:06.900 | rather than eliminative connectionist, right?
00:50:09.460 | Because I want to acknowledge that these higher level
00:50:14.460 | kinds of aspects of our cognition are real,
00:50:19.380 | but they're not, they don't,
00:50:24.420 | they don't exist as such.
00:50:28.980 | And there was an example that Doug Hofstadter used to use
00:50:33.540 | that I thought was helpful in this respect.
00:50:36.660 | Just the idea that we can think about sand dunes
00:50:41.300 | as entities and talk about like how many there are even,
00:50:51.380 | but we also know that a sand dune is a very fluid thing.
00:50:56.380 | It's a pile of sand that is capable of moving around
00:51:01.380 | under the wind and, you know,
00:51:06.380 | reforming itself in somewhat different ways.
00:51:10.100 | And if we think about our thoughts as like sand dunes,
00:51:13.000 | as being things that, you know, emerge from
00:51:18.060 | just the way all the lower level elements
00:51:21.260 | sort of work together and are constrained
00:51:24.380 | by external forces, then we can say,
00:51:28.140 | yes, they exist as such, but they also, you know,
00:51:33.140 | we shouldn't treat them as completely monolithic entities
00:51:37.860 | that we can understand without understanding
00:51:41.540 | sort of all of the stuff that allows them to change
00:51:46.100 | in the ways that they do.
00:51:47.500 | And that's where I think the connectionist
00:51:49.180 | feeds into the cognitive.
00:51:52.180 | It's like, okay, so if the substrate
00:51:55.340 | is parallel distributed connectionist,
00:51:58.620 | then it doesn't mean that the contents of thought
00:52:03.300 | isn't, you know, like abstract and symbolic,
00:52:06.960 | but it's more fluid maybe than is easier to capture
00:52:13.040 | with a set of logical expressions.
00:52:15.380 | - Yeah, that's a heck of a sort of thing to put
00:52:18.420 | at the top of a resume, radical emerginist connectionist.
00:52:23.420 | So there is just like you said,
00:52:26.060 | a beautiful dance between that,
00:52:27.580 | between the machinery of intelligence,
00:52:30.380 | like the neural network side of it
00:52:32.340 | and the stuff that emerges.
00:52:34.340 | I mean, the stuff that emerges seems to be,
00:52:39.220 | I don't know, I don't know what that is.
00:52:44.380 | It seems like maybe all of reality is emergent.
00:52:48.920 | What I think about,
00:52:52.860 | this is made most distinctly rich to me
00:52:57.380 | when I look at cellular automata,
00:52:59.620 | look at game of life,
00:53:01.340 | that from very, very simple things,
00:53:03.620 | very rich, complex things emerge
00:53:06.780 | that start looking very quickly like organisms
00:53:10.260 | that you forget how the actual thing operates.
00:53:13.620 | They start looking like they're moving around,
00:53:15.620 | they're eating each other.
00:53:16.460 | Some of them are generating offspring.
00:53:20.100 | You forget very quickly.
00:53:21.900 | And it seems like maybe it's something about the human mind
00:53:24.980 | that wants to operate in some layer of the emergent
00:53:28.620 | and forget about the mechanism
00:53:30.580 | of how that emergence happens.
00:53:32.220 | So it just like you are in your radicalness,
00:53:35.240 | also it seems like unfair to eliminate the magic
00:53:41.340 | of that emergent.
00:53:43.040 | Like eliminate the fact that that emergent is real.
00:53:48.040 | - Yeah, no, I agree.
00:53:49.700 | That's why I got rid of eliminative, right?
00:53:53.300 | - Eliminative, yeah.
00:53:54.140 | - Yeah, because it seemed like that was trying to say
00:53:56.580 | that it's all completely like--
00:54:01.580 | - An illusion of some kind that's not--
00:54:03.380 | - Well, who knows whether there aren't
00:54:06.580 | some illusory characteristics there.
00:54:09.680 | And I think that philosophically,
00:54:13.480 | many people have confronted that possibility over time,
00:54:17.760 | but it's still important to accept it as magic, right?
00:54:22.760 | So I think of Fellini in this context,
00:54:30.260 | I think of others who have appreciated the role of magic,
00:54:38.040 | of actual trickery in creating illusions that move us.
00:54:43.040 | And Plato was onto this too.
00:54:47.360 | It's like somehow or other these shadows
00:54:49.860 | give rise to something much deeper than that.
00:54:55.980 | So we won't try to figure out what it is,
00:55:01.020 | we'll just accept it as given that that occurs.
00:55:06.740 | But he was still onto the magic of it.
00:55:08.640 | - Yeah, yeah.
00:55:09.900 | We won't try to really, really, really deeply understand
00:55:13.040 | how it works, we'll just enjoy the fact
00:55:15.320 | that it's kind of fun.
00:55:16.660 | Okay, but you worked closely with Dave Rommelhart.
00:55:21.660 | He passed away as a human being.
00:55:24.920 | What do you remember about him?
00:55:27.000 | Do you miss the guy?
00:55:28.000 | - Absolutely.
00:55:33.140 | He passed away 15-ish years ago now,
00:55:38.140 | and his demise was actually one of the most poignant
00:55:43.720 | and relevant tragedies relevant to our conversation.
00:55:59.720 | He started to undergo a progressive neurological condition
00:56:04.720 | that isn't fully understood.
00:56:14.980 | That is to say his particular course isn't fully understood
00:56:19.760 | because brain scans weren't done at certain stages
00:56:28.140 | and no autopsy was done or anything like that.
00:56:31.780 | The wishes of the family.
00:56:33.140 | So we don't know as much about the underlying pathology
00:56:37.640 | as we might, but I had begun to get interested
00:56:42.640 | in this neurological condition that might have been
00:56:49.700 | the very one that he was succumbing to
00:56:52.500 | as my own efforts to understand another aspect
00:56:56.640 | of this mystery that we've been discussing
00:56:58.740 | while he was beginning to get progressively
00:57:02.720 | more and more affected.
00:57:04.360 | So I'm gonna talk about the disorder
00:57:07.020 | and not about Rommelhart for a second, okay?
00:57:09.380 | The disorder is something my colleagues and collaborators
00:57:12.580 | have chosen to call semantic dementia.
00:57:17.100 | So it's a specific form of loss of mind related to meaning,
00:57:25.340 | semantic dementia.
00:57:27.600 | And it's progressive in the sense that the patient
00:57:32.120 | loses the ability to appreciate the meaning
00:57:39.540 | of the experiences that they have,
00:57:44.680 | either from touch, from sight, from sound, from language.
00:57:51.600 | I hear sounds, but I don't know what they mean
00:57:54.200 | kind of thing.
00:57:55.180 | So as this illness progresses, it starts with
00:58:02.700 | the patient being unable to differentiate
00:58:07.040 | like similar breeds of dog or remember
00:58:14.680 | the lower frequency unfamiliar categories
00:58:19.120 | that they used to be able to remember.
00:58:21.960 | But as it progresses, it becomes more and more striking
00:58:26.960 | and the patient loses the ability to recognize
00:58:32.200 | things like pigs and goats and sheep
00:58:39.200 | and calls all middle-sized animals dogs
00:58:42.600 | and can't recognize rabbits and rodents anymore.
00:58:47.040 | They call all the little ones cats
00:58:49.740 | and they can't recognize hippopotamuses
00:58:52.520 | and cows where they call them all horses.
00:58:55.840 | So there was this one patient who went through
00:58:59.760 | this progression where at a certain point,
00:59:03.040 | any four-legged animal, he would call it either a horse
00:59:05.820 | or a dog or a cat.
00:59:08.120 | And if it was big, he would tend to call it a horse.
00:59:10.680 | If it was small, he'd tend to call it a cat.
00:59:12.840 | Middle-sized ones he called dogs.
00:59:14.520 | This is just a part of the syndrome though.
00:59:19.000 | The patient loses the ability to relate
00:59:22.620 | concepts to each other.
00:59:25.300 | So my collaborator in this work, Carolyn Patterson,
00:59:28.800 | developed a test called the pyramids and palm trees test.
00:59:33.320 | So you give the patient a picture of pyramids
00:59:39.520 | and they have a choice, which goes with the pyramids?
00:59:42.940 | Palm trees or pine trees?
00:59:46.580 | And she showed that this wasn't just a matter of language
00:59:50.980 | because the patient's loss of this ability shows up
00:59:55.740 | whether you present the material with words
00:59:58.300 | or with pictures.
01:00:00.220 | The pictures, they can't put the pictures together
01:00:03.780 | with each other properly anymore.
01:00:05.280 | They can't relate the pictures to the words either.
01:00:07.820 | They can't do word picture matching.
01:00:09.700 | But they've lost the conceptual grounding
01:00:12.500 | from either modality of input.
01:00:15.780 | And so that's why it's called semantic dementia.
01:00:19.780 | The very semantics is disintegrating.
01:00:23.140 | And we understand this in terms of our idea
01:00:28.140 | that distributed representation,
01:00:30.460 | a pattern of activation represents the concepts.
01:00:32.980 | Really similar ones, as you degrade them,
01:00:35.580 | they start being, you lose the differences.
01:00:39.220 | And then, so the difference between the dog and the goat
01:00:42.380 | sort of is no longer part of the pattern anymore.
01:00:44.580 | And since dog is really familiar,
01:00:47.460 | that's the thing that remains.
01:00:49.340 | And we understand that in the way the models work and learn.
01:00:52.500 | But Rommel-Hart underwent this condition.
01:00:57.300 | So on the one hand, it's a fascinating aspect
01:01:00.260 | of parallel distributed processing to me.
01:01:02.420 | And it reveals this sort of texture
01:01:07.180 | of distributed representation in a very nice way,
01:01:10.620 | I've always felt.
01:01:11.500 | But at the same time, it was extremely poignant
01:01:13.660 | because this is exactly the condition
01:01:16.780 | that Rommel-Hart was undergoing.
01:01:18.460 | And there was a period of time when he was,
01:01:20.900 | this man who had been the most focused,
01:01:25.480 | goal-directed, competitive, (laughs)
01:01:33.420 | thoughtful person who was willing to work for years
01:01:42.340 | to solve a hard problem, he starts to disappear.
01:01:47.340 | And there was a period of time when it was like,
01:01:53.780 | hard for any of us to really appreciate
01:01:59.180 | that he was sort of, in some sense,
01:02:01.420 | not fully there anymore.
01:02:04.500 | - Do you know if he was able to introspect
01:02:07.660 | this dissolution of the understanding mind?
01:02:12.660 | Was he, I mean, this is one of the big scientists
01:02:18.140 | that thinks about this.
01:02:20.100 | Was he able to look at himself
01:02:21.740 | and understand the fading mind?
01:02:23.480 | - You know, we can contrast Hawking
01:02:29.740 | and Rommel-Hart in this way.
01:02:31.460 | And I like to do that to honor Rommel-Hart
01:02:34.180 | because I think Rommel-Hart is sort of like
01:02:36.180 | the Hawking of cognitive science to me in some ways.
01:02:39.820 | Both of them suffered from a degenerative condition.
01:02:46.200 | In Hawking's case, it affected the motor system.
01:02:49.300 | In Rommel-Hart's case, it's affecting the semantics.
01:02:54.180 | And not just the pure object semantics,
01:03:00.300 | but maybe the self semantics as well.
01:03:05.340 | And we don't understand that.
01:03:07.140 | - Concepts broadly.
01:03:08.980 | - So I would say he didn't.
01:03:13.920 | And this was part of what, from the outside,
01:03:16.740 | was a profound tragedy.
01:03:18.660 | But on the other hand, at some level, he sort of did.
01:03:22.940 | Because there was a period of time
01:03:26.660 | when it finally was realized
01:03:28.620 | that he had really become profoundly impaired.
01:03:33.060 | This was clearly a biological condition
01:03:35.220 | and it wasn't just like he was distracted that day
01:03:38.300 | or something like that.
01:03:40.180 | So he retired from his professorship at Stanford
01:03:45.020 | and he lived with his brother for a couple years
01:03:50.020 | and then he moved into a facility
01:03:54.860 | for people with cognitive impairments.
01:04:01.580 | One that many elderly people end up in
01:04:05.180 | when they have cognitive impairments.
01:04:06.780 | And I would spend time with him during that period.
01:04:11.780 | This was in the late '90s, around 2000 even.
01:04:17.140 | And we would go bowling and he could still bowl.
01:04:22.140 | And after bowling, I took him to lunch
01:04:30.900 | and I said, "Where would you like to go?
01:04:34.340 | "You wanna go to Wendy's?"
01:04:35.460 | And he said, "Nah."
01:04:37.260 | And I said, "Okay, well, where do you wanna go?"
01:04:38.740 | And he just pointed.
01:04:40.260 | He said, "Turn here."
01:04:41.580 | So he still had a certain amount of spatial cognition
01:04:44.940 | and he could get me to the restaurant.
01:04:46.840 | And then when we got to the restaurant,
01:04:50.500 | I said, "What do you wanna order?"
01:04:53.020 | And he couldn't come up with any of the words
01:04:56.860 | but he knew where on the menu the thing was that he wanted.
01:05:00.660 | - That's so fascinating.
01:05:01.860 | - He couldn't say what it was
01:05:05.340 | but he knew that that's what he wanted to eat.
01:05:07.640 | It isn't monolithic at all.
01:05:15.260 | Our cognition is, first of all,
01:05:20.060 | graded in certain kinds of ways but also multi-partite.
01:05:24.140 | There's many elements to it and things,
01:05:27.980 | certain sort of partial competencies still exist
01:05:32.140 | in the absence of other aspects of these competencies.
01:05:36.040 | So this is what always fascinated me
01:05:39.340 | about what used to be called cognitive neuropsychology,
01:05:44.340 | the effects of brain damage on cognition.
01:05:50.060 | But in particular, this gradual disintegration part.
01:05:54.300 | I'm a big believer that the loss of a human being
01:05:58.700 | that you value is as powerful as first falling in love
01:06:02.340 | with that human being.
01:06:03.340 | I think it's all a celebration of the human being.
01:06:06.940 | So the disintegration itself too is a celebration in a way.
01:06:10.820 | - Yeah, yeah.
01:06:12.180 | But just to say something more about the scientist
01:06:17.740 | and the backpropagation idea that you mentioned.
01:06:22.580 | So in 1982,
01:06:25.820 | Hinton had been there as a postdoc
01:06:33.420 | and organized that conference.
01:06:34.860 | He'd actually gone away
01:06:35.980 | and gotten an assistant professorship
01:06:37.740 | and then there was this opportunity to bring him back.
01:06:41.660 | So Jeff Hinton was back on a sabbatical.
01:06:45.860 | - San Diego. - In San Diego.
01:06:47.560 | And Rumelhart and I had decided we wanted to do this
01:06:53.180 | we thought it was really exciting.
01:06:54.900 | And the papers on the interactive activation model
01:06:59.420 | that I was telling you about had just been published.
01:07:01.580 | And we both sort of saw a huge potential for this work
01:07:05.220 | and Jeff was there.
01:07:07.460 | And so the three of us started a research group
01:07:12.460 | which we called the PDP Research Group.
01:07:15.020 | And several other people came.
01:07:19.220 | Francis Crick, who was at the Salk Institute
01:07:22.020 | heard about it from Jeff.
01:07:23.660 | 'Cause Jeff was known among Brits to be brilliant
01:07:28.340 | and Francis was well connected with his British friends.
01:07:31.140 | So Francis Crick came.
01:07:33.580 | - That's a heck of a group of people, wow.
01:07:35.620 | - And several, Paul Spolensky was one of the other postdocs.
01:07:40.620 | He was still there as a postdoc.
01:07:43.020 | And a few other people.
01:07:45.900 | But anyway,
01:07:51.380 | talked to us about learning
01:07:56.860 | and how we should think about
01:07:59.380 | how learning occurs in a neural network.
01:08:07.220 | And he said, "The problem
01:08:09.300 | "with the way you guys have been approaching this
01:08:13.220 | "is that you've been looking for inspiration from biology
01:08:17.780 | "to tell you what the rules should be
01:08:21.160 | "for how the synapses should change
01:08:23.020 | "the strengths of their connections,
01:08:24.520 | "how the connections should form."
01:08:26.400 | He said, "That's the wrong way to go about it.
01:08:31.120 | "What you should do is you should think in terms of
01:08:35.340 | "how you can adjust connection weights to solve a problem.
01:08:45.140 | "So you define your problem, and then you figure out
01:08:50.140 | "how the adjustment of the connection weights
01:08:52.160 | "will solve the problem."
01:08:54.260 | And Rommelhart heard that and said to himself,
01:09:00.380 | "Okay, so I'm gonna start thinking about it that way.
01:09:04.500 | "I'm going to essentially imagine
01:09:09.500 | "that I have some objective function,
01:09:12.060 | "some goal of the computation.
01:09:14.360 | "I want my machine to correctly classify
01:09:17.320 | "all of these images, and I can score that.
01:09:21.640 | "I can measure how well they're doing on each image,
01:09:24.460 | "and I get some measure of error or loss,
01:09:27.580 | "it's typically called in deep learning.
01:09:30.780 | "And I'm going to figure out
01:09:34.320 | "how to adjust the connection weights
01:09:35.960 | "so as to minimize my loss or reduce the error."
01:09:42.160 | And that's called gradient descent.
01:09:47.160 | And engineers were already familiar
01:09:51.040 | with the concept of gradient descent.
01:09:53.860 | And in fact, there was an algorithm called the Delta Rule
01:09:58.860 | that had been invented by a professor
01:10:04.380 | in the electrical engineering department at Stanford,
01:10:08.380 | Woodrow, Bernie Woodrow, and a collaborator named Hoff.
01:10:11.740 | I never met him.
01:10:13.260 | Anyway, so gradient descent in continuous neural networks
01:10:18.260 | with multiple neuron-like processing units
01:10:22.460 | was already understood
01:10:23.940 | for a single layer of connection weights.
01:10:30.020 | We have some inputs over a set of neurons.
01:10:32.940 | We want the output to produce a certain pattern.
01:10:36.200 | We can define the difference between our target
01:10:39.060 | and what the neural network is producing,
01:10:41.620 | and we can figure out how to change the connection weights
01:10:43.740 | to reduce that error.
01:10:45.260 | So what Romalhard did was to generalize that
01:10:48.820 | so as to be able to change the connections
01:10:52.100 | from earlier layers of units to the ones at a hidden layer
01:10:56.660 | between the input and the output.
01:10:59.100 | And so he first called the algorithm
01:11:02.420 | the Generalized Delta Rule,
01:11:04.060 | because it's just an extension of the gradient descent idea.
01:11:08.540 | And interestingly enough, Hinton was thinking
01:11:12.500 | that this wasn't going to work very well.
01:11:15.500 | So Hinton had his own alternative algorithm at the time
01:11:20.500 | based on the concept of the Boltzmann machine
01:11:24.060 | that he was pursuing.
01:11:25.100 | So the paper on the Boltzmann machine came out in,
01:11:27.760 | learning in Boltzmann machines came out in 1985,
01:11:30.520 | but it turned out that backprop worked better
01:11:35.820 | than the Boltzmann machine learning algorithm.
01:11:38.420 | - So this generalized delta algorithm
01:11:41.140 | ended up being called backpropagation,
01:11:43.700 | as you say, backprop.
01:11:44.900 | - Yeah.
01:11:46.180 | And probably that name is opaque to many people.
01:11:51.180 | What does that mean?
01:11:52.660 | What it meant was that in order to figure out
01:11:57.940 | what the changes you needed to make
01:12:00.140 | to the connections from the input to the hidden layer,
01:12:04.260 | you had to backpropagate the error signals
01:12:08.580 | from the output layer through the connections
01:12:12.820 | from the hidden layer to the output
01:12:14.900 | to get the signals that would be the error signals
01:12:19.480 | for the hidden layer.
01:12:21.140 | And that's how Rumelhart formulated it.
01:12:23.100 | It was like, well, we know what the error signals
01:12:25.180 | are at the output layer.
01:12:26.340 | Let's see if we can get a signal at the hidden layer
01:12:29.020 | that tells each hidden unit
01:12:30.460 | what its error signal is, essentially.
01:12:32.860 | So it's backpropagating through the connections
01:12:36.460 | from the hidden to the output to get the signals
01:12:40.540 | to tell the hidden units how to change their weights
01:12:43.100 | from the input, and that's why it's called backprop.
01:12:45.700 | Yeah, so it came from Hinton having introduced the concept
01:12:52.860 | of define your objective function,
01:12:57.060 | figure out how to take the derivative
01:12:59.860 | so that you can adjust the connections
01:13:02.580 | so that they make progress towards your goal.
01:13:04.940 | - So stop thinking about biology for a second,
01:13:07.540 | and let's start to think about optimization
01:13:09.740 | and computation a little bit more.
01:13:12.980 | So what about Jeff Hinton?
01:13:15.820 | You've gotten a chance to work with him in that little,
01:13:20.460 | the set of people involved there, it's quite incredible,
01:13:24.940 | the small set of people under the PDP flag.
01:13:28.900 | It's just, given the amount of impact
01:13:31.460 | those ideas have had over the years,
01:13:33.180 | it's kind of incredible to think about.
01:13:34.660 | But just like you said, like yourself,
01:13:39.180 | Jeffrey Hinton is seen as one of the,
01:13:41.620 | not just like a seminal figure in AI,
01:13:43.740 | but just a brilliant person,
01:13:45.460 | just like the horsepower of the mind
01:13:48.140 | is pretty high up there for him
01:13:50.420 | 'cause he's just a great thinker.
01:13:52.500 | So what kind of ideas have you learned from him?
01:13:57.500 | Have you influenced each other on?
01:13:59.980 | Have you debated over,
01:14:01.220 | what stands out to you in the full space of ideas here
01:14:06.220 | at the intersection of computation and cognition?
01:14:08.860 | - Well, so Jeff has said many things to me
01:14:14.740 | that had a profound impact on my thinking.
01:14:16.860 | And he's written several articles
01:14:20.820 | which were way ahead of their time.
01:14:27.660 | He had two papers in 1981, just to give one example,
01:14:32.660 | one of which was essentially the idea of transformers,
01:14:43.460 | and another of which was a early paper
01:14:48.500 | on semantic cognition, which inspired him and Rommelhart
01:14:57.540 | and me throughout the '80s,
01:15:01.980 | and still I think sort of grounds my own thinking
01:15:06.980 | about the semantic aspects of cognition.
01:15:16.040 | He also, in a small paper that was never published
01:15:22.860 | that he wrote in 1977,
01:15:26.020 | before he actually arrived at UCSD,
01:15:27.900 | or maybe a couple of years even before that, I don't know,
01:15:30.740 | when he was a PhD student,
01:15:32.380 | he described how a neural network
01:15:36.900 | could do recursive computation.
01:15:40.080 | And it was a very clever idea
01:15:46.020 | that he's continued to explore over time,
01:15:48.920 | which was sort of the idea that
01:15:55.300 | when you call a subroutine,
01:15:56.660 | you need to save the state that you had when you called it
01:16:01.460 | so you can get back to where you were
01:16:03.180 | when you're finished with the subroutine.
01:16:04.940 | And the idea was that you would save the state
01:16:09.940 | of the calling routine
01:16:11.260 | by making fast changes to connection weights,
01:16:14.820 | and then when you finished with the subroutine call,
01:16:19.820 | those fast changes in the connection weights
01:16:22.320 | would allow you to go back to where you had been before
01:16:25.260 | and reinstate the previous context
01:16:27.660 | so that you could continue on
01:16:29.020 | with the top level of the computation.
01:16:33.040 | Anyway, that was part of the idea.
01:16:35.660 | And I always thought, okay, that's really, you know,
01:16:38.900 | he had extremely creative ideas
01:16:42.620 | that were quite a lot ahead of his time,
01:16:45.300 | and many of them in the 1970s and early 1980s.
01:16:50.720 | So another thing about Geoff Hinton's way of thinking,
01:16:55.720 | which has profoundly influenced my effort
01:17:02.360 | to understand human mathematical cognition
01:17:07.960 | is that he doesn't write too many equations.
01:17:12.960 | And people tell stories like,
01:17:15.680 | oh, in the Hinton lab meetings,
01:17:18.120 | you don't get up at the board and write equations
01:17:20.280 | like you do in everybody else's machine learning lab.
01:17:23.520 | What you do is you draw a picture. (laughs)
01:17:27.560 | And, you know, he explains aspects
01:17:32.440 | of the way deep learning works
01:17:34.480 | by putting his hands together
01:17:36.800 | and showing you the shape of a ravine
01:17:39.660 | and using that as a geometrical metaphor
01:17:44.680 | for what's happening as this gradient descent process.
01:17:48.440 | You're coming down the wall of a ravine.
01:17:50.520 | If you take too big a jump,
01:17:51.640 | you're gonna jump to the other side.
01:17:54.000 | And so that's why we have to turn down
01:17:56.720 | the learning rate, for example.
01:17:59.760 | And it speaks to me of the fundamentally intuitive character
01:18:04.760 | of deep insight together with the fact
01:18:13.480 | that together with a commitment to really understanding
01:18:18.480 | in a way that's absolutely, ultimately explicit and clear,
01:18:27.480 | but also intuitive.
01:18:31.800 | - Yeah, there's certain people like that.
01:18:33.720 | He's an example, some kind of weird mix of visual
01:18:38.520 | and intuitive and all those kinds of things.
01:18:40.640 | Feynman is another example, different style of thinking,
01:18:43.360 | but very unique.
01:18:44.640 | And when you're around those people,
01:18:46.740 | for me in the engineering realm,
01:18:48.680 | there's a guy named Jim Keller,
01:18:50.520 | who's a chip designer, engineer.
01:18:52.860 | Every time I talk to him,
01:18:55.600 | it doesn't matter what we're talking about.
01:18:58.260 | Just having experienced that unique way of thinking
01:19:02.000 | transforms you and makes your work much better.
01:19:05.360 | And that's the magic.
01:19:06.960 | You look at Daniel Kahneman,
01:19:08.780 | you look at the great collaborations
01:19:10.720 | throughout the history of science.
01:19:12.800 | That's the magic of that.
01:19:13.640 | It's not always the exact ideas that you talk about,
01:19:16.680 | but it's the process of generating those ideas,
01:19:19.700 | being around that, spending time with that human being,
01:19:22.820 | you can come up with some brilliant work,
01:19:25.100 | especially when it's cross-disciplinary
01:19:26.760 | as it was a little bit in your case with Jeff.
01:19:30.640 | - Yeah, Jeff is a descendant of the logician Boole.
01:19:38.640 | He comes from a long line of English academics.
01:19:43.640 | And together with the deeply intuitive thinking ability
01:19:49.080 | that he has, he also has, it's been clear.
01:19:56.040 | He's described this to me,
01:20:00.000 | and I think he's mentioned it from time to time
01:20:02.360 | in other interviews that he's had with people.
01:20:07.800 | He's wanted to be able to sort of think of himself
01:20:11.400 | as contributing to the understanding of reasoning itself,
01:20:16.400 | not just human reasoning.
01:20:22.880 | Like Boole is about logic, right?
01:20:25.720 | It's about what can we conclude from what else
01:20:29.200 | and how do we formalize that.
01:20:31.880 | And as a computer scientist, logician,
01:20:36.800 | philosopher, the goal is to understand
01:20:41.800 | how we derive truths from givens and things like this.
01:20:48.520 | And the work that Jeff was doing in the early to mid '80s
01:20:53.520 | on something called the Bolton machine
01:20:59.880 | was his way of connecting with that Boolean tradition
01:21:04.320 | and bringing it into the more continuous,
01:21:08.200 | probabilistic-graded constraint satisfaction realm.
01:21:11.540 | And it was beautiful, a set of ideas
01:21:17.760 | linked with theoretical physics,
01:21:21.360 | as well as with logic.
01:21:27.120 | And it's always been, I mean,
01:21:31.240 | I've always been inspired by the Bolton machine too.
01:21:33.480 | It's like, well, if the neurons are probabilistic
01:21:36.720 | rather than deterministic in their computations,
01:21:40.440 | then maybe this somehow is part of the serendipity
01:21:45.440 | or adventitiousness of the moment of insight, right?
01:21:54.240 | It might not have occurred at that particular instant.
01:21:57.480 | It might be sort of partially the result
01:21:59.400 | of a stochastic process.
01:22:01.520 | And that too is part of the magic of the emergence
01:22:05.400 | of some of these things.
01:22:08.120 | - Well, you're right with the Boolean lineage
01:22:09.880 | and the dream of computer science is somehow,
01:22:14.880 | I mean, I certainly think of humans this way,
01:22:17.440 | that humans are one particular manifestation of intelligence,
01:22:21.480 | that there's something bigger going on,
01:22:23.400 | and you're hoping to figure that out.
01:22:26.200 | The mechanisms of intelligence, the mechanisms of cognition
01:22:29.440 | are much bigger than just humans.
01:22:31.120 | - Yeah, so I think of, I started using the phrase
01:22:36.120 | computational intelligence at some point
01:22:38.760 | as to characterize the field that I thought,
01:22:43.040 | people like Geoff Hinton and many of the people
01:22:48.040 | I know at DeepMind are working in,
01:22:53.080 | and where I feel like I'm,
01:23:00.600 | I'm a kind of a human-oriented
01:23:04.680 | computational intelligence researcher
01:23:06.960 | in that I'm actually kind of interested
01:23:09.200 | in the human solution.
01:23:11.000 | But at the same time, I feel like that's where
01:23:16.000 | a huge amount of the excitement of deep learning
01:23:22.640 | actually lies is in the idea that,
01:23:27.720 | we may be able to even go beyond what we can achieve
01:23:32.720 | with our own nervous systems when we build
01:23:36.360 | computational intelligences that are
01:23:40.200 | not limited in the ways that we are by our own biology.
01:23:47.000 | - Perhaps allowing us to scale the very mechanisms
01:23:50.780 | of human intelligence, just increase its power through scale.
01:23:55.560 | - Yes, and I think that that, obviously that's the,
01:24:00.360 | that's being played out massively at Google Brain,
01:24:07.240 | at OpenAI, and to some extent at DeepMind as well.
01:24:11.000 | I guess I shouldn't say to some extent.
01:24:14.040 | This massive scale of the computations that
01:24:19.960 | are used to succeed at games like Go
01:24:24.360 | or to solve the protein folding problems
01:24:26.400 | that they've been solving and so on.
01:24:28.240 | - Still not as many synapses and neurons as the human brain.
01:24:32.560 | So we still got, we're still beating them on that.
01:24:36.620 | We humans are beating the AIs,
01:24:38.960 | but they're catching up pretty quickly.
01:24:41.220 | You write about modeling of mathematical cognition.
01:24:46.360 | So let me first ask about mathematics in general.
01:24:49.140 | There's a paper titled
01:24:52.720 | Parallel Distributed Processing Approach
01:24:54.280 | to Mathematical Cognition, where in the introduction,
01:24:57.400 | there's some beautiful discussion of mathematics.
01:25:00.740 | And you reference there Tristan Needham,
01:25:04.600 | who criticizes a narrow form of view of mathematics
01:25:07.520 | by liking the studying of mathematics
01:25:11.680 | as simple manipulation to studying music
01:25:14.840 | without ever hearing a note.
01:25:17.160 | So from that perspective, what do you think is mathematics?
01:25:20.880 | What is this world of mathematics like?
01:25:23.720 | - Well, I think of mathematics as a set of tools
01:25:28.480 | for exploring idealized worlds
01:25:33.480 | that often turn out to be extremely relevant
01:25:42.640 | to the real world, but need not.
01:25:47.720 | But they're worlds in which
01:25:49.760 | objects exist with idealized properties,
01:25:57.980 | and in which the relationships among them
01:26:06.720 | can be characterized with precision
01:26:09.320 | so as to allow the implications of certain facts
01:26:16.320 | to then allow you to derive other facts with certainty.
01:26:21.320 | So, you know, if you have two triangles
01:26:28.160 | and you know that there is an angle in the first one
01:26:36.600 | that has the same measure as an angle in the second one,
01:26:43.040 | and you know that the lengths of the sides
01:26:45.560 | adjacent to that angle in each of the two triangles,
01:26:50.080 | the corresponding sides adjacent to that angle
01:26:55.600 | also have the same measure,
01:26:57.600 | then you can then conclude that the triangles are congruent.
01:27:02.240 | That is to say, they have all of their properties in common.
01:27:06.960 | And that is something about triangles.
01:27:12.280 | It's not a matter of formulas.
01:27:16.240 | These are idealized objects.
01:27:18.540 | In fact, you know, we built bridges out of triangles,
01:27:24.280 | and we understand how to measure the height
01:27:28.780 | of something we can't climb
01:27:30.940 | by extending these ideas about triangles a little further.
01:27:35.940 | And, you know,
01:27:42.960 | all of the ability to get a tiny speck of matter
01:27:47.960 | launched from the planet Earth
01:27:53.520 | to intersect with some tiny, tiny little body
01:27:57.120 | way out in, way beyond Pluto somewhere
01:28:01.280 | at exactly a predicted time and date
01:28:03.520 | is something that depends on these ideas, right?
01:28:07.840 | So, but, and it's actually happening
01:28:11.640 | in the real physical world
01:28:15.800 | that these ideas make contact with it
01:28:20.480 | in those kinds of instances.
01:28:22.180 | And so, but you know,
01:28:28.520 | there are these idealized objects,
01:28:31.060 | these triangles or these distances or these points,
01:28:34.020 | whatever they are, that allow for this set of tools
01:28:39.020 | to be created that then gives human beings the,
01:28:45.860 | it's this incredible leverage
01:28:48.500 | that they didn't have without these concepts.
01:28:50.780 | And I think this is actually already true
01:28:56.140 | when we think about just, you know,
01:29:02.420 | the natural numbers.
01:29:04.880 | I always like to include zero,
01:29:08.420 | so I'm gonna say the non-negative integers,
01:29:11.700 | but that's a place where some people
01:29:15.940 | prefer not to include zero, but--
01:29:18.060 | - Oh, we like zero here.
01:29:19.420 | Natural numbers, zero, one, two, three, four,
01:29:21.460 | five, six, seven, and so on.
01:29:23.100 | - Yeah, and you know, because they give you
01:29:26.500 | the ability to be exact about the things
01:29:31.500 | that you have, like how many sheep you have.
01:29:36.500 | Like, you know, I sent you out this morning,
01:29:40.060 | there were 23 sheep, you came back with only 22.
01:29:43.420 | What happened, right?
01:29:45.180 | - The fundamental problem of physics,
01:29:46.720 | how many sheep you have, yeah.
01:29:49.220 | - It's a fundamental problem of human society
01:29:53.100 | that you damn well better bring back
01:29:54.660 | the same number of sheep as you started with.
01:29:56.900 | And you know, it allows commerce,
01:30:00.300 | it allows contracts, it allows the establishment
01:30:04.260 | of records and so on to have systems
01:30:08.540 | that allow these things to be notated.
01:30:10.980 | But they have an inherent aboutness to them
01:30:15.740 | that's one at the, one in the same time
01:30:20.100 | sort of abstract and idealized and generalizable
01:30:23.380 | while on the other hand,
01:30:26.860 | potentially very, very grounded and concrete.
01:30:30.100 | And one of the things that makes
01:30:35.980 | for the incredible achievements of the human mind
01:30:42.060 | is the fact that humans invented these idealized systems
01:30:47.060 | that leverage the power of human thought
01:30:53.500 | in such a way as to allow all this kind of thing to happen.
01:30:56.900 | And so that's what mathematics to me
01:31:01.820 | is the development of systems for thinking about
01:31:05.620 | the properties and relations among sets of idealized objects.
01:31:12.300 | And you know, the mathematical notation system
01:31:21.960 | that we unfortunately focus way too much on
01:31:26.960 | is just our way of expressing propositions
01:31:32.360 | about these properties.
01:31:37.080 | - Right, it's just like we're talking
01:31:38.840 | with Chomsky in language.
01:31:40.520 | It's the thing we've invented
01:31:42.040 | for the communication of those ideas.
01:31:43.880 | They're not necessarily the deep representation
01:31:47.520 | of those ideas.
01:31:48.640 | - Yeah.
01:31:49.480 | - What's a good way to model
01:31:53.400 | such powerful mathematical reasoning, would you say?
01:31:58.400 | What are some ideas you have for capturing this in a model?
01:32:02.480 | - The insights that human mathematicians have had
01:32:05.680 | is a combination of the kind of the intuitive
01:32:10.680 | kind of connectionist-like knowledge
01:32:17.400 | that makes it so that something is just like obviously true
01:32:22.400 | so that you don't have to think about why it's true.
01:32:31.040 | That then makes it possible to then take the next step
01:32:37.480 | and ponder and reason and figure out something
01:32:41.960 | that you previously didn't have that intuition about.
01:32:46.600 | It then ultimately becomes a part of the intuition
01:32:50.560 | that the next generation of mathematical thinkers
01:32:55.560 | have to ground their own thinking on
01:32:59.400 | so that they can extend the ideas even further.
01:33:02.160 | I came across this quotation from Henri Poincaré
01:33:08.040 | while I was walking in the woods with my wife
01:33:15.860 | in a state park in Northern California late last summer.
01:33:20.360 | And what it said on the bench was,
01:33:24.460 | it is by logic that we prove,
01:33:30.800 | but by intuition that we discover.
01:33:32.960 | And so what for me the essence of the project
01:33:37.960 | is to understand how to bring
01:33:41.280 | the intuitive connectionist resources to bear on
01:33:46.280 | letting the intuitive discovery arise
01:33:51.700 | from engagement in thinking with this formal system.
01:33:59.120 | So I think of the ability of somebody like Hinton
01:34:09.260 | or Newton or Einstein or Romal Hart or Poincaré
01:34:14.260 | to Archimedes is another example, right?
01:34:21.280 | So suddenly a flash of insight occurs.
01:34:25.800 | It's like the constellation of all of these
01:34:30.800 | simultaneous constraints that somehow or other
01:34:35.320 | causes the mind to settle into a novel state
01:34:38.960 | that it never did before and give rise to a new idea
01:34:43.560 | that then you can say, okay, well now,
01:34:49.880 | how can I prove this?
01:34:53.700 | How do I write down the steps of that theorem
01:34:56.440 | that allow me to make it rigorous and certain?
01:35:00.900 | And so I feel like the kinds of things
01:35:08.220 | that we're beginning to see deep learning systems do
01:35:13.220 | of their own accord kind of gives me this feeling
01:35:19.740 | of, I don't know, hope or encouragement
01:35:26.700 | that ultimately it'll all happen.
01:35:35.660 | - So in particular, as many people now
01:35:40.660 | have become really interested in thinking about,
01:35:45.660 | neural networks that have been trained
01:35:48.700 | with massive amounts of text can be given a prompt
01:35:53.700 | and they can then sort of generate some really interesting,
01:36:00.060 | fanciful, creative story from that prompt.
01:36:03.900 | And there's kind of like a sense that they've somehow
01:36:08.900 | synthesized something like novel out of the,
01:36:16.820 | all of the particulars of all of the billions
01:36:21.520 | and billions of experiences that went into the training data
01:36:25.440 | that gives rise to something like this sort of
01:36:28.440 | intuitive sense of what would be a fun
01:36:32.100 | and interesting little story to tell
01:36:34.040 | or something like that.
01:36:35.280 | It just sort of wells up out of the,
01:36:38.640 | letting the thing play out its own imagining
01:36:43.840 | of what somebody might say given this prompt
01:36:47.800 | as a input to get it to start to generate its own thoughts.
01:36:52.800 | And to me that sort of represents the potential
01:36:57.920 | of capturing the intuitive side of this.
01:37:02.060 | - Yeah, and there's other examples.
01:37:03.360 | I don't know if you find them as captivating as,
01:37:06.040 | you know, on the deep mind side with AlphaZero,
01:37:09.320 | if you study chess, the kind of solutions
01:37:12.000 | that has come up in terms of chess,
01:37:13.960 | there's novel ideas there.
01:37:17.680 | It feels very like there's brilliant moments of insight.
01:37:21.640 | And the mechanism they use,
01:37:25.380 | if you think of search as maybe more
01:37:30.080 | towards the good old fashioned AI,
01:37:31.660 | and then there's the connectionist network
01:37:34.500 | that has the intuition of looking at a board,
01:37:38.220 | looking at a set of patterns and saying,
01:37:40.580 | how good is this set of positions?
01:37:42.580 | And the next few positions, how good are those?
01:37:45.460 | And that's it.
01:37:46.460 | That's just an intuition.
01:37:47.860 | Grandmasters have this and understanding positionally,
01:37:52.300 | tactically, how good the situation is,
01:37:54.900 | how can it be improved without doing this full,
01:37:58.180 | like deep search, and then maybe doing a little bit
01:38:02.320 | of what human chess players call calculation,
01:38:05.760 | which is the search, taking a particular set of steps
01:38:08.740 | down the line to see how they unroll.
01:38:11.040 | But there is moments of genius in those systems too.
01:38:15.160 | So that's another hopeful illustration
01:38:18.900 | that from neural networks can emerge
01:38:21.580 | this novel creation of an idea.
01:38:26.360 | Yes, and I think that, I think Demis Hassabis is,
01:38:30.080 | he's spoken about those things.
01:38:35.760 | I heard him describe a move that was made
01:38:40.680 | in one of the Go matches against Lee Sedol
01:38:44.400 | in a very similar way.
01:38:46.240 | And it caused me to become really excited
01:38:50.120 | to kind of collaborate with some of those guys at DeepMind.
01:38:54.920 | So I think though that what I like to really emphasize here
01:38:59.920 | is one part of what I like to emphasize
01:39:10.560 | about mathematical cognition at least
01:39:12.540 | is that philosophers and logicians going back
01:39:21.260 | three or even a little more than 3,000 years ago
01:39:25.380 | began to develop these formal systems.
01:39:30.060 | And gradually the whole idea about thinking formally
01:39:36.060 | got constructed.
01:39:42.940 | And it's preceded Euclid, certainly President Euclid,
01:39:50.500 | certainly present in the work of Thales and others.
01:39:54.140 | And I'm not the world's leading expert
01:39:56.540 | in all the details of that history.
01:39:58.740 | But Euclid's elements were the kind of the touch point
01:40:03.740 | of a coherent document that sort of laid out
01:40:09.660 | this idea of an actual formal system
01:40:15.240 | within which these objects were characterized
01:40:19.180 | and the system of inference that allowed new truths
01:40:24.180 | to be derived from others
01:40:32.420 | was sort of like established as a paradigm.
01:40:35.100 | And what I find interesting is the idea
01:40:44.920 | that the ability to become a person
01:40:49.920 | who is capable of thinking in this abstract formal way
01:40:54.920 | is a result of the same kind of immersion
01:41:01.720 | in experience thinking in that way
01:41:08.700 | that we now begin to think of our understanding
01:41:11.940 | of language as being.
01:41:13.180 | So we immerse ourselves in a particular language,
01:41:18.180 | in a particular world of objects and their relationships,
01:41:22.320 | and we learn to talk about that.
01:41:24.800 | And we develop intuitive understanding of the real world.
01:41:29.200 | In a similar way, we can think that what academia
01:41:34.200 | has created for us, what those early philosophers
01:41:39.480 | and their academies in Athens and Alexandria
01:41:44.340 | and other places allowed was the development
01:41:49.340 | of these schools of thought, modes of thought
01:41:54.340 | that then become deeply ingrained.
01:41:57.100 | And it becomes what it is that makes it so
01:42:02.100 | that somebody like Jerry Fodor would think
01:42:06.260 | that systematic thought is the essential characteristic
01:42:11.260 | of the human mind as opposed to a derived,
01:42:18.960 | an acquired characteristic that results
01:42:21.760 | from acculturation in a certain mode
01:42:25.360 | that's been invented by humans.
01:42:28.640 | - Would you say it's more fundamental than like language?
01:42:32.180 | If we start dancing, if we bring Chomsky
01:42:35.360 | back into the conversation, first of all,
01:42:39.520 | is it unfair to draw a line between mathematical cognition
01:42:44.280 | and language, linguistic cognition?
01:42:48.640 | - I think that's a very interesting question.
01:42:51.080 | And I think it's one of the ones
01:42:53.960 | that I'm actually very interested in right now.
01:42:56.320 | But I think the answer is, in important ways,
01:43:04.480 | it is important to draw that line,
01:43:07.920 | but then to come back and look at it again
01:43:09.800 | and see some of the subtleties
01:43:12.080 | and interesting aspects of the difference.
01:43:14.500 | So if we think about Chomsky himself,
01:43:23.440 | he was born into an academic family.
01:43:35.120 | His father was a professor of rabbinical studies
01:43:38.120 | at a small rabbinical college in Philadelphia.
01:43:42.080 | And he was deeply enculturated in
01:43:48.760 | a culture of thought and reason
01:43:55.480 | and brought to the effort to understand natural language
01:44:02.560 | this profound engagement with these formal systems.
01:44:07.560 | And I think that there was tremendous power in that
01:44:14.360 | and that Chomsky had some amazing insights
01:44:22.940 | into the structure of natural language,
01:44:25.360 | but that, and I'm gonna use the word but there,
01:44:32.120 | the actual intuitive knowledge of these things
01:44:35.920 | only goes so far and does not go as far
01:44:39.600 | as it does in people like Chomsky himself.
01:44:43.400 | And this was something that was discovered
01:44:46.320 | in the PhD dissertation of Lila Gleitman,
01:44:49.160 | who was actually trained in the same linguistics department
01:44:52.120 | with Chomsky.
01:44:53.020 | So what Lila discovered was that the intuitions
01:45:00.640 | that linguists had about even the meaning of a phrase,
01:45:05.640 | not just about its grammar,
01:45:11.120 | but about what they thought a phrase must mean
01:45:15.760 | were very different from the intuitions
01:45:18.800 | of an ordinary person who wasn't a formally trained thinker.
01:45:23.800 | And well, it recently has become much more salient.
01:45:29.100 | I happen to have learned about this when I myself
01:45:31.480 | was a PhD student at the University of Pennsylvania,
01:45:34.320 | but I never knew how to put it together
01:45:37.480 | with all of my other thinking about these things.
01:45:39.480 | So I actually currently have the hypothesis
01:45:44.480 | that formally trained linguists
01:45:47.400 | and other formally trained academics,
01:45:49.920 | whether it be linguistics, philosophy,
01:45:58.640 | cognitive science, computer science,
01:46:01.260 | machine learning, mathematics,
01:46:03.440 | have a mode of engagement with experience
01:46:12.320 | that is intuitively deeply structured
01:46:17.140 | to be more organized around the systematicity
01:46:26.780 | and ability to be conformant with the principles
01:46:31.780 | of a system than is actually true
01:46:38.560 | of the natural human mind without that immersion.
01:46:42.560 | - That's fascinating.
01:46:43.400 | And so the different fields and approaches
01:46:46.740 | with which you start to study the mind
01:46:48.700 | actually take you away from the natural operation
01:46:52.620 | of the mind.
01:46:53.540 | So it makes it very difficult for you
01:46:55.300 | to be somebody who introspects.
01:46:59.060 | - Yes.
01:46:59.900 | And this is where things about human belief
01:47:04.900 | and so-called knowledge that we consider private,
01:47:21.620 | not our business to manipulate in others.
01:47:26.140 | We are not entitled to tell somebody else
01:47:29.220 | what to believe about certain kinds of things.
01:47:33.020 | What are those beliefs?
01:47:39.420 | Well, they are the product of this sort of immersion
01:47:44.020 | and enculturation.
01:47:46.760 | That is what I believe.
01:47:51.440 | - So- - And that's limiting.
01:47:53.380 | - It's something to be aware of.
01:47:57.280 | - Does that limit you from having a good model
01:48:01.780 | of some of cognition?
01:48:04.480 | - It can.
01:48:05.900 | - So when you look at mathematical or linguistics,
01:48:08.300 | I mean, what is that line then?
01:48:10.680 | So is Chomsky unable to sneak up
01:48:14.780 | to the full picture of cognition?
01:48:16.240 | Are you, when you're focusing on mathematical thinking,
01:48:20.220 | are you also unable to do so?
01:48:23.060 | - I think you're right.
01:48:24.380 | I think that's a great way of characterizing it.
01:48:27.080 | And I also think that it's related
01:48:32.080 | to the concept of beginner's mind
01:48:38.460 | and another concept called the expert blind spot.
01:48:46.180 | So the expert blind spot is much more prosaic-seeming
01:48:51.180 | than this point that you were just making.
01:48:54.200 | But it's something that plagues experts
01:48:59.200 | when they try to communicate their understanding
01:49:02.380 | to non-experts.
01:49:03.580 | And that is that things are self-evident to them
01:49:12.300 | that they can't begin to even think about
01:49:17.300 | how they could explain it to somebody else
01:49:23.300 | because it's just so patently obvious
01:49:28.300 | that it must be true.
01:49:31.420 | And when Kronenker said,
01:49:42.060 | God made the natural numbers, all else is the work of man,
01:49:47.060 | he was expressing that intuition
01:49:50.380 | that somehow or other, the basic fundamentals
01:49:56.460 | of discrete quantities being countable and innumerable
01:50:01.900 | and indefinite in number was not something
01:50:09.740 | that had to be discovered, but he was wrong.
01:50:14.740 | It turns out that many cognitive scientists
01:50:21.180 | agreed with him for a time.
01:50:22.340 | There was a long period of time where the natural numbers
01:50:27.340 | were considered to be a part of the innate endowment
01:50:31.860 | of core knowledge or to use the kind of phrases
01:50:36.900 | that Spelke and Carey used to talk about
01:50:40.340 | what they believe are the innate primitives
01:50:42.760 | of the human mind.
01:50:44.180 | And they no longer believe that.
01:50:47.820 | It's actually been more or less accepted
01:50:52.340 | by almost everyone that the natural numbers
01:50:55.340 | are actually a cultural construction.
01:50:58.100 | And it's so interesting to go back
01:51:00.540 | and sort of like study those few people who still exist
01:51:03.380 | who don't have those systems.
01:51:06.140 | So this is just an example to me
01:51:08.940 | and where a certain mode of thinking about language itself
01:51:14.740 | or a certain mode of thinking about geometry
01:51:19.380 | and those kinds of relations,
01:51:21.540 | so it becomes so second nature that you don't know
01:51:24.060 | what it is that you need to teach.
01:51:25.900 | And in fact, we don't really teach it
01:51:32.500 | all that explicitly anyway.
01:51:35.340 | And it's, you take a math class,
01:51:40.340 | the professor sort of teaches it to you
01:51:44.000 | the way they understand it.
01:51:46.120 | Some of the students in the class sort of like,
01:51:49.100 | they get it, they start to get the way of thinking
01:51:51.540 | and they can actually do the problems
01:51:53.300 | that get put on the homework
01:51:55.380 | that the professor thinks are interesting
01:51:57.020 | and challenging ones.
01:51:58.020 | But most of the students who don't kind of engage
01:52:03.380 | as deeply don't ever get.
01:52:05.460 | And we think, oh, that man must be brilliant.
01:52:10.040 | He must have this special insight,
01:52:11.940 | but he must have some biological sort of bit
01:52:16.660 | that's different, right?
01:52:17.780 | That makes him so that he or she could have that insight.
01:52:21.660 | But I'm, although I don't wanna dismiss
01:52:26.660 | biological individual differences completely,
01:52:29.620 | I find it much more interesting to think
01:52:34.060 | about the possibility that,
01:52:35.740 | it was that difference in the dinner table conversation
01:52:41.980 | at the Chomsky house when he was growing up
01:52:44.820 | that made it so that he had that cast of mind.
01:52:48.020 | - Yeah, and there's a few topics we talked about
01:52:51.300 | that kind of interconnect.
01:52:53.740 | 'Cause I wonder, the better I get at certain things,
01:52:57.460 | we humans, the deeper we understand something,
01:53:01.060 | what are you starting to then miss
01:53:02.980 | about the rest of the world?
01:53:07.180 | We talked about David and his degenerative mind.
01:53:12.180 | And when you look in the mirror and wonder,
01:53:19.380 | how different am I cognitively
01:53:22.100 | from the man I was a month ago,
01:53:24.380 | from the man I was a year ago?
01:53:26.300 | Like what, if I can, having thought about language
01:53:31.300 | if I'm Chomsky for 10, 20 years,
01:53:35.900 | what am I no longer able to see?
01:53:38.180 | What is in my blind spot and how big is that?
01:53:41.140 | And then to somehow be able to leap back
01:53:44.220 | out of your deep structure that you formed for yourself
01:53:48.100 | about thinking about the world,
01:53:49.940 | leap back and look at the big picture again,
01:53:53.260 | or jump out of your current way of thinking.
01:53:55.960 | And to be able to introspect,
01:53:58.340 | like what are the limitations of your mind?
01:54:00.540 | How is your mind less powerful than it used to be
01:54:04.100 | or more powerful or different, powerful in different ways?
01:54:07.420 | So that seems to be a difficult thing to do
01:54:10.820 | 'cause we're living, we're looking at the world
01:54:13.780 | through the lens of our mind, right?
01:54:15.780 | To step outside and introspect is difficult,
01:54:18.460 | but it seems necessary if you want to make progress.
01:54:22.820 | - You know, one of the threads of psychological research
01:54:26.500 | that's always been very, I don't know,
01:54:31.500 | important to me to be aware of is the idea
01:54:35.980 | that our explanations of our own behavior
01:54:40.980 | aren't necessarily actually part of the process
01:54:49.820 | that caused that behavior to occur,
01:54:54.820 | or even valid observations of the set of constraints
01:55:01.900 | that led to the outcome.
01:55:04.620 | But they are post hoc rationalizations
01:55:07.180 | that we can give based on information at our disposal
01:55:11.900 | about what might have contributed to the result
01:55:15.740 | that we came to when asked.
01:55:19.080 | And so this is an idea that was introduced
01:55:23.340 | in a very important paper by Nisbet and Wilson
01:55:28.060 | about the limits on our ability to be aware of the factors
01:55:33.060 | that cause us to make the choices that we make.
01:55:38.680 | And I think it's something that's very important
01:55:45.240 | and it's something that we really ought to be
01:55:50.240 | much more cognizant of in general as human beings
01:55:55.760 | is that our own insight into exactly why
01:55:59.740 | we hold the beliefs that we do and we hold the attitudes
01:56:03.120 | and make the choices and feel the feelings that we do
01:56:07.060 | is not something that we totally control
01:56:13.600 | or totally observe.
01:56:15.740 | And it's subject to our culturally transmitted
01:56:20.740 | understanding of what it is that is the mode
01:56:28.180 | that we give to explain these things when asked to do so
01:56:33.180 | as much as it is about anything else.
01:56:37.060 | And so even our ability to introspect
01:56:40.820 | and think we have access to our own thoughts
01:56:43.440 | is a product of culture and belief, practice.
01:56:48.080 | - So let me ask you the big question of advice.
01:56:53.940 | So you've lived an incredible life in terms of the ideas
01:56:58.940 | you've put out into the world,
01:57:00.900 | in terms of the trajectory you've taken through your career,
01:57:03.420 | through your life.
01:57:04.420 | What advice would you give to young people today
01:57:07.220 | in high school and college about how to be more open
01:57:13.420 | how to have a career or how to have a life
01:57:15.980 | they can be proud of?
01:57:17.200 | - Finding the thing that you are intrinsically motivated
01:57:24.120 | to engage with and then celebrating that discovery
01:57:29.120 | is what it's all about.
01:57:34.760 | When I was in college, I struggled with that.
01:57:41.020 | I had thought I wanted to be a psychiatrist
01:57:46.020 | because I think I was interested in human psychology
01:57:50.780 | in high school.
01:57:51.620 | And at that time, the only sort of information I had
01:57:56.300 | that had anything to do with the psyche was,
01:57:58.380 | you know, Freud and Eric Fromm
01:58:00.260 | and sort of popular psychiatry kinds of things.
01:58:03.940 | And so, well, they were psychiatrists, right?
01:58:06.740 | So I had to be a psychiatrist.
01:58:08.820 | And that meant I had to go to medical school.
01:58:11.380 | And I got to college and I find myself taking, you know,
01:58:16.260 | the first semester of a three-quarter physics class
01:58:19.980 | and it was mechanics.
01:58:21.260 | And this was so far from what it was I was interested in,
01:58:24.420 | but it was also too early in the morning
01:58:26.100 | in the winter court semester.
01:58:28.180 | So I never made it to the physics class.
01:58:30.860 | But I wandered about the rest of my freshman year
01:58:35.660 | and most of my sophomore year
01:58:38.860 | until I found myself in the midst of this situation
01:58:44.860 | where around me,
01:58:46.460 | there was this big revolution happening.
01:58:50.380 | I was at Columbia University in 1968
01:58:53.340 | and the Vietnam War is going on.
01:58:56.020 | Columbia's building a gym in Morningside Heights,
01:58:59.060 | which is part of Harlem.
01:59:00.300 | And people are thinking, oh, the big, bad rich guys
01:59:03.620 | are stealing the parkland
01:59:06.460 | that belongs to the people of Harlem.
01:59:09.240 | And, you know, they're part of the military
01:59:12.660 | industrial complex, which is enslaving us
01:59:15.340 | and sending us all off to war in Vietnam.
01:59:17.780 | And so there was a big revolution
01:59:20.540 | that involved a confluence of black activism
01:59:23.700 | and, you know, SDS and social justice
01:59:26.860 | and the whole university blew up and got shut down.
01:59:30.020 | And I got a chance to sort of think about
01:59:33.860 | why people were behaving the way they were in this context.
01:59:38.660 | And I, you know, I happened to have taken
01:59:42.940 | mathematical statistics.
01:59:44.620 | I happened to have been taking psychology that quarter,
01:59:47.900 | just psych one.
01:59:49.140 | And somehow things in that space all ran together in my mind
01:59:53.540 | and got me really excited about asking questions
01:59:57.780 | about why people, what made certain people
02:00:00.300 | go into the buildings and not others and things like that.
02:00:03.460 | And so suddenly I had a path forward
02:00:06.140 | and I had just been wandering around aimlessly.
02:00:08.900 | And at the different points in my career, you know,
02:00:11.900 | when I think, okay, well, should I take this class
02:00:15.500 | or should I
02:00:19.620 | read that book about
02:00:25.300 | some idea that I wanna understand better, you know,
02:00:29.100 | or should I pursue the thing that excites me
02:00:32.700 | and interests me or should I, you know,
02:00:34.940 | meet some requirement, you know, that's,
02:00:38.180 | I always did the latter.
02:00:39.460 | So I ended up, my professors in psychology
02:00:43.180 | thought I was great.
02:00:46.180 | They wanted me to go to graduate school.
02:00:48.260 | They nominated me for Phi Beta Kappa
02:00:52.820 | and I went to the Phi Beta Kappa ceremony
02:00:55.580 | and this guy came up and he said,
02:00:56.660 | "Oh, are you Magna or Summa?"
02:00:58.580 | And I wasn't even getting honors based on my grades.
02:01:02.100 | They just happened to have thought I was interested enough
02:01:06.820 | in ideas to belong to Phi Beta Kappa.
02:01:09.660 | - I mean, would it be fair to say
02:01:11.060 | you kind of stumbled around a little bit
02:01:13.300 | through accidents of too early morning of classes
02:01:18.380 | in physics and so on until you discovered
02:01:21.060 | intrinsic motivation, as you mentioned,
02:01:23.100 | and then that's it, it hooked you
02:01:25.380 | and then you celebrate the fact that this happens
02:01:27.660 | to human beings and it's like--
02:01:29.780 | - Yeah, like, and what is it that made
02:01:32.880 | what I did intrinsically motivating to me?
02:01:36.360 | Well, that's interesting and I don't know
02:01:39.300 | all the answers to it and I don't think I wanna,
02:01:44.060 | I want anybody to think that you should be sort of
02:01:49.140 | in any way, I don't know, sanctimonious
02:01:52.900 | or anything about it.
02:01:53.740 | You know, it's like, I really enjoyed
02:01:58.140 | doing statistical analysis of data.
02:02:00.940 | I really enjoyed running my own experiment,
02:02:05.940 | which was what I got a chance to do
02:02:08.220 | in the psychology department that chemistry
02:02:10.660 | and physics had never, I never imagined
02:02:13.460 | that mere mortals would ever do an experiment
02:02:15.740 | in those sciences, except one that was in the textbook
02:02:18.860 | that you were told to do in lab class.
02:02:21.460 | But in psychology, we were already,
02:02:23.740 | like even when I was taking Psych I,
02:02:25.500 | it turned out we had our own rat
02:02:27.700 | and we got to, after two set experiments,
02:02:30.160 | we got to, okay, do something you think of,
02:02:32.900 | you know, with your rat, you know?
02:02:34.220 | So it's the opportunity to do it myself
02:02:39.060 | and to bring together a certain set of things
02:02:42.740 | that engaged me intrinsically.
02:02:46.580 | And I think it has something to do
02:02:48.700 | with why certain people turn out to be,
02:02:51.180 | you know, profoundly amazing musical geniuses, right?
02:02:56.180 | They get immersed in it at an early enough point
02:03:01.420 | and it just sort of gets into the fabric.
02:03:04.900 | So my little brother had intrinsic motivation for music
02:03:08.660 | as we witnessed when he discovered how
02:03:13.100 | to put records on the phonograph
02:03:15.900 | when he was like 13 months old
02:03:18.020 | and recognize which one he wanted to play,
02:03:20.740 | not 'cause he could read the labels,
02:03:22.400 | because he could sort of see which ones had which scratches,
02:03:25.540 | which were the different, you know,
02:03:27.420 | oh, that's rapid E Espanol and that's--
02:03:29.900 | - Oh, wow.
02:03:30.740 | - You know, and--
02:03:31.580 | - And he enjoyed that, that connected with him somehow.
02:03:33.860 | - Yeah, and there was something that it fed into
02:03:36.340 | and you're extremely lucky if you have that
02:03:40.940 | and if you can nurture it and can let it grow
02:03:44.860 | and let it be an important part of your life.
02:03:47.580 | - Yeah, those are the two things,
02:03:49.140 | is like be attentive enough to feel it when it comes.
02:03:54.140 | Like this is something special.
02:03:56.580 | I mean, I don't know, for example,
02:03:59.140 | I really like tabular data, like Excel sheets.
02:04:04.140 | Like it brings me a deep joy.
02:04:07.820 | I don't know how useful that is for anything.
02:04:10.020 | - That's part of what I'm talking about.
02:04:12.260 | - Exactly.
02:04:13.180 | - So there's like a million, not a million,
02:04:15.620 | but there's a lot of things like that for me
02:04:18.460 | and you have to hear that for yourself.
02:04:20.220 | Like realize this is really joyful,
02:04:24.180 | but then the other part that you're mentioning,
02:04:25.900 | which is the nurture, is take time and stay with it,
02:04:29.140 | stay with it a while and see where that takes you in life.
02:04:33.380 | - Yeah, and I think the motivational engagement
02:04:38.380 | results in the immersion that then creates
02:04:42.040 | the opportunity to obtain the expertise.
02:04:45.340 | So we could call it the Mozart effect, right?
02:04:49.780 | I mean, when I think about Mozart,
02:04:51.620 | I think about the person who was born
02:04:55.860 | as the fourth member of the Family String Quartet, right?
02:04:59.220 | And they handed him the violin when he was six weeks old.
02:05:04.220 | All right, start playing.
02:05:09.720 | So the level of immersion there was amazingly profound,
02:05:14.720 | but hopefully he also had something,
02:05:20.040 | maybe this is where the more sort of the genetic part
02:05:28.000 | comes in sometimes, I think.
02:05:29.400 | Something in him resonated to the music
02:05:33.960 | so that the synergy of the combination
02:05:36.680 | of that was so powerful.
02:05:38.520 | So that's what I really consider to be the Mozart effect.
02:05:41.420 | It's sort of the synergy of something with experience
02:05:46.420 | that then results in the unique flowering
02:05:49.820 | of a particular mind.
02:05:52.080 | So I know my siblings and I are all very different
02:05:58.520 | from each other.
02:06:00.260 | We've all gone in our own different directions.
02:06:02.700 | And I mentioned my younger brother who was very musical.
02:06:07.340 | I had my other younger brother
02:06:08.980 | was like this amazing like intuitive engineer.
02:06:11.940 | And one of my sisters was passionate about
02:06:18.740 | water conservation well before it was
02:06:25.000 | such a hugely important issue that it is today.
02:06:29.320 | So we all sort of somehow find a different thing.
02:06:35.560 | And I don't mean to say it isn't tied in
02:06:40.560 | with something about us biologically,
02:06:43.440 | but it's also when that happens, where you can find that,
02:06:47.040 | then you can do your thing and you can be excited about it.
02:06:50.800 | So people can be excited about fitting people on bicycles
02:06:53.640 | as well as excited about making neural networks,
02:06:56.040 | achieve insights into human cognition, right?
02:06:58.480 | - Yeah, like for me personally,
02:07:00.440 | I've always been excited about love
02:07:03.760 | and friendship between humans.
02:07:06.080 | And just like the actual experience of it
02:07:09.920 | is since I was a child, just observing people around me
02:07:13.200 | and also been excited about robots.
02:07:16.280 | And there's something in me that thinks
02:07:18.680 | I really would love to explore
02:07:20.800 | how those two things combine.
02:07:22.240 | And it doesn't make any sense.
02:07:23.760 | A lot of it is also timing,
02:07:25.520 | just to think of your own career and your own life.
02:07:28.200 | You find yourself in certain pieces,
02:07:30.680 | places that happen to involve
02:07:32.960 | some of the greatest thinkers of our time.
02:07:35.280 | And so it just worked out that like,
02:07:37.040 | you guys developed those ideas
02:07:38.880 | and there may be a lot of other people similar to you
02:07:41.840 | and they were brilliant and they never found
02:07:44.360 | that right connection and place
02:07:46.100 | to where the ideas could flourish.
02:07:48.000 | So it's timing, it's place, it's people.
02:07:51.680 | And ultimately the whole ride, it's undirected.
02:07:55.680 | Can I ask you about something you mentioned
02:07:58.240 | in terms of psychiatry when you were younger?
02:08:00.760 | Because I had a similar experience
02:08:02.800 | of reading Freud and Carl Jung
02:08:08.080 | and just those kind of popular psychiatry ideas.
02:08:13.080 | And that was a dream for me early on in high school
02:08:15.840 | to like, I hope to understand the human mind by,
02:08:20.000 | somehow psychiatry felt like the right discipline for that.
02:08:27.200 | Does that make you sad that psychiatry is not
02:08:30.520 | the mechanism by which you want to,
02:08:34.640 | are able to explore the human mind?
02:08:36.160 | So for me, I was a little bit disillusioned
02:08:38.920 | because of how much prescription medication
02:08:43.920 | and biochemistry is involved in the discipline of psychiatry
02:08:48.000 | as opposed to the dream of the Freud,
02:08:50.600 | like use the mechanisms of language
02:08:53.800 | to explore the human mind.
02:08:56.600 | So that was a little disappointing.
02:08:58.760 | And that's why I kind of went to computer science
02:09:01.440 | and thinking like, maybe you can explore the human mind
02:09:04.160 | by trying to build the thing.
02:09:06.360 | - Yes, I wasn't exposed to the,
02:09:08.660 | sort of the biomedical/pharmacological aspects
02:09:14.780 | of psychiatry at that point because I didn't,
02:09:18.600 | I dropped out of that whole idea of pre-med
02:09:21.920 | that I never even found out about that until much later.
02:09:25.600 | But you're absolutely right.
02:09:27.920 | So I was actually a member of the
02:09:31.000 | National Advisory Mental Health Council.
02:09:37.360 | That is to say the board of scientists
02:09:40.480 | who advised the director of the National Institute
02:09:43.600 | of Mental Health.
02:09:45.120 | And that was around the year 2000.
02:09:47.600 | And in fact, at that time, the man who came in
02:09:53.040 | as the new director, I had been on this board for a year
02:09:56.360 | when he came in, said, okay,
02:10:01.280 | schizophrenia is a biological illness.
02:10:06.840 | It's a lot like cancer.
02:10:08.520 | We've made huge strides in curing cancer
02:10:11.160 | and that's what we're gonna do with schizophrenia.
02:10:13.280 | We're gonna find the medications
02:10:15.780 | that are gonna cure this disease.
02:10:18.240 | And we're not gonna listen to anybody's grandmother anymore.
02:10:21.200 | And good old behavioral psychology
02:10:26.200 | is not something we're going to support any further.
02:10:30.200 | And he completely alienated me from the institute
02:10:35.200 | and from all of its prior policies,
02:10:43.260 | which had been much more holistic,
02:10:45.280 | I think really at some level.
02:10:47.960 | And the other people on the board
02:10:51.400 | were like psychiatrists, right?
02:10:53.480 | Very biological psychiatrists.
02:10:57.240 | It didn't pan out, right?
02:10:58.560 | That nothing has changed in our ability
02:11:03.280 | to help people with mental illness.
02:11:07.080 | And so 20 years later, that particular path
02:11:12.080 | was a dead end as far as I can tell.
02:11:15.280 | - Well, there's some aspect to,
02:11:17.000 | and sorry to romanticize the whole philosophical
02:11:20.240 | conversation about the human mind,
02:11:22.260 | but to me, psychiatrists for a time
02:11:25.600 | held the flag of we're the deep thinkers.
02:11:30.080 | In the same way that physicists are the deep thinkers
02:11:32.560 | about the nature of reality,
02:11:34.420 | psychiatrists are the deep thinkers
02:11:36.080 | about the nature of the human mind.
02:11:37.920 | And I think that flag has been taken from them
02:11:40.680 | and carried by people like you.
02:11:43.040 | It's more in the cognitive psychology,
02:11:46.140 | especially when you have a foot
02:11:48.000 | in the computational view of the world,
02:11:50.120 | 'cause you can both build it,
02:11:51.280 | you can like intuit about the functioning of the mind
02:11:54.440 | by building little models
02:11:56.360 | and be able to say mathematical things
02:11:58.040 | and then deploying those models,
02:11:59.360 | especially in computers to say,
02:12:01.480 | does this actually work?
02:12:03.200 | And they do little experiments.
02:12:05.080 | And then some combination of neuroscience
02:12:08.360 | where you're starting to actually be able to observe,
02:12:12.000 | you know, do certain experiments on human beings
02:12:14.240 | and observe how the brain is actually functioning.
02:12:18.340 | And there, using intuition,
02:12:21.060 | you can start being the philosopher.
02:12:22.980 | Like Richard Feynman is the philosopher,
02:12:25.180 | a cognitive psychologist can become the philosopher
02:12:28.260 | and psychiatrists become much more like doctors.
02:12:30.800 | They're like very medical.
02:12:32.280 | They help people with medication,
02:12:34.220 | biochemistry and so on,
02:12:35.660 | but they are no longer the book writers
02:12:39.460 | and the philosophers,
02:12:40.300 | which of course I admire.
02:12:42.340 | I admire the Richard Feynman ability
02:12:44.840 | to do great low level mathematics and physics
02:12:49.680 | and the high level philosophy.
02:12:52.100 | - Yeah, I think it was Fromm and Jung more than Freud
02:12:57.100 | that was sort of initially kind of like made me feel like,
02:13:02.140 | oh, this is really amazing and interesting
02:13:04.760 | and I wanna explore it further.
02:13:06.680 | I actually, when I got to college and I lost that thread,
02:13:10.640 | I found more of it in sociology and literature
02:13:15.640 | than I did in any place else.
02:13:20.520 | So I took quite a lot of both of those disciplines
02:13:24.600 | as an undergraduate.
02:13:25.820 | And, you know, I was actually deeply ambivalent
02:13:31.880 | about the psychology because I was doing experiments
02:13:36.000 | after the initial flurry of interest
02:13:39.440 | in why people would occupy buildings during an insurrection
02:13:44.240 | and consider, you know, be sort of like so over committed
02:13:49.240 | to their beliefs.
02:13:51.840 | But I ended up in the psychology laboratory
02:13:54.880 | running experiments on pigeons.
02:13:56.480 | And so I had these profound sort of like dissonance
02:14:01.200 | between, okay, the kinds of issues that would be explored
02:14:05.240 | when I was thinking about what I read about
02:14:09.280 | in modern British literature versus what I could study
02:14:14.280 | with my pigeons in the laboratory.
02:14:17.480 | That got resolved when I went to graduate school
02:14:19.800 | and I discovered cognitive psychology.
02:14:22.120 | And so for me, that was the path out of this sort of like,
02:14:27.120 | extremely sort of ambivalent divergence
02:14:31.080 | between the interest in the human condition
02:14:33.080 | and the desire to do, you know,
02:14:37.080 | actual mechanistically oriented thinking about it.
02:14:40.100 | And I think we've come a long way in that regard
02:14:46.080 | and that you're absolutely right that nowadays
02:14:50.520 | this is something that's accessible to people
02:14:53.080 | through the pathway in through computer science
02:14:57.640 | or the pathway in through neuroscience.
02:15:02.900 | You know, you can get derailed in neuroscience
02:15:05.980 | down to the bottom of the system
02:15:10.260 | where you might find the cures of various conditions,
02:15:14.180 | but you don't get a chance to think
02:15:16.780 | about the higher level stuff.
02:15:18.120 | So it's in the systems and cognitive neuroscience
02:15:21.060 | and computational intelligence,
02:15:24.020 | miasma up there at the top
02:15:25.640 | that I think these opportunities are most,
02:15:28.580 | are richest right now.
02:15:30.720 | And so yes, I am indeed blessed
02:15:32.980 | by having had the opportunity to fall into that space.
02:15:37.980 | - So you mentioned the human condition,
02:15:41.260 | speaking of which, you happen to be a human being
02:15:44.940 | who's unfortunately not immortal.
02:15:48.300 | That seems to be a fundamental part of the human condition
02:15:52.980 | that this ride ends.
02:15:54.780 | Do you think about the fact that you're going to die one day?
02:16:00.140 | Are you afraid of death?
02:16:01.380 | - I would say that I am not as much afraid of death
02:16:09.660 | as I am of degeneration.
02:16:12.880 | And I say that in part for reasons of having, you know,
02:16:19.380 | seen some tragic degenerative situations unfold.
02:16:28.080 | It's exciting when you can continue to participate
02:16:33.080 | and feel like you're near the place
02:16:41.020 | where the wave is breaking on the shore, if you like.
02:16:46.160 | And I think about, you know, my own future potential
02:16:53.600 | if I were to undergo, begin to suffer from dementia,
02:16:58.600 | Alzheimer's disease or semantic dementia
02:17:06.920 | or some other condition, you know,
02:17:09.840 | I would sort of gradually lose the thread of that ability.
02:17:13.440 | And so one can live on for several,
02:17:19.960 | for a decade after, you know, sort of having to retire
02:17:24.880 | because one no longer has these kinds of abilities to engage.
02:17:29.880 | And I think that's the thing that I fear the most.
02:17:35.560 | - The losing of that, like the breaking of the wave,
02:17:40.560 | the flourishing of the mind where you have these ideas
02:17:44.120 | and they're swimming around, you're able to play with them.
02:17:46.720 | - Yeah, and collaborate with other people who, you know,
02:17:51.000 | are themselves really helping to push these ideas forward.
02:17:56.000 | So, yeah.
02:17:58.920 | - What about the edge of the cliff?
02:18:01.400 | The end, I mean, the mystery of it, I mean.
02:18:05.200 | - The migrated sort of conception of mind
02:18:09.520 | and, you know, sort of continuous sort of way
02:18:12.800 | of thinking about most things makes it so that,
02:18:16.640 | to me, the discreteness of that transition
02:18:21.640 | is less apparent than it seems to be to most people.
02:18:27.160 | - I see, I see, yeah.
02:18:29.780 | Yeah, I wonder, so I don't know if you know the work
02:18:33.880 | of Ernest Becker and so on, I wonder what role mortality
02:18:38.760 | and our ability to be cognizant of it,
02:18:42.160 | anticipate it and perhaps be afraid of it,
02:18:44.840 | what role that plays in our reasoning of the world.
02:18:49.840 | - I think that it can be motivating to people
02:18:53.160 | to think they have a limited period left.
02:18:55.400 | I think in my own case, you know,
02:18:59.080 | it's like seven or eight years ago now
02:19:01.160 | that I was sitting around doing experiments
02:19:06.160 | on decision making that were satisfying in a certain way
02:19:13.040 | because I could really get closure
02:19:16.480 | on whether the model fit the data perfectly or not.
02:19:21.480 | And I could see how one could test, you know,
02:19:25.000 | the predictions in monkeys as well as humans
02:19:27.800 | and really see what the neurons were doing.
02:19:30.880 | But I just realized, hey, wait a minute, you know,
02:19:34.240 | I may only have about 10 or 15 years left here
02:19:37.800 | and I don't feel like I'm getting towards the answers
02:19:41.400 | to the really interesting questions
02:19:43.360 | while I'm doing this particular level of work.
02:19:46.680 | And that's when I said to myself,
02:19:48.640 | okay, let's pick something that's hard, you know?
02:19:54.800 | So that's when I started working on mathematical cognition.
02:19:59.040 | And I think it was more in terms of,
02:20:03.300 | well, I got 15 more years possibly of useful life left,
02:20:06.880 | let's imagine that it's only 10.
02:20:09.960 | I'm actually getting close to the end of that now,
02:20:12.120 | maybe three or four more years.
02:20:13.680 | But I'm beginning to feel like,
02:20:15.980 | well, I probably have another five after that.
02:20:18.100 | So, okay, I'll give myself another six or eight.
02:20:21.260 | - But a deadline is looming and therefore--
02:20:23.960 | - It's not gonna go on forever.
02:20:25.560 | And so, yeah, I gotta keep thinking about the questions
02:20:30.560 | that I think are the interesting
02:20:32.560 | and important ones for sure.
02:20:34.120 | - What do you hope your legacy is?
02:20:37.480 | You've done some incredible work in your life
02:20:40.320 | as a man, as a scientist.
02:20:43.000 | When the aliens and the human civilization is long gone
02:20:47.320 | and the aliens are reading the encyclopedia
02:20:49.800 | about the human species,
02:20:51.640 | what do you hope is the paragraph written about you?
02:20:54.340 | - I would want it to sort of highlight
02:21:03.640 | a couple things that I was able to see
02:21:07.800 | one path that was more exciting to me
02:21:21.560 | than the one that seemed already to be there
02:21:25.680 | for a cognitive psychologist.
02:21:28.360 | But not for any super special reason
02:21:32.360 | other than that I'd had the right context prior to that,
02:21:35.140 | but that I had gone ahead and followed that lead.
02:21:38.960 | And then, I forget the exact wording,
02:21:41.800 | but I said in this preface that the joy of science
02:21:46.800 | is the moment in which a partially formed thought
02:21:59.020 | in the mind of one person gets crystallized
02:22:04.020 | a little better in the discourse
02:22:06.460 | and becomes the foundation of some exciting,
02:22:11.460 | concrete piece of actual scientific progress.
02:22:15.580 | And I feel like that moment happened
02:22:18.380 | when Rumelhart and I were doing
02:22:19.820 | the interactive activation model
02:22:22.020 | and when Rumelhart heard Hinton talk about gradient descent
02:22:26.140 | and having the objective function
02:22:29.620 | to guide the learning process.
02:22:31.420 | And it happened a lot in that period
02:22:36.380 | and I sort of seek that kind of thing
02:22:39.040 | in my collaborations with my students.
02:22:42.940 | So the idea that this is a person
02:22:47.940 | who contributed to science
02:22:51.100 | by finding exciting collaborative opportunities
02:22:53.920 | to engage with other people through
02:22:56.840 | is something that I certainly hope
02:22:59.080 | is part of the paragraph.
02:23:00.500 | - And like you said, taking a step
02:23:03.260 | maybe in directions that are non-obvious.
02:23:07.700 | So it's the old Robert Frost, road less taken.
02:23:11.640 | So maybe, 'cause you said like this incomplete initial idea,
02:23:16.640 | that step you take is a little bit off the beaten path.
02:23:22.320 | - If I could just say one more thing here.
02:23:24.640 | This was something that really contributed
02:23:29.080 | to energizing me in a way
02:23:30.720 | that I feel it would be useful to share.
02:23:35.220 | My PhD dissertation project
02:23:40.220 | was completely empirical experimental project.
02:23:44.320 | And I wrote a paper based on the two main experiments
02:23:49.320 | that were the core of my dissertation.
02:23:51.760 | And I submitted it to a journal.
02:23:53.920 | And at the end of the paper,
02:23:56.640 | I had a little section where I laid out
02:24:01.040 | the beginnings of my theory
02:24:05.160 | about what I thought was going on
02:24:08.080 | that would explain the data that I had collected.
02:24:11.160 | And I had submitted the paper
02:24:14.080 | to the Journal of Experimental Psychology.
02:24:16.480 | So I got back a letter from the editor saying,
02:24:20.600 | "Thank you very much, these are great experiments.
02:24:22.560 | "We'd love to publish them in the journal.
02:24:24.700 | "But what we'd like you to do
02:24:26.260 | "is to leave the theorizing to the theorists
02:24:28.840 | "and take that part out of the paper."
02:24:32.200 | And so I did, I took that part out of the paper.
02:24:36.300 | But I almost found myself labeled
02:24:42.400 | as a non-theorist by this.
02:24:45.520 | And I could have succumbed to that and said,
02:24:49.880 | "Okay, well, I guess my job is to just go on
02:24:51.980 | "and do experiments, right?"
02:24:54.200 | But that's not what I wanted to do.
02:24:59.200 | And so when I got to my assistant professorship,
02:25:03.600 | although I continued to do experiments
02:25:06.640 | because I knew I had to get some papers out,
02:25:09.320 | I also, at the end of my first year,
02:25:11.920 | submitted my first article to Psychological Review,
02:25:14.960 | which was the theoretical journal,
02:25:16.440 | where I took that section and elaborated it
02:25:19.240 | and wrote it up and submitted it to them.
02:25:21.640 | And they didn't accept that either,
02:25:24.320 | but they said, "Oh, this is interesting.
02:25:26.120 | "You should keep thinking about it this time."
02:25:28.320 | And then that was what got me going to think,
02:25:31.340 | "Okay, so it's not a superhuman thing
02:25:35.920 | "to contribute to the development of theory.
02:25:38.700 | "You don't have to be, you can do it as a mere mortal."
02:25:43.700 | - And the broader, I think, lesson is
02:25:47.600 | don't succumb to the labels of a particular reviewer.
02:25:50.680 | - Yeah, that's for sure.
02:25:51.520 | (laughing)
02:25:53.760 | Or anybody labeling you, right?
02:25:56.080 | - Exactly.
02:25:56.920 | Especially as you become successful,
02:26:01.080 | labels get assigned to you
02:26:04.220 | for that you're successful for that thing.
02:26:05.920 | - Yeah, I'm a connectionist or a cognitive scientist
02:26:08.520 | and not a neuroscientist, whatever.
02:26:10.000 | - And then you can completely,
02:26:12.040 | that's just, that's the stories of the past.
02:26:14.780 | You're, today, a new person
02:26:16.800 | that can completely revolutionize in totally new areas.
02:26:20.280 | So don't let those labels hold you back.
02:26:24.160 | Well, let me ask the big question.
02:26:26.240 | When you look into the,
02:26:29.960 | you said it started with Columbia
02:26:31.680 | trying to observe these humans
02:26:33.120 | and they're doing weird stuff
02:26:34.920 | and you wanna know why are they doing this stuff.
02:26:37.360 | So let's zoom out even bigger.
02:26:39.680 | At the hundred plus billion people
02:26:42.400 | who've ever lived on Earth,
02:26:45.160 | why do you think we're all doing what we're doing?
02:26:48.880 | What do you think is the meaning of it all?
02:26:50.400 | The big why question.
02:26:51.740 | We seem to be very busy doing a bunch of stuff
02:26:54.480 | and we seem to be kind of directed towards somewhere,
02:26:58.240 | but why?
02:27:00.640 | - Well, I myself think that we make meaning for ourselves
02:27:08.360 | and that we find inspiration in the meaning
02:27:14.440 | that other people have made in the past.
02:27:16.520 | And the great religious thinkers
02:27:21.440 | of the first millennium BC
02:27:27.000 | and few that came in the early part
02:27:32.000 | of the second millennium
02:27:34.840 | laid down some important foundations for us.
02:27:43.720 | But I do believe that we are an emergent result
02:27:48.720 | of a process that happened naturally without guidance
02:27:56.000 | and that meaning is what we make of it
02:28:01.000 | and that the creation of efforts to reify meaning
02:28:09.560 | in religious traditions and so on
02:28:14.320 | is just a part of the expression of that goal
02:28:18.240 | that we have to not find out what the meaning is
02:28:23.240 | but to make it ourselves.
02:28:29.640 | So to me, it's something that's very personal,
02:28:38.160 | it's very individual, it's like meaning will come for you
02:28:43.160 | through the particular combination of synergistic elements
02:28:49.280 | that are your fabric and your experience
02:28:52.960 | and your context and you should,
02:29:04.760 | it's all made in a certain kind of a local context though.
02:29:08.840 | Here I am at UCSD with this brilliant man, Rommelhart,
02:29:13.980 | who's having these doubts about
02:29:20.320 | symbolic artificial intelligence
02:29:24.920 | that resonate with my desire
02:29:27.680 | to see it grounded in the biology
02:29:29.800 | and let's make the most of that.
02:29:34.800 | - Yeah, and so from that little pocket,
02:29:38.760 | there's some kind of peculiar little emergent process
02:29:43.600 | that then, which is basically each one of us,
02:29:47.200 | each one of us humans is a kind of,
02:29:49.720 | you think cells and they come together
02:29:52.000 | and it's an emergent process
02:29:54.200 | that then tells fancy stories about itself
02:29:58.600 | and then gets, just like you said,
02:30:01.520 | just enjoys the beauty of the stories
02:30:03.480 | we tell about ourselves.
02:30:04.840 | It's an emergent process that lives for a time,
02:30:09.260 | is defined by its local pocket and context in time and space
02:30:14.260 | and then tells pretty stories
02:30:15.760 | and we write those stories down
02:30:17.040 | and then we celebrate how nice the stories are
02:30:19.480 | and then it continues
02:30:20.880 | 'cause we build stories on top of each other
02:30:23.260 | and eventually we'll colonize, hopefully, other planets,
02:30:28.080 | other solar systems, other galaxies
02:30:31.720 | and we'll tell even better stories.
02:30:34.000 | But it all starts here on Earth.
02:30:37.240 | Jay, you're speaking of peculiar emergent processes
02:30:42.240 | that lived one heck of a story.
02:30:45.240 | You're one of the great scientists
02:30:49.760 | of cognitive science, of psychology, of computation.
02:30:57.000 | It's a huge honor that you would talk to me today,
02:30:59.920 | that you spend your very valuable time.
02:31:02.320 | I really enjoy talking with you
02:31:03.640 | and thank you for all the work you've done.
02:31:05.160 | I can't wait to see what you do next.
02:31:07.160 | - Well, thank you so much
02:31:08.320 | and this has been an amazing opportunity for me
02:31:11.640 | to let ideas that I've never fully expressed before come out
02:31:16.300 | 'cause you ask such a wide range of the deeper questions
02:31:20.760 | that we've all been thinking about for so long.
02:31:23.560 | So thank you very much for that.
02:31:24.960 | - Thank you.
02:31:26.600 | - Thanks for listening to this conversation
02:31:28.280 | with Jay McClelland.
02:31:29.600 | To support this podcast,
02:31:31.000 | please check out our sponsors in the description.
02:31:33.640 | And now let me leave you with some words
02:31:35.360 | from Geoffrey Hinton.
02:31:36.920 | "In the long run, curiosity-driven research works best.
02:31:40.840 | "Real breakthroughs come from people focusing
02:31:43.420 | "on what they're excited about."
02:31:45.040 | Thanks for listening and hope to see you next time.
02:31:48.680 | (upbeat music)
02:31:51.260 | (upbeat music)
02:31:53.840 | [BLANK_AUDIO]