back to indexMatt Botvinick: Neuroscience, Psychology, and AI at DeepMind | Lex Fridman Podcast #106
Chapters
0:0 Introduction
3:29 How much of the brain do we understand?
14:26 Psychology
22:53 The paradox of the human brain
32:23 Cognition is a function of the environment
39:34 Prefrontal cortex
53:27 Information processing in the brain
60:11 Meta-reinforcement learning
75:18 Dopamine
79:1 Neuroscience and AI research
83:37 Human side of AI
99:56 Dopamine and reinforcement learning
113:7 Can we create an AI that a human can love?
00:00:00.000 |
The following is a conversation with Matt Botmanick, 00:00:03.440 |
Director of Neuroscience Research at DeepMind. 00:00:09.360 |
navigating effortlessly between cognitive psychology, 00:00:12.480 |
computational neuroscience, and artificial intelligence. 00:00:44.760 |
If you enjoy this podcast, subscribe on YouTube, 00:00:55.600 |
spelled surprisingly without the E, just F-R-I-D-M-A-N. 00:01:07.620 |
This episode is supported by The Jordan Harbinger Show. 00:01:19.380 |
on Apple Podcast, Spotify, and you know where to look. 00:01:30.280 |
I recently listened to his conversation with Jack Barsky, 00:01:50.800 |
that I need to do a deep dive into the Cold War era 00:01:53.680 |
to get a complete picture of Russia's recent history. 00:02:06.700 |
This episode is also supported by Magic Spoon, 00:02:10.280 |
low-carb, keto-friendly, super amazingly delicious cereal. 00:02:25.280 |
pull-up challenge I'm doing, including the running. 00:02:33.800 |
because most cereals have a crazy amount of sugar, 00:03:04.160 |
Click the magicspoon.com/lex link in the description 00:03:07.840 |
and use code LEX at checkout for free shipping 00:03:13.080 |
They've agreed to sponsor this podcast for a long time. 00:03:16.440 |
They're an amazing sponsor and an even better cereal. 00:03:24.680 |
And now, here's my conversation with Matt Botvinick. 00:03:28.460 |
How much of the human brain do you think we understand? 00:03:36.920 |
in the history of neuroscience in the sense that 00:03:40.040 |
I feel like we understand a lot about the brain 00:03:52.600 |
- When you say high level, what are you thinking? 00:04:00.760 |
You know, what kinds of computation does the brain do? 00:04:03.880 |
What kinds of behaviors would we have to explain 00:04:10.040 |
if we were gonna look down at the mechanistic level? 00:04:22.080 |
But it's almost like we're seeing it through a fog. 00:04:26.600 |
We don't really understand what the neuronal mechanisms are 00:04:34.600 |
what are the functions that the brain is computing 00:04:38.400 |
if we were gonna get down to the neuronal level? 00:04:45.520 |
in the last few years, incredible progress has been made 00:04:49.600 |
in terms of technologies that allow us to see, 00:05:02.640 |
And then there's this yawning gap in between. 00:05:16.040 |
the synaptic connections and all the dopamine, 00:05:21.560 |
- One blanket statement I should probably make is that, 00:05:30.200 |
to make a distinction between psychology and neuroscience. 00:05:49.960 |
Well, it seems to be for taking blood on one side 00:05:54.960 |
that has metabolites in it that shouldn't be there, 00:06:05.120 |
and then excreting that in the form of urine. 00:06:07.020 |
That's what the kidney is for, it's like obvious. 00:06:10.200 |
So the rest of the work is deciding how it does that. 00:06:19.080 |
The brain, as far as I can tell, is for producing behavior. 00:06:22.720 |
It's for going from perceptual inputs to behavioral outputs, 00:06:26.960 |
and the behavioral outputs should be adaptive. 00:06:33.600 |
It's about understanding the structure of that function. 00:06:35.880 |
And then the rest of neuroscience is about figuring out 00:06:38.880 |
how those operations are actually carried out 00:06:49.960 |
the gap between the electrical signal and behavior, 00:06:59.080 |
that touches behavior, how the brain generates behavior, 00:07:03.280 |
or how the brain converts raw visual information 00:07:12.520 |
psychology, and neuroscience as all one science. 00:07:22.920 |
So certainly you will be correct in your feeling 00:07:26.880 |
in some number of years, but that number of years 00:07:33.400 |
- Is that aspirational, or is that pragmatic engineering 00:07:39.360 |
- It's both in the sense that this is what I hope 00:07:44.360 |
and expect will bear fruit over the coming decades, 00:07:51.520 |
but it's also pragmatic in the sense that I'm not sure 00:07:57.560 |
what we're doing in either psychology or neuroscience 00:08:04.920 |
I don't know what it means to understand the brain 00:08:09.760 |
if part of the enterprise is not about understanding 00:08:20.040 |
- I mean, yeah, but I would compare it to maybe 00:08:24.240 |
astronomers looking at the movement of the planets 00:08:32.360 |
And I would argue that at least in the early days, 00:08:35.560 |
there's some value to just tracing the movement 00:08:37.760 |
of the planets and the stars without thinking 00:08:41.680 |
about the physics too much because it's such a big leap 00:08:54.760 |
- Well, right, and I think, so I thought about this 00:08:58.080 |
a lot when I was in grad school 'cause a lot of what 00:09:06.040 |
about what it meant to, it seems like what we were talking 00:09:10.080 |
about a lot of the time were virtual causal mechanisms. 00:09:14.720 |
Like, oh, well, attentional selection then selects 00:09:19.720 |
some object in the environment and that is then passed on 00:09:24.280 |
to the motor, information about that is passed on 00:09:26.840 |
to the motor system, but these are virtual mechanisms. 00:09:29.680 |
These are, they're metaphors, there's no reduction 00:09:34.680 |
going on in that conversation to some physical mechanism 00:09:47.240 |
But the causal mechanisms are definitely neurons 00:09:53.280 |
So in psychology, at least for me personally, 00:09:56.160 |
there was this strange insecurity about trafficking 00:10:00.080 |
in these metaphors, which were supposed to explain 00:10:05.640 |
If you can't ground them in physical mechanisms, 00:10:09.320 |
then what is the explanatory validity of these explanations? 00:10:14.320 |
And I managed to soothe my own nerves by thinking 00:10:32.400 |
on the history of this field, but I know enough to say 00:10:36.240 |
that Mendelian genetics preceded Watson and Crick. 00:10:41.240 |
And so there was a significant period of time 00:10:45.480 |
during which people were productively investigating 00:10:52.760 |
the structure of inheritance using what was essentially 00:11:02.520 |
They're sort of an explanatory thing that we made up 00:11:06.080 |
and we ascribed to them these causal properties. 00:11:12.800 |
And then later, there was a kind of blank there 00:11:17.440 |
that was filled in with a physical mechanism. 00:11:29.360 |
of what kind of causal mechanism we were looking for. 00:11:42.680 |
- No, no, the metaphors we use in cognitive psychology 00:11:47.680 |
are things like attention, the way that memory works. 00:12:08.940 |
But it's still worth having, that metaphorical level. 00:12:24.380 |
to the idea that that arises from interaction of neurons? 00:12:33.820 |
- Is the interaction of neurons also not a metaphor to you? 00:12:38.060 |
Or is it literally, like, that's no longer a metaphor. 00:12:42.340 |
That's already the lowest level of abstractions 00:12:53.780 |
what I wanna say could end up being controversial. 00:12:56.920 |
So what I wanna say is, yes, the interactions of neurons, 00:13:01.900 |
that's not metaphorical, that's a physical fact. 00:13:04.620 |
That's where the causal interactions actually occur. 00:13:14.740 |
you know, I don't wanna go down that rabbit hole. 00:13:17.260 |
- It's always turtles on top of turtles, yeah. 00:13:29.060 |
which has to do with neurotransmitter release. 00:13:48.380 |
I think remaining forever at the level of description 00:14:29.820 |
bridging of the gap between psychology and neuroscience, 00:14:47.100 |
before I discovered AI and even neuroscience, 00:14:52.860 |
And do you think it's possible to understand the mind 00:14:55.860 |
without getting into all the messy details of neuroscience? 00:15:01.380 |
to you it's appealing to try to understand the mechanisms 00:15:05.140 |
at the lowest level, but do you think that's needed, 00:15:07.580 |
that's required, to understand how the mind works? 00:15:10.260 |
- That's an important part of the whole picture, 00:15:23.500 |
renders psychology in its own right unproductive. 00:15:31.180 |
I am fond of saying that I have learned much more 00:15:34.980 |
from psychology than I have from neuroscience. 00:15:38.460 |
To me, psychology is a hugely important discipline. 00:15:54.020 |
that have been native to cognitive psychology 00:16:04.180 |
they're starting to become interesting to AI researchers 00:16:11.660 |
- Can you maybe talk a little bit about what you see as 00:16:21.900 |
I mean, maybe just start it off as a science, as a field. 00:16:25.620 |
- To me, it was when I understood what psychology is, 00:16:32.780 |
it was really disappointing to see two aspects. 00:16:39.180 |
how small the number of subject is in the studies. 00:16:45.300 |
how controlled the entire, how much it was in the lab. 00:16:52.660 |
There was no mechanism for studying humans in the wild. 00:16:54.980 |
So that's where I became a little bit disillusioned 00:17:05.740 |
data of human behavior on the internet becomes exciting 00:17:08.300 |
because the N grows and then in the wild grows. 00:17:13.860 |
Like, do you have a optimistic or pessimistic, 00:17:22.740 |
it was early enough that there was still a thrill 00:17:32.820 |
there were ways of doing experimental science 00:17:35.660 |
that provided insight to the structure of the mind. 00:17:55.540 |
and trying to understand what the specific deficits were 00:18:00.540 |
that arose from a lesion in a particular part of the brain. 00:18:06.620 |
And the kind of experimentation that was done 00:18:08.940 |
and that's still being done to get answers in that context 00:18:21.340 |
An experiment answered one question but raised another 00:18:26.580 |
And you really felt like you were narrowing in on 00:18:42.220 |
I mean, the very detailed neuropsychological studies 00:19:04.340 |
of that kind of research that really made you feel 00:19:12.460 |
language processing is organized in the brain. 00:19:25.320 |
that the cost of doing highly controlled experiments 00:19:30.320 |
is that you, by construction, miss out on the richness 00:19:42.280 |
by what in those days was called connectionism, 00:19:56.400 |
They weren't yet really useful for industrial applications. 00:20:04.080 |
- Oh, neural networks were very concretely the thing 00:20:09.120 |
I was handed, are you familiar with the PDP books 00:20:39.120 |
- I actually was quite interested in surgery. 00:20:47.280 |
on the planet who was torn between those two fields. 00:20:52.720 |
And I said exactly that to my advisor in medical school, 00:21:01.920 |
And he said to me, "No, no, it's actually not so uncommon 00:21:05.120 |
"to be interested in surgery and psychiatry." 00:21:12.600 |
is that both fields are about going beneath the surface 00:21:20.640 |
as someone who was interested in psychoanalysis 00:21:38.120 |
that's inside everybody's abdomen and thorax. 00:21:40.560 |
- That's a very poetic way to connect it to disciplines 00:22:09.320 |
And I said, "Well, I've always been interested 00:22:13.020 |
"I'm pretty sure that nobody's doing scientific research 00:22:25.000 |
And he said, "Well, you know, I'm not sure that's true. 00:22:29.600 |
And he pulled down the PDB books from his shelf 00:22:34.000 |
He hadn't read them, but he handed them to me. 00:22:38.680 |
And that was, you know, I went back to my dorm room 00:22:41.480 |
and I just, you know, read them cover to cover. 00:22:46.520 |
which was one of the original names for deep learning. 00:22:50.840 |
- And so, I apologize for the romanticized question, 00:22:59.880 |
is to you the most beautiful, mysterious, surprising? 00:23:17.320 |
that the brain is so mysterious and seems so distant. 00:23:39.040 |
The brain is literally what makes everything obvious 00:23:48.120 |
- I used to teach, when I taught at Princeton, 00:23:50.520 |
I used to teach a cognitive neuroscience course. 00:23:53.000 |
And the very last thing I would say to the students was, 00:24:01.960 |
as scientific inspiration, the metaphor is often, 00:24:08.040 |
The stars will inspire you to wonder at the universe 00:24:11.600 |
and think about your place in it and how things work. 00:24:28.480 |
but from the extremely intimately close brain. 00:24:35.200 |
- There's something just endlessly fascinating 00:24:40.240 |
- Like Jessica said, the one that's close and yet distant, 00:24:58.560 |
- I guess what I mean is the subjective nature 00:25:03.800 |
of the experience, if we can take a small tangent 00:25:15.520 |
about specifically the mechanism of cognition? 00:25:23.240 |
it's almost like paralyzing the beauty and the mystery 00:25:26.640 |
of the fact that it creates the entirety of the experience, 00:25:29.520 |
not just the reasoning capability, but the experience? 00:25:32.920 |
- Well, I definitely resonate with that latter thought. 00:25:37.920 |
And I often find discussions of artificial intelligence 00:25:49.120 |
Speaking as someone who has always had an interest in art, 00:25:57.360 |
'cause it sounds like somebody who has an interest in art. 00:26:11.960 |
"Oh, well, don't worry, we're talking about cognition, 00:26:17.080 |
- There's an incredible scope to what humans go through 00:26:25.360 |
And yes, so that's part of what fascinates me, 00:26:37.320 |
but at the same time, it's so mysterious to us, how? 00:26:44.720 |
- Like, we literally, our brains are literally 00:26:57.680 |
the actual explanation for that is so overwhelming. 00:27:05.600 |
Certain people have fixations on particular questions, 00:27:08.480 |
and that's always, that's just always been mine. 00:27:11.720 |
- Yeah, I would say the poetry of that is fascinating. 00:27:14.040 |
And I'm really interested in natural language as well. 00:27:16.760 |
And when you look at artificial intelligence community, 00:27:23.880 |
when you try to create a benchmark for the community 00:27:26.840 |
to gather around, how much of the magic of language 00:27:33.240 |
That there's something, we talk about experience, 00:27:43.800 |
the spirit of the Turing test is lost in these benchmarks. 00:27:51.920 |
The moment you try to do real good rigorous science, 00:28:01.600 |
it feels like you're losing some of the magic. 00:28:10.080 |
- Well, I agree with you, but at the same time, 00:28:18.100 |
about that first wave of deep learning models in cognition 00:28:23.020 |
was the fact that the people who were building these models 00:28:49.260 |
One said, well, the mind encodes certain rules, 00:29:10.420 |
who said, well, if you look carefully at the data, 00:29:13.860 |
if you actually look at corpora, like language corpora, 00:29:25.140 |
and you just tack on E-D, and then there are exceptions, 00:29:36.140 |
There are certain clues to which verbs should be exceptional, 00:29:41.140 |
and then there are exceptions to the exceptions, 00:29:44.100 |
and there was a word that was kind of deployed 00:29:47.720 |
in order to capture this, which was quasi-regular. 00:29:51.720 |
In other words, there are rules, but it's messy, 00:29:54.700 |
and there's structure even among the exceptions, 00:29:58.740 |
and it would be, yeah, you could try to write down 00:30:15.220 |
and see how it ends up representing all of this richness. 00:30:28.060 |
and that's something that I always found very compelling. 00:30:36.180 |
and profound to you in terms of our current deep learning 00:30:39.780 |
neural network, artificial neural network approaches, 00:30:46.300 |
about the biological neural networks in our brain? 00:31:04.500 |
in trying to create a human-level intelligence? 00:31:34.900 |
like which aspect of the neural networks in our brain 00:31:39.140 |
Is that closer to the cognitive science level of... 00:31:49.620 |
of neuroscience, cognitive science, and psychology, 00:31:56.380 |
by saying you're kind of seeing them as separate, 00:32:02.020 |
where is there something about the lowest layer 00:32:09.140 |
that is profound to you in terms of its difference 00:32:24.420 |
if you take an introductory computer science course 00:32:31.420 |
one way of articulating what the significance 00:32:36.420 |
of a Turing machine is, is that it's a machine emulator. 00:33:04.940 |
We're capacity limited, we're not Turing machines, 00:33:06.980 |
obviously, but we have the ability to adapt behaviors 00:33:10.980 |
that are very much unlike anything we've done before, 00:33:22.340 |
- But just on that point, you mentioned Turing machine, 00:33:24.580 |
but nevertheless, it's fundamentally our brains 00:33:32.060 |
It was a little bit unclear to this line you drew. 00:33:40.660 |
- I'm happy to think of it as just basic computation, 00:33:49.820 |
that are leading to the full richness of human cognition. 00:33:58.860 |
that allow people to do arithmetic or play chess. 00:34:14.940 |
to your search of understanding the human mind 00:34:31.100 |
especially looking at the reinforcement learning work 00:34:48.820 |
that's come out of my group or DeepMind lately, 00:35:01.540 |
The human behavior arises within communities. 00:35:16.060 |
That was like, if you look at like 2001 Space Odyssey 00:35:26.660 |
of our large numbers of humans to hold an idea, 00:35:31.900 |
like you said, shaking hands versus bumping elbows, 00:35:40.860 |
just kind of this like distributed convergence 00:35:43.380 |
towards an idea over a particular period of time 00:35:58.700 |
a clear objective function under which we operate, 00:36:01.340 |
but we all kind of converge towards one somehow. 00:36:16.540 |
The first step is try to understand the mind. 00:36:19.700 |
I mean, I think there's something to the argument 00:36:25.780 |
like strictly bottom-up approach is wrong-headed. 00:36:29.940 |
In other words, there are basic phenomena that, 00:36:36.860 |
that can only be understood in the context of groups. 00:36:44.700 |
I've never been particularly convinced by the notion 00:36:48.700 |
that we should consider intelligence to adhere 00:36:58.740 |
that the basic unit that we want to understand 00:37:11.300 |
I'm stubbornly, I stubbornly define it as something 00:37:18.820 |
That's just my, I don't know if that's my take. 00:37:20.180 |
- I'm with you, but that could be the reductionist dream 00:37:22.860 |
of a scientist because you can understand a single human. 00:37:26.420 |
It also is very possible that intelligence can only arise 00:37:32.860 |
When there's multiple sort of, it's a sad thing, 00:37:37.500 |
if that's true, because it's very difficult to study. 00:37:58.580 |
I think a serious effort to understand human intelligence 00:38:17.660 |
the cognizing system, whether it's a brain or an AI system. 00:38:22.200 |
That's one thing I took away actually from my early studies 00:38:26.780 |
with the pioneers of neural network research, 00:38:38.580 |
it's only partly a function of the, you know, 00:38:44.500 |
and the learning algorithms that it implements. 00:38:48.260 |
what really shapes it is the interaction of those things 00:38:58.300 |
that's made most clear in reinforcement learning 00:39:03.700 |
you can only learn as much as you can simulate, 00:39:05.820 |
and that's what made, what DeepMind made very clear 00:39:13.580 |
of the other agent of the competitive behavior, 00:39:16.840 |
which the other agent becomes the environment essentially. 00:39:19.980 |
And that's, I mean, one of the most exciting ideas in AI 00:39:28.760 |
There's a thing where competition is essential for learning, 00:39:35.040 |
So if we can step back into another sort of beautiful world, 00:39:44.680 |
is there something for people who might not know, 00:39:55.520 |
or just in general, what are the different parts 00:39:59.900 |
that you've studied, and that are just good to know about 00:40:49.360 |
- And when you say anatomical, sorry to interrupt, 00:40:51.960 |
so that's referring to sort of the geographic region, 00:40:56.960 |
as opposed to some kind of functional definition. 00:41:00.160 |
- Exactly, so this is kind of the coward's way out. 00:41:04.420 |
I'm telling you what the prefrontal cortex is 00:41:06.020 |
just in terms of what part of the real estate it occupies. 00:41:11.720 |
And in fact, the early history of neuroscientific research 00:41:26.400 |
it was really World War I that started people 00:41:35.520 |
what different parts of the brain, the human brain do 00:41:49.920 |
to try to identify the functions of different brain regions. 00:41:56.160 |
But one of the frustrations that neuropsychologists faced 00:42:05.000 |
to these most kind of frontal parts of the brain. 00:42:08.420 |
It was just a very difficult thing to pin down. 00:42:20.560 |
of clinical experience and close observation, 00:42:22.960 |
they started to put their finger on a syndrome 00:42:27.640 |
Actually, one of them was a Russian neuropsychologist 00:42:30.440 |
named Luria, who students of cognitive psychology still read. 00:42:40.200 |
was that the frontal cortex was somehow involved 00:42:57.560 |
or to change what they were doing in a very flexible way 00:43:10.900 |
- Yeah, what later helped bring this function 00:43:23.660 |
as habitual behavior versus goal-directed behavior. 00:43:28.100 |
So it's very, very clear that the human brain 00:43:42.420 |
so that they don't require you to concentrate too much. 00:43:49.780 |
Just think about the difference between driving 00:44:07.820 |
so that they can be habits, so that they can be automatic. 00:44:12.340 |
- That's kind of like the purest form of learning, 00:44:24.120 |
how artificial intelligence systems can learn. 00:44:27.380 |
Is that the way you think? - It's interesting. 00:44:34.620 |
in thinking about where we are in AI research. 00:44:38.680 |
But just to finish the kind of dissertation here, 00:44:54.620 |
sort of in contradistinction to that habitual domain. 00:45:07.360 |
"Whoa, whoa, what I usually do in this situation is X, 00:45:10.720 |
"but given the context, I probably should do Y." 00:45:14.160 |
I mean, the elbow bump is a great example, right? 00:45:22.520 |
and it's the prefrontal cortex that allows us 00:45:26.000 |
to bear in mind that there's something unusual 00:45:38.560 |
and he built tests for detecting these kinds of things, 00:45:51.080 |
would have a great deal of trouble with that. 00:45:53.520 |
Somebody proffering their hand would elicit a handshake. 00:45:57.760 |
The prefrontal cortex is what allows us to say, 00:46:09.180 |
"and to reason about what behavior is appropriate there." 00:46:25.420 |
If no, then how do they integrate new experiences? 00:46:35.920 |
because we have revolutionary new technologies 00:46:48.280 |
and also causally influencing neural behavior 00:47:11.320 |
sort of, for me at least, a very urgent question 00:47:14.040 |
whether the kinds of things that we wanna understand 00:47:27.880 |
You know, people who study fruit flies will often tell you, 00:47:32.880 |
"Hey, fruit flies are smarter than you think." 00:47:36.880 |
where fruit flies were able to learn new behaviors, 00:47:40.320 |
were able to generalize from one stimulus to another 00:47:44.180 |
in a way that suggests that they have abstractions 00:47:58.160 |
recounting some observation about mouse behavior, 00:48:03.160 |
where it seemed like mice were taking an awfully long time 00:48:09.040 |
to learn a task that for a human would be profoundly trivial. 00:48:16.440 |
that mice really don't have the cognitive flexibility 00:48:26.200 |
"because you asked a mouse to deal with stimuli 00:48:31.120 |
"and behaviors that were very unnatural for the mouse. 00:48:34.120 |
"If instead you kept the logic of the experiment the same, 00:48:44.280 |
"that aligns with what mice are used to dealing with 00:48:48.320 |
"you might find that a mouse actually has more intelligence 00:48:54.960 |
of mice doing things in their natural habitat, 00:49:00.040 |
dealing with, you know, physical problems, you know, 00:49:02.960 |
I have to drag this piece of food back to my, you know, 00:49:06.200 |
back to my lair, but there's something in my way 00:49:10.440 |
So I think these are open questions to put it, 00:49:15.440 |
- And then taking a small step back related to that 00:49:18.560 |
is you kind of mentioned we're taking a little shortcut 00:49:25.240 |
the prefrontal cortex is a region of the brain, 00:49:28.320 |
but if we, what's your sense in a bigger philosophical view, 00:49:36.300 |
Do you have a sense that it's a set of subsystems 00:49:46.360 |
or to what degree is it a giant interconnected mess 00:50:03.540 |
that all parts of the brain are doing the same thing. 00:50:07.180 |
This follows immediately from the kinds of studies 00:50:11.160 |
of brain damage that we were chatting about before. 00:50:24.600 |
Having said that, there are two other things to add, 00:50:45.840 |
at least this is my observation of the literature, 00:50:48.180 |
is that the differences between regions are graded 00:51:06.780 |
and that have clear channels of communication between them. 00:51:22.060 |
the functions of which are not clearly defined 00:51:27.380 |
and the borders of which seem to be quite vague. 00:51:30.800 |
And then there's another thing that's popping up 00:51:36.020 |
which involves application of these new features. 00:51:43.660 |
which there are a number of studies that suggest 00:51:59.100 |
that we wouldn't have thought would be there. 00:52:01.340 |
For example, looking in the primary visual cortex, 00:52:12.980 |
where are the edges in this scene that I'm viewing? 00:52:19.460 |
you can recover information from primary visual cortex 00:52:23.220 |
like what behavior the animal is engaged in right now 00:52:35.180 |
whose function is pretty well defined at a coarse grain 00:52:42.860 |
about information from very different domains. 00:52:47.060 |
So the history of neuroscience is sort of this oscillation 00:52:54.020 |
the kind of modular view and then the big mush view. 00:53:05.580 |
because there's something about our conceptual system 00:53:10.020 |
that finds it easy to think about a modularized system 00:53:12.740 |
and easy to think about a completely undifferentiated system, 00:53:15.480 |
but something that kind of lies in between is confusing, 00:53:19.980 |
but we're gonna have to get used to it, I think. 00:53:23.380 |
the lower level mechanism of neuronal communication. 00:53:26.740 |
- So on that topic, you kind of mentioned information. 00:53:31.880 |
that there's still mystery and disagreement on 00:53:34.620 |
is how does the brain carry information and signal? 00:53:38.060 |
Like what in your sense is the basic mechanism 00:53:53.300 |
in deep learning research to be a reasonable approximation 00:53:57.040 |
to the mechanisms that carry information in the brain. 00:54:02.040 |
So the usual way of articulating that is to say, 00:54:08.560 |
What matters is how quickly is an individual neuron spiking? 00:54:22.760 |
And that number is enough to capture what neurons are doing. 00:54:31.120 |
that's an adequate description of how information 00:54:39.900 |
There are studies that suggest that the precise timing 00:54:46.080 |
There are studies that suggest that there are computations 00:54:50.680 |
that go on within the dendritic tree, within a neuron 00:54:57.120 |
and that really don't equate to anything that we're doing 00:55:02.840 |
Having said that, I feel like we're getting somewhere 00:55:07.840 |
by sticking to this high level of abstraction. 00:55:16.200 |
I remember reading some vague paper somewhere recently 00:55:20.040 |
where the mechanical signal, like the vibrations 00:55:23.400 |
or something of the neurons also communicates information. 00:55:33.040 |
the electrical signal, this is in Nature paper, 00:55:36.840 |
something like that, where the electrical signal 00:55:38.800 |
is actually a side effect of the mechanical signal. 00:55:49.040 |
that there could be a deeper, it's always like in physics 00:55:52.400 |
with quantum mechanics, there's always a deeper story 00:55:57.480 |
But you think it's basically the rate of spiking 00:56:00.560 |
that gets us, that's like the lowest hanging fruit 00:56:06.600 |
I mean, this is not, the only way in which this stance 00:56:14.000 |
there are members of the neuroscience community 00:56:32.800 |
the neuron has receptors for those transmitters. 00:56:35.880 |
The meeting of the transmitter with these receptors 00:56:48.640 |
And it's that spike that is conducted down the axon 00:56:56.800 |
This is like the way the brain is supposed to work. 00:56:59.280 |
Now, what we do when we build artificial neural networks 00:57:03.640 |
of the kind that are now popular in the AI community 00:57:06.760 |
is that we don't worry about those individual spikes. 00:57:22.280 |
And so the activity of units in a deep learning system 00:57:27.120 |
is broadly analogous to the spike rate of a neuron. 00:57:37.000 |
that there are other forms of communication in the brain. 00:57:39.160 |
In fact, I've been involved in some research recently 00:57:49.680 |
that are sort of below the level of spike production 00:58:00.640 |
that I think that the things that we're building 00:58:08.760 |
- Let me ask just for fun a crazy question, 'cause I can. 00:58:14.680 |
Do you think it's possible we're completely wrong 00:58:24.120 |
in some very different kind of way in the brain? 00:58:29.960 |
if I didn't think there was any chance we were wrong. 00:58:41.200 |
of course, the vast majority of deep learning research 00:58:49.120 |
there's sort of an unbroken chain of research 00:58:55.000 |
which is, hey, let's train a deep learning system. 00:59:31.120 |
the learning algorithms that we have access to, 00:59:44.080 |
patterns of neuronal behavior in these artificial models 00:59:48.760 |
that look hauntingly similar to what you see in the brain. 00:59:59.720 |
such coincidence is unlikely to not be deeply meaningful. 01:00:11.640 |
So you have co-authored several recent papers 01:00:15.160 |
that sort of weave beautifully between the world 01:00:24.800 |
can we just try to dance around and talk about some of them, 01:00:38.960 |
"Prefrontal Cortex as a Meta-Reinforcement Learning System." 01:00:46.720 |
- Yeah, I mean, the key idea is about meta-learning. 01:00:58.680 |
a situation in which you have a learning algorithm, 01:01:06.160 |
and the learning algorithm operates in such a way 01:01:09.800 |
that it gives rise to another learning algorithm. 01:01:17.180 |
you had one learning algorithm sort of adjusting 01:01:20.360 |
the parameters on another learning algorithm. 01:01:23.080 |
But the case that we're interested in this paper 01:01:25.120 |
is one where you start with just one learning algorithm, 01:01:29.200 |
and then another learning algorithm kind of emerges 01:01:39.640 |
ascurantist, but that's the idea of meta-learning. 01:01:54.320 |
that make you better at learning something new. 01:01:57.320 |
Like a familiar example would be learning a foreign language. 01:02:02.880 |
it may be quite laborious and disorienting and novel. 01:02:07.880 |
But if, let's say you've learned two foreign languages, 01:02:20.200 |
You know, okay, I'm gonna have to learn how to conjugate. 01:02:23.920 |
That's a simple form of meta-learning, right? 01:02:26.360 |
In the sense that there's some slow learning mechanism 01:02:39.160 |
from the psychology world, from neuroscience, 01:02:50.000 |
that we can bring into the artificial intelligence world? 01:02:55.960 |
was in AI work that we were doing in my group. 01:03:06.280 |
using standard reinforcement learning algorithms. 01:03:10.200 |
But you train that network, not just in one task, 01:03:12.680 |
but you train it in a bunch of interrelated tasks. 01:03:29.360 |
a form of meta-learning spontaneously happens 01:03:37.680 |
a recurrent neural network has a kind of memory 01:03:47.480 |
that you have units that connect to other units 01:04:00.080 |
It's like actively holding something in mind. 01:04:03.000 |
And so that memory gives the recurrent neural network 01:04:13.080 |
The way that the activity pattern evolves over time 01:04:25.520 |
are shaped by the connectivity, by the synaptic weights. 01:04:45.840 |
the activation dynamics will become very interesting, right? 01:04:53.160 |
where you have to press one button or another, 01:05:04.160 |
and there's some probability I'll give you an M&M 01:05:07.560 |
And you have to figure out what those probabilities are 01:05:13.760 |
instead of just giving you one of these tasks, 01:05:22.120 |
two new buttons, you have to figure out which one's best. 01:05:37.320 |
when we first started kind of realizing what was going on. 01:05:45.480 |
those slow synaptic changes give rise to a network dynamics 01:05:52.960 |
the dynamics themselves turn into a learning algorithm. 01:05:56.840 |
So in other words, you can tell this is happening 01:05:59.040 |
by just freezing the synaptic weights, saying, 01:06:03.360 |
Here's a new box, figure out which button is best." 01:06:07.600 |
And the recurrent neural network will do this just fine. 01:06:09.600 |
There's no, like it figures out which button is best. 01:06:13.040 |
It kind of transitions from exploring the two buttons 01:06:21.680 |
It's happening because the activity dynamics of the network 01:06:25.840 |
have been shaped by the slow learning process 01:06:30.720 |
And so what's happened is that this slow learning algorithm 01:06:39.760 |
the activity dynamics into its own learning algorithm. 01:06:43.480 |
And as we were kind of realizing that this is a thing, 01:06:49.160 |
it just so happened that the group that was working on this 01:06:56.040 |
And it started kind of ringing a bell for us, 01:07:07.560 |
synaptic memory and activity-based memory in the brain." 01:07:10.560 |
And it also reminded us of recurrent connectivity 01:07:15.920 |
that's very characteristic of prefrontal function. 01:07:18.400 |
So this is kind of why it's good to have people working 01:07:22.840 |
on AI that know a little bit about neuroscience 01:07:26.240 |
and vice versa, because we started thinking about 01:07:29.560 |
whether we could apply this principle to neuroscience. 01:07:42.920 |
to force something like an idea of a learning to learn, 01:07:50.880 |
as long as you keep varying the environment sufficiently. 01:08:00.720 |
"Okay, well, we know that the prefrontal cortex 01:08:05.000 |
We know that it's an important locus for working memory, 01:08:15.640 |
In other words, what is reinforcement learning? 01:08:19.320 |
You take an action, you see how much reward you got, 01:08:23.620 |
Maybe the prefrontal cortex is doing that sort of thing 01:08:28.520 |
It's keeping around a memory in its activity patterns 01:08:42.020 |
"Well, how did the prefrontal cortex get so smart?" 01:08:44.580 |
In other words, where did these activity dynamics come from? 01:08:50.780 |
in the recurrent dynamics of the prefrontal cortex arise? 01:08:54.460 |
And one answer that became evident in this work was, 01:09:12.460 |
but because this kind of several temporal classes 01:09:24.340 |
can you keep building stacks of learning-to-learn-to-learn, 01:09:29.340 |
learning-to-learn-to-learn-to-learn-to-learn, 01:09:31.660 |
because it keeps, I mean, basically abstractions 01:09:41.980 |
- Or is this overstretching this kind of mechanism? 01:10:19.300 |
that that kind of level of abstraction would be powerful, 01:10:23.940 |
This kind of, is it useful to think of learning 01:10:35.300 |
about this mechanism that we were starting to look at, 01:10:39.020 |
and other groups started talking about very similar things 01:10:43.220 |
at the same time, and then a kind of explosion of interest 01:10:47.020 |
in meta-learning happened in the AI community 01:10:50.580 |
I don't know if we had anything to do with that, 01:10:52.060 |
but I was gratified to see that a lot of people 01:10:57.780 |
One of the things that I like about the kind of flavor 01:11:33.160 |
well, gee, if we wanted meta-learning to happen, 01:11:37.080 |
but there's something about the kind of meta-learning 01:11:39.520 |
that we were studying that seemed to me special 01:11:45.000 |
It was just something that automatically happened 01:11:51.080 |
and it was trained with a reinforcement learning algorithm. 01:11:54.040 |
And in that sense, it can be as meta as it wants to be. 01:11:59.600 |
There's no limit on how abstract the meta-learning can get 01:12:04.600 |
because it's not reliant on a human engineering 01:12:08.040 |
a particular meta-learning algorithm to get there. 01:12:15.200 |
I guess I hope that that's relevant in the brain. 01:12:53.440 |
It actually doesn't have to be a recurrent neural network. 01:12:55.680 |
A paper that I was honored to be involved with even earlier 01:13:10.160 |
Meta-Learning in Memory Augmented Neural Networks. 01:13:38.080 |
But this brings us back to something I was saying earlier 01:13:44.500 |
This will happen if the system is being trained 01:13:49.920 |
in a setting where there's a sequence of tasks 01:13:59.020 |
And that's something that's very obviously true 01:14:06.380 |
If you just kind of think about what you do every day, 01:14:17.640 |
But everything that you do has a family resemblance. 01:14:21.040 |
It shares a structure with something that you did before. 01:14:32.680 |
It's endless variety with endless redundancy. 01:14:40.560 |
- And it does seem like we're just so good at finding, 01:14:47.320 |
you described, we're really good at finding that redundancy, 01:14:50.040 |
finding those similarities, the family resemblance. 01:14:56.560 |
Melanie Mitchell was talking about analogies. 01:15:06.120 |
There's so many echoes here of psychology and neuroscience. 01:15:10.640 |
And obviously now with reinforcement learning 01:15:18.280 |
If we could talk a little bit about dopamine, 01:15:20.160 |
you have really, you're a part of co-authoring 01:15:31.040 |
Can you describe the key ideas of that paper? 01:15:37.760 |
is acknowledge my co-authors on actually both 01:15:43.080 |
- I'll just, I'll certainly post all their names. 01:15:46.560 |
'Cause I'm sort of a bash to be the spokesperson 01:15:50.360 |
for these papers when I had such amazing collaborators 01:15:58.600 |
- Yeah, there's an incredible team there, but yeah. 01:16:06.400 |
we also collaborated with Naoichi at Harvard, 01:16:09.040 |
who obviously the paper simply wouldn't have happened 01:16:12.680 |
But so you were asking for like a thumbnail sketch of-- 01:16:17.600 |
- Yes, a thumbnail sketch or key ideas or things, 01:16:21.320 |
the insights that continue on our kind of discussion here 01:16:41.460 |
which I think on the surface sounds like something 01:16:48.360 |
We see it also as a way of validating what we're doing 01:16:57.960 |
is using some technique that we've been trying out 01:17:11.540 |
that it'll interface well with other mechanisms. 01:17:17.660 |
- Just because a particular paper is a little bit focused 01:17:19.620 |
on from AI from neural networks to neuroscience, 01:17:33.180 |
- Yeah, I mean, we've talked about the notion 01:17:36.220 |
of a virtuous circle between AI and neuroscience. 01:17:39.260 |
And the way I see it, that's always been there 01:17:53.500 |
There are some phases when neuroscience was sort of ahead. 01:18:20.120 |
has been focusing on approaches to studying behavior 01:18:26.540 |
from this earlier era of cognitive psychology. 01:18:36.660 |
like how do we deal with large, complex environments? 01:18:54.600 |
with insights that may change the direction of our work. 01:19:06.340 |
but they're able to both think about neuroscience and AI. 01:19:10.300 |
You know, I don't often meet people like that. 01:19:19.780 |
Do you think a human being can be both good at AI 01:19:26.500 |
what kind of human can occupy these two realms? 01:19:34.740 |
or is that a very special few can kind of jump? 01:19:51.320 |
- I think it does take a special kind of person 01:19:58.660 |
to be truly world-class at both AI and neuroscience. 01:20:08.100 |
who's interest in neuroscience and psychology 01:20:11.380 |
involved using the kinds of modeling techniques 01:20:31.140 |
who I would consider pretty expert on both fronts, 01:21:14.320 |
whether it's programming or the kind of tools necessary 01:21:18.460 |
to distribute, to compute, all that kind of stuff. 01:21:24.760 |
- Exactly, especially with the recent explosion 01:21:32.380 |
I think the best scenario for both neuroscience and AI 01:21:52.860 |
to exclusively focused on the engineering side of AI. 01:21:56.480 |
But to have those people inhabiting a community 01:22:05.680 |
And I may be someone who's very close to the center 01:22:09.540 |
in the sense that I have one foot in the neuroscience world 01:22:22.460 |
from having true technical expertise in either domain. 01:22:33.300 |
who can kind of see the connections between these two. 01:22:37.220 |
- Yeah, the emergent intelligence of the community 01:22:45.180 |
So hopefully, I mean, I've seen that work out well 01:22:50.860 |
There are people who, I mean, even if you just focus 01:22:56.460 |
it's been a good thing to have some people around 01:23:00.140 |
doing that kind of work whose PhDs are in neuroscience 01:23:05.340 |
Every academic discipline has its kind of blind spots 01:23:18.340 |
And having some intellectual diversity is really healthy. 01:23:33.100 |
who bring some neuroscience background to the table 01:23:37.500 |
- So one of my, probably the deepest passion for me, 01:23:41.460 |
what I would say, maybe we kind of spoke off mic 01:23:48.420 |
is a blind spot for at least robotics and AI folks 01:23:51.420 |
is human-robot interaction, human-agent interaction. 01:23:55.620 |
Maybe, do you have thoughts about how we reduce 01:24:03.060 |
Do you also share the feeling that not enough folks 01:24:10.300 |
- Well, I'm actually pretty intensively interested 01:24:15.920 |
And there are people in my group who've actually pivoted 01:24:20.980 |
from doing more traditional cognitive psychology 01:24:24.220 |
and cognitive neuroscience to doing experimental work 01:24:32.540 |
that I'm pretty passionately interested in this. 01:24:35.540 |
One is, it's kind of the outcome of having thought 01:24:53.480 |
So what does it mean to make the world a better place? 01:25:00.520 |
And so how do you make life better for humans? 01:25:05.800 |
That's a proposition that when you look at it carefully 01:25:10.560 |
and honestly is rather horrendously complicated, 01:25:15.560 |
especially when the AI systems that you're building 01:25:25.240 |
They're not, you're not programming something 01:25:34.860 |
We're building systems that learn from experience. 01:25:39.720 |
So that typically leads to AI safety questions. 01:25:43.480 |
How do we keep these things from getting out of control? 01:25:45.440 |
How do we keep them from doing things that harm humans? 01:25:54.480 |
And there are large sectors of the research community 01:26:04.960 |
But there's, I guess I would say a positive side to this too 01:26:11.160 |
what would it mean to make human life better? 01:26:15.880 |
And how can we imagine learning systems doing that? 01:26:26.160 |
it's not sufficient to philosophize about that. 01:26:32.000 |
how humans actually work and what humans want 01:26:37.840 |
and the difficulties of knowing what humans want 01:26:58.060 |
in order to really address that issue in an adequate way, 01:27:04.040 |
you have to, I mean, psychology becomes part of the picture. 01:27:27.360 |
And when you create a learning system, just as you said, 01:27:31.400 |
that will eventually have to interact with humans, 01:27:45.120 |
you can't just watch humans to learn about humans. 01:27:52.280 |
And I mean, then questions arise that start imperceptibly, 01:27:57.280 |
but inevitably to slip beyond the realm of engineering. 01:28:08.980 |
under what conditions do you want that agent to do it? 01:28:13.820 |
So if I have a robot that can play Beethoven sonatas 01:28:18.820 |
better than any human, in the sense that the sensitivity, 01:28:30.760 |
the expression is just beyond what any human, 01:28:36.340 |
Do I wanna go to a concert and hear a robot play? 01:28:59.180 |
Probably the agents should interact with humans 01:29:07.800 |
And then when you start, I referred this the moment ago, 01:29:14.340 |
Be quote, what if two humans want different things 01:29:19.100 |
and you have only one agent that's able to interact with them 01:29:39.940 |
then it goes beyond questions of engineering and technology 01:29:44.940 |
and starts to shade in perceptibly into questions about 01:29:59.820 |
quite refreshed in my involvement in AI research. 01:30:03.020 |
It was almost like building this kind of stuff 01:30:16.700 |
And bringing in viewpoints from multiple sub-communities 01:30:26.340 |
It started making me feel like doing AI research 01:30:35.180 |
could potentially lead to a kind of cultural renewal. 01:30:42.860 |
- Yeah, it's the way to understand human beings 01:31:03.820 |
- And it might restore a certain, I don't know, 01:31:10.500 |
or even dare I say, spirituality to the world. 01:31:21.060 |
I think AI will be the philosophy of the 21st century, 01:31:28.980 |
I think a lot of AI researchers are afraid to open that door 01:31:35.620 |
of the human-agent interaction, human-AI interaction. 01:31:44.540 |
- One thing I often think about is the usual schema 01:32:00.540 |
And again, I hasten to say AI safety is hugely important. 01:32:06.500 |
about those risks, totally on board for that. 01:32:23.060 |
that might be relevant, which is when we think of humans 01:32:28.060 |
gaining more and more information about human life, 01:32:33.340 |
the narrative there is usually that they gain 01:32:37.020 |
more and more wisdom and they get closer to enlightenment 01:32:52.540 |
are just gonna, they're gonna figure out more and more 01:33:05.460 |
without some careful, setting things up very carefully. 01:33:14.180 |
I personally believe that the most trajectories, 01:33:19.180 |
natural human trajectories will lead us towards progress. 01:33:42.100 |
There's something appealing to our human mind 01:33:47.660 |
I mean, we don't wanna be eaten by the tiger, I guess. 01:33:55.700 |
from actually building out all the other trajectories 01:33:58.660 |
which are potentially leading to all the positive worlds, 01:34:17.380 |
You have to do the, not just the AI safety work 01:34:20.740 |
of the one worst case analysis, how do we prevent that? 01:34:31.340 |
that would lead to all the positive actions that can go. 01:34:38.340 |
we should be spending a lot of our time saying 01:34:42.860 |
I think it's harder to see that there's work to be done 01:34:47.860 |
to bring into focus the question of what it would look like 01:34:58.820 |
if we didn't have the sense there was huge potential. 01:35:05.940 |
We have a sense that AGI would be a major boom to humanity. 01:35:21.260 |
that are already gonna make the world a better place, 01:35:37.460 |
when we think about building fully intelligent agents 01:35:54.540 |
And I think we just need to start working on it. 01:36:00.100 |
but just intelligent agents that interact with us 01:36:02.500 |
and help us enrich our own existence on social networks, 01:36:06.460 |
for example, on recommender systems, various intelligent, 01:36:14.740 |
I mean, Twitter is struggling with this very idea, 01:36:21.460 |
that increase the quality and the health of a conversation? 01:36:25.260 |
- That's a beautiful, beautiful human psychology question. 01:36:42.060 |
And how do you make these choices in a democratic way? 01:36:58.540 |
who have the skillset to build these kinds of systems, 01:37:02.260 |
but what it means to make the world a better place 01:37:08.060 |
is something that we all have to be talking about. 01:37:11.180 |
- Yeah, the world that we're trying to make a better place 01:37:16.180 |
includes a huge variety of different kinds of people. 01:37:45.940 |
because it turns out that it all becomes relevant. 01:37:55.620 |
we've been trying not to write philosophy papers, right? 01:38:00.140 |
We've been trying not to write physician papers. 01:38:10.820 |
for humans with all of their complexity and contradiction 01:38:25.500 |
- And often reinforcement learning frameworks 01:38:27.500 |
actually kind of allow you to do that machine learning. 01:38:33.540 |
is it allows you to reduce the unsolvable problem, 01:38:38.180 |
into something more concrete that you can get a hold of. 01:38:41.660 |
- Yeah, and it allows you to kind of define the problem 01:38:43.860 |
in some way that allows for growth in the system 01:38:51.100 |
you're not responsible for the details, right? 01:38:54.060 |
You say, this is generally what I want you to do. 01:39:04.060 |
But I think also some of these positive issues 01:39:09.140 |
to really come to understand what humans want? 01:39:12.660 |
And, you know, with all of the subtleties of that, right? 01:39:18.980 |
You know, humans want help with certain things. 01:39:24.740 |
But they don't want everything done for them, right? 01:39:27.500 |
There is part of the satisfaction that humans get from life 01:39:32.780 |
So if there were devices around that did everything for, 01:39:34.740 |
you know, I often think of the movie "Wall-E", right? 01:39:37.580 |
That's like dystopian in a totally different way. 01:39:39.420 |
It's like, the machines are doing everything for us. 01:39:43.860 |
You know, anyway, I just, I find this, you know, 01:39:52.860 |
- To me, it's one of the most exciting and it's wide open. 01:40:05.100 |
- Yeah, a quick summary of, what's the title of the paper? 01:40:10.100 |
- I think we called it a distributional code for value 01:40:19.820 |
So that's another project that grew out of pure AI research. 01:40:24.820 |
A number of people at DeepMind and a few other places 01:40:37.420 |
by taking something in traditional reinforcement learning 01:40:43.460 |
from traditional reinforcement learning was a value signal. 01:40:50.260 |
at least most algorithms, is some representation 01:40:58.300 |
And that's usually represented as a single number. 01:41:13.780 |
that situation would be represented as a single number, 01:41:20.060 |
And this new form of reinforcement learning said, 01:41:28.660 |
So now we think of the gambler as literally thinking, 01:41:38.380 |
And it had been observed through experiments, 01:41:44.180 |
that that kind of distributional representation 01:42:06.140 |
that this had been tried out in a kind of heuristic way. 01:42:09.820 |
People thought, well, gee, what would happen if we tried? 01:42:17.260 |
And it was only then that people started thinking, 01:42:26.140 |
just trying to figure out why it works, which is ongoing. 01:42:29.700 |
But one thing that's already clear from that research 01:42:34.300 |
is that it drives richer representation learning. 01:42:47.260 |
standard deep reinforcement learning algorithms 01:42:58.140 |
Because the thing that you're trying to represent, 01:43:16.880 |
but they have different distributions of value. 01:43:22.300 |
will maintain the distinction between these two things. 01:43:26.820 |
distributional learning can keep things separate 01:43:32.140 |
that might otherwise be conflated or squished together. 01:44:00.740 |
But what do we know about dopamine in the human brain? 01:44:04.180 |
What is it, why is it useful, why is it interesting? 01:44:07.460 |
What does it have to do with the prefrontal cortex 01:44:15.300 |
where there's a huge amount of detail and debate. 01:44:24.660 |
is that the function of this neurotransmitter dopamine 01:44:33.460 |
of standard reinforcement learning algorithms, 01:44:46.880 |
Well, if you made some prediction about future reward, 01:44:51.820 |
and then you get more reward than you were expecting, 01:44:55.980 |
you wanna go back and increase the value representation 01:45:03.780 |
If you got less reward than you were expecting, 01:45:08.460 |
- And that's the process of temporal difference. 01:45:16.620 |
sort of the backbone of our armamentarium in RL. 01:45:22.540 |
between the reward prediction error and dopamine 01:45:45.100 |
was representing these reward prediction errors, 01:45:48.020 |
again, in this like kind of single number way, 01:45:52.860 |
representing your surprise with a single number. 01:45:56.740 |
And in distributional reinforcement learning, 01:45:58.540 |
this kind of new elaboration of the standard approach, 01:46:19.280 |
on distributional temporal difference learning, 01:46:22.340 |
talked to a guy in my group, Zeb Kurth-Nelson, 01:46:33.460 |
And they started looking at what was in the literature, 01:46:39.220 |
and we came up with some specific predictions about, 01:46:53.580 |
all of which ended up being fairly clearly confirmed, 01:46:56.440 |
and all of which leads to at least some initial indication 01:47:03.460 |
that dopamine might be representing surprise signals 01:47:06.820 |
in a way that is not just collapsing everything 01:47:10.020 |
to a single number, but instead is kind of respecting 01:47:12.220 |
the variety of future outcomes, if that makes sense. 01:47:16.660 |
- So yeah, so that's showing, suggesting possibly 01:47:19.620 |
that dopamine has a really interesting representation scheme 01:47:31.780 |
of AI revealing something nice about neuroscience. 01:47:36.300 |
- Well, you never know, so the minute you publish a paper 01:47:50.900 |
it has been a lot of fun for us to take these ideas from AI 01:47:58.940 |
- So we kind of talked about it a little bit, 01:48:01.300 |
but where do you see the field of neuroscience 01:48:12.540 |
that you can see breakthroughs in the next, let's get crazy, 01:48:16.340 |
not just three or five years, but next 10, 20, 30 years 01:48:20.020 |
that would make you excited and perhaps you'd be part of? 01:48:44.380 |
so neuroscience, especially the part of neuroscience 01:48:56.340 |
there's been this explosion in new technology. 01:49:08.980 |
have not involved a lot of interesting behavior. 01:49:18.740 |
some of these technologies, you actually have to, 01:49:21.060 |
if you're studying a mouse, you have to head fix the mouse. 01:49:23.660 |
In other words, you have to immobilize the mouse. 01:49:41.340 |
where the animal can kind of move a trackball 01:49:52.860 |
well, let's try to bring behavior into the picture. 01:49:58.260 |
which was supposed to be what this whole thing was about. 01:50:10.180 |
and the widespread interest in what's going on in AI, 01:50:15.020 |
will come together to kind of open a new chapter 01:50:18.540 |
in neuroscience research where there's a kind of 01:50:22.780 |
a rebirth of interest in the structure of behavior 01:50:47.980 |
Neuroscience is about studying the mechanisms 01:50:51.580 |
that underlie whatever it is the brain is for, 01:50:58.420 |
I feel like we could maybe take a step toward that now 01:51:15.220 |
So what about the engineering of intelligence systems? 01:51:28.700 |
is to build systems that have the kind of flexibility, 01:51:33.140 |
and the kind of flexibility that humans have in two senses. 01:51:38.580 |
One is that humans can be good at many things. 01:51:45.620 |
that they can switch between things very easily, 01:51:52.060 |
because they very ably see what a new task has in common 01:52:13.660 |
are simply wrong for getting that kind of flexibility. 01:52:28.580 |
of the AI community is starting to pivot to that question. 01:52:37.980 |
It's gonna lead to a focus on what in psychology 01:52:43.540 |
which is the ability to switch between tasks, 01:52:45.840 |
the ability to quickly put together a program of behavior 01:52:51.720 |
but you know makes sense for a particular set of demands. 01:52:55.260 |
It's very closely related to what the prefrontal cortex does 01:53:01.060 |
So I think it's gonna be an interesting new chapter. 01:53:05.380 |
- So that's the reasoning side and cognition side, 01:53:07.380 |
but let me ask the over romanticized question. 01:53:10.540 |
Do you think we'll ever engineer an AGI system 01:53:52.740 |
toward other humans into a sort of two-dimensional, 01:54:15.420 |
who's very skilled and capable, but is very cold. 01:54:31.980 |
who elicits in us or displays a lot of human warmth, 01:55:01.340 |
These are the two dimensions that people seem 01:55:02.860 |
to kind of like, along which people size other people up. 01:55:06.740 |
And in AI research, we really focus on this capability thing. 01:55:12.700 |
This thing can play Go at a superhuman level. 01:55:19.340 |
What would it mean for an AI system to be warm? 01:55:23.900 |
And I don't know, maybe there are easy solutions here. 01:55:33.060 |
But I think it also has to do with a pattern of behavior, 01:55:36.380 |
a pattern of, what would it mean for an AI system 01:55:43.380 |
in a way that actually made us feel like it was for real? 01:56:01.100 |
Is there an AI system that can not only convince us 01:56:38.900 |
and probably one of the most important questions 01:57:01.640 |
There are ways of communicating to other people 01:57:32.760 |
something that sets them out in the right direction 01:57:44.200 |
I mean, honestly, if that's not where we're headed, 01:57:52.240 |
- I think it's exciting as a scientific problem, 01:58:21.000 |
So it's good having the opportunity to do that. 01:58:57.280 |
If you enjoy this thing, subscribe on YouTube, 01:58:59.600 |
review it with the five stars on Apple Podcasts, 01:59:25.120 |
imagine angels, contemplate the meaning of infinity 01:59:28.720 |
and even question its own place in the cosmos? 01:59:31.720 |
Especially awe-inspiring is the fact that any single brain, 01:59:45.480 |
These particles drifted for eons and light years 01:59:48.320 |
until gravity and change brought them together here now. 01:59:53.160 |
These atoms now form a conglomerate, your brain, 01:59:57.560 |
that can not only ponder the very stars that gave it birth, 02:00:00.840 |
but can also think about its own ability to think 02:00:07.800 |
With the arrival of humans, it has been said, 02:00:10.640 |
the universe has suddenly become conscious of itself. 02:00:19.960 |
Thank you for listening and hope to see you next time.