back to index

Tomaso Poggio: Brains, Minds, and Machines | Lex Fridman Podcast #13


Chapters

0:0
4:0 Time Travel Is Possible
22:38 Parts of the Brain
26:28 Cortex
28:12 The Human Visual Cortex
41:19 Stochastic Gradient Descent
43:18 Bazoo Theorem
44:53 The Universality a Universal Approximation Theorem
47:15 The Curse of Dimensionality
47:56 Challenges of Unsupervised Learning
55:8 Object Recognition Problem
56:31 The Existential Threat of Ai
61:21 Levels of Understanding
63:37 Ethics of Neuroscience
65:42 The Hard Problem of Consciousness
70:7 Next Breakthrough
71:56 Virtual Reality Experiment
78:51 Intelligence Is a Gift or Curse

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Tomasso Poggio.
00:00:02.960 | He's a professor at MIT and is a director
00:00:05.540 | of the Center for Brains, Minds and Machines.
00:00:08.360 | Cited over 100,000 times, his work has had a profound impact
00:00:13.360 | on our understanding of the nature of intelligence
00:00:16.160 | in both biological and artificial neural networks.
00:00:19.880 | He has been an advisor to many highly impactful researchers
00:00:23.840 | and entrepreneurs in AI,
00:00:25.680 | including Demis Hassabis of DeepMind,
00:00:28.000 | Amnon Shashua of Mobileye and Christophe Koch
00:00:31.200 | of the Allen Institute for Brain Science.
00:00:34.120 | This conversation is part of the MIT course
00:00:36.400 | on artificial general intelligence
00:00:38.120 | and the Artificial Intelligence Podcast.
00:00:40.240 | If you enjoy it, subscribe on YouTube, iTunes,
00:00:42.760 | or simply connect with me on Twitter @LexFriedman,
00:00:45.820 | spelled F-R-I-D.
00:00:47.960 | And now, here's my conversation with Tomasso Poggio.
00:00:52.480 | You've mentioned that in your childhood,
00:00:54.520 | you've developed a fascination with physics,
00:00:56.960 | especially the theory of relativity,
00:00:59.720 | and that Einstein was also a childhood hero to you.
00:01:03.600 | What aspect of Einstein's genius,
00:01:07.920 | the nature of his genius, do you think was essential
00:01:10.200 | for discovering the theory of relativity?
00:01:12.960 | - You know, Einstein was a hero to me
00:01:15.960 | and I'm sure to many people because he was able to make,
00:01:20.960 | of course, a major, major contribution to physics
00:01:25.180 | with simplifying a bit just a Gedanken experiment,
00:01:30.180 | a thought experiment.
00:01:33.600 | You know, imagining communication with lights
00:01:38.880 | between a stationary observer and somebody on a train.
00:01:43.320 | And I thought, you know, the fact that just with the force
00:01:48.320 | of his thought, of his thinking, of his mind,
00:01:52.720 | he could get to something so deep
00:01:55.660 | in terms of physical reality,
00:01:57.500 | how time depended on space and speed.
00:02:01.320 | It was something absolutely fascinating.
00:02:04.100 | It was the power of intelligence, the power of the mind.
00:02:08.420 | - Do you think the ability to imagine,
00:02:11.120 | to visualize as he did, as a lot of great physicists do,
00:02:15.180 | do you think that's in all of us, human beings,
00:02:18.620 | or is there something special
00:02:20.580 | to that one particular human being?
00:02:22.860 | - I think, you know, all of us can learn
00:02:27.100 | and have, in principle, similar breakthroughs.
00:02:32.100 | There are lessons to be learned from Einstein.
00:02:37.180 | He was one of five PhD students at ETA,
00:02:42.180 | the Eidgenossische Technische Hochschule in Zurich,
00:02:47.540 | in physics, and he was the worst of the five,
00:02:50.820 | the only one who did not get an academic position
00:02:54.460 | when he graduated, when he finished his PhD,
00:02:57.940 | and he went to work, as everybody knows,
00:03:01.100 | for the patent office.
00:03:02.500 | And so it's not so much that he worked
00:03:04.820 | for the patent office, but the fact that, obviously,
00:03:07.980 | he was smart, but he was not a top student,
00:03:11.260 | obviously, he was an anticonformist.
00:03:13.580 | He was not thinking in the traditional way
00:03:16.740 | that probably his teachers
00:03:18.260 | and the other students were doing.
00:03:19.980 | So there is a lot to be said about trying to be,
00:03:24.760 | to do the opposite or something quite different
00:03:28.700 | from what other people are doing.
00:03:31.060 | That's certainly true for the stock market.
00:03:32.860 | Never buy if everybody's buying. (laughs)
00:03:36.860 | - And also true for science.
00:03:38.580 | - Yes.
00:03:39.660 | - So you've also mentioned, staying on the theme of physics,
00:03:43.740 | that you were excited at a young age
00:03:47.620 | by the mysteries of the universe that physics could uncover.
00:03:50.820 | Such, as I saw mentioned, the possibility of time travel.
00:03:55.460 | So the most out-of-the-box question, I think,
00:03:59.460 | I'll get to ask today, do you think time travel is possible?
00:04:02.460 | - Well, it would be nice if it were possible right now.
00:04:06.620 | In science, you never say no.
00:04:12.860 | - But your understanding of the nature of time.
00:04:15.060 | - Yeah, it's very likely that it's not possible
00:04:19.780 | to travel in time.
00:04:21.220 | You may be able to travel forward in time
00:04:26.080 | if we can, for instance, freeze ourselves
00:04:29.820 | or go on some spacecraft traveling
00:04:34.340 | close to the speed of light.
00:04:35.940 | But in terms of actively traveling, for instance,
00:04:40.500 | back in time, I find probably very unlikely.
00:04:45.300 | - So do you still hold the underlying dream
00:04:49.180 | of the engineering intelligence that will build systems
00:04:53.340 | that are able to do such huge leaps,
00:04:56.820 | like discovering the kind of mechanism
00:05:00.700 | that would be required to travel through time?
00:05:02.620 | Do you still hold that dream,
00:05:04.180 | or echoes of it from your childhood?
00:05:07.060 | - Yeah.
00:05:08.660 | I don't think whether, there are certain problems
00:05:11.100 | that probably cannot be solved,
00:05:14.460 | depending what you believe about the physical reality.
00:05:18.340 | Like, maybe totally impossible to create energy
00:05:23.340 | from nothing or to travel back in time.
00:05:27.660 | But about making machines that can think
00:05:34.300 | as well as we do or better, or more likely,
00:05:38.140 | especially in the short and mid-term,
00:05:40.860 | help us think better, which is in a sense
00:05:43.220 | is happening already with the computers we have.
00:05:46.420 | And it will happen more and more.
00:05:48.420 | Well, that I certainly believe.
00:05:49.940 | And I don't see in principle why computers
00:05:53.260 | at some point could not become more intelligent
00:05:58.260 | than we are, although the word intelligence
00:06:02.340 | is a tricky one, and one we should discuss
00:06:05.860 | what I mean with that.
00:06:07.100 | - Yeah, intelligence, consciousness,
00:06:09.780 | words like love, all these are very,
00:06:14.780 | need to be disentangled.
00:06:16.740 | So you've mentioned also that you believe
00:06:18.740 | the problem of intelligence is the greatest problem
00:06:22.220 | in science, greater than the origin of life
00:06:24.460 | and the origin of the universe.
00:06:26.020 | You've also, in the talk I've listened to,
00:06:30.700 | said that you're open to arguments against you.
00:06:34.820 | So what do you think is the most captivating
00:06:39.820 | aspect of this problem of understanding
00:06:42.180 | the nature of intelligence?
00:06:43.280 | Why does it captivate you as it does?
00:06:46.500 | - Well, originally, I think one of the motivation
00:06:50.660 | that I had as, I guess, a teenager
00:06:54.980 | when I was infatuated with theory of relativity
00:06:58.540 | was really that I found that there was
00:07:02.740 | the problem of time and space and general relativity,
00:07:07.740 | but there were so many other problems
00:07:09.940 | of the same level of difficulty and importance
00:07:13.700 | that I could, even if I were Einstein,
00:07:16.620 | it was difficult to hope to solve all of them.
00:07:19.500 | So what about solving a problem whose solution
00:07:24.380 | allowed me to solve all the problems?
00:07:26.580 | And this was, what if we could find the key
00:07:31.060 | to an intelligence 10 times better or faster than Einstein?
00:07:35.900 | - So that's sort of seeing artificial intelligence
00:07:40.140 | as a tool to expand our capabilities,
00:07:43.220 | but is there just an inherent curiosity in you
00:07:47.900 | in just understanding what it is in here
00:07:52.140 | that makes it all work?
00:07:54.340 | - Yes, absolutely, you're right.
00:07:55.700 | So I started saying this was the motivation
00:07:59.060 | when I was a teenager, but soon after,
00:08:02.980 | I think the problem of human intelligence
00:08:07.140 | became a real focus of my science and my research
00:08:12.140 | because I think he's, for me,
00:08:20.140 | the most interesting problem is really asking
00:08:25.580 | who we are, right?
00:08:28.060 | Is asking not only a question about science,
00:08:31.700 | but even about the very tool we are using to do science,
00:08:35.780 | which is our brain.
00:08:36.780 | How does our brain work?
00:08:39.820 | From where does it come from?
00:08:42.060 | What are its limitation?
00:08:43.660 | Can we make it better?
00:08:45.020 | - And that, in many ways, is the ultimate question
00:08:48.740 | that underlies this whole effort of science.
00:08:54.340 | - So you've made significant contributions
00:08:56.380 | in both the science of intelligence
00:08:58.220 | and the engineering of intelligence.
00:09:00.220 | In a hypothetical way, let me ask,
00:09:04.980 | how far do you think we can get
00:09:06.580 | in creating intelligence systems
00:09:08.900 | without understanding the biological,
00:09:12.100 | the understanding how the human brain creates intelligence?
00:09:15.060 | Put another way, do you think we can build
00:09:17.660 | a strong AI system without really getting at the core,
00:09:21.060 | the understanding of the functional nature of the brain?
00:09:24.940 | - Well, this is a real difficult question.
00:09:28.060 | You know, we did solve problems like flying
00:09:33.620 | without really using too much our knowledge
00:09:39.980 | about how birds fly.
00:09:42.460 | It was important, I guess, to know that you could have
00:09:47.580 | things heavier than air being able to fly,
00:09:52.980 | like birds.
00:09:54.580 | But beyond that, probably we did not learn very much.
00:10:00.300 | You know, some.
00:10:01.140 | The Brothers Wright did learn a lot of observation
00:10:06.580 | about birds and designing their aircraft.
00:10:11.580 | But you can argue we did not use much of biology
00:10:15.780 | in that particular case.
00:10:17.740 | Now, in the case of intelligence,
00:10:20.500 | I think that it's a bit of a bet right now.
00:10:25.500 | If you ask, okay, we all agree we'll get at some point,
00:10:33.780 | maybe soon, maybe later, to a machine
00:10:38.860 | that is indistinguishable from my secretary,
00:10:42.180 | say, in terms of what I can ask the machine to do.
00:10:45.300 | I think we'll get there.
00:10:49.020 | And now the question is, you can ask people,
00:10:51.980 | do you think we'll get there without any knowledge
00:10:54.260 | about the human brain, or that the best way
00:10:59.260 | to get there is to understand better the human brain?
00:11:02.540 | Okay, this is, I think, an educated bet
00:11:05.980 | that different people with different background
00:11:09.100 | will decide in different ways.
00:11:11.780 | The recent history of the progress in AI
00:11:15.060 | in the last, I would say, five years or 10 years
00:11:18.620 | has been that the main breakthroughs,
00:11:23.620 | the main recent breakthroughs,
00:11:25.980 | are really start from neuroscience.
00:11:30.980 | I can mention reinforcement learning as one,
00:11:35.900 | is one of the algorithms at the core of AlphaGo,
00:11:40.900 | which is the system that beat
00:11:43.700 | the kind of an official world champion of Go,
00:11:46.700 | Lee Sedol, two, three years ago in Seoul.
00:11:50.820 | That's one, and that started really
00:11:55.180 | with the work of Pavlov in 1900,
00:11:57.180 | Marvin Minsky in the '60s,
00:12:03.260 | and many other neuroscientists later on.
00:12:06.700 | And deep learning started, which is at the core, again,
00:12:12.580 | of AlphaGo and systems like autonomous driving systems
00:12:17.580 | for cars, like the systems that MobilEye,
00:12:22.500 | which is a company started by one of my ex-post-docs,
00:12:25.620 | Amnon Shashua, that is at the core of those things.
00:12:30.620 | And deep learning, really the initial ideas
00:12:34.540 | in terms of the architecture
00:12:35.940 | of this layered hierarchical networks
00:12:39.100 | started with work of Thorsten Wiesel
00:12:43.140 | and David Hubel at Harvard, up the river in the '60s.
00:12:47.780 | So recent history suggests that neuroscience
00:12:51.420 | played a big role in these breakthroughs.
00:12:54.340 | My personal bet is that there is a good chance
00:12:57.940 | they continue to play a big role,
00:12:59.900 | maybe not in all the future breakthroughs,
00:13:01.860 | but in some of them.
00:13:03.340 | - At least in inspiration, so--
00:13:05.020 | - At least in inspiration, absolutely, yes.
00:13:07.380 | - So you studied both artificial
00:13:10.740 | and biological neural networks.
00:13:12.180 | You said these mechanisms that underlie deep learning,
00:13:16.340 | deep and reinforcement learning,
00:13:18.460 | but there is nevertheless significant differences
00:13:23.920 | between biological and artificial neural networks
00:13:26.100 | as they stand now.
00:13:27.300 | So between the two, what do you find
00:13:30.820 | is the most interesting, mysterious,
00:13:32.580 | maybe even beautiful difference
00:13:34.220 | as it currently stands in our understanding?
00:13:37.900 | - I must confess that until recently,
00:13:41.380 | I found that the artificial networks,
00:13:44.900 | too simplistic relative to real neural networks.
00:13:48.780 | But recently I've been starting to think
00:13:52.860 | that yes, there are very big simplification
00:13:57.020 | of what you find in the brain.
00:13:59.060 | But on the other hand,
00:14:01.460 | the artificial networks are much closer
00:14:02.980 | in terms of the architecture to the brain
00:14:07.060 | than other models that we had,
00:14:09.660 | that computer science used as model of thinking,
00:14:13.060 | which were mathematical logics,
00:14:15.620 | Lisp, Prolog, and those kind of things.
00:14:19.460 | So in comparison to those,
00:14:21.340 | they're much closer to the brain.
00:14:23.340 | You have networks of neurons,
00:14:26.140 | which is what the brain is about.
00:14:29.060 | The artificial neurons in the models,
00:14:31.620 | as I said, caricature of the biological neurons.
00:14:35.500 | But they're still neurons, single units,
00:14:37.660 | communicating with other units,
00:14:40.060 | something that is absent in the traditional
00:14:44.140 | computer type models of mathematics,
00:14:48.540 | reasoning, and so on.
00:14:50.820 | - So what aspect would you like to see
00:14:53.140 | in artificial neural networks added over time
00:14:57.280 | as we try to figure out ways to improve them?
00:14:59.980 | - So one of the main differences
00:15:03.900 | and problems in terms of deep learning today,
00:15:09.780 | and it's not only deep learning,
00:15:12.820 | and the brain is the need for deep learning techniques
00:15:17.820 | to have a lot of labeled examples.
00:15:22.980 | For instance, for ImageNet,
00:15:24.700 | you have a training set which is one million images,
00:15:28.700 | each one labeled by some human
00:15:31.060 | in terms of which object is there.
00:15:34.700 | And it's clear that in biology,
00:15:39.700 | a baby may be able to see a million images
00:15:44.860 | in the first years of life,
00:15:46.980 | but will not have a million of labels
00:15:49.260 | given to him or her by parents
00:15:52.740 | or caretakers.
00:15:56.420 | So how do you solve that?
00:15:59.620 | I think that there is this interesting challenge
00:16:03.900 | that today deep learning and related techniques
00:16:08.100 | are all about big data,
00:16:09.500 | big data meaning a lot of examples labeled by humans.
00:16:18.820 | Whereas in nature you have,
00:16:22.740 | so this big data is n going to infinity,
00:16:26.260 | that's the best, n meaning labeled data.
00:16:30.260 | But I think the biological world is more n going to one.
00:16:33.880 | A child can learn--
00:16:36.820 | - It's a beautiful way to put it.
00:16:37.860 | - Very small number of labeled examples.
00:16:42.700 | Like you tell a child, this is a car,
00:16:44.900 | you don't need to say, like in ImageNet,
00:16:48.020 | this is a car, this is a car,
00:16:49.540 | this is not a car, this is not a car,
00:16:51.140 | one million times.
00:16:52.460 | - So and of course with AlphaGo
00:16:56.140 | or at least the AlphaZero variants,
00:16:58.460 | there's because the world of Go is so simplistic
00:17:02.980 | that you can actually learn by yourself
00:17:05.940 | through self play, you can play against each other.
00:17:08.500 | In the real world, I mean the visual system
00:17:10.620 | that you've studied extensively
00:17:12.500 | is a lot more complicated than the game of Go.
00:17:15.860 | So on the comment about children,
00:17:18.260 | which are fascinatingly good at learning new stuff,
00:17:23.020 | how much of it do you think is hardware
00:17:24.700 | and how much of it is software?
00:17:26.620 | - Yeah, that's a good deep question,
00:17:29.540 | is in a sense is the old question of nurture and nature,
00:17:33.020 | how much is in the gene
00:17:35.620 | and how much is in the experience of an individual.
00:17:39.400 | Obviously it's both that play a role
00:17:44.820 | and I believe that the way evolution
00:17:49.820 | gives, put prior information, so to speak, hardwired,
00:17:55.780 | it's not really hardwired,
00:17:58.500 | but that's essentially an hypothesis.
00:18:02.740 | I think what's going on is that evolution
00:18:05.300 | has almost necessarily, if you believe in Darwin,
00:18:12.060 | is very opportunistic.
00:18:14.860 | And think about our DNA and the DNA of Drosophila,
00:18:19.860 | our DNA does not have many more genes than Drosophila.
00:18:28.780 | - The fly.
00:18:29.700 | - The fly, the fruit fly.
00:18:32.500 | Now we know that the fruit fly does not learn very much
00:18:36.820 | during its individual existence.
00:18:39.700 | It looks like one of this machinery
00:18:42.300 | that it's really mostly, not 100%, but 95%,
00:18:47.180 | hard-coded by the genes.
00:18:49.820 | But since we don't have many more genes than Drosophila,
00:18:55.020 | evolution could encode in us
00:18:58.900 | a kind of general learning machinery
00:19:01.780 | and then had to give very weak priors.
00:19:09.860 | Like for instance, let me give a specific example
00:19:14.860 | which is recent work by a member of our
00:19:17.220 | Center for Brains, Minds, and Machines.
00:19:19.300 | We know because of work of other people in our group
00:19:24.420 | and other groups that there are cells
00:19:26.700 | in a part of our brain, neurons, that are tuned to faces.
00:19:31.220 | They seem to be involved in face recognition.
00:19:33.860 | Now this face area exists, seems to be present
00:19:39.540 | in young children and adults.
00:19:42.780 | And one question is, is there from the beginning,
00:19:48.380 | is hardwired by evolution,
00:19:51.780 | or somehow is learned very quickly?
00:19:55.020 | - So what's your, by the way,
00:19:56.260 | a lot of the questions I'm asking,
00:19:58.940 | the answer is we don't really know.
00:20:00.940 | But as a person who has contributed
00:20:04.500 | some profound ideas in these fields,
00:20:06.420 | you're a good person to guess at some of these.
00:20:08.380 | So of course there's a caveat
00:20:10.220 | before a lot of the stuff we talk about.
00:20:11.740 | But what is your hunch?
00:20:14.620 | Is the face, the part of the brain
00:20:16.420 | that seems to be concentrated on face recognition,
00:20:20.100 | are you born with that?
00:20:21.220 | Or you just, it's designed to learn that quickly?
00:20:25.140 | Like the face of the mother and so on.
00:20:26.900 | - My hunch, my bias was the second one,
00:20:31.140 | learned very quickly.
00:20:32.260 | And it turns out that Marge Livingstone at Harvard
00:20:37.260 | has done some amazing experiments
00:20:39.820 | in which she raised baby monkeys,
00:20:42.980 | depriving them of faces during the first weeks of life.
00:20:47.220 | So they see technicians, but the technicians have a mask.
00:20:53.020 | - Yes.
00:20:53.860 | - And so when they looked at the area in the brain
00:21:00.060 | of these monkeys, that where usually you find faces,
00:21:07.120 | they found no face preference.
00:21:09.140 | So my guess is that what evolution does in this case
00:21:15.820 | is there is a plastic, an area which is plastic,
00:21:20.380 | which is kind of predetermined to be imprinted very easily.
00:21:25.380 | But the command from the gene is not a detailed circuitry
00:21:30.180 | for a face template.
00:21:32.300 | Could be, but this will require probably a lot of bits.
00:21:36.220 | You have to specify a lot of connection
00:21:38.080 | of a lot of neurons.
00:21:39.700 | Instead, the command from the gene is something like imprint,
00:21:44.700 | memorize what you see most often
00:21:47.840 | in the first two weeks of life,
00:21:49.380 | especially in connection with food.
00:21:51.320 | And maybe nipples, I don't know.
00:21:53.840 | - Right, well, source of food.
00:21:56.040 | And so in that area is very plastic at first
00:21:58.880 | and then solidifies.
00:22:00.440 | It'd be interesting if a variant of that experiment
00:22:03.620 | would show a different kind of pattern
00:22:06.280 | associated with food than a face pattern,
00:22:08.960 | whether that could stick.
00:22:10.240 | - There are indications that during that experiment,
00:22:13.580 | what the monkeys saw quite often
00:22:18.000 | were the blue gloves of the technicians
00:22:22.240 | that were giving to the baby monkeys the milk.
00:22:25.560 | And some of the cells, instead of being face sensitive
00:22:29.400 | in that area, are hand sensitive.
00:22:33.580 | - Oh, that's fascinating.
00:22:34.840 | Can you talk about what are the different parts
00:22:39.240 | of the brain and in your view, sort of loosely,
00:22:43.960 | and how do they contribute to intelligence?
00:22:45.800 | Do you see the brain as a bunch of different modules
00:22:49.560 | and they together come in the human brain
00:22:52.480 | to create intelligence, or is it all one
00:22:56.160 | mush of the same kind of fundamental architecture?
00:23:02.100 | - Right.
00:23:02.940 | - Architecture.
00:23:03.760 | - Yeah, that's an important question.
00:23:08.860 | And there was a phase in neuroscience
00:23:13.460 | back in the 1950 or so in which it was believed
00:23:18.460 | for a while that the brain was equipotential,
00:23:21.940 | this was the term.
00:23:22.980 | You could cut out a piece and nothing special happened
00:23:28.500 | apart a little bit less performance.
00:23:32.400 | There was a surgeon, Lashley, who did a lot of experiments
00:23:37.400 | of this type with mice and rats and concluded
00:23:43.240 | that every part of the brain was essentially equivalent
00:23:47.560 | to any other one.
00:23:48.540 | It turns out that that's really not true.
00:23:56.120 | There are very specific modules in the brain, as you said,
00:24:00.600 | and people may lose the ability to speak
00:24:05.320 | if you have a stroke in a certain region,
00:24:07.520 | or may lose control of their legs in another region.
00:24:12.520 | So they're very specific.
00:24:14.540 | The brain is also quite flexible and redundant,
00:24:17.920 | so often it can correct things
00:24:21.040 | and kind of take control of things.
00:24:26.040 | It can take over functions from one part of the brain
00:24:29.240 | to the other, but really there are specific modules.
00:24:33.840 | So the answer that we know from this old work,
00:24:38.840 | which was basically based on lesions,
00:24:43.720 | either on animals or very often there were a mine of,
00:24:48.720 | well, there was a mine of very interesting data
00:24:53.000 | coming from the war, from different types of--
00:24:58.000 | - Injuries.
00:25:00.240 | - Injuries that soldiers had in the brain.
00:25:03.760 | And more recently, functional MRI,
00:25:08.760 | which allow you to check which part of the brain
00:25:13.840 | are active when you are doing different tasks,
00:25:19.680 | as can replace some of this.
00:25:23.720 | You can see that certain parts of the brain
00:25:25.920 | are involved, are active in certain tasks.
00:25:29.760 | - Vision, language, yeah.
00:25:31.120 | - Yeah, that's right.
00:25:32.280 | - But sort of taking a step back to that part of the brain
00:25:36.520 | that discovers that specializes in the face
00:25:39.320 | and how that might be learned,
00:25:41.680 | what's your intuition behind,
00:25:45.320 | is it possible that sort of from a physicist's perspective,
00:25:48.880 | when you get lower and lower,
00:25:50.440 | it's all the same stuff and it just,
00:25:52.720 | when you're born, it's plastic and quickly figures out,
00:25:56.400 | this part is gonna be about vision,
00:25:58.020 | this is gonna be about language,
00:25:59.240 | this is about common sense reasoning.
00:26:02.000 | Do you have an intuition that that kind of learning
00:26:05.120 | is going on really quickly
00:26:06.280 | or is it really kind of solidified in hardware?
00:26:09.760 | - That's a great question.
00:26:11.320 | So there are parts of the brain
00:26:14.600 | like the cerebellum or the hippocampus
00:26:19.000 | that are quite different from each other.
00:26:21.600 | They clearly have different anatomy,
00:26:23.840 | different connectivity.
00:26:25.600 | Then there is the cortex,
00:26:31.200 | which is the most developed part of the brain in humans.
00:26:35.200 | And in the cortex, you have different regions of the cortex
00:26:40.880 | that are responsible for vision,
00:26:43.400 | for audition, for motor control, for language.
00:26:47.120 | Now one of the big puzzles of this
00:26:50.800 | is that in the cortex, is the cortex, is the cortex,
00:26:55.280 | looks like it is the same in terms of hardware,
00:27:00.000 | in terms of type of neurons and connectivity
00:27:05.160 | across these different modalities.
00:27:08.400 | So for the cortex,
00:27:11.360 | letting aside these other parts of the brain
00:27:14.280 | like spinal cord, hippocampus, cerebellum and so on,
00:27:16.840 | for the cortex, I think your question
00:27:20.440 | about hardware and software and learning and so on,
00:27:24.240 | it's, I think it's rather open.
00:27:28.520 | And I find very interesting
00:27:33.320 | for instance to think about an architecture,
00:27:35.600 | computer architecture that is good for vision
00:27:38.640 | and at the same time is good for language.
00:27:41.320 | It seems to be so different problem areas
00:27:46.320 | that you have to solve.
00:27:48.240 | - But the underlying mechanism might be the same
00:27:51.280 | and that's really instructive for--
00:27:52.920 | - It may be-- - Artificial neural networks.
00:27:55.200 | So you've done a lot of great work in vision,
00:27:58.000 | in human vision, computer vision.
00:28:00.620 | And you mentioned the problem of human vision
00:28:03.840 | is really as difficult as the problem
00:28:06.280 | of general intelligence.
00:28:07.480 | And maybe that connects to the cortex discussion.
00:28:11.460 | Can you describe the human visual cortex
00:28:15.340 | and how the humans begin to understand the world
00:28:20.300 | through the raw sensory information?
00:28:22.460 | What's, for folks who are not familiar,
00:28:27.460 | especially on the computer vision side,
00:28:30.120 | we don't often actually take a step back
00:28:32.660 | except saying with a sentence or two
00:28:34.380 | that one is inspired by the other.
00:28:36.620 | What is it that we know about the human visual cortex
00:28:40.060 | that's interesting?
00:28:40.900 | - So we know quite a bit,
00:28:41.740 | at the same time we don't know a lot.
00:28:43.500 | But the bit we know,
00:28:45.560 | in a sense we know a lot of the details
00:28:50.140 | and many we don't know
00:28:53.460 | and we know a lot of the top level,
00:28:57.460 | the answer to top level question
00:29:00.100 | but we don't know some basic ones.
00:29:02.260 | Even in terms of general neuroscience,
00:29:04.060 | forgetting vision, why do we sleep?
00:29:08.980 | It's such a basic question
00:29:12.000 | and we really don't have an answer to that.
00:29:14.620 | - Do you think, so taking a step back on that,
00:29:17.220 | so sleep for example is fascinating.
00:29:18.820 | Do you think that's a neuroscience question
00:29:22.060 | or if we talk about abstractions,
00:29:24.900 | what do you think is an interesting way
00:29:26.740 | to study intelligence or most effective
00:29:29.440 | on the levels of abstraction?
00:29:30.740 | Is it chemical, is it biological,
00:29:33.140 | is it electrophysical, mathematical
00:29:35.580 | as you've done a lot of excellent work on that side?
00:29:37.860 | Which psychology, which level of abstraction do you think?
00:29:42.860 | - Well in terms of levels of abstraction,
00:29:46.900 | I think we need all of them.
00:29:48.420 | - All of them.
00:29:49.260 | - It's like if you ask me
00:29:52.100 | what does it mean to understand a computer?
00:29:56.300 | That's much simpler but in a computer,
00:30:00.180 | I could say well I understand how to use PowerPoint.
00:30:03.940 | That's my level of understanding a computer.
00:30:07.260 | It has reasonable, it gives me some power
00:30:10.620 | to produce lights and beautiful slides.
00:30:13.500 | Now you can ask somebody else,
00:30:17.340 | he says well I know how the transistor work
00:30:19.860 | that are inside the computer.
00:30:21.420 | I can write the equation for transistor and diodes
00:30:25.940 | and circuits, logical circuits.
00:30:29.340 | And I can ask this guy, do you know how to operate PowerPoint?
00:30:32.460 | No idea.
00:30:34.060 | - So do you think if we discovered computers
00:30:37.860 | walking amongst us full of these transistors
00:30:41.580 | that are also operating under Windows and have PowerPoint,
00:30:45.580 | do you think it's digging in a little bit more,
00:30:49.980 | how useful is it to understand the transistor
00:30:52.900 | in order to be able to understand PowerPoint
00:30:57.900 | and these higher level intelligent processes?
00:31:00.340 | - So I think in the case of computers,
00:31:03.740 | because they were made by engineers by us,
00:31:06.980 | this different level of understanding
00:31:09.300 | are rather separate on purpose.
00:31:11.900 | They are separate modules so that the engineer
00:31:16.580 | that designed the circuit for the chips
00:31:19.220 | does not need to know what is inside PowerPoint.
00:31:23.580 | And somebody can write the software
00:31:26.820 | translating from one to the other.
00:31:30.260 | So in that case, I don't think understanding the transistor
00:31:35.260 | help you understand PowerPoint, or very little.
00:31:39.940 | If you want to understand the computer, this question,
00:31:43.940 | I would say you have to understand it at different levels.
00:31:46.820 | If you really want to build one, right?
00:31:51.580 | But for the brain, I think this levels of understanding,
00:31:57.300 | so the algorithms, which kind of computation,
00:32:00.820 | the equivalent PowerPoint, and the circuits,
00:32:04.620 | the transistors, I think they are much more intertwined
00:32:08.780 | with each other.
00:32:09.620 | There is not a neatly level of the software
00:32:13.940 | separate from the hardware.
00:32:15.900 | And so that's why I think in the case of the brain,
00:32:20.100 | the problem is more difficult,
00:32:21.780 | more than for computers requires the interaction,
00:32:25.620 | the collaboration between different types of expertise.
00:32:29.460 | - So the brain is a big hierarchical mess,
00:32:32.100 | so you can't just disentangle levels.
00:32:35.180 | - I think you can, but it's much more difficult,
00:32:37.900 | and it's not completely obvious.
00:32:40.860 | And as I said, I think he's one of the person
00:32:44.700 | I think is the greatest problem in science.
00:32:47.220 | So I think it's fair that it's difficult.
00:32:51.860 | - That's a difficult one.
00:32:53.300 | That said, you do talk about compositionality
00:32:56.900 | and why it might be useful.
00:32:58.300 | And when you discuss why these neural networks
00:33:01.740 | in artificial or biological sense learn anything,
00:33:05.180 | you talk about compositionality.
00:33:07.540 | See, there's a sense that nature can be disentangled.
00:33:12.540 | Well, all aspects of our cognition could be disentangled
00:33:20.500 | a little bit to some degree.
00:33:22.660 | So why do you think, first of all,
00:33:25.940 | how do you see compositionality,
00:33:27.740 | and why do you think it exists at all in nature?
00:33:31.660 | - I spoke about, I used the term compositionality
00:33:36.660 | when we looked at deep neural networks, multi-layers,
00:33:44.820 | and trying to understand when and why
00:33:48.420 | they're more powerful than more classical
00:33:53.260 | one-layer networks like linear classifier,
00:33:57.660 | kernel machines, so-called.
00:34:00.020 | And what we found is that in terms of approximating
00:34:06.580 | or learning or representing a function,
00:34:09.940 | a mapping from an input to an output,
00:34:12.220 | like from an image to the label in the image,
00:34:16.860 | if this function has a particular structure,
00:34:20.900 | then deep networks are much more powerful
00:34:25.020 | than shallow networks to approximate
00:34:27.060 | the underlying function.
00:34:28.900 | And the particular structure is a structure
00:34:32.020 | of compositionality.
00:34:33.940 | If the function is made up of functions of functions,
00:34:38.940 | so that you need to look on,
00:34:42.980 | when you are interpreting an image,
00:34:45.820 | classifying an image, you don't need to look
00:34:49.140 | at all pixels at once, but you can compute something
00:34:54.020 | from small groups of pixels,
00:34:57.180 | and then you can compute something on the output
00:35:00.740 | of this local computation, and so on.
00:35:03.660 | It is similar to what you do when you read a sentence.
00:35:07.300 | You don't need to read the first and the last letter,
00:35:11.340 | but you can read syllables, combine them in words,
00:35:15.980 | combine the words in sentences.
00:35:18.060 | So this is this kind of structure.
00:35:20.980 | - So that's just part of a discussion
00:35:22.580 | of why deep neural networks may be more effective
00:35:26.500 | than the shallow methods.
00:35:27.820 | And is your sense, for most things we can use
00:35:32.260 | neural networks for, those problems are going
00:35:37.780 | to be compositional in nature, like language, like vision?
00:35:44.220 | How far can we get in this kind of way?
00:35:47.820 | - So here is almost philosophy.
00:35:51.540 | - Well, let's go there.
00:35:53.060 | - Yeah, let's go there.
00:35:54.220 | So a friend of mine, Max Tegmark,
00:35:57.420 | who is a physicist at MIT.
00:36:00.260 | - I've talked to him on this thing, yeah.
00:36:01.860 | And he disagrees with you, right?
00:36:03.900 | A little bit.
00:36:04.740 | - Yeah, we agree on most,
00:36:07.060 | but the conclusion is a bit different.
00:36:10.180 | His conclusion is that, for images, for instance,
00:36:14.700 | the compositional structure of this function
00:36:19.460 | that we have to learn or to solve these problems
00:36:23.420 | comes from physics, comes from the fact
00:36:27.780 | that you have local interactions in physics
00:36:31.940 | between atoms and other atoms,
00:36:35.380 | between particle of matter and other particles,
00:36:39.780 | between planets and other planets,
00:36:42.380 | between stars and other, it's all local.
00:36:45.700 | And that's true, but you could push this argument
00:36:53.420 | a bit further, not this argument, actually,
00:36:57.620 | you could argue that, you know,
00:37:01.300 | maybe that's part of the truth,
00:37:02.820 | but maybe what happens is kind of the opposite,
00:37:06.820 | is that our brain is wired up as a deep network.
00:37:10.940 | So it can learn, understand, solve problems
00:37:16.900 | that have this compositional structure.
00:37:21.380 | And it cannot do, it cannot solve problems
00:37:26.500 | that don't have this compositional structure.
00:37:29.420 | So the problems we are accustomed to,
00:37:32.660 | we think about, we test our algorithms on,
00:37:36.860 | are this compositional structure
00:37:40.180 | because our brain is made up.
00:37:42.700 | - And that's, in a sense, an evolutionary perspective
00:37:45.380 | that we've, so the ones that didn't have,
00:37:48.260 | that weren't dealing with a compositional nature
00:37:51.660 | of reality died off?
00:37:55.260 | - Yes, but also could be, maybe the reason
00:38:00.340 | why we have this local connectivity in the brain,
00:38:05.340 | like simple cells in cortex looking only
00:38:09.020 | at the small part of the image, each one of them,
00:38:11.860 | and then other cells looking at the small number
00:38:14.620 | of these simple cells and so on.
00:38:16.340 | The reason for this may be purely that it was difficult
00:38:21.100 | to grow long-range connectivity.
00:38:24.260 | So suppose it's, you know, for biology,
00:38:28.660 | it's possible to grow short-range connectivity,
00:38:33.620 | but not long-range also because there is a limited
00:38:37.100 | number of long-range that you can.
00:38:39.740 | And so you have this limitation from the biology.
00:38:44.020 | And this means you build a deep convolutional network.
00:38:50.060 | This would be something like a deep convolutional network.
00:38:53.660 | And this is great for solving certain class of problems.
00:38:57.780 | These are the ones we find easy and important for our life.
00:39:02.780 | And yes, they were enough for us to survive.
00:39:05.900 | - And you can start a successful business
00:39:10.820 | on solving those problems, right?
00:39:13.300 | With Mobileye, driving is a compositional problem.
00:39:17.420 | So on the learning task, I mean, we don't know much
00:39:21.940 | about how the brain learns in terms of optimization,
00:39:25.720 | but so the thing that's stochastic gradient descent
00:39:29.100 | is what artificial neural networks use for the most part
00:39:33.820 | to adjust the parameters in such a way that it's able
00:39:37.780 | to deal based on the label data,
00:39:40.740 | it's able to solve the problem.
00:39:42.580 | So what's your intuition about why it works at all,
00:39:47.580 | how hard of a problem it is to optimize a neural network,
00:39:54.500 | artificial neural network?
00:39:56.420 | Is there other alternatives?
00:39:58.740 | Just in general, your intuition is behind
00:40:02.040 | this very simplistic algorithm
00:40:03.820 | that seems to do pretty good, surprising.
00:40:05.940 | - Yes, yes.
00:40:07.820 | So I find neuroscience, the architecture of cortex
00:40:12.820 | is really similar to the architecture of deep networks.
00:40:17.320 | So there is a nice correspondence there
00:40:20.380 | between the biology and this kind of local connectivity,
00:40:25.380 | hierarchical architecture.
00:40:28.220 | The stochastic gradient descent, as you said,
00:40:30.980 | is a very simple technique.
00:40:34.360 | It seems pretty unlikely that biology could do that
00:40:40.820 | from what we know right now about cortex
00:40:45.260 | and neurons and synapses.
00:40:50.260 | So it's a big question open whether there are other
00:40:54.020 | optimization learning algorithms
00:40:58.820 | that can replace stochastic gradient descent.
00:41:02.060 | And my guess is yes,
00:41:04.780 | but nobody has found yet a real answer.
00:41:11.740 | I mean, people are trying, still trying,
00:41:13.900 | and there are some interesting ideas.
00:41:18.360 | The fact that stochastic gradient descent is so successful,
00:41:23.360 | this has become clearly is not so mysterious.
00:41:27.760 | And the reason is that it's an interesting fact
00:41:32.760 | that is a change in a sense
00:41:35.560 | in how people think about statistics.
00:41:39.360 | And this is the following, is that typically
00:41:44.960 | when you had data and you had, say, a model with parameters,
00:41:49.960 | you are trying to fit the model to the data,
00:41:54.520 | to fit the parameter.
00:41:56.000 | Typically, the kind of crowd wisdom type idea
00:42:01.000 | was you should have at least twice the number of data
00:42:09.480 | than the number of parameters.
00:42:11.640 | Maybe 10 times is better.
00:42:15.560 | Now, the way you train neural network these days
00:42:19.600 | is that they have 10 or 100 times more parameters than data.
00:42:24.320 | Exactly the opposite.
00:42:25.440 | And which, you know, it has been one of the puzzles
00:42:31.840 | about neural networks.
00:42:34.080 | How can you get something that really works
00:42:37.120 | when you have so much freedom in--
00:42:40.680 | - From that little data, it can generalize somehow.
00:42:43.040 | - Right, exactly.
00:42:44.240 | - Do you think the stochastic nature of it
00:42:46.440 | is essential, the randomness?
00:42:48.200 | - So I think we have some initial understanding
00:42:50.640 | why this happens.
00:42:52.280 | But one nice side effect of having this over-parameterization
00:42:57.280 | more parameters than data,
00:43:00.040 | is that when you look for the minima of a loss function,
00:43:04.720 | like stochastic gradient descent is doing,
00:43:07.200 | you find, I made some calculations based on
00:43:14.200 | some old basic theorem of algebra called the Bezout theorem
00:43:19.200 | that gives you an estimate of the number of solution
00:43:23.840 | of a system of polynomial equation.
00:43:26.000 | Anyway, the bottom line is that there are probably
00:43:30.560 | more minima for a typical deep networks
00:43:35.040 | than atoms in the universe.
00:43:38.400 | Just to say, there are a lot.
00:43:41.600 | Because of the over-parameterization.
00:43:43.600 | - Yes.
00:43:44.880 | - A more global minima, zero minima, good minima.
00:43:49.000 | So it's not too--
00:43:50.600 | - More global minima.
00:43:51.640 | - Yeah, a lot of them.
00:43:53.200 | So you have a lot of solutions.
00:43:54.520 | So it's not so surprising that you can find them
00:43:57.960 | relatively easily.
00:43:59.240 | And this is because of the over-parameterization.
00:44:04.280 | - The over-parameterization sprinkles that entire space
00:44:07.960 | with solutions that are pretty good.
00:44:09.480 | - Yeah, it's not so surprising, right?
00:44:11.240 | It's like, if you have a system of linear equation
00:44:14.360 | and you have more unknowns than equations,
00:44:17.880 | then you have, we know,
00:44:19.000 | you have an infinite number of solutions.
00:44:22.000 | And the question is to pick one, that's another story.
00:44:25.400 | But you have an infinite number of solutions.
00:44:27.400 | So there are a lot of value of your unknowns
00:44:31.000 | that satisfy the equations.
00:44:33.120 | - But it's possible that there's a lot of those solutions
00:44:36.080 | that aren't very good.
00:44:37.360 | What's surprising is that they're pretty good.
00:44:38.200 | - So that's a separate question.
00:44:39.440 | Why can you pick one that generalizes one?
00:44:42.840 | - Yeah, exactly.
00:44:43.680 | - But that's a separate question with separate answers.
00:44:46.320 | Yeah.
00:44:47.160 | - One theorem that people like to talk about
00:44:50.640 | that kind of inspires imagination
00:44:52.680 | of the power of neural networks is the universality,
00:44:55.640 | universal approximation theorem,
00:44:57.840 | that you can approximate any computable function
00:45:00.960 | with just a finite number of neurons
00:45:02.840 | in a single hidden layer.
00:45:04.360 | Do you find this theorem one surprising?
00:45:07.680 | Do you find it useful, interesting, inspiring?
00:45:11.560 | - No, this one, you know, I never found it very surprising.
00:45:16.440 | It was known since the '80s, since I entered the field,
00:45:21.440 | because it's basically the same as Weierstrass theorem,
00:45:27.160 | which says that I can approximate any continuous function
00:45:32.000 | with a polynomial of sufficiently,
00:45:34.600 | with a sufficient number of terms, monomials.
00:45:37.480 | - Yeah.
00:45:38.320 | - It's basically the same, and the proofs are very similar.
00:45:41.680 | - So your intuition was there was never any doubt
00:45:44.200 | that neural networks in theory could be
00:45:47.000 | very strong approximators.
00:45:48.040 | - Right, the question, the interesting question is that
00:45:51.680 | if this theorem says you can approximate, fine,
00:45:58.280 | but when you ask how many neurons, for instance,
00:46:03.160 | or in the case of polynomial, how many monomials,
00:46:06.400 | I need to get a good approximation.
00:46:09.440 | Then it turns out that that depends on the dimensionality
00:46:16.360 | of your function, how many variables you have.
00:46:20.560 | But it depends on the dimensionality of your function
00:46:22.840 | in a bad way.
00:46:25.080 | It's, for instance, suppose you want an error
00:46:28.680 | which is no worse than 10% in your approximation.
00:46:35.080 | You come up with a network that approximates your function
00:46:38.120 | within 10%.
00:46:40.480 | Then it turns out that the number of units you need
00:46:44.520 | are in the order of 10 to the dimensionality, D,
00:46:48.360 | how many variables.
00:46:50.080 | So if you have two variables, D is two,
00:46:54.680 | and you have 100 units, and okay.
00:46:57.320 | But if you have, say, 200 by 200 pixel images,
00:47:02.920 | now this is 40,000, whatever.
00:47:06.880 | - We again go to the size of the universe pretty quickly.
00:47:09.800 | - Exactly, 10 to the 40,000 or something.
00:47:12.500 | And so this is called the curse of dimensionality.
00:47:18.680 | Not quite appropriately.
00:47:20.720 | - And the hope is with the extra layers
00:47:24.200 | you can remove the curse.
00:47:28.040 | - What we proved is that if you have deep layers,
00:47:32.280 | hierarchical architecture with local connectivity
00:47:36.240 | of the type of convolutional deep learning,
00:47:39.160 | and if you're dealing with a function
00:47:41.760 | that has this kind of hierarchical architecture,
00:47:46.760 | then you avoid completely the curse.
00:47:50.760 | - You've spoken a lot about supervised deep learning.
00:47:53.680 | - Yeah.
00:47:54.520 | - What are your thoughts, hopes, views,
00:47:56.480 | on the challenges of unsupervised learning
00:48:01.240 | with GANs, with generative adversarial networks?
00:48:05.680 | Do you see those as distinct, the power of GANs,
00:48:09.200 | do you see those as distinct from supervised methods
00:48:12.360 | in neural networks, or are they really all
00:48:14.160 | in the same representation ballpark?
00:48:16.640 | - GANs is one way to get estimation
00:48:21.280 | of probability densities, which is a somewhat new way
00:48:26.280 | that people have not done before.
00:48:30.360 | I don't know whether this will really play
00:48:34.680 | an important role in intelligence.
00:48:39.120 | It's interesting, I'm less enthusiastic about it
00:48:46.320 | than many people in the field.
00:48:48.680 | I have the feeling that many people in the field
00:48:50.880 | are really impressed by the ability
00:48:53.600 | of producing realistic-looking images
00:48:58.960 | in this generative way.
00:49:01.160 | - Which describes the popularity of the methods,
00:49:03.080 | but you're saying that while that's exciting and cool
00:49:06.360 | to look at, it may not be the tool that's useful for--
00:49:10.320 | - Yeah.
00:49:11.240 | - So you describe it kind of beautifully.
00:49:13.600 | Current supervised methods go N to infinity
00:49:16.360 | in terms of number of labeled points,
00:49:18.160 | and we really have to figure out how to go to N to one.
00:49:20.960 | - Yeah.
00:49:21.800 | - And you're thinking GANs might help,
00:49:23.240 | but they might not be the right--
00:49:25.160 | - I don't think for that problem,
00:49:27.680 | which I really think is important.
00:49:29.360 | I think they may help, they certainly have applications,
00:49:33.680 | for instance, in computer graphics.
00:49:35.880 | I did work long ago, which was a little bit similar
00:49:42.480 | in terms of saying, okay, I have a network
00:49:47.000 | and I present images, and I can,
00:49:51.800 | so input is images and output is, for instance,
00:49:55.360 | the pose of the image, a face, how much is smiling,
00:49:59.520 | is rotated 45 degrees or not.
00:50:02.760 | What about having a network that I train
00:50:06.360 | with the same data set, but now I invert input and output.
00:50:10.640 | Now the input is the pose or the expression,
00:50:14.960 | a number, set of numbers, and the output is the image,
00:50:18.320 | and I train it.
00:50:19.160 | And we did pretty good, interesting results
00:50:22.520 | in terms of producing very realistic looking images.
00:50:27.520 | It was a less sophisticated mechanism,
00:50:31.960 | but the output was pretty, less than GANs,
00:50:35.320 | but the output was pretty much of the same quality.
00:50:38.840 | So I think for a computer graphics type application,
00:50:43.360 | yeah, definitely GANs can be quite useful,
00:50:46.240 | and not only for that, but for,
00:50:52.040 | you know, helping, for instance,
00:50:54.600 | on this problem of unsupervised example
00:50:58.200 | of reducing the number of labeled examples.
00:51:00.960 | I think people, it's like they think
00:51:06.480 | they can get out more than they put in.
00:51:09.440 | - There's no free lunches, you said.
00:51:13.120 | - Right.
00:51:13.960 | - So what do you think, what's your intuition?
00:51:16.300 | How can we slow the growth of N to infinity
00:51:21.320 | in supervised learning?
00:51:25.080 | So, for example, Mobileye has very successfully,
00:51:29.880 | I mean, essentially, annotated large amounts of data
00:51:33.040 | to be able to drive a car.
00:51:34.680 | Now, one thought is, so we're trying to teach machines,
00:51:39.280 | school of AI, and we're trying to,
00:51:42.600 | so how can we become better teachers, maybe?
00:51:46.040 | That's one way.
00:51:47.360 | - No, you gotta, you know,
00:51:50.120 | well, I like that, because one,
00:51:52.360 | again, one caricature of the history of computer science,
00:51:58.200 | you could say, is, it begins with programmers,
00:52:03.200 | expensive.
00:52:04.720 | - Yeah.
00:52:05.560 | - Continuous labelers, cheap.
00:52:08.160 | - Yeah.
00:52:09.600 | - And the future will be schools,
00:52:12.880 | like we have for kids.
00:52:14.600 | - Yeah. (laughs)
00:52:16.320 | - Currently, the labeling methods,
00:52:19.360 | we're not selective about which examples
00:52:23.800 | we teach networks with, so,
00:52:27.040 | I think the focus of making one,
00:52:29.960 | networks that learn much faster
00:52:31.320 | is often on the architecture side,
00:52:33.600 | but how can we pick better examples with which to learn?
00:52:37.760 | Do you have intuitions about that?
00:52:39.440 | - Well, that's part of the problem,
00:52:42.400 | but the other one is,
00:52:45.280 | if we look at biology,
00:52:50.280 | a reasonable assumption, I think, is,
00:52:53.560 | in the same spirit that I said,
00:52:58.160 | evolution is opportunistic and has weak priors.
00:53:03.160 | You know, the way I think the intelligence of a child,
00:53:08.200 | a baby may develop, is,
00:53:12.120 | by bootstrapping weak priors from evolution.
00:53:17.120 | For instance,
00:53:18.560 | you can assume that you have in most organisms,
00:53:26.520 | including human babies,
00:53:29.000 | built in some basic machinery
00:53:32.640 | to detect motion and relative motion.
00:53:36.980 | And in fact, there is, you know,
00:53:39.920 | we know all insects from fruit flies
00:53:42.960 | to other animals, they have this.
00:53:46.400 | Even in the retinas, in the very peripheral part,
00:53:53.120 | it's very conserved across species,
00:53:55.760 | something that evolution discovered early.
00:53:58.200 | It may be the reason why babies tend to look,
00:54:02.440 | in the first few days, to moving objects,
00:54:06.240 | and not to not moving objects.
00:54:08.400 | Now, moving objects means, okay,
00:54:10.040 | they are attracted by motion,
00:54:12.280 | but motion also means that motion gives
00:54:15.840 | automatic segmentation from the background.
00:54:19.440 | So because of motion boundaries,
00:54:23.640 | you know, either the object is moving,
00:54:26.720 | or the eye of the baby is tracking the moving object,
00:54:30.600 | and the background is moving, right?
00:54:32.900 | - Yeah, so just purely on the visual characteristics
00:54:36.080 | of the scene, that seems to be the most useful.
00:54:37.880 | - Right, so it's like looking at an object
00:54:41.680 | without background.
00:54:42.680 | It's ideal for learning the object,
00:54:45.720 | otherwise it's really difficult,
00:54:47.480 | because you have so much stuff.
00:54:50.400 | So suppose you do this at the beginning, first weeks,
00:54:53.960 | then after that, you can recognize object.
00:54:58.500 | Now, they are imprinted a number of,
00:55:00.360 | even in the background, even without motion.
00:55:05.760 | - So that's the, by the way, I just wanna ask,
00:55:08.600 | on the object recognition problem,
00:55:10.880 | so there is this being responsive to movement,
00:55:13.920 | and doing edge detection, essentially.
00:55:16.720 | What's the gap between being effective
00:55:21.040 | at visually recognizing stuff,
00:55:22.640 | detecting where it is, and understanding the scene?
00:55:27.440 | Is this a huge gap in many layers,
00:55:30.320 | or is it close?
00:55:32.920 | - No, I think that's a huge gap.
00:55:35.080 | I think present algorithm, with all the success
00:55:40.040 | that we have, and the fact that there are a lot
00:55:43.160 | of very useful, I think we are in a golden age
00:55:47.120 | for applications of low-level vision,
00:55:51.720 | and low-level speech recognition, and so on,
00:55:54.280 | you know, Alexa, and so on.
00:55:56.840 | There are many more things of similar level to be done,
00:55:59.920 | including medical diagnosis, and so on,
00:56:02.020 | but we are far from what we call understanding
00:56:05.540 | of a scene, of language, of actions, of people.
00:56:10.540 | That is, despite the claims, that's, I think, very far.
00:56:16.940 | - We're a little bit off.
00:56:19.460 | So, in popular culture, and among many researchers,
00:56:23.140 | some of which I've spoken with,
00:56:24.780 | the Stuart Russell, and Elon Musk,
00:56:27.480 | in and out of the AI field, there's a concern
00:56:31.660 | about the existential threat of AI.
00:56:33.460 | And how do you think about this concern?
00:56:37.900 | And is it valuable to think about large-scale,
00:56:44.820 | long-term, unintended consequences
00:56:47.340 | of intelligent systems we try to build?
00:56:51.440 | - I always think it's better to worry first,
00:56:55.140 | you know, early, rather than late.
00:56:58.660 | - So worry is good.
00:56:59.620 | - Yeah, I'm not against worrying at all.
00:57:03.060 | Personally, I think that, you know,
00:57:08.060 | it will take a long time before there is
00:57:12.140 | real reason to be worried.
00:57:14.480 | But, as I said, I think it's good to put in place
00:57:19.460 | and think about possible safety against.
00:57:23.400 | What I find a bit misleading are things like,
00:57:28.480 | that have been said by people I know,
00:57:30.300 | like Elon Musk, and what is, Bostrom, in particular?
00:57:35.300 | What is his first name?
00:57:36.780 | - Nick Bostrom.
00:57:37.620 | - Nick Bostrom, right.
00:57:38.540 | You know, and a couple of other people that,
00:57:41.680 | for instance, AI is more dangerous than nuclear weapons.
00:57:46.100 | - Right.
00:57:46.940 | - I think that's really wrong.
00:57:49.540 | That can be, it's misleading, right?
00:57:52.740 | Because in terms of priority, we should still
00:57:56.460 | be more worried about nuclear weapons,
00:57:59.520 | and, you know, what people are doing about it,
00:58:02.400 | and so on, than AI.
00:58:04.320 | - And you've spoken about Demis Hassabis,
00:58:09.960 | and yourself saying that you think it'll be
00:58:13.160 | about 100 years out before we have
00:58:17.520 | a general intelligence system that's on par
00:58:19.360 | with a human being.
00:58:20.600 | Do you have any updates for those predictions?
00:58:22.560 | - Well, I think he said--
00:58:24.080 | - He said 20, I think.
00:58:25.120 | - He said 20, right.
00:58:26.240 | This was a couple of years ago.
00:58:27.700 | I have not asked him again, so should I?
00:58:31.500 | - Your own prediction.
00:58:32.880 | What's your prediction about when you'll be truly surprised?
00:58:38.860 | And what's the confidence interval on that?
00:58:41.860 | - You know, it's so difficult to predict the future,
00:58:45.220 | and even the present sometimes.
00:58:47.260 | - It's pretty hard to predict, yeah.
00:58:48.100 | - Right, but I would be, as I said,
00:58:51.420 | this is completely, I would be more like
00:58:55.580 | Rod Brooks, I think he's about 200 years.
00:58:59.040 | - 200 years.
00:58:59.880 | When we have this kind of AGI system,
00:59:04.860 | artificial general intelligence system,
00:59:06.920 | you're sitting in a room with her, him, it,
00:59:11.240 | do you think it will be, the underlying design
00:59:16.360 | of such a system is something we'll be able to understand?
00:59:19.080 | It'll be simple?
00:59:20.460 | Do you think it'll be explainable?
00:59:24.900 | Understandable by us?
00:59:27.580 | Your intuition, again, we're in the realm
00:59:30.060 | of philosophy a little bit.
00:59:32.100 | - Well, probably no, but again,
00:59:37.100 | it depends what you really mean for understanding.
00:59:42.020 | So I think, you know, we don't understand
00:59:47.020 | what, how deep networks work.
00:59:53.340 | I think we're beginning to have a theory now,
00:59:56.500 | but in the case of deep networks,
00:59:59.220 | or even in the case of the simple,
01:00:01.540 | simpler kernel machines or linear classifier,
01:00:06.340 | we really don't understand the individual units also.
01:00:11.340 | But we understand, you know, what the computation
01:00:15.980 | and the limitations and the properties of it are.
01:00:19.280 | It's similar to many things.
01:00:22.500 | You know, we, what does it mean to understand
01:00:25.940 | how a fusion bomb works?
01:00:29.620 | How many of us, you know, many of us understand
01:00:33.780 | the basic principle, and some of us
01:00:37.520 | may understand deeper details.
01:00:40.660 | - In that sense, understanding is, as a community,
01:00:43.460 | as a civilization, can we build another copy of it?
01:00:46.660 | - Okay.
01:00:47.500 | - And in that sense, do you think there'll be,
01:00:50.740 | there'll need to be some evolutionary component
01:00:53.780 | where it runs away from our understanding?
01:00:56.260 | Or do you think it could be engineered from the ground up?
01:00:59.340 | The same way you go from the transistor to PowerPoint.
01:01:02.300 | - Right, so many years ago, this was actually,
01:01:06.140 | let me see, 40, 41 years ago,
01:01:09.260 | I wrote a paper with David Marr,
01:01:13.420 | who was one of the founding father of computer vision,
01:01:18.020 | computational vision.
01:01:20.580 | I wrote a paper about levels of understanding,
01:01:23.820 | which is related to the question we discussed earlier
01:01:26.180 | about understanding PowerPoint,
01:01:28.700 | understanding transistors and so on.
01:01:30.840 | And, you know, in that kind of framework,
01:01:36.420 | we had a level of the hardware
01:01:38.820 | and the top level of the algorithms.
01:01:41.160 | We did not have learning.
01:01:45.060 | Recently, I updated adding levels,
01:01:48.300 | and one level I added to those three was learning.
01:01:53.060 | So, and you can imagine, you could have a good understanding
01:01:59.340 | of how you construct learning machine, like we do.
01:02:03.420 | But being unable to describe in detail
01:02:08.820 | what the learning machines will discover, right?
01:02:13.740 | Now, that would be still a powerful understanding
01:02:17.100 | if I can build a learning machine,
01:02:19.420 | even if I don't understand in detail
01:02:23.020 | every time it learns something.
01:02:25.300 | - Just like our children, if they start listening
01:02:28.860 | to a certain type of music, I don't know,
01:02:32.020 | Miley Cyrus or something, you don't understand
01:02:34.820 | why they came to that particular preference,
01:02:37.640 | but you understand the learning process.
01:02:39.420 | That's very interesting.
01:02:40.460 | - Yeah, yeah.
01:02:41.460 | - So, on learning for some people,
01:02:46.740 | for systems to be part of our world,
01:02:50.420 | it has a certain, one of the challenging things
01:02:53.460 | that you've spoken about is learning ethics,
01:02:56.940 | learning morals, and how hard do you think
01:03:01.140 | is the problem of, first of all,
01:03:04.540 | humans understanding our ethics?
01:03:06.860 | What is the origin on the neural and low level of ethics?
01:03:10.620 | What is it at the higher level?
01:03:12.460 | Is it something that's learnable
01:03:14.420 | from machines in your intuition?
01:03:16.280 | - I think, yeah, ethics is learnable, very likely.
01:03:22.960 | I think it's one of these problems where,
01:03:27.180 | think understanding the neuroscience of ethics.
01:03:34.740 | You know, people discuss there is an ethics of neuroscience.
01:03:39.660 | (laughing)
01:03:41.500 | - Yeah, yes.
01:03:42.540 | - You know, how a neuroscientist should or should not behave.
01:03:45.900 | You can think of a neurosurgeon and the ethics
01:03:49.980 | that he has to be, or he/she has to be.
01:03:53.940 | But I'm more interested on the neuroscience of--
01:03:57.660 | - You're blowing my mind right now,
01:03:58.820 | the neuroscience of ethics, it's very meta.
01:04:01.100 | - Yeah, and I think that would be important to understand
01:04:04.780 | also for being able to design machines
01:04:09.420 | that are ethical machines in our sense of ethics.
01:04:14.420 | - And you think there is something in neuroscience,
01:04:18.540 | there's patterns, tools in neuroscience
01:04:21.540 | that could help us shed some light on ethics?
01:04:25.340 | Or is it more so on the psychologist's sociology
01:04:28.980 | and which higher level?
01:04:29.860 | - No, there is psychology, but there is also,
01:04:32.300 | in the meantime, there is evidence,
01:04:36.820 | fMRI of specific areas of the brain
01:04:41.140 | that are involved in certain ethical judgment.
01:04:44.500 | And not only this, you can stimulate those area
01:04:47.620 | with magnetic fields and change the ethical decisions.
01:04:52.620 | - Yeah, wow.
01:04:56.380 | - So that's work by a colleague of mine, Rebecca Sachs,
01:05:00.780 | and there is other researchers doing similar work.
01:05:05.300 | And I think this is the beginning,
01:05:08.260 | but ideally at some point we'll have an understanding
01:05:12.980 | of how this works and why it evolved, right?
01:05:17.420 | - The big why question, yeah, it must have some purpose.
01:05:21.980 | - Yeah, obviously it has some social purposes, probably.
01:05:26.980 | - If neuroscience holds the key to at least illuminate
01:05:33.600 | some aspect of ethics, that means it could be
01:05:35.700 | a learnable problem.
01:05:37.100 | - Yeah, exactly.
01:05:38.860 | - And as we're getting into harder and harder questions,
01:05:42.020 | let's go to the hard problem of consciousness.
01:05:45.460 | Is this an important problem for us to think about
01:05:49.020 | and solve on the engineering of intelligence side
01:05:52.620 | of your work, of our dream?
01:05:55.780 | - You know, it's unclear.
01:05:57.460 | So, you know, again, this is a deep problem,
01:06:02.700 | partly because it's very difficult to define consciousness.
01:06:06.880 | And there is a debate among neuroscientists
01:06:11.880 | and about whether consciousness, and philosophers,
01:06:21.200 | of course, whether consciousness is something
01:06:25.800 | that requires flesh and blood, so to speak,
01:06:31.480 | or could be, you know, that we could have silicon devices
01:06:36.480 | that are conscious, or up to statement like everything
01:06:43.360 | has some degree of consciousness and some more than others.
01:06:48.520 | This is like Giulio Tognoni and Fee.
01:06:53.520 | - We just recently talked to Christoph Koch.
01:06:56.280 | - Okay, yeah, Christoph was my first graduate student.
01:07:00.680 | - Do you think it's important to illuminate aspects
01:07:04.960 | of consciousness in order to engineer intelligence systems?
01:07:09.960 | Do you think an intelligence system
01:07:12.440 | would ultimately have consciousness?
01:07:14.480 | Are they interlinked?
01:07:16.260 | - You know, most of the people working
01:07:21.000 | in artificial intelligence, I think, would answer
01:07:24.440 | we don't strictly need consciousness
01:07:27.280 | to have an intelligence system.
01:07:30.040 | - That's sort of the easier question,
01:07:31.840 | because it's a very engineering answer to the question.
01:07:35.680 | - Yes.
01:07:36.520 | - Pass the Turing test, we don't need consciousness.
01:07:38.160 | But if you were to go, do you think it's possible
01:07:42.600 | that we need to have that kind of self-awareness?
01:07:47.600 | - We may, yes.
01:07:49.920 | So for instance, I personally think that when test
01:07:56.720 | a machine or a person in a Turing test,
01:08:00.520 | in an extended Turing test, I think consciousness
01:08:05.520 | is part of what we require in that test,
01:08:11.120 | implicitly, to say that this is intelligent.
01:08:15.080 | Christoph disagrees.
01:08:17.160 | - Yes, he does.
01:08:18.560 | Despite many other romantic notions he holds,
01:08:23.480 | he disagrees with that one.
01:08:24.760 | - Yes, that's right.
01:08:26.520 | So, you know, we'll see.
01:08:28.360 | - Do you think, as a quick question,
01:08:33.000 | Ernest Becker of fear of death,
01:08:36.840 | do you think mortality and those kinds of things
01:08:41.920 | are important for, well, for consciousness
01:08:46.920 | and for intelligence, the finiteness of life,
01:08:51.500 | finiteness of existence, or is that just a side effect
01:08:55.760 | of evolutionary side effect that's useful
01:08:58.600 | for natural selection?
01:09:01.120 | Do you think this kind of thing that,
01:09:03.280 | this interview is gonna run out of time soon,
01:09:05.680 | our life will run out of time soon,
01:09:08.080 | do you think that's needed to make this conversation good
01:09:10.560 | and life good?
01:09:12.040 | - You know, I never thought about it.
01:09:13.480 | It's a very interesting question.
01:09:15.920 | I think Steve Jobs in his commencement speech
01:09:21.240 | at Stanford argued that having a finite life
01:09:26.240 | was important for stimulating achievements.
01:09:30.280 | So it was a different--
01:09:31.680 | - You live every day like it's your last, right?
01:09:33.760 | - Yeah, yeah.
01:09:34.840 | So, rationally, I don't think strictly
01:09:39.400 | you need mortality for consciousness, but--
01:09:44.120 | - Who knows?
01:09:46.200 | They seem to go together in our biological system, right?
01:09:48.760 | - Yeah, yeah.
01:09:50.840 | - You've mentioned before,
01:09:53.360 | and students are associated with,
01:09:56.160 | AlphaGo immobilized the big recent success stories in AI.
01:10:01.320 | I think it's captivated the entire world
01:10:04.080 | of what AI can do.
01:10:06.060 | So what do you think will be the next breakthrough?
01:10:10.680 | What's your intuition about the next breakthrough?
01:10:13.720 | - Of course, I don't know where the next breakthrough is.
01:10:16.800 | I think that there is a good chance, as I said before,
01:10:21.440 | that the next breakthrough would also be inspired by,
01:10:25.020 | you know, neuroscience.
01:10:26.260 | But which one, I don't know.
01:10:32.320 | - And there's, so MIT has this quest for intelligence.
01:10:35.840 | And there's a few moonshots, which, in that spirit,
01:10:39.240 | which ones are you excited about?
01:10:41.360 | Which projects kind of--
01:10:44.120 | - Well, of course I'm excited about one of the moonshots,
01:10:48.760 | which is our Center for Brains, Minds, and Machines,
01:10:51.760 | which is the one which is fully funded by NSF.
01:10:56.760 | And it is about visual intelligence.
01:11:02.680 | - And that one is particularly about understanding.
01:11:06.240 | - Visual intelligence, so the visual cortex,
01:11:09.360 | and visual intelligence in the sense of
01:11:14.360 | how we look around ourselves
01:11:17.360 | and understand the world around ourselves,
01:11:22.000 | you know, meaning what is going on,
01:11:25.480 | how we could go from here to there
01:11:29.080 | without hitting obstacles.
01:11:30.900 | You know, whether there are other agents,
01:11:34.400 | people in the environment.
01:11:36.760 | These are all things that we perceive very quickly.
01:11:41.240 | And it's something actually quite close to being conscious,
01:11:46.240 | not quite, but there is this interesting experiment
01:11:50.360 | that was run at Google X, which is in a sense,
01:11:54.800 | is just a virtual reality experiment,
01:11:58.860 | but in which they had subjects sitting, say, in a chair,
01:12:03.880 | with goggles, like Oculus and so on,
01:12:07.140 | earphones, and they were seeing through the eyes
01:12:14.040 | of a robot nearby, two cameras,
01:12:17.540 | microphones for receiving.
01:12:19.900 | So their sensory system was there, right?
01:12:23.760 | And the impression of all the subjects,
01:12:27.200 | very strong, they could not shake it off,
01:12:30.280 | was that they were where the robot was.
01:12:33.620 | They could look at themselves from the robot
01:12:38.580 | and still feel they were where the robot is.
01:12:42.880 | They were looking at their body.
01:12:44.480 | Their self had moved.
01:12:48.480 | - So some aspect of seeing understanding
01:12:50.400 | has to have ability to place yourself,
01:12:53.600 | have a self-awareness about your position in the world
01:12:57.640 | and what the world is.
01:12:59.560 | So we may have to solve the hard problem
01:13:03.280 | of consciousness to solve it.
01:13:04.840 | - On their way, yes.
01:13:05.960 | - It's quite a moonshot.
01:13:07.760 | So you've been an advisor to some incredible minds,
01:13:12.400 | including Demis Hassabis, Christophe Koch,
01:13:14.880 | Amnon Shashua, like you said,
01:13:17.320 | all went on to become seminal figures
01:13:20.120 | in their respective fields.
01:13:21.960 | From your own success as a researcher
01:13:24.240 | and from perspective as a mentor of these researchers,
01:13:29.320 | having guided them,
01:13:30.560 | in the way of advice,
01:13:34.200 | what does it take to be successful in science
01:13:36.400 | and engineering careers?
01:13:37.880 | Whether you're talking to somebody in their teens,
01:13:43.320 | 20s and 30s, what does that path look like?
01:13:46.340 | - It's curiosity and having fun.
01:13:51.520 | And I think it's important also having fun
01:13:58.280 | with other curious minds.
01:14:01.500 | - It's the people you surround with too.
01:14:04.520 | So fun and curiosity.
01:14:06.800 | You mentioned Steve Jobs.
01:14:09.960 | Is there also an underlying ambition
01:14:13.160 | that's unique that you saw
01:14:14.720 | or is it really does boil down
01:14:16.440 | to insatiable curiosity and fun?
01:14:18.800 | - Well, of course, it's being curious
01:14:22.220 | in an active and ambitious way, yes.
01:14:26.480 | And definitely.
01:14:29.640 | But I think sometime in science,
01:14:33.840 | there are friends of mine who are like this.
01:14:37.020 | You know, there are some of the scientists
01:14:40.680 | like to work by themselves
01:14:42.880 | and kind of communicate only when they
01:14:49.080 | complete their work or discover something.
01:14:54.320 | I think I always found the actual process
01:14:58.720 | of discovering something is more fun
01:15:03.720 | if it's together with other intelligent
01:15:07.280 | and curious and fun people.
01:15:09.240 | - So if you see the fun in that process,
01:15:11.320 | the side effect of that process
01:15:13.200 | would be that you'll actually end up discovering
01:15:14.920 | some interesting things.
01:15:16.320 | So as you've led many incredible efforts here,
01:15:23.280 | what's the secret to being a good advisor,
01:15:25.480 | mentor, leader in a research setting?
01:15:28.320 | Is it a similar spirit?
01:15:30.200 | Or yeah, what advice could you give to people,
01:15:34.000 | young faculty and so on?
01:15:35.920 | - It's partly repeating what I said
01:15:38.280 | about an environment that should be friendly
01:15:41.240 | and fun and ambitious.
01:15:44.400 | And I think I learned a lot
01:15:49.240 | from some of my advisors and friends
01:15:52.840 | and some were physicists.
01:15:55.240 | And there was, for instance,
01:15:57.440 | this behavior that was encouraged
01:16:02.440 | of when somebody comes with a new idea in the group,
01:16:06.680 | you are, unless it's really stupid,
01:16:09.040 | but you are always enthusiastic.
01:16:10.920 | And then, and you're enthusiastic
01:16:13.520 | for a few minutes, for a few hours.
01:16:15.080 | Then you start, you know,
01:16:18.640 | asking critically a few questions,
01:16:21.360 | testing this.
01:16:23.000 | But, you know, this is a process that is,
01:16:26.240 | I think it's very good.
01:16:28.240 | You have to be enthusiastic.
01:16:30.480 | Sometimes people are very critical from the beginning.
01:16:33.880 | That's not--
01:16:36.240 | - Yes, you have to give it a chance
01:16:37.560 | for that seed to grow.
01:16:39.360 | That said, with some of your ideas,
01:16:41.600 | which are quite revolutionary,
01:16:42.760 | so there's, I've witnessed,
01:16:44.280 | especially in the human vision side
01:16:45.800 | and neuroscience side,
01:16:47.280 | there could be some pretty heated arguments.
01:16:49.960 | Do you enjoy these?
01:16:51.120 | Is that a part of science and academic pursuits
01:16:54.440 | that you enjoy?
01:16:55.400 | - Yeah.
01:16:56.240 | - Is it, is that something that happens
01:16:59.200 | in your group as well?
01:17:00.960 | - Yeah, absolutely.
01:17:02.360 | I also spent some time in Germany.
01:17:04.280 | Again, there is this tradition
01:17:05.800 | in which people are more forthright,
01:17:10.800 | less kind than here.
01:17:14.120 | So, you know, in the US,
01:17:18.520 | when you write a bad letter,
01:17:20.080 | you still say, "This guy's nice."
01:17:22.560 | - Yes, yes.
01:17:24.080 | Yeah, here in America, it's degrees of nice.
01:17:28.800 | - Yes.
01:17:29.640 | - It's all just degrees of nice, yeah.
01:17:31.000 | - Right, right.
01:17:31.840 | So, as long as this does not become personal,
01:17:34.920 | and it's really like, you know,
01:17:39.720 | a football game with its rules,
01:17:42.680 | that's great.
01:17:43.520 | - That's fun.
01:17:45.480 | So, if you somehow find yourself in a position
01:17:49.240 | to ask one question of an oracle,
01:17:51.800 | like a genie, maybe a god,
01:17:54.200 | and you're guaranteed to get a clear answer,
01:17:57.680 | what kind of question would you ask?
01:18:01.280 | What would be the question you would ask?
01:18:03.600 | - In the spirit of our discussion,
01:18:06.000 | it could be, how could I become 10 times more intelligent?
01:18:10.160 | (laughing)
01:18:11.760 | - And so, but see, you only get a clear, short answer.
01:18:16.200 | So, do you think there's a clear, short answer to that?
01:18:18.640 | - No.
01:18:19.480 | (laughing)
01:18:20.720 | - And that's the answer you'll get.
01:18:22.520 | Okay.
01:18:23.640 | So, you've mentioned "Flowers of Elgar'nan."
01:18:26.880 | - Oh, yeah.
01:18:27.960 | - As a story that inspired you in your childhood,
01:18:31.760 | as this story of a mouse,
01:18:37.120 | a human achieving genius-level intelligence,
01:18:39.360 | and then understanding what was happening
01:18:41.480 | while slowly becoming not intelligent again
01:18:44.200 | in this tragedy of gaining intelligence
01:18:46.560 | and losing intelligence.
01:18:48.600 | Do you think, in that spirit, in that story,
01:18:51.440 | do you think intelligence is a gift or a curse,
01:18:55.360 | from the perspective of happiness and meaning of life?
01:19:00.160 | You try to create an intelligent system
01:19:02.200 | that understands the universe,
01:19:03.880 | but on an individual level, the meaning of life,
01:19:06.480 | do you think intelligence is a gift?
01:19:08.440 | - It's a good question.
01:19:17.120 | I don't know.
01:19:17.960 | - As one of the,
01:19:23.680 | as one people consider the smartest people in the world,
01:19:28.680 | in some dimension, at the very least,
01:19:31.280 | what do you think?
01:19:33.280 | - I don't know, it may be invariant to intelligence,
01:19:37.560 | let alone degree of happiness.
01:19:39.680 | Would be nice if it were.
01:19:41.200 | - That's the hope.
01:19:44.720 | - Yeah.
01:19:46.160 | - You could be smart and happy and clueless and happy.
01:19:50.160 | - Yeah.
01:19:51.800 | - As always, on the discussion of the meaning of life,
01:19:54.480 | it's probably a good place to end.
01:19:57.320 | Tomasso, thank you so much for talking today.
01:19:59.240 | - Thank you, this was great.
01:20:00.640 | (upbeat music)
01:20:03.240 | (upbeat music)
01:20:05.840 | (upbeat music)
01:20:08.440 | (upbeat music)
01:20:11.040 | (upbeat music)
01:20:13.640 | (upbeat music)
01:20:16.240 | [BLANK_AUDIO]