back to index

Wojciech Zaremba: OpenAI Codex, GPT-3, Robotics, and the Future of AI | Lex Fridman Podcast #215


Chapters

0:0 Introduction
1:18 The Fermi paradox
8:20 Systems of government
10:57 Life, intelligence, and consciousness
18:10 GPT language model
20:23 Engineering consciousness
24:40 Is there an algorithm for intelligence?
32:2 Neural networks and deep learning
44:53 Human reward functions
49:47 Love is part of the human condition
52:14 Expanding our circle of empathy
56:19 Psychedelics and meditation
67:46 Ilya Sutskever
75:3 How does GPT work?
84:56 AI safety
91:42 OpenAI Codex
105:15 Robotics
114:32 Developing self driving cars and robots
125:23 What is the benchmark for intelligence?
128:25 Will we spend more time in virtual reality?
130:39 AI Friendships
140:9 Sleep
142:43 Generating good ideas
149:8 Advice for young people
153:52 Getting started with machine learning
157:5 What is beauty?
160:56 Death
167:44 Meaning of life

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Wojciech Zaremba,
00:00:03.480 | co-founder of OpenAI,
00:00:05.520 | which is one of the top organizations in the world
00:00:08.460 | doing artificial intelligence research and development.
00:00:12.340 | Wojciech is the head of language and cogeneration teams,
00:00:16.820 | building and doing research on GitHub Copilot,
00:00:20.240 | OpenAI Codecs, and GPT-3,
00:00:24.040 | and who knows, 4, 5, 6, N,
00:00:28.140 | and N+1, and he also previously led OpenAI's robotic efforts.
00:00:33.140 | These are incredibly exciting projects to me
00:00:37.620 | that deeply challenge and expand our understanding
00:00:41.400 | of the structure and nature of intelligence.
00:00:44.080 | The 21st century, I think, may very well be remembered
00:00:48.320 | for a handful of revolutionary AI systems
00:00:51.480 | and their implementations.
00:00:53.240 | GPT, Codecs, and applications of language models
00:00:56.880 | and transformers in general to the language
00:00:59.320 | and visual domains may very well be
00:01:02.740 | at the core of these AI systems.
00:01:05.820 | To support this podcast, please check out our sponsors.
00:01:09.400 | They're listed in the description.
00:01:11.760 | This is the Lex Friedman Podcast,
00:01:13.980 | and here is my conversation with Wojciech Zaremba.
00:01:17.380 | You mentioned that Sam Altman asked
00:01:21.280 | about the Fermi Paradox,
00:01:24.200 | and the people at OpenAI had really sophisticated,
00:01:27.280 | interesting answers, so that's when you knew
00:01:29.720 | this is the right team to be working with.
00:01:31.780 | So let me ask you about the Fermi Paradox, about aliens.
00:01:35.580 | Why have we not found overwhelming evidence
00:01:39.400 | for aliens visiting Earth?
00:01:41.520 | - I don't have a conviction in the answer,
00:01:43.120 | but rather kind of probabilistic perspective
00:01:45.620 | on what might be a, let's say, possible answer.
00:01:47.960 | It's also interesting that the question itself
00:01:50.600 | even can touch on your typical question
00:01:54.680 | of what's the meaning of life,
00:01:55.880 | because if you assume that we don't see aliens
00:01:58.400 | because they destroy themselves,
00:01:59.960 | that kind of upweights the focus
00:02:03.840 | on making sure that we won't destroy ourselves.
00:02:06.780 | At the moment, the place where I am actually
00:02:10.720 | with my belief, and these things also change over the time,
00:02:14.160 | is I think that we might be alone in the universe,
00:02:18.040 | which actually makes life more,
00:02:20.440 | or let's say consciousness life, more kind of valuable,
00:02:23.500 | and that means that we should more appreciate it.
00:02:26.560 | - Have we always been alone?
00:02:27.760 | So what's your intuition about our galaxy, our universe?
00:02:30.840 | Is it just sprinkled with graveyards
00:02:34.120 | of intelligent civilizations, or are we truly,
00:02:37.100 | is life, intelligent life, truly unique?
00:02:40.200 | - At the moment, my belief that it is unique,
00:02:42.680 | but I would say I could also, you know,
00:02:45.200 | there was some footage released with UFO objects,
00:02:49.360 | which makes me actually doubt my own belief.
00:02:51.840 | - Yes.
00:02:53.200 | - Yeah, I can tell you one crazy answer that I have heard.
00:02:56.240 | - Yes.
00:02:57.520 | - So apparently, when you look actually
00:03:00.440 | at the limits of computation, you can compute more
00:03:04.360 | if the temperature of the universe would drop down.
00:03:08.560 | So one of the things that aliens might want to do
00:03:13.220 | if they are truly optimizing to maximize amount of compute,
00:03:16.560 | which, you know, maybe can lead to,
00:03:18.440 | or let's say simulations or so,
00:03:20.440 | it's instead of wasting current entropy of the universe,
00:03:24.600 | because, you know, we, by living,
00:03:25.880 | we are actually somewhat wasting entropy,
00:03:28.880 | then you can wait for the universe to cool down
00:03:31.640 | such that you have more computation.
00:03:33.320 | So that's kind of a funny answer.
00:03:34.520 | I'm not sure if I believe in it,
00:03:36.360 | but that would be one of the reasons
00:03:37.720 | why you don't see aliens.
00:03:39.760 | It's also possible, see, some people say
00:03:42.520 | that maybe there is not that much point
00:03:44.960 | in actually going to other galaxies if you can go inwards.
00:03:49.440 | So there is no limits of what could be an experience
00:03:53.360 | if we could, you know, connect machines to our brains,
00:03:57.320 | while there are still some limits
00:03:58.620 | if we want to explore universe.
00:04:00.280 | - Yeah, there could be a lot of ways to go inwards too.
00:04:05.000 | Once you figure out some aspect of physics
00:04:07.560 | we haven't figured out yet,
00:04:09.000 | maybe you can travel to different dimensions.
00:04:11.020 | I mean, travel in three-dimensional space
00:04:16.000 | may not be the most fun kind of travel.
00:04:19.120 | There may be like just a huge amount
00:04:21.000 | of different ways to travel.
00:04:22.520 | And it doesn't require a spaceship going slowly
00:04:26.280 | in 3D space to space time.
00:04:28.360 | - It also feels, you know, one of the problems
00:04:31.000 | is that speed of light is low and the universe is vast.
00:04:34.880 | And it seems that actually most likely
00:04:37.920 | if we want to travel very far,
00:04:40.320 | then we would instead of actually sending spaceships
00:04:45.320 | with humans that weigh a lot,
00:04:46.980 | we would send something similar
00:04:49.020 | to what Uri Miller is working on.
00:04:51.060 | These are like a huge sail, which is at first powered,
00:04:55.580 | there is a shot of laser from an air
00:04:57.740 | and it can propel it to quarter of speed of light.
00:05:01.300 | And sail itself contains a few grams of equipment.
00:05:06.300 | And that might be the way to actually
00:05:10.020 | transport matter through universe.
00:05:12.380 | But then when you think what would it mean for humans,
00:05:14.780 | it means that we would need to actually
00:05:16.780 | put their 3D printer and, you know,
00:05:18.900 | 3D print a human on other planet.
00:05:20.840 | I don't know, play them YouTube or let's say,
00:05:23.460 | or like a 3D print like huge human right away
00:05:26.080 | or maybe a womb or so.
00:05:27.420 | - With our current techniques of archeology,
00:05:32.680 | if a civilization was born and died
00:05:35.580 | long enough ago on earth,
00:05:38.940 | we wouldn't be able to tell.
00:05:40.540 | And so that makes me really sad.
00:05:43.380 | And so I think about earth in that same way.
00:05:45.620 | How can we leave some remnants if we do destroy ourselves?
00:05:50.180 | How can we leave remnants for aliens
00:05:52.100 | in the future to discover?
00:05:54.100 | Like, here's some nice stuff we've done.
00:05:56.580 | Like Wikipedia and YouTube,
00:05:58.300 | do we have it like in a satellite orbiting earth
00:06:02.180 | with a hard drive?
00:06:03.620 | Like how do we say, how do we back up human civilization
00:06:08.500 | for the good parts or all of it is good parts
00:06:12.100 | so that it can be preserved longer than our bodies can?
00:06:16.620 | That's kind of a, it's a difficult question.
00:06:20.380 | It also requires the difficult acceptance
00:06:22.900 | of the fact that we may die.
00:06:24.700 | And if we die, we may die suddenly as a civilization.
00:06:29.340 | - So let's see.
00:06:30.540 | I think it kind of depends on the cataclysm.
00:06:33.180 | We have observed in other parts of universe
00:06:35.260 | that birds of gamma rays,
00:06:37.820 | these are high energy rays of light
00:06:41.940 | that actually can apparently kill entire galaxy.
00:06:45.300 | So there might be actually nothing even to,
00:06:49.340 | nothing to protect us from it.
00:06:51.220 | I'm also, and I'm looking actually at the past civilization.
00:06:53.820 | So it's like Aztecs or so,
00:06:55.900 | they disappeared from the surface of the earth.
00:06:59.140 | And one can ask, why is it the case?
00:07:02.540 | And the way I'm thinking about it is,
00:07:06.340 | you know, that definitely they had some problem
00:07:09.220 | that they couldn't solve.
00:07:11.020 | And maybe there was a flood
00:07:12.700 | and all of a sudden they couldn't drink,
00:07:14.500 | there was no potable water and they all died.
00:07:17.500 | And I think that so far,
00:07:22.100 | the best solution to such problems is I guess technology.
00:07:27.700 | So, I mean, if they would know that you can just boil water
00:07:30.580 | and then drink it after,
00:07:32.140 | then that would save their civilization.
00:07:34.220 | And even now, when we look actually at the current pandemic,
00:07:37.860 | it seems that once again, actually science comes to rescue
00:07:41.420 | and somehow science increases size of the action space.
00:07:45.260 | And I think that's a good thing.
00:07:47.500 | - Yeah, but nature has a vastly larger action space.
00:07:52.500 | - But still it might be a good thing for us
00:07:54.740 | to keep on increasing action space.
00:07:56.540 | (laughing)
00:07:58.380 | - Okay, looking at past civilizations, yes.
00:08:01.940 | But looking at the destruction of human civilization,
00:08:05.860 | perhaps expanding the action space will add actions
00:08:09.860 | that are easily acted upon, easily executed,
00:08:15.860 | and as a result, destroy us.
00:08:18.980 | - So let's see.
00:08:20.460 | I was pondering why actually even we have negative impact
00:08:25.460 | on the globe, because, you know,
00:08:27.900 | if you ask every single individual,
00:08:30.460 | they would like to have clean air,
00:08:32.780 | they would like healthy planet,
00:08:34.060 | but somehow it actually, it's not the case
00:08:36.300 | that as a collective, we are not going in this direction.
00:08:39.300 | I think that there exists very powerful system
00:08:43.140 | to describe what we value, that's capitalism.
00:08:45.340 | It assigns actually monetary values to various activities.
00:08:49.140 | At the moment, the problem in the current system
00:08:51.820 | is that there are some things which we value,
00:08:54.140 | there is no cost assigned to it.
00:08:55.820 | So even though we value clean air,
00:08:58.340 | or maybe we also value lack of destruction
00:09:03.340 | on let's say internet or so,
00:09:05.300 | at the moment, these quantities, you know,
00:09:08.380 | companies, corporations can pollute them for free.
00:09:11.900 | So in some sense, I wish, or like,
00:09:17.780 | that's I guess purpose of politics
00:09:21.060 | to align the incentive systems.
00:09:23.980 | And we are kind of maybe even moving in this direction.
00:09:26.700 | The first issue is even to be able to measure
00:09:28.900 | the things that we value,
00:09:30.660 | then we can actually assign the monetary value to them.
00:09:34.340 | - Yeah, and that's, so it's getting the data
00:09:36.820 | and also probably through technology,
00:09:39.580 | enabling people to vote and to move money around
00:09:44.580 | in a way that is aligned with their values.
00:09:46.980 | And that's very much a technology question.
00:09:50.300 | So like having one president and Congress and voting
00:09:55.900 | that happens every four years or something like that,
00:09:59.060 | that's a very outdated idea.
00:10:00.660 | There could be some technological improvements
00:10:02.620 | to that kind of idea.
00:10:03.660 | - So I'm thinking from time to time about these topics,
00:10:06.740 | but it's also feels to me that it's a little bit like,
00:10:10.380 | it's hard for me to actually make correct predictions
00:10:12.740 | what is the appropriate thing to do.
00:10:14.740 | I extremely trust Sam Altman, our CEO on these topics.
00:10:19.740 | He, I'm more on the side of being, I guess,
00:10:24.900 | naive hippie that, yeah.
00:10:27.500 | (laughing)
00:10:29.940 | - That's your life philosophy.
00:10:31.620 | Well, I think self-doubt and I think hippie implies optimism.
00:10:37.700 | Those two things are pretty good way to operate.
00:10:43.260 | - I mean, still it is hard for me to actually understand
00:10:47.860 | how the politics works or like how this,
00:10:51.260 | like exactly how the things would play out.
00:10:54.740 | And Sam is really excellent with it.
00:10:57.420 | - What do you think is rarest in the universe?
00:10:59.820 | You said we might be alone.
00:11:01.820 | What's hardest to build is another engineering way
00:11:04.380 | to ask that.
00:11:05.580 | Life, intelligence or consciousness.
00:11:08.980 | So like you said that we might be alone,
00:11:12.060 | which is the thing that's hardest to get to?
00:11:14.980 | Is it just the origin of life?
00:11:16.940 | Is it the origin of intelligence?
00:11:18.940 | Is it the origin of consciousness?
00:11:22.700 | - So let me at first explain you my kind of mental model,
00:11:27.220 | what I think is needed for life to appear.
00:11:29.580 | So I imagine that at some point
00:11:33.660 | there was this primordial soup of amino acids
00:11:38.340 | and maybe some proteins in the ocean.
00:11:41.340 | And some proteins were turning into some other proteins
00:11:44.860 | through reaction.
00:11:46.340 | And you can almost think about this cycle
00:11:50.540 | of what turns into what as there is a graph
00:11:53.220 | essentially describing which substance
00:11:55.020 | turns into some other substance.
00:11:57.140 | And essentially life means that all the sudden
00:11:59.500 | in the graph has been created that cycle
00:12:02.060 | such that the same thing keeps on happening
00:12:04.340 | over and over again.
00:12:05.460 | That's what is needed for life to happen.
00:12:07.220 | And in some sense, you can think almost
00:12:09.220 | that you have this gigantic graph
00:12:11.020 | and it needs like a sufficient number of edges
00:12:14.100 | for the cycle to appear.
00:12:17.300 | Then from perspective of intelligence and consciousness,
00:12:21.580 | my current intuition is that they might be quite intertwined.
00:12:26.340 | First of all, it might not be that it's like a binary thing
00:12:28.940 | that you have intelligence or consciousness.
00:12:30.780 | It seems to be more like a continuous component.
00:12:35.780 | Let's see, if we look for instance
00:12:38.460 | on the even networks recognizing images,
00:12:42.500 | people are able to show that the activations
00:12:45.300 | of these networks correlate very strongly
00:12:48.740 | with activations in visual cortex of some monkeys.
00:12:52.860 | The same seems to be true about language models.
00:12:56.620 | Also, if you for instance, look,
00:13:01.620 | if you train agent in 3D world,
00:13:04.900 | at first, it barely recognizes what is going on.
00:13:10.500 | Over the time it kind of recognizes foreground
00:13:12.980 | from background, over the time it kind of knows
00:13:15.580 | where there is a foot and it just follows it.
00:13:19.580 | Over the time, it actually starts having a 3D perception.
00:13:22.940 | So it is possible for instance,
00:13:24.540 | to look inside of the head of an agent
00:13:27.100 | and ask what would it see if it looks to the right.
00:13:29.860 | And the crazy thing is, initially when the agents
00:13:33.020 | are barely trained, that these predictions are pretty bad.
00:13:35.900 | Over the time, they become better and better.
00:13:38.580 | You can still see that if you ask what happens
00:13:42.540 | when the head is turned by 360 degrees,
00:13:45.220 | for some time, they think that the different thing appears.
00:13:48.340 | And then at some stage, they understand actually
00:13:51.060 | that the same thing supposed to appear.
00:13:52.780 | So they get like a understanding of 3D structure.
00:13:55.900 | It's also very likely that they have inside some level
00:14:00.340 | of like a symbolic reasoning,
00:14:03.420 | like they're particularly symbols for other agents.
00:14:06.860 | So when you look at Dota agents, they collaborate together.
00:14:11.780 | And they have some anticipation of if they would win battle,
00:14:16.780 | they have some expectations with respect to other agents.
00:14:22.500 | I might be too much anthropomorphizing
00:14:24.820 | how the things look for me.
00:14:28.580 | But then the fact that they have a symbol for other agents
00:14:33.580 | makes me believe that at some stage
00:14:36.660 | as they are optimizing for skills,
00:14:40.180 | they would have also symbol to describe themselves.
00:14:42.740 | This is like a very useful symbol to have.
00:14:46.540 | And this particularity, I would call it
00:14:48.220 | like a self-consciousness or self-awareness.
00:14:51.140 | And still it might be different from the consciousness.
00:14:55.420 | So I guess the way how I'm understanding
00:14:58.980 | the word consciousness, let's say the experience
00:15:01.260 | of drinking a coffee, or let's say experience of being a bat,
00:15:04.260 | that's the meaning of the word consciousness.
00:15:06.860 | It doesn't mean to be awake.
00:15:09.860 | Yeah, it feels, it might be also somewhat related
00:15:13.180 | to memory and recurrent connections.
00:15:15.460 | So it's kind of, okay, if you look at anesthetic drugs,
00:15:19.780 | they might be, like they essentially,
00:15:24.740 | they disturb brain waves such that
00:15:29.500 | maybe memory is not formed.
00:15:34.020 | - And so there's a lessening of consciousness
00:15:36.460 | when you do that.
00:15:37.340 | - Correct.
00:15:38.180 | - And so that's one way to intuit
00:15:39.700 | what is consciousness.
00:15:41.060 | There's also kind of another element here.
00:15:45.340 | It could be that it's this kind of self-awareness module
00:15:50.300 | that you described, plus the actual subjective experience
00:15:55.300 | is a storytelling module that tells us a story
00:16:00.620 | about what we're experiencing.
00:16:05.220 | - The crazy thing, so let's say, I mean, in meditation,
00:16:08.580 | they teach people not to speak story inside of the head.
00:16:12.820 | And there is also some fraction of population
00:16:15.460 | who doesn't have actually narrator.
00:16:18.060 | I know people who don't have a narrator,
00:16:21.100 | and they have to use external people
00:16:23.860 | in order to kind of solve tasks
00:16:27.580 | that require internal narrator.
00:16:30.780 | So it seems that it's possible to have the experience
00:16:34.300 | without the talk.
00:16:37.500 | - What are we talking about
00:16:38.860 | when we talk about the internal narrator?
00:16:41.140 | Is that the voice when you're reading a book?
00:16:42.340 | - Yeah, I thought that that's what you are referring to.
00:16:45.100 | - Well, I was referring more on the not an actual voice.
00:16:50.100 | I meant there's some kind of subjective experience
00:16:56.180 | feels like it's fundamentally about storytelling
00:17:02.020 | to ourselves.
00:17:04.580 | It feels like the feeling is a story
00:17:09.580 | that is much simpler abstraction
00:17:15.060 | than the raw sensory information.
00:17:17.460 | So there feels like it's a very high level abstraction
00:17:21.020 | that is useful for me to feel like entity in this world.
00:17:26.020 | Most useful aspect of it is that because I'm conscious,
00:17:33.180 | I think there's an intricate connection to me
00:17:35.420 | not wanting to die.
00:17:37.620 | So it's a useful hack to really prioritize not dying.
00:17:43.860 | Those seem to be somewhat connected.
00:17:46.860 | So I'm telling the story of it's richly feels
00:17:50.180 | like something to be me,
00:17:51.860 | and the fact that me exists in this world,
00:17:54.260 | I wanna preserve me.
00:17:56.020 | So that makes it a useful agent hack.
00:17:58.500 | - So I will just refer maybe to the fact
00:18:01.020 | that I will just refer maybe to the first part,
00:18:03.780 | as you said about that kind of story
00:18:05.940 | of describing who you are.
00:18:07.260 | I was thinking about that even,
00:18:12.580 | so obviously I like thinking about consciousness.
00:18:17.580 | I like thinking about AI as well,
00:18:19.700 | and I'm trying to see analogies of these things in AI,
00:18:22.700 | what would it correspond to?
00:18:24.580 | So, open AI train,
00:18:30.380 | a model called GPT,
00:18:35.020 | which can generate pretty amusing text on arbitrary topic.
00:18:40.020 | And one way to control GPT
00:18:45.820 | is by putting into prefix at the beginning of the text,
00:18:50.580 | some information, what would be the story about?
00:18:52.980 | You can have even chat with GPT
00:18:58.180 | by saying that the chat is with Lex or Elon Musk or so,
00:19:01.860 | and GPT would just pretend to be you or Elon Musk or so.
00:19:06.860 | And it almost feels that this story
00:19:12.220 | that we give ourselves to describe our life,
00:19:15.580 | it's almost like things that you put into context of GPT.
00:19:18.980 | - Yeah, and it generates the,
00:19:21.300 | but the context we provide to GPT is multimodal.
00:19:27.180 | - It's more so, GPT itself is multimodal.
00:19:29.460 | GPT itself hasn't learned actually
00:19:32.500 | from experience of single human,
00:19:33.980 | but from the experience of humanity, it's a chameleon.
00:19:37.340 | You can turn it into anything.
00:19:39.180 | And in some sense, by providing context,
00:19:41.940 | it behaves as the thing that you wanted it to be.
00:19:47.300 | It's interesting that people have stories of who they are.
00:19:52.300 | And as you said, these stories,
00:19:54.220 | they help them to operate in the world.
00:19:57.340 | But it's also interesting,
00:20:00.220 | I guess various people find it out through meditation or so,
00:20:03.100 | that there might be some patterns that you have learned
00:20:07.500 | when you were a kid
00:20:08.340 | that actually are not serving you anymore.
00:20:10.820 | And you also might be thinking that that's who you are,
00:20:14.140 | and that's actually just a story.
00:20:15.780 | - Yeah, so it's a useful hack,
00:20:18.660 | but sometimes it gets us into trouble.
00:20:20.540 | It's a local optima.
00:20:21.660 | - It's a local optima.
00:20:23.060 | - You wrote that Stephen Hawking,
00:20:25.060 | he tweeted, Stephen Hawking asked,
00:20:27.540 | what breathes fire into equations?
00:20:30.020 | Which meant what makes given mathematical equations
00:20:32.860 | realize the physics of a universe?
00:20:35.980 | Similarly, I wonder what breathes fire into computation?
00:20:40.460 | What makes given computation conscious?
00:20:43.540 | Okay, so how do we engineer consciousness?
00:20:47.380 | How do you breathe fire and magic into the machine?
00:20:51.900 | - So it seems clear to me
00:20:54.500 | that not every computation is conscious.
00:20:57.340 | I mean, you can, let's say,
00:20:58.700 | just keep on multiplying one matrix over and over again,
00:21:02.340 | and it might be gigantic matrix,
00:21:03.980 | you can put a lot of computation,
00:21:05.580 | I don't think it would be conscious.
00:21:07.180 | So in some sense, the question is,
00:21:09.620 | what are the computations which could be conscious?
00:21:14.500 | I mean, so one assumption is
00:21:17.180 | that it has to do purely with computation,
00:21:19.060 | that you can abstract away matter,
00:21:20.740 | and other possibilities,
00:21:22.620 | it's very important was the realization of computation
00:21:25.020 | that it has to do with some force fields or so,
00:21:28.980 | and they bring consciousness.
00:21:30.620 | At the moment, my intuition is
00:21:31.900 | that it can be fully abstracted away.
00:21:33.780 | So in case of computation, you can ask yourself,
00:21:36.460 | what are the mathematical objects or so
00:21:39.860 | that could bring such properties?
00:21:41.860 | So for instance, if we think about the models, AI models,
00:21:46.860 | that what they truly try to do,
00:21:51.620 | or like models like GPT,
00:21:53.620 | is they try to predict next word or so,
00:21:58.620 | and this turns out to be equivalent to compressing text.
00:22:05.860 | And because in some sense,
00:22:09.140 | compression means that you learn the model of reality,
00:22:12.860 | and you have just to remember where are your mistakes,
00:22:17.060 | the better you are in predicting the...
00:22:19.140 | And in some sense, when we look at our experience,
00:22:22.740 | also when you look, for instance, at the car driving,
00:22:24.700 | you know in which direction it will go,
00:22:26.420 | you are good like in prediction.
00:22:28.420 | And it might be the case
00:22:31.100 | that the consciousness is intertwined with compression.
00:22:34.980 | It might be also the case that self-consciousness
00:22:37.860 | has to do with compressor trying to compress itself.
00:22:41.980 | So, okay, I was just wondering,
00:22:45.140 | what are the objects in mathematics
00:22:48.180 | or computer science which are mysterious
00:22:50.820 | that could have to do with consciousness?
00:22:54.220 | And then I thought, you know, you see in mathematics,
00:22:59.220 | there is something called Gadiel theorem,
00:23:02.380 | which means if you have sufficiently
00:23:05.500 | complicated mathematical system,
00:23:07.060 | it is possible to point the mathematical system
00:23:10.100 | back on itself.
00:23:11.420 | In computer science,
00:23:12.340 | there is something called Halping problem.
00:23:15.220 | It's somewhat similar construction.
00:23:17.460 | So I thought that, you know, if we believe
00:23:19.820 | that under assumption that consciousness
00:23:24.660 | has to do with compression,
00:23:26.980 | then you could imagine that
00:23:30.740 | as you keep on compressing things,
00:23:33.420 | then at some point it actually makes sense
00:23:35.700 | for the compressor to compress itself.
00:23:37.500 | - Meta compression. - Yeah.
00:23:39.220 | - Consciousness is meta compression.
00:23:41.620 | - That's an idea.
00:23:45.220 | And in some sense, you know, the crazy--
00:23:46.900 | - I love it.
00:23:47.740 | - Thank you.
00:23:48.580 | - So, but do you think, if we think of a Turing machine,
00:23:52.260 | a universal Turing machine,
00:23:54.380 | can that achieve consciousness?
00:23:57.340 | So is there something beyond our traditional definition
00:24:01.700 | of computation that's required?
00:24:03.620 | - So it's a specific computation.
00:24:05.500 | And I said, this computation has to do with compression.
00:24:08.580 | And the compression itself,
00:24:10.980 | maybe other way of putting it is like,
00:24:13.180 | you're internally creating the model of reality
00:24:16.300 | in order, it's like you try inside to simplify reality
00:24:20.100 | in order to predict what's gonna happen.
00:24:22.540 | And that also feels somewhat similar
00:24:24.940 | to how I think actually about my own conscious experience.
00:24:28.380 | So clearly I don't have access to reality.
00:24:31.260 | The only access to reality is through, you know,
00:24:33.500 | cable going to my brain.
00:24:35.060 | And my brain is creating a simulation of reality,
00:24:38.020 | and I have access to the simulation of reality.
00:24:40.740 | - Are you by any chance aware of the Hutter Prize,
00:24:45.020 | Marcus Hutter?
00:24:46.540 | He made this prize for compression of Wikipedia pages.
00:24:51.540 | And there's a few qualities to it.
00:24:56.220 | One, I think has to be perfect compression, which makes,
00:24:59.900 | I think that little quirk makes it much less applicable
00:25:04.060 | to the general task of intelligence.
00:25:07.100 | 'Cause it feels like intelligence
00:25:08.380 | is always going to be messy.
00:25:11.620 | Like perfect compression feels like it's not the right goal,
00:25:16.620 | but it's nevertheless a very interesting goal.
00:25:19.300 | So for him, intelligence equals compression.
00:25:22.700 | And so the smaller you make the file,
00:25:26.140 | given a large Wikipedia page,
00:25:29.140 | the more intelligent the system has to be.
00:25:31.220 | - Yeah, that makes sense.
00:25:32.060 | So you can make perfect compression if you store errors.
00:25:34.900 | And I think that actually what he meant
00:25:36.380 | is you have algorithm plus errors.
00:25:38.500 | By the way, Hutter is a,
00:25:41.100 | he was PhD advisor of Sian Leck,
00:25:45.020 | who is a DeepMind co-founder.
00:25:48.620 | - Yeah, yeah.
00:25:49.460 | So there's an interesting, and now he's a DeepMind,
00:25:53.340 | there's an interesting network of people.
00:25:55.780 | He's one of the people that I think seriously
00:26:00.740 | took on the task of what would an AGI system look like?
00:26:06.140 | I think for a longest time,
00:26:08.620 | the question of AGI was not taken seriously
00:26:13.380 | or rather rigorously.
00:26:15.660 | And he did just that.
00:26:17.760 | Like mathematically speaking,
00:26:19.620 | what would the model look like?
00:26:21.100 | If you remove the constraints of it having to be,
00:26:23.620 | having to have a reasonable amount of memory,
00:26:30.700 | reasonable amount of running time complexity,
00:26:34.500 | computation time, what would it look like?
00:26:37.100 | And essentially it's a half math,
00:26:40.140 | half philosophical discussion of how would it,
00:26:43.260 | like a reinforcement learning type of framework
00:26:46.160 | look like for an AGI?
00:26:47.580 | - Yeah, so he developed the framework
00:26:49.580 | even to describe what's optimal
00:26:51.540 | with respect to reinforcement learning.
00:26:53.260 | Like there is a theoretical framework,
00:26:54.860 | which is, as you said, under assumption,
00:26:57.060 | there is infinite amount of memory and compute.
00:26:59.900 | There was actually one person before,
00:27:02.180 | his name is Solomonov.
00:27:03.580 | Huter extended Solomonov's work to reinforcement learning,
00:27:07.820 | but there exists a theoretical algorithm,
00:27:11.500 | which is optimal algorithm to build intelligence.
00:27:14.940 | And I can actually explain the algorithm.
00:27:16.940 | - Yes.
00:27:17.780 | Let's go, let's go, let's go.
00:27:20.260 | - So the task itself, you can-
00:27:21.980 | - Can I just pause how absurd it is
00:27:25.860 | for brain in a skull trying to explain
00:27:29.100 | the algorithm for intelligence?
00:27:30.540 | Just go ahead.
00:27:31.460 | - It is pretty crazy.
00:27:32.500 | It is pretty crazy that the brain itself
00:27:34.460 | is actually so small and it can ponder.
00:27:37.140 | - How to design algorithms
00:27:40.420 | that optimally solve the problem of intelligence.
00:27:42.900 | Okay.
00:27:43.740 | All right, so what's the algorithm?
00:27:45.300 | - So let's see.
00:27:46.460 | So first of all, the task itself is described as
00:27:50.940 | you have infinite sequence of zeros and ones, okay?
00:27:54.620 | You read n bits and you're about to predict n plus one bit.
00:27:59.180 | So that's the task.
00:28:00.220 | And you could imagine that every task
00:28:02.580 | could be casted as such a task.
00:28:04.540 | So if for instance, you have images and labels,
00:28:07.580 | you can just turn every image into a sequence
00:28:09.540 | of zeros and ones, then label, you concatenate labels
00:28:12.860 | and that's actually,
00:28:14.820 | and you could start by having training data first,
00:28:18.500 | and then afterwards you have test data.
00:28:21.260 | So theoretically, any problem could be casted
00:28:24.340 | as a problem of predicting zeros and ones
00:28:27.260 | on this infinite tape.
00:28:29.060 | So let's say you read already n bits
00:28:34.060 | and you want to predict n plus one bit.
00:28:37.140 | And I will ask you to write every possible program
00:28:41.980 | that generates these n bits, okay?
00:28:44.500 | So, and you can have, you choose programming language.
00:28:48.620 | It can be Python or C++.
00:28:50.780 | And the difference between programming languages
00:28:53.460 | might be, there is a difference by constant,
00:28:56.180 | asymptotically your predictions will be equivalent.
00:28:58.940 | So you read n bits, you enumerate all the programs
00:29:04.260 | that produce these n bits in their output.
00:29:07.700 | And then in order to predict n plus one bit,
00:29:10.780 | you actually weight the programs according to their length.
00:29:15.780 | And there is like a specific formula how you weight them.
00:29:19.940 | And then the n plus one bit prediction
00:29:23.460 | is the prediction from each of these program
00:29:26.540 | according to that weight.
00:29:28.540 | - Like statistically.
00:29:29.860 | - Statistically, yeah.
00:29:30.700 | - You pick, so the smaller the program,
00:29:32.620 | the more likely you are to pick its output.
00:29:36.940 | So that algorithm is grounded in the hope
00:29:41.940 | or the intuition that the simple answer is the right one.
00:29:45.980 | - It's a formalization of it.
00:29:47.700 | - Yeah.
00:29:48.660 | - It also means like if you would ask the question
00:29:51.900 | after how many years would Sun explode?
00:29:56.900 | You can say, it's more likely the answer is two
00:30:01.020 | to some power because they're shorter program.
00:30:04.140 | - Yeah.
00:30:04.980 | - And then other.
00:30:06.660 | - Well, I don't have a good intuition
00:30:08.420 | about how different the space of short programs
00:30:11.740 | are from the space of large programs.
00:30:14.780 | Like what is the universe where short programs
00:30:18.260 | like run things?
00:30:21.500 | - So as I said, the things have to agree with n bits.
00:30:25.020 | So even if you have, you need to start,
00:30:28.020 | okay, if you have very short program
00:30:30.300 | and they're like still some,
00:30:32.540 | if it's not perfectly prediction of n bits,
00:30:34.660 | you have to start errors.
00:30:36.500 | What are the errors?
00:30:37.340 | And that gives you the full program that agrees on n bits.
00:30:40.100 | - Oh, so you don't agree perfectly with the n bits
00:30:43.580 | and you store errors.
00:30:44.860 | - That's like a longer program, slightly longer program
00:30:48.340 | because it contains these extra bits of errors.
00:30:50.900 | - That's fascinating.
00:30:51.980 | What's your intuition about the programs
00:30:56.980 | that are able to do cool stuff
00:30:59.700 | like intelligence and consciousness?
00:31:01.700 | Are they perfectly, like, is there if then statements
00:31:06.700 | in them?
00:31:09.300 | So like, is there a lot of exceptions that they're storing?
00:31:11.740 | - So you could imagine if there would be tremendous amount
00:31:15.140 | of if statements, then they wouldn't be that short
00:31:17.940 | in case of neural networks.
00:31:19.900 | You could imagine that what happens is they,
00:31:23.900 | when you start with an uninitialized neural network,
00:31:29.380 | it stores internally many possibilities
00:31:32.340 | how the problem can be solved.
00:31:35.300 | And SGD is kind of magnifying some paths
00:31:38.300 | which are slightly similar to the correct answer.
00:31:44.460 | So it's kind of magnifying correct programs.
00:31:46.740 | And in some sense, SGD is a search algorithm
00:31:49.820 | in the program space.
00:31:51.260 | And the program space is represented by, you know,
00:31:55.020 | kind of the wiring inside of the neural network.
00:31:57.780 | And there's like an insane number of ways
00:32:00.260 | how the features can be computed.
00:32:02.700 | - Let me ask you the high level basic question
00:32:05.380 | that's not so basic.
00:32:07.220 | What is deep learning?
00:32:10.020 | Is there a way you'd like to think of it
00:32:12.340 | that is different than like a generic textbook definition?
00:32:15.940 | - The thing that I hinted just a second ago
00:32:18.300 | is maybe the closest to how I'm thinking these days
00:32:21.540 | about deep learning.
00:32:23.260 | So the statement is neural networks
00:32:28.260 | can represent some programs.
00:32:30.940 | Seems that various modules that we are actually adding up
00:32:35.180 | to like, you know, we want networks to be deep
00:32:38.260 | because we want multiple steps of the computation.
00:32:42.820 | And deep learning provides the way
00:32:47.340 | to represent space of programs, which is searchable.
00:32:50.340 | And it's searchable with stochastic gradient descent.
00:32:53.300 | So we have an algorithm to search over
00:32:55.660 | humongous number of programs
00:32:58.580 | and gradient descent kind of bubbles up
00:33:00.980 | the things that tend to give correct answers.
00:33:04.380 | - So a neural network with fixed weights,
00:33:09.380 | that's optimized.
00:33:11.860 | Do you think of that as a single program?
00:33:14.300 | - So there is a work by Christopher Olach,
00:33:18.380 | where he, so he works on interpretability
00:33:21.900 | of neural networks.
00:33:23.100 | And he was able to identify inside of the neural network,
00:33:28.100 | for instance, a detector of a wheel for a car
00:33:31.820 | or the detector of a mask for a car.
00:33:33.940 | And then he was able to separate them out
00:33:36.140 | and assemble them together using a simple program
00:33:40.860 | for the detector, for a car detector.
00:33:43.300 | - That's like, if you think of traditionally
00:33:46.060 | defined programs, that's like a function within a program
00:33:49.300 | that this particular neural network was able to find.
00:33:52.260 | And you can tear that out, just like you can copy
00:33:54.940 | and paste from Stack Overflow.
00:33:56.500 | So any program is a composition of smaller programs.
00:34:03.340 | - Yeah, I mean, the nice thing about the neural networks
00:34:05.820 | is that it allows the things to be more fuzzy
00:34:08.620 | than in case of programs.
00:34:10.460 | In case of programs, you have this like a branching
00:34:13.180 | this way or that way.
00:34:14.460 | And the neural networks, they have an easier way
00:34:17.300 | to be somewhere in between or to share things.
00:34:21.020 | - What to use the most beautiful or surprising idea
00:34:25.100 | in deep learning, in the utilization
00:34:27.860 | of these neural networks, which by the way,
00:34:30.140 | for people who are not familiar,
00:34:32.300 | neural networks is a bunch of, what would you say?
00:34:36.180 | It's inspired by the human brain.
00:34:37.900 | There's neurons, there's connection between those neurons.
00:34:40.380 | There's inputs and there's outputs and there's millions
00:34:43.780 | or billions of those neurons.
00:34:45.500 | And the learning happens by adjusting the weights
00:34:50.500 | on the edges that connect these neurons.
00:34:54.180 | - Thank you for giving definition.
00:34:56.380 | I supposed to do it, but I guess you have enough empathy
00:34:59.060 | to listeners to actually know that that might be useful.
00:35:02.740 | - No, that's like, so I'm asking Plato of like,
00:35:06.220 | what is the meaning of life?
00:35:07.460 | He's not gonna answer.
00:35:09.340 | You're being philosophical and deep and quite profound
00:35:12.380 | talking about the space of programs,
00:35:13.940 | which is very interesting, but also for people
00:35:17.100 | who are just not familiar, what the hell we're talking about
00:35:19.500 | when we talk about deep learning.
00:35:20.940 | Anyway, sorry, what is the most beautiful
00:35:23.740 | or surprising idea to you in all the time
00:35:27.580 | you've worked at deep learning?
00:35:28.540 | And you worked on a lot of fascinating projects,
00:35:31.980 | applications of neural networks.
00:35:35.260 | It doesn't have to be big and profound.
00:35:36.940 | It can be a cool trick.
00:35:38.260 | - Yeah, I mean, I'm thinking about the trick,
00:35:40.300 | but like it's still amusing to me that it works at all.
00:35:44.700 | That let's say that the extremely simple algorithm,
00:35:47.380 | stochastic gradient descent, which is something
00:35:49.620 | that I would be able to derive on the piece of paper
00:35:53.420 | to high school student when put at the scale
00:35:58.420 | of thousands of machines actually can create the behaviors
00:36:04.980 | which we called kind of human like behaviors.
00:36:07.980 | - So in general, any application of stochastic gradient
00:36:11.380 | descent in neural networks is amazing to you.
00:36:14.660 | So, or is there a particular application
00:36:18.260 | in natural language, reinforcement learning?
00:36:21.820 | And also, what do you attribute that success to?
00:36:29.020 | Is it just scale?
00:36:31.340 | What profound insight can we take from the fact that
00:36:34.780 | the thing works for gigantic sets of variables?
00:36:39.780 | - I mean, the interesting thing is these algorithms,
00:36:42.780 | they were invented decades ago
00:36:46.340 | and people actually gave up on the idea.
00:36:51.340 | And back then they thought that we need
00:36:55.700 | profoundly different algorithms
00:36:57.820 | and they spent a lot of cycles on very different algorithms.
00:37:01.300 | And I believe that we have seen that various innovations
00:37:06.100 | that say like transformer or dropout
00:37:09.820 | or so they can vastly help.
00:37:13.220 | But it's also remarkable to me that this algorithm
00:37:16.820 | from '60s or so, or I mean, you can even say
00:37:20.180 | that the gradient descent was invented by Leibniz
00:37:23.340 | in I guess 18th century or so.
00:37:25.460 | That actually is the core of learning.
00:37:29.500 | In the past, people are, it's almost like
00:37:32.740 | out of the maybe an ego, people are saying
00:37:35.860 | that it cannot be the case that such a simple algorithm
00:37:39.060 | is there, could solve complicated problems.
00:37:44.060 | So they were in search for the other algorithms.
00:37:48.780 | And as I'm saying, like, I believe that actually
00:37:50.620 | we are in the game where there is,
00:37:52.580 | there are actually frankly three levers.
00:37:54.260 | There is compute, there are algorithms and there is data.
00:37:57.820 | And if we want to build intelligent systems,
00:38:00.220 | we have to pull all three levers
00:38:03.620 | and they are actually multiplicative.
00:38:05.500 | It's also interesting, so you ask, is it only compute?
00:38:09.500 | People internally, they did the studies
00:38:12.980 | to determine how much gains they were coming
00:38:15.540 | from different levers.
00:38:16.940 | And so far we have seen that more gains came
00:38:19.700 | from compute than algorithms.
00:38:21.140 | But also we are in the world that in case of compute,
00:38:23.820 | there is a kind of exponential increase in funding.
00:38:27.020 | And at some point it's impossible to invest more.
00:38:29.820 | It's impossible to invest $10 trillion.
00:38:31.820 | We are speaking about, let's say all taxes in US.
00:38:37.060 | - But you're talking about money
00:38:39.900 | that could be innovation in the compute.
00:38:42.940 | - That's true as well.
00:38:44.780 | So, I mean, there are like a few pieces.
00:38:46.620 | So one piece is human brain is an incredible supercomputer.
00:38:50.460 | And they're like, it has,
00:38:57.420 | a hundred trillion parameters.
00:38:58.940 | Or like if you try to count various quantities in the brain,
00:39:03.380 | they're like a neuron synapses,
00:39:05.700 | that small number of neurons, there is a lot of synapses.
00:39:09.020 | It's unclear even how to map synapses
00:39:13.060 | to parameters of neural networks,
00:39:16.060 | but it's clear that there are many more.
00:39:18.380 | So it might be the case that our networks
00:39:22.140 | are still somewhat small.
00:39:25.380 | It also might be the case that they are more efficient
00:39:27.380 | than brain or less efficient by some huge factor.
00:39:31.100 | I also believe that there will be,
00:39:33.940 | you know, at the moment we are at the stage
00:39:35.540 | that these neural networks, they require a thousand X
00:39:39.500 | or like a huge factor of more data than humans do.
00:39:43.220 | And it will be a matter of,
00:39:44.820 | there will be algorithms that vastly
00:39:49.460 | decrease sample complexity, I believe so.
00:39:51.780 | But the place where we are heading today
00:39:54.020 | is there are domains which contains million X more data.
00:39:59.020 | And even though computers might be
00:40:01.420 | 1000 times slower than humans in learning,
00:40:03.940 | that's not the problem.
00:40:04.940 | Like for instance, I believe that it should be possible
00:40:09.940 | to create superhuman therapies by,
00:40:14.260 | and there are like even simple steps of doing it.
00:40:20.460 | And you know, the core reason is there is just,
00:40:24.060 | machine will be able to read way more
00:40:26.500 | transcripts of therapies,
00:40:27.700 | and then it should be able to speak simultaneously
00:40:30.180 | with many more people.
00:40:31.220 | And it should be possible to optimize it all in parallel.
00:40:35.060 | - Now you're touching on something I deeply care about
00:40:39.540 | and think is way harder than we imagine.
00:40:41.940 | What's the goal of a therapist?
00:40:45.300 | - What's the goal of therapist?
00:40:47.620 | So, okay, so one goal, now this is terrifying to me,
00:40:51.540 | but there's a lot of people that contemplate suicide,
00:40:55.260 | self-affirmed depression,
00:40:56.740 | and they could significantly be helped with therapy.
00:41:02.620 | And the idea that an AI algorithm
00:41:05.740 | might be in charge of that,
00:41:08.180 | it's like a life and death task.
00:41:10.340 | The stakes are high.
00:41:14.060 | So one goal for a therapist, whether human or AI,
00:41:19.060 | is to prevent suicide ideation, to prevent suicide.
00:41:23.940 | How do you achieve that?
00:41:25.780 | - So let's see.
00:41:27.900 | So to be clear, I don't think that the current models
00:41:31.780 | are good enough for such a task
00:41:33.460 | because it requires insane amount of understanding, empathy,
00:41:36.660 | and the models are far from this place, but it's-
00:41:40.380 | - But do you think that understanding empathy,
00:41:43.220 | that signal is in the data?
00:41:45.740 | - I think there is some signal in the data, yes.
00:41:47.660 | I mean, there are plenty of transcripts of conversations
00:41:50.980 | and it is possible from it to understand personalities.
00:41:55.980 | It is possible from it to understand
00:41:59.460 | if conversation is friendly, amicable, antagonistic.
00:42:04.460 | It is, I believe that, you know,
00:42:07.820 | given the fact that the models that we train now,
00:42:11.500 | they can have, they are chameleons,
00:42:16.460 | that they can have any personality.
00:42:18.180 | They might turn out to be better in understanding
00:42:20.700 | personality of other people than anyone else.
00:42:24.260 | And they should- - Be empathetic.
00:42:25.660 | - To be empathetic.
00:42:26.580 | - Yeah, interesting.
00:42:28.940 | But I wonder if there's some level
00:42:33.100 | of multiple modalities required
00:42:37.740 | to be able to be empathetic of the human experience,
00:42:42.060 | whether language is not enough, to understand death,
00:42:45.060 | to understand fear, to understand childhood trauma,
00:42:49.900 | to understand wit and humor required
00:42:54.500 | when you're dancing with a person
00:42:56.300 | who might be depressed or suffering,
00:42:58.860 | both humor and hope and love and all those kinds of things.
00:43:02.980 | So there's another underlying question,
00:43:05.100 | which is self-supervised versus supervised.
00:43:08.660 | So can you get that from the data
00:43:13.180 | by just reading a huge number of transcripts?
00:43:16.500 | - I actually, so I think that reading
00:43:18.580 | huge number of transcripts is a step one.
00:43:21.020 | It's like the same way as you cannot learn to dance
00:43:23.940 | if just from YouTube.
00:43:25.620 | By watching it, you have to actually try it out yourself.
00:43:29.100 | So I think that here, that's a similar situation.
00:43:32.140 | I also wouldn't deploy the system
00:43:33.980 | in the high stakes situations right away,
00:43:36.740 | but kind of see gradually where it goes.
00:43:40.180 | And obviously, initially, it would have to go
00:43:44.220 | hand in hand with humans.
00:43:46.540 | But at the moment, we are in the situation
00:43:49.060 | that actually there is many more people
00:43:51.860 | who actually would like to have a therapy
00:43:54.340 | or speak with someone than there are therapies out there.
00:43:57.940 | So fundamentally, I was thinking,
00:44:02.740 | what are the things that can vastly
00:44:06.420 | increase people's well-being?
00:44:08.740 | Therapy is one of them.
00:44:10.300 | I think meditation is other one.
00:44:12.220 | I guess maybe human connection is a third one.
00:44:14.380 | And I guess pharmacologically, it's also possible.
00:44:17.460 | Maybe direct brain stimulation or something like that.
00:44:19.900 | But these are pretty much options out there.
00:44:22.140 | Then let's say the way I'm thinking about the AGI endeavor
00:44:25.780 | is by default, that's an endeavor
00:44:27.580 | to increase amount of wealth.
00:44:30.700 | And I believe that we can vastly increase
00:44:32.660 | amount of wealth for everyone.
00:44:35.180 | And simultaneously, so I mean,
00:44:37.420 | these are like two endeavors that make sense to me.
00:44:40.020 | One is like essentially increase amount of wealth.
00:44:42.700 | And second one is increase overall human well-being.
00:44:47.140 | - And those are coupled together.
00:44:48.500 | - And they can, I would say, these are different topics.
00:44:51.980 | One can help another.
00:44:53.180 | - And therapist is a funny word
00:44:57.980 | 'cause I see friendship and love as therapy.
00:45:00.420 | I mean, so therapist broadly defined
00:45:02.900 | is just friendship, is a friend.
00:45:05.580 | So like therapist has a very kind of clinical sense to it.
00:45:10.060 | But what is human connection?
00:45:12.340 | Not to get all Camus and Dostoevsky on you,
00:45:18.420 | but life is suffering and we draw,
00:45:22.420 | we seek connection with other humans
00:45:25.340 | as we desperately try to make sense of this world
00:45:29.740 | in a deep, overwhelming loneliness that we feel inside.
00:45:34.700 | - So I think connection has to do with understanding.
00:45:37.980 | And I think that almost like a lack of understanding
00:45:40.700 | causes suffering.
00:45:41.540 | If you speak with someone and you feel ignored,
00:45:44.820 | that actually causes pain.
00:45:46.860 | If you are feeling deeply understood,
00:45:49.940 | that actually they might not even tell you
00:45:52.740 | what to do in life, but like a pure understanding.
00:45:56.100 | - Or just being heard.
00:45:57.620 | Understanding is a kind of, it's a lot,
00:46:00.780 | just being heard, feel like you're being heard.
00:46:03.860 | Like somehow that's a alleviation temporarily
00:46:08.180 | of the loneliness.
00:46:09.860 | That if somebody knows you're here
00:46:13.340 | with their body language, with the way they are,
00:46:17.340 | with the way they look at you, with the way they talk,
00:46:20.180 | you feel less alone for a brief moment.
00:46:22.380 | - Yeah, very much, I agree.
00:46:25.180 | So I thought in the past about a somewhat similar question
00:46:29.460 | to yours, which is what is love?
00:46:31.980 | Rather, what is connection?
00:46:33.700 | And obviously I think about these things
00:46:37.140 | from AI perspective, what would it mean?
00:46:39.140 | So I said that the intelligence has to do
00:46:43.860 | with some compression, which is more or less,
00:46:45.780 | like I can say, almost understanding
00:46:47.780 | of what is going around.
00:46:49.220 | It seems to me that other aspect is there seem
00:46:52.740 | to be reward functions and you can have a reward
00:46:57.060 | for food, for maybe human connection,
00:47:00.780 | for let's say warmth, sex and so on.
00:47:05.780 | And it turns out that the various people
00:47:10.860 | might be optimizing slightly different reward functions.
00:47:13.620 | They essentially might care about different things.
00:47:16.420 | And in case of love, at least the love between two people,
00:47:22.540 | you can say that the boundary between people dissolves
00:47:26.460 | to such extent that they end up optimizing
00:47:29.140 | each other reward functions.
00:47:30.620 | - Yeah, that's interesting.
00:47:35.660 | Celebrate the success of each other.
00:47:39.900 | - In some sense, I would say love means helping others
00:47:43.860 | to optimize their reward functions,
00:47:45.860 | not your reward functions, not the things
00:47:47.460 | that you think are important,
00:47:48.980 | but the things that the person cares about.
00:47:51.220 | You try to help them to optimize it.
00:47:55.060 | - So love is, if you think of two reward functions,
00:47:58.660 | you just, it's like addition.
00:48:00.340 | You combine them together.
00:48:01.260 | - Yeah, pretty much.
00:48:02.100 | - Maybe like with a weight and it depends
00:48:04.020 | like the dynamic of the relationship.
00:48:06.140 | - Yeah, I mean, you could imagine that
00:48:07.500 | if you are fully optimizing someone's reward function
00:48:10.180 | without yours, then maybe are creating codependency
00:48:13.180 | or something like that.
00:48:15.060 | I'm not sure what's the appropriate weight.
00:48:17.100 | But the interesting thing is I even think
00:48:20.180 | that the individual person,
00:48:22.700 | we ourselves, we are actually less of a unified insight.
00:48:29.620 | So for instance, if you look at the donut,
00:48:32.140 | on the one level, you might think,
00:48:33.460 | "Oh, this looks tasty, I would like to eat it."
00:48:35.780 | On other level, you might tell yourself,
00:48:37.860 | "I shouldn't be doing it because I want to gain muscles."
00:48:41.980 | So, and you might do it regardless,
00:48:44.700 | kind of against yourself.
00:48:46.100 | So it seems that even within ourselves,
00:48:48.580 | they're almost like a kind of intertwined personas.
00:48:51.780 | And I believe that the self-love means that the love
00:48:56.780 | between all these personas,
00:48:58.580 | which also means being able to love yourself
00:49:03.540 | when we are angry or stressed or so.
00:49:06.260 | - Combining all those reward functions
00:49:07.980 | of the different selves you have.
00:49:09.660 | - Yeah, and accepting that they are there.
00:49:11.340 | Like, you know, often people,
00:49:12.540 | they have a negative self-talk or they say,
00:49:15.180 | "I don't like when I'm angry."
00:49:16.900 | And like, I try to imagine if there would be
00:49:21.900 | like a small baby Lex, like five years old.
00:49:26.500 | - Who's angry.
00:49:27.420 | - Angry, and then you're like, "You shouldn't be angry.
00:49:30.260 | "Like, stop being angry."
00:49:31.940 | But like, instead actually you want Lex to come over,
00:49:35.060 | give him a hug and just like say, "It's fine.
00:49:37.700 | "Okay, you can be angry as long as you want."
00:49:40.340 | And then he would stop.
00:49:41.540 | - Or maybe not.
00:49:44.060 | - Or maybe not, but you cannot expect it even.
00:49:46.260 | - Yeah, but still that doesn't explain the why of love.
00:49:49.740 | Like, why is love part of the human condition?
00:49:52.180 | Why is it useful to combine the reward functions?
00:49:56.620 | It seems like that doesn't, I mean,
00:50:00.260 | I don't think reinforcement learning frameworks
00:50:02.100 | can give us answers to why.
00:50:04.020 | Even the Hutter framework has an objective function
00:50:08.460 | that's static.
00:50:09.460 | - So we came to existence as a consequence
00:50:12.860 | of evolutionary process.
00:50:14.540 | And in some sense, the purpose of evolution is survival.
00:50:17.660 | And then this complicated optimization objective
00:50:22.460 | baked into us, let's say compression,
00:50:25.300 | which might help us operate in the real world
00:50:27.940 | and it baked into us various reward functions.
00:50:30.340 | - Yeah.
00:50:31.740 | - Then to be clear, at the moment we are operating
00:50:34.100 | in the regime which is somewhat out of distribution
00:50:36.780 | where the even evolution optimize us.
00:50:38.860 | - It's almost like love is a consequence of cooperation
00:50:42.460 | that we've discovered is useful.
00:50:44.020 | - Correct.
00:50:44.860 | In some way, it's even the case if you--
00:50:47.100 | - I just love the idea that love
00:50:48.660 | is like the out of distribution.
00:50:51.380 | - Or it's not out of distribution.
00:50:52.540 | It's like, as you said, it evolved for cooperation.
00:50:55.420 | - Yes.
00:50:56.260 | - And I believe that the, in some sense,
00:50:58.420 | cooperation ends up helping each of us individually.
00:51:00.700 | So it makes sense, evolutionary.
00:51:02.820 | And there is, in some sense, and you know,
00:51:05.580 | love means there is this dissolution of boundaries
00:51:08.180 | that you have a shared reward function.
00:51:10.340 | And we evolved to actually identify ourselves
00:51:13.100 | with larger groups.
00:51:14.300 | So we can identify ourselves, you know, with a family.
00:51:18.420 | We can identify ourselves with a country
00:51:20.660 | to such extent that people are willing
00:51:22.500 | to give away their life for country.
00:51:24.300 | So there is, we are wired actually even for love.
00:51:30.060 | And at the moment, I guess,
00:51:32.100 | maybe it would be somewhat more beneficial
00:51:37.220 | if we would identify ourselves
00:51:39.580 | with all the humanity as a whole.
00:51:41.620 | So you can clearly see when people travel around the world,
00:51:44.620 | when they run into person from the same country,
00:51:47.180 | they say, "Oh, we see TR."
00:51:48.900 | And all of a sudden they find all these similarities.
00:51:52.060 | They find some, they befriend those folks
00:51:55.300 | earlier than others.
00:51:56.780 | So there is like a sense, some sense of the belonging.
00:51:59.460 | And I would say, I think it would be overall
00:52:01.460 | good thing to the world for people to move towards,
00:52:06.460 | I think it's even called open individualism,
00:52:09.900 | move toward the mindset of a larger and larger groups.
00:52:13.900 | - So the challenge there, that's a beautiful vision
00:52:17.580 | and I share it to expand that circle of empathy,
00:52:21.060 | that circle of love towards the entirety of humanity.
00:52:24.180 | But then you start to ask, well, where do you draw the line?
00:52:27.340 | Because why not expand it to other conscious beings?
00:52:30.660 | And then finally for our discussion,
00:52:34.220 | something I think about is why not expand it to AI systems?
00:52:39.460 | Like we start respecting each other when the person,
00:52:43.340 | the entity on the other side has the capacity to suffer,
00:52:47.580 | 'cause then we develop a capacity to sort of empathize.
00:52:52.340 | And so I could see AI systems that are interacting
00:52:55.460 | with humans more and more having conscious-like displays.
00:52:59.980 | So like they display consciousness through language
00:53:04.380 | and through other means.
00:53:05.900 | And so then the question is like,
00:53:07.740 | well, is that consciousness?
00:53:09.820 | Because they're acting conscious.
00:53:12.020 | And so the reason we don't like torturing animals
00:53:16.780 | is because they look like they're suffering
00:53:20.660 | when they're tortured.
00:53:22.460 | And if AI looks like it's suffering when it's tortured,
00:53:27.460 | how is that not requiring of the same kind of empathy
00:53:34.460 | from us and respect and rights that animals do
00:53:38.340 | and other humans do?
00:53:39.180 | - I think it requires empathy as well.
00:53:40.940 | I mean, I would like, I guess us or humanity
00:53:43.820 | or so make a progress in understanding what consciousness is
00:53:48.140 | because I don't want just to be speaking
00:53:50.060 | about the philosophy,
00:53:51.620 | but rather actually make a scientific.
00:53:53.500 | There was a time that people thought
00:53:57.700 | that there is a force of life
00:54:01.300 | and the things that have this force, they are alive.
00:54:05.820 | And I think that there is actually a path
00:54:10.340 | to understand exactly what consciousness is.
00:54:13.820 | And in some sense, it might require essentially
00:54:18.820 | putting probes inside of a human brain,
00:54:21.860 | what neural link does.
00:54:23.820 | - So the goal there, I mean, there's several things
00:54:25.660 | with consciousness that make it a real discipline,
00:54:28.340 | which is one is rigorous measurement of consciousness.
00:54:32.460 | And then the other is the engineering of consciousness,
00:54:34.700 | which may or may not be related.
00:54:36.500 | I mean, you could also run into trouble,
00:54:38.900 | like for example, in the United States,
00:54:41.460 | for the Department of DOT, Department of Transportation,
00:54:45.020 | and a lot of different places put a value on human life.
00:54:48.740 | I think DOT's value is $9 million per person.
00:54:53.220 | So in that same way, you can get into trouble
00:54:57.860 | if you put a number on how conscious a being is,
00:55:01.140 | 'cause then you can start making policy.
00:55:03.500 | If a cow is 0.1 or like 10% as conscious as a human,
00:55:08.500 | then you can start making calculations
00:55:14.100 | and it might get you into trouble.
00:55:15.380 | But then again, that might be a very good way to do it.
00:55:18.940 | - I would like to move to that place
00:55:22.220 | that actually we have scientific understanding
00:55:23.900 | what consciousness is.
00:55:25.180 | And then we'll be able to actually assign value
00:55:27.700 | and I believe that there is even the path
00:55:30.100 | for the experimentation in it.
00:55:32.780 | So we said that you could put the probes
00:55:37.780 | inside of the brain.
00:55:39.620 | There is actually few other things that you could do
00:55:42.700 | with devices like Neuralink.
00:55:44.700 | So you could imagine that the way even to measure
00:55:47.500 | if AI system is conscious
00:55:49.660 | is by literally just plugging into the brain.
00:55:52.380 | I mean, that assumes that it's kind of easy,
00:55:55.460 | but the plugging into the brain and asking person
00:55:58.180 | if they feel that their consciousness expanded.
00:56:01.060 | This direction, of course, has some issues.
00:56:03.140 | You can say, if someone takes a psychedelic drug,
00:56:05.820 | they might feel that their consciousness expanded
00:56:08.220 | even though that drug itself is not conscious.
00:56:11.100 | - Right, so like you can't fully trust the self-report
00:56:14.180 | of a person saying their consciousness is expanded or not.
00:56:18.460 | Let me ask you a little bit about psychedelics
00:56:22.660 | 'cause there've been a lot of excellent research
00:56:24.900 | on different psychedelics, psilocybin, MDMA, even DMT,
00:56:29.780 | drugs in general, marijuana too.
00:56:32.580 | What do you think psychedelics do to the human mind?
00:56:37.060 | It seems they take the human mind
00:56:39.380 | to some interesting places.
00:56:41.980 | Is that just a little hack, a visual hack,
00:56:46.220 | or is there some profound expansion of the mind?
00:56:49.380 | - So let's see.
00:56:51.020 | I don't believe in magic.
00:56:53.380 | I believe in science, in causality.
00:56:58.300 | Still, let's say, and then as I said,
00:57:02.060 | I think that the brain,
00:57:03.900 | our subjective experience of reality is,
00:57:08.500 | we live in the simulation run by our brain,
00:57:12.820 | and the simulation that our brain runs,
00:57:15.300 | they can be very pleasant or very hellish.
00:57:18.020 | Drugs, they are changing some hyperparameters
00:57:21.940 | of the simulation.
00:57:23.180 | It is possible, thanks to change of these hyperparameters,
00:57:26.580 | to actually look back on your experience
00:57:28.820 | and even see that the given things
00:57:30.780 | that we took for granted, they are changeable.
00:57:35.420 | So they allow to have an amazing perspective.
00:57:39.260 | There is also, for instance, the fact that after DMT,
00:57:42.980 | people can see the full movie inside of their head,
00:57:47.620 | gives me further belief that the brain can generate
00:57:51.940 | that full movie, that the brain is actually learning
00:57:55.980 | the model of reality to such extent
00:57:58.100 | that it tries to predict what's gonna happen next.
00:58:00.500 | - Yeah, very high resolution,
00:58:01.980 | so it can replay reality, essentially.
00:58:03.820 | - Extremely high resolution.
00:58:06.060 | Yeah, and it's also kind of interesting to me
00:58:08.380 | that somehow there seems to be some similarity
00:58:11.460 | between these drugs and meditation itself.
00:58:16.820 | And I actually started even these days
00:58:19.060 | to think about meditation as a psychedelic.
00:58:21.340 | - Do you practice meditation?
00:58:24.620 | - I practice meditation.
00:58:26.540 | I mean, I went a few times on the retreats
00:58:29.780 | and it feels after, like after second or third day
00:58:33.340 | of meditation, there is almost like a sense of tripping.
00:58:38.340 | - What does a meditation retreat entail?
00:58:44.740 | - So you wake up early in the morning
00:58:49.260 | and you meditate for extended period of time.
00:58:52.500 | - Alone?
00:58:53.580 | Sorry, Jake, you've been talking.
00:58:54.420 | - Yeah, so it's optimized, even though there are other people
00:58:57.580 | it's optimized for isolation.
00:59:00.060 | So you don't speak with anyone,
00:59:01.460 | you don't actually look into other people's eyes.
00:59:04.540 | And you sit on the chair.
00:59:07.620 | So Vipassana meditation tells you to focus on the breath.
00:59:13.700 | So you try to put all attention into breathing in
00:59:18.700 | and breathing out.
00:59:20.940 | And the crazy thing is that as you focus attention like that
00:59:25.940 | after some time, there's some starts coming back
00:59:30.340 | like some memories that you're completely forgotten.
00:59:34.380 | It almost feels like that you have a mailbox
00:59:38.340 | and then you know, you are just like archiving email
00:59:42.140 | one by one.
00:59:43.700 | And at some point there is like a amazing feeling
00:59:48.700 | of getting to mailbox zero, zero emails.
00:59:51.620 | And it's very pleasant.
00:59:53.700 | It's kind of, it's crazy to me that once you resolve
00:59:58.700 | these inner stories or like inner traumas,
01:00:08.220 | then once there is nothing left,
01:00:12.020 | the default state of human mind is extremely peaceful
01:00:16.460 | and happy.
01:00:17.340 | Extreme, like some sense, it feels that,
01:00:20.620 | it feels at least to me the way how when I was a child
01:00:28.180 | that I can look at any object and it's very beautiful.
01:00:31.540 | I have a lot of curiosity about the simple things.
01:00:34.500 | And that's where usually meditation takes me.
01:00:38.060 | - Are you, what are you experiencing?
01:00:40.580 | Are you just taking in simple sensory information
01:00:44.620 | and are just enjoying the rawness
01:00:46.900 | of that sensory information?
01:00:48.700 | So there's no memories or all that kind of stuff.
01:00:52.540 | You're just enjoying being?
01:00:55.540 | - Yeah, pretty much.
01:00:56.460 | I mean, still there is, thoughts are slowing down.
01:01:01.420 | Sometimes they pop up, but it's also somehow
01:01:04.980 | the extended meditation takes you to the space
01:01:07.100 | that they are way more friendly, way more positive.
01:01:11.780 | There is also this thing that we've,
01:01:18.220 | it almost feels that,
01:01:19.660 | it almost feels that we are constantly getting
01:01:25.660 | a little bit of a reward function
01:01:27.780 | and we are just spreading this reward function
01:01:30.180 | on various activities.
01:01:31.620 | But if you stay still for extended period of time,
01:01:35.220 | it kind of accumulates, accumulates, accumulates.
01:01:37.660 | And there is a sense, there is a sense
01:01:41.860 | that some point it passes some threshold
01:01:44.300 | and it feels as drop is falling into kind of ocean
01:01:49.300 | of love and bliss.
01:01:50.740 | And that's like a, this is like a very pleasant,
01:01:53.100 | and that's why I'm saying like,
01:01:55.180 | that corresponds to the subjective experience.
01:01:58.140 | Some people, I guess, in spiritual community,
01:02:02.860 | they describe it that that's the reality.
01:02:05.700 | And I would say, I believe that there are like
01:02:07.460 | all sorts of subjective experience that one can have.
01:02:10.540 | And I believe that for instance, meditation might take you
01:02:14.580 | to the subjective experiences,
01:02:16.060 | which are very pleasant, collaborative.
01:02:17.900 | And I would like a world to move toward
01:02:20.820 | more collaborative place.
01:02:23.380 | Yeah, I would say that's very pleasant
01:02:26.700 | and I enjoy doing stuff like that.
01:02:28.300 | - I wonder how that maps to your mathematical model of love
01:02:33.300 | with the reward function, combining a bunch of things.
01:02:38.020 | It seems like our life then is we're just,
01:02:42.660 | we have this reward function and we're accumulating
01:02:45.300 | a bunch of stuff in it with weights.
01:02:48.180 | It's like multi-objective.
01:02:53.140 | And what meditation is, is you just remove them,
01:02:56.940 | and remove them until the weight on one
01:03:00.820 | or just a few is very high.
01:03:03.500 | And that's where the pleasure comes from.
01:03:05.300 | - Yeah, so something similar to how I'm thinking about this.
01:03:08.300 | So I told you that there is this like,
01:03:10.420 | there is a story of who you are.
01:03:14.220 | And I think almost about it as a text prepended to GPT.
01:03:19.220 | (laughing)
01:03:20.660 | - Yeah.
01:03:21.500 | - And some people refer to it as ego.
01:03:24.260 | Okay, there's like a story.
01:03:25.900 | Who you are, okay?
01:03:28.020 | - So ego is the prompt for GPT-3?
01:03:30.580 | - Yeah. - Or GPT?
01:03:31.420 | - Yes, yes, and that's description of you.
01:03:32.980 | And then with meditation, you can get to the point
01:03:35.180 | that actually you experience things without the prompt.
01:03:38.220 | And you experience things as they are.
01:03:42.340 | You are not biased or the description
01:03:44.140 | how they supposed to be.
01:03:45.420 | That's very pleasant.
01:03:47.540 | And then with respect to the reward function,
01:03:50.060 | it's possible to get to the point
01:03:52.900 | that there is the solution of self.
01:03:55.580 | And therefore you can say that you are having a,
01:03:58.940 | or like your brain attempts to simulate
01:04:01.020 | the reward function of everyone else or like everything.
01:04:04.140 | That's, there is this like a love
01:04:05.780 | which feels like a oneness with everything.
01:04:07.940 | And that's also, you know, very beautiful, very pleasant.
01:04:11.500 | At some point, you might have a lot of altruistic thoughts
01:04:15.260 | during that moment, and then the self always comes back.
01:04:19.300 | - How would you recommend,
01:04:21.020 | if somebody is interested in meditation,
01:04:22.980 | like a big thing to take on as a project,
01:04:25.620 | would you recommend a meditation retreat?
01:04:27.460 | How many days?
01:04:28.620 | What kind of thing would you recommend?
01:04:30.220 | - I think that actually retreat is the way to go.
01:04:33.300 | It almost feels that, as I said,
01:04:37.180 | like meditation is a psychedelic,
01:04:39.340 | but when you take it in the small dose,
01:04:42.300 | you might barely feel it.
01:04:43.580 | Once you get the high dose, actually you're gonna feel it.
01:04:46.620 | - So even cold turkey,
01:04:49.700 | if you haven't really seriously meditated
01:04:51.580 | for a prolonged period of time, just go to a retreat.
01:04:54.260 | - Yeah. - How many days?
01:04:55.380 | How many days? - I would start weekend one.
01:04:57.460 | - Weekend, so like two, three days.
01:04:59.740 | - And it's like, it's interesting that first or second day,
01:05:03.780 | it's hard, and at some point it becomes easy.
01:05:06.140 | - There's a lot of seconds in a day.
01:05:09.540 | How hard is the meditation retreat?
01:05:11.500 | Just sitting there in a chair.
01:05:13.980 | - So the thing is actually,
01:05:17.340 | it literally just depends on your own framing.
01:05:22.340 | Like if you are in the mindset
01:05:23.980 | that you are waiting for it to be over,
01:05:26.340 | or you are waiting for a nirvana to happen,
01:05:29.220 | it will be very unpleasant.
01:05:30.900 | And in some sense, even the difficulty,
01:05:33.300 | it's not even in the lack of being able
01:05:36.940 | to speak with others.
01:05:37.860 | Like you are sitting there,
01:05:39.460 | your legs will hurt from sitting.
01:05:41.780 | - In terms of like the practical things,
01:05:44.660 | do you experience kind of discomfort,
01:05:46.860 | like physical discomfort of just sitting,
01:05:49.140 | like your butt being numb,
01:05:51.020 | your legs being sore, all that kind of stuff?
01:05:54.340 | - Yes, you experience it,
01:05:55.540 | and then they teach you to observe it.
01:05:58.820 | The crazy thing is, you at first might have a feeling
01:06:03.460 | toward trying to escape it,
01:06:05.460 | and that becomes very apparent
01:06:07.300 | that that's extremely unpleasant.
01:06:09.260 | And then you just observe it,
01:06:11.980 | and at some point it just becomes, it just is.
01:06:16.300 | It's like, I remember Ilya told me some time ago
01:06:20.860 | that he takes a cold shower,
01:06:23.460 | and his mindset of taking a cold shower
01:06:25.940 | was to embrace suffering.
01:06:28.540 | - Yeah, excellent.
01:06:29.860 | I do the same.
01:06:30.700 | - Is this your style?
01:06:31.540 | - Yeah, it's my style.
01:06:33.100 | I like this.
01:06:34.420 | - So my style is actually,
01:06:36.620 | I also sometimes take cold showers.
01:06:39.140 | It is purely observing how the water goes through my body,
01:06:42.580 | like purely being present,
01:06:43.860 | not trying to escape from there.
01:06:46.140 | - Yeah.
01:06:46.980 | - And I would say then it actually becomes pleasant.
01:06:49.620 | It's not like, "Ugh!"
01:06:53.380 | - Well, that's interesting.
01:06:55.020 | I'm also, that's the way to deal
01:06:59.740 | with anything really difficult,
01:07:00.900 | especially in the physical space,
01:07:02.260 | is to observe it.
01:07:04.660 | To say it's pleasant,
01:07:09.060 | I would use a different word.
01:07:10.700 | You're accepting of the full beauty of reality,
01:07:17.500 | I would say, 'cause say pleasant.
01:07:20.540 | But yeah, in some sense it is pleasant.
01:07:22.740 | That's the only way to deal with a cold shower
01:07:25.740 | is to become an observer and to find joy in it.
01:07:30.340 | Same with really difficult physical exercise,
01:07:35.340 | or running for a really long time, endurance events,
01:07:38.620 | just any time you're exhausted, any kind of pain.
01:07:41.180 | I think the only way to survive it is not to resist it,
01:07:43.980 | it's to observe it.
01:07:44.940 | You mentioned Ilya.
01:07:47.540 | Ilya Sutskever.
01:07:49.100 | - He's very, he's our chief scientist,
01:07:51.660 | but also he's a very close friend of mine.
01:07:53.780 | - He co-founded OpenAI with you.
01:07:56.420 | I've spoken with him a few times.
01:07:58.620 | He's brilliant, I really enjoy talking to him.
01:08:00.900 | His mind, just like yours, works in fascinating ways.
01:08:07.420 | Both of you are not able to define deep learning simply.
01:08:10.340 | What's it like having him as somebody
01:08:15.460 | you have technical discussions with
01:08:17.740 | on in-space machine learning, deep learning, AI,
01:08:21.900 | but also life?
01:08:23.500 | What's it like when these two agents
01:08:26.300 | get into a self-play situation in a room?
01:08:31.260 | What's it like collaborating with him?
01:08:33.220 | - So I believe that we have extreme respect to each other.
01:08:38.220 | So I love Ilya's insight,
01:08:43.220 | both like, I guess, about consciousness, life, AI.
01:08:48.300 | - But in terms of the, it's interesting to me
01:08:52.620 | 'cause you're a brilliant thinker
01:08:56.100 | in the space of machine learning, like intuition,
01:09:00.140 | digging deep in what works, what doesn't,
01:09:04.860 | why it works, why it doesn't, and so is Ilya.
01:09:07.980 | I'm wondering if there's interesting,
01:09:10.580 | deep discussions you've had with him in the past
01:09:12.940 | or disagreements that were very productive.
01:09:15.300 | - So I can say, I also understood over the time
01:09:19.340 | where are my strengths.
01:09:20.940 | So obviously we have plenty of AI discussions.
01:09:23.740 | And I myself have plenty of ideas,
01:09:29.340 | but I consider Ilya one of the most prolific AI scientists
01:09:34.180 | in the entire world.
01:09:35.740 | And I think that I realized that maybe my super skill
01:09:40.740 | is being able to bring people to collaborate together,
01:09:45.140 | that I have some level of empathy
01:09:47.300 | that is unique in AI world.
01:09:49.380 | And that might come from either meditation, psychedelics,
01:09:53.380 | or let's say I read just hundreds of books on this topic.
01:09:55.780 | So, and I also went through a journey of,
01:09:58.940 | I developed all sorts of algorithms.
01:10:00.820 | So I think that maybe I can,
01:10:04.820 | that's my superhuman skill.
01:10:07.620 | Ilya is one of the best AI scientists,
01:10:12.100 | but then I'm pretty good in assembling teams.
01:10:15.500 | And I'm also not holding to people,
01:10:17.380 | like I'm growing people
01:10:18.460 | and then people become managers at OpenAI.
01:10:20.780 | I grew many of them, like a research managers.
01:10:23.660 | - So you find places where you're excellent
01:10:28.420 | and he finds like his deep scientific insights
01:10:32.460 | is where he is.
01:10:33.300 | And you find ways you can, the puzzle pieces fit together.
01:10:36.620 | - Correct.
01:10:37.460 | Like, ultimately, for instance, let's say Ilya,
01:10:40.180 | he doesn't manage people.
01:10:41.580 | That's not what he likes or so.
01:10:45.020 | I like hanging out with people.
01:10:48.780 | By default, I'm an extrovert and I care about people.
01:10:51.420 | - Oh, interesting.
01:10:52.260 | Okay.
01:10:53.260 | Okay, cool.
01:10:54.100 | So that fits perfectly together.
01:10:56.380 | But I mean, I also just like your intuition
01:10:59.180 | about various problems in machine learning.
01:11:01.260 | He's definitely one I really enjoy.
01:11:04.700 | I remember talking to him
01:11:06.060 | about something I was struggling with,
01:11:09.740 | which is coming up with a good model for pedestrians,
01:11:14.740 | for human beings that cross the street
01:11:16.900 | in the context of autonomous vehicles.
01:11:18.820 | And he immediately started to like formulate a framework
01:11:23.820 | within which you can evolve a model for pedestrians,
01:11:27.380 | like through self-play, all that kind of mechanisms.
01:11:29.980 | The depth of thought on a particular problem,
01:11:34.340 | especially problems he doesn't know anything about,
01:11:36.740 | is fascinating to watch.
01:11:38.620 | It makes you realize like,
01:11:40.260 | yeah, the limits that the human intellect might be limitless.
01:11:47.580 | Or it's just impressive to see a descendant of ape
01:11:51.300 | come up with clever ideas.
01:11:52.740 | - Yeah, I mean, so even in the space of deep learning,
01:11:55.820 | when you look at various people,
01:11:57.500 | there are people who invented some breakthroughs once,
01:12:02.500 | but there are very few people who did it multiple times.
01:12:06.340 | And you can think if someone invented it once,
01:12:08.820 | that might be just a sheer luck.
01:12:11.740 | And if someone invented it multiple times,
01:12:14.180 | if a probability of inventing it once is one over a million,
01:12:17.340 | then probability of inventing it twice or three times
01:12:19.660 | would be one over a million squared or to the power of three,
01:12:23.260 | which would be just impossible.
01:12:25.100 | So it literally means that it's given that it's not the luck.
01:12:30.100 | And Ilya is one of these few people
01:12:33.180 | who have a lot of these inventions in his arsenal.
01:12:38.180 | It also feels that, for instance,
01:12:41.940 | if you think about folks like Gauss or Euler,
01:12:44.300 | at first they read a lot of books
01:12:49.220 | and then they did thinking and then they figure out math.
01:12:52.620 | And that's how it feels with Ilya.
01:12:56.260 | You know, at first he read stuff
01:12:57.900 | and then he spent his thinking cycles.
01:13:00.500 | - That's a really good way to put it.
01:13:04.700 | When I talk to him, I see thinking.
01:13:10.700 | He's actually thinking.
01:13:14.500 | Like he makes me realize that there's like deep thinking
01:13:17.460 | that the human mind can do.
01:13:19.220 | Like most of us are not thinking deeply.
01:13:21.580 | Like you really have to put a lot of effort to think deeply.
01:13:25.700 | Like I have to really put myself in a place
01:13:28.740 | where I think deeply about a problem.
01:13:30.380 | It takes a lot of effort.
01:13:31.940 | It's like an airplane taking off or something.
01:13:34.620 | You have to achieve deep focus.
01:13:36.580 | He's just, what is it?
01:13:39.980 | His brain is like a vertical takeoff
01:13:42.220 | in terms of airplane analogy.
01:13:45.380 | So it's interesting.
01:13:46.820 | But I mean, Cal Newport talks about this
01:13:49.780 | as ideas of deep work.
01:13:51.300 | Most of us don't work much at all
01:13:55.060 | in terms of like deeply think about particular problems,
01:13:58.980 | whether it's a math, engineering, all that kind of stuff.
01:14:02.020 | You want to go to that place often
01:14:05.100 | and that's real hard work.
01:14:07.020 | And some of us are better than others at that.
01:14:09.260 | - So I think that the big piece has to do
01:14:11.780 | with actually even engineering your environment
01:14:14.660 | such that it's conducive to that.
01:14:16.580 | So see, both Ilya and I, on the frequent basis,
01:14:21.580 | we kind of disconnect ourselves from the world
01:14:24.180 | in order to be able to do extensive amount of thinking.
01:14:28.060 | So Ilya usually, he just leaves iPad at hand.
01:14:33.060 | He loves his iPad.
01:14:35.100 | And for me, I'm even sometimes,
01:14:39.980 | just going for a few days to different location, to Airbnb,
01:14:43.500 | I'm turning off my phone
01:14:45.380 | and there is no access to me.
01:14:47.060 | And that's extremely important for me
01:14:50.780 | to be able to actually just formulate new thoughts,
01:14:54.140 | to do deep work rather than to be reactive.
01:14:57.100 | And the older I am,
01:14:58.900 | the more of this like random tasks are at hand.
01:15:02.340 | - Before I go on to that thread,
01:15:05.620 | let me return to our friend, GPT.
01:15:09.380 | Let me ask you another ridiculously big question.
01:15:12.380 | Can you give an overview of what GPT-3 is?
01:15:15.940 | Or like you say in your Twitter bio, GPT-N+1,
01:15:19.140 | how it works and why it works?
01:15:24.020 | - So GPT-3 is a humongous neural network.
01:15:29.020 | Let's assume that we know what is neural network,
01:15:31.420 | gave the definition.
01:15:33.340 | And it is trained on the entire internet
01:15:36.580 | just to predict next word.
01:15:39.460 | So let's say it sees part of the article
01:15:43.060 | and the only task that it has at hand,
01:15:45.540 | it is to say what would be the next word?
01:15:48.380 | What would be the next word?
01:15:50.220 | And it becomes really exceptional at the task
01:15:54.220 | of figuring out what's the next word.
01:15:56.300 | So you might ask, why would this be an important task?
01:16:01.300 | Why would it be important to predict what's the next word?
01:16:05.060 | And it turns out that a lot of problems can be formulated
01:16:09.500 | as a text completion problem.
01:16:12.380 | So GPT is purely learning to complete the text.
01:16:16.740 | And you could imagine, for instance,
01:16:18.100 | if you are asking a question,
01:16:20.260 | who is the president of United States?
01:16:22.820 | Then GPT can give you an answer to it.
01:16:25.980 | It turns out that many more things can be formulated
01:16:28.940 | this way, you can format text
01:16:31.500 | in the way that you have sent us in English.
01:16:34.580 | You make it even look like some content of a website
01:16:38.340 | elsewhere, which would be teaching people
01:16:40.060 | how to translate things between languages.
01:16:41.940 | So it would be EN colon, text in English, FR colon.
01:16:46.780 | And then you ask people, and then you ask model to continue.
01:16:51.780 | And it turns out that such a model
01:16:54.540 | is predicting translation from English to French.
01:16:57.140 | The crazy thing is that this model can be used
01:17:02.140 | for way more sophisticated tasks.
01:17:04.340 | So you can format text such that it looks
01:17:06.900 | like a conversation between two people.
01:17:09.180 | And that might be a conversation between you and Elon Musk.
01:17:12.820 | And because the model read all the texts about Elon Musk,
01:17:16.740 | it will be able to predict Elon Musk words
01:17:19.260 | as it would be Elon Musk.
01:17:20.300 | It will speak about colonization of Mars,
01:17:23.860 | about sustainable future and so on.
01:17:26.420 | And it's also possible to even give
01:17:31.420 | arbitrary personality to the model.
01:17:33.020 | You can say, here is a conversation
01:17:35.060 | that we have a friendly AI bot.
01:17:36.620 | And the model will complete the text as a friendly AI bot.
01:17:42.660 | - So, I mean, how do I express how amazing this is?
01:17:47.660 | So just to clarify, a conversation,
01:17:53.100 | generating a conversation between me and Elon Musk,
01:17:56.100 | it wouldn't just generate good examples
01:18:00.260 | of what Elon would say.
01:18:02.020 | It would get the syntax all correct.
01:18:04.140 | So like interview style,
01:18:06.060 | it would say like Elon colon, Lex colon.
01:18:09.220 | It's not just like inklings of semantic correctness.
01:18:14.220 | It's like the whole thing, grammatical, syntactic, semantic.
01:18:22.660 | It's just really, really impressive generalization.
01:18:29.980 | - Yeah, I mean, I also want to provide some caveats.
01:18:33.180 | So it can generate few paragraphs of coherent text,
01:18:36.580 | but as you go to longer pieces,
01:18:38.940 | it actually goes off the rails.
01:18:41.380 | Okay, if you try to write a book,
01:18:43.780 | it won't work out this way.
01:18:45.700 | - What way does it go off the rails, by the way?
01:18:47.860 | Is there interesting ways in which it goes off the rails?
01:18:50.580 | Like what falls apart first?
01:18:54.060 | - So the model is trained on the old existing data
01:18:56.780 | that is out there,
01:18:58.980 | which means that it is not trained on its own mistakes.
01:19:02.260 | So for instance, if it would make a mistake,
01:19:04.900 | then I kept, so to give you an example.
01:19:08.380 | So let's say I have a conversation
01:19:10.900 | with a model pretending that is Elon Musk.
01:19:14.580 | And then I start putting some,
01:19:16.940 | I'm start actually making up things which are not factual.
01:19:20.700 | I would say- - Sounds like Twitter.
01:19:23.500 | (laughing)
01:19:25.140 | But I got you, sorry, yeah.
01:19:27.580 | - Like, I don't know, I would say that Elon is my wife
01:19:31.100 | and the model will just keep on carrying it on.
01:19:36.020 | - As if it's true.
01:19:37.220 | - Yes, and in some sense,
01:19:39.060 | if you would have a normal conversation with Elon,
01:19:41.780 | he would be, "What the fuck?"
01:19:43.260 | - Yeah, there would be some feedback between.
01:19:46.500 | So the model is trained on things that humans have written,
01:19:50.260 | but through the generation process,
01:19:52.180 | there's no human in the loop feedback.
01:19:54.300 | - Correct.
01:19:55.460 | - That's fascinating.
01:19:56.580 | - Makes sense?
01:19:57.420 | - It magnifies, like the errors get magnified and magnified.
01:20:00.580 | - Correct.
01:20:01.420 | - And it's also interesting.
01:20:03.420 | I mean, first of all, humans have the same problem.
01:20:06.860 | It's just that we will make fewer errors
01:20:11.860 | and magnify the errors slower.
01:20:13.860 | - I think that actually what happens with humans is
01:20:16.460 | if you have a wrong belief about the world as a kid,
01:20:19.780 | then very quickly you will learn that it's not correct
01:20:22.780 | because you are grounded in reality
01:20:24.660 | and you are learning from your new experience.
01:20:26.540 | - Yes.
01:20:27.700 | But do you think the model can correct itself too?
01:20:31.100 | Won't it through the power of the representation,
01:20:34.980 | and so the absence of Elon Musk being your wife,
01:20:39.860 | information on the internet, won't it correct itself?
01:20:43.860 | - There won't be examples like that.
01:20:45.900 | - So the errors will be subtle at first.
01:20:48.460 | - Subtle at first, and in some sense,
01:20:50.620 | you can also say that the data that is not out there
01:20:54.420 | is the data which would represent how the human learns.
01:20:57.260 | And maybe model would be trained on such a data,
01:21:01.900 | then it would be better off.
01:21:03.620 | - How intelligent is GPT-3, do you think?
01:21:06.540 | Like when you think about the nature of intelligence,
01:21:10.020 | it seems exceptionally impressive.
01:21:12.580 | But then if you think about the big AGI problem,
01:21:17.500 | is this footsteps along the way to AGI?
01:21:20.980 | - So let's see.
01:21:22.740 | Seems that intelligence itself is,
01:21:24.580 | there are multiple axis of it.
01:21:26.780 | And I would expect that the systems that we are building,
01:21:31.780 | they might end up being superhuman on some axis,
01:21:36.460 | and subhuman on some other axis.
01:21:38.180 | It would be surprising to me on all axis simultaneously,
01:21:41.100 | they would become superhuman.
01:21:42.540 | Of course, people ask this question,
01:21:45.500 | is GPT a spaceship that would take us to the moon,
01:21:50.220 | or are we building a ladder to heaven,
01:21:53.020 | that we are just building bigger and bigger ladder?
01:21:55.540 | And we don't know in some sense,
01:21:58.460 | which one of these two--
01:22:00.020 | - Which one is better?
01:22:01.300 | I'm trying to, I like "Stairway to Heaven",
01:22:05.180 | it's a good song,
01:22:06.020 | so I'm not exactly sure which one is better.
01:22:08.020 | But you're saying like the spaceship to the moon
01:22:10.100 | is actually effective?
01:22:11.660 | - Correct, so people who criticize GPT,
01:22:15.020 | they say, "You're guys just building a taller ladder."
01:22:20.580 | And it will never reach the moon.
01:22:22.220 | And at the moment, I would say the way I'm thinking is,
01:22:27.820 | this is like a scientific question.
01:22:29.500 | And I'm also in heart, I'm a builder creator.
01:22:33.620 | And like, I'm thinking, let's try out,
01:22:35.820 | let's see how far it goes.
01:22:37.900 | And so far we see constantly that there is a progress.
01:22:41.860 | - Yeah, so do you think GPT-4,
01:22:47.500 | GPT-5, GPT-N+1,
01:22:50.420 | there'll be a phase shift,
01:22:54.260 | like a transition to a place
01:22:56.180 | where we'll be truly surprised?
01:22:58.420 | Then again, like GPT-3 is already very, truly surprising.
01:23:02.300 | The people that criticize GPT-3 as a, what is it?
01:23:06.100 | Ladder to heaven,
01:23:07.620 | I think too quickly get accustomed to how impressive it is,
01:23:11.060 | that the prediction of the next word
01:23:12.860 | can achieve such depth of semantics,
01:23:15.700 | with accuracy of syntax, grammar, and semantics.
01:23:19.780 | Do you think GPT-4 and 5 and 6
01:23:25.460 | will continue to surprise us?
01:23:26.900 | - I mean, definitely there will be more impressive models.
01:23:31.020 | There is a question, of course,
01:23:32.100 | if there will be a phase shift.
01:23:34.980 | And also even the way I'm thinking about these models
01:23:39.980 | is that when we build these models,
01:23:44.060 | you know, we see some level of the capabilities,
01:23:46.340 | but we don't even fully understand everything
01:23:49.060 | that the model can do.
01:23:50.300 | And actually one of the best things to do
01:23:52.540 | is to allow other people to probe the model,
01:23:56.260 | to even see what is possible.
01:23:57.780 | - Hence the using GPT as an API
01:24:03.820 | and opening it up to the world.
01:24:05.380 | - Yeah, I mean, so when I'm thinking from perspective of,
01:24:08.740 | obviously various people have concerns about AGI,
01:24:13.460 | including myself.
01:24:14.500 | And then when I'm thinking from perspective,
01:24:17.340 | what's the strategy even to deploy these things to the world?
01:24:20.820 | Then the one strategy that I have seen many times working
01:24:24.940 | is the iterative deployment,
01:24:26.740 | that you deploy slightly better versions
01:24:30.180 | and you allow other people to criticize you.
01:24:32.700 | So you actually, or try it out,
01:24:34.700 | you see where are there fundamental issues.
01:24:37.380 | And it's almost, you don't want to be in that situation
01:24:40.780 | that you are holding into powerful system
01:24:45.180 | and there's like a huge overhang,
01:24:46.740 | then you deploy it and it might have a random chaotic impact
01:24:50.700 | on the world.
01:24:51.540 | So you actually want to be in the situation
01:24:53.500 | that you are gradually deploying systems.
01:24:55.860 | - I asked this question of Ilya,
01:24:58.260 | let me ask you this question.
01:25:01.580 | I've been reading a lot about Stalin and power.
01:25:10.180 | If you're in possession of a system that's like AGI,
01:25:14.380 | that's exceptionally powerful,
01:25:16.060 | do you think your character and integrity
01:25:20.060 | might become corrupted?
01:25:21.820 | Like famously power corrupts
01:25:23.420 | and absolute power corrupts absolutely.
01:25:25.900 | - So I believe that you want at some point
01:25:29.340 | to work toward distributing the power.
01:25:33.020 | I think that you want to be in the situation
01:25:36.180 | that actually AGI is not controlled
01:25:38.540 | by a small number of people,
01:25:40.780 | but essentially by a larger collective.
01:25:45.140 | - So the thing is that requires
01:25:46.780 | a George Washington style move.
01:25:49.300 | In the ascent to power,
01:25:52.420 | there's always a moment when somebody gets a lot of power
01:25:56.500 | and they have to have the integrity
01:25:58.340 | and the moral compass to give away that power.
01:26:03.620 | That humans have been good and bad throughout history
01:26:07.060 | at this particular step.
01:26:08.900 | And I wonder, I wonder we like blind ourselves in,
01:26:13.020 | for example, between nations,
01:26:15.540 | a race towards, yeah, AI race between nations.
01:26:20.540 | We might blind ourselves and justify to ourselves
01:26:24.420 | the development of AI without distributing the power
01:26:28.140 | because we want to defend ourselves against China,
01:26:30.500 | against Russia, that kind of logic.
01:26:34.220 | And I wonder how we design governance mechanisms
01:26:39.220 | that prevent us from becoming power hungry
01:26:45.500 | and in the process destroying ourselves.
01:26:48.340 | - So let's see.
01:26:49.620 | I have been thinking about this topic quite a bit,
01:26:51.980 | but I also want to admit that once again,
01:26:55.100 | I actually want to rely way more on Sam Altman on it.
01:26:57.940 | He wrote an excellent blog
01:27:01.300 | on how even to distribute wealth.
01:27:04.140 | And he proposed in his blog to tax equity
01:27:09.140 | of the companies rather than profit and to distribute it.
01:27:13.900 | And this is an example of Washington move.
01:27:18.740 | I guess I personally have insane trust in Sam.
01:27:24.780 | He already spent plenty of money
01:27:27.260 | running a universal basic income project.
01:27:31.900 | That gives me, I guess, maybe some level of trust to him,
01:27:36.100 | but I also, I guess, love him as a friend.
01:27:41.100 | - Yeah.
01:27:42.380 | I wonder, because we're sort of summoning
01:27:44.500 | a new set of technologies,
01:27:46.780 | I wonder if we'll be cognizant.
01:27:51.340 | Like you're describing the process of open AI,
01:27:54.180 | but it could also be at other places
01:27:56.180 | like in the US government, right?
01:27:59.460 | Both China and the US are now full steam ahead
01:28:03.900 | on autonomous weapons systems development.
01:28:06.300 | And that's really worrying to me
01:28:08.860 | because in the framework of something being
01:28:13.540 | a national security danger or a military danger,
01:28:17.100 | you can do a lot of pretty dark things
01:28:20.020 | that blind our moral compass.
01:28:23.420 | And I think AI will be one of those things.
01:28:26.620 | In some sense, the mission and the work you're doing
01:28:30.180 | in open AI is like the counterbalance to that.
01:28:33.620 | So you want to have more open AI
01:28:35.460 | and less autonomous weapons systems.
01:28:38.060 | - I like these statements.
01:28:39.900 | Like to be clear, this is interesting
01:28:41.620 | and I'm thinking about it myself,
01:28:43.540 | but this is a place that I put my trust
01:28:48.540 | actually in Sam's hands because it's extremely hard
01:28:51.740 | for me to reason about it.
01:28:53.420 | - Yeah.
01:28:54.260 | - One important statement to make is
01:28:56.300 | it's good to think about this.
01:28:59.260 | - Yeah, no question about it.
01:29:00.580 | No question about it.
01:29:01.420 | - Even like low level, quote unquote engineer.
01:29:05.340 | Like there's such a, I remember I programmed a car,
01:29:11.260 | a RC car.
01:29:12.820 | They went really fast, like 30, 40 miles an hour.
01:29:18.380 | And I remember I was like sleep deprived.
01:29:20.980 | So I programmed it pretty crappily
01:29:23.860 | and it like, the code froze.
01:29:26.380 | So it's doing some basic computer vision
01:29:28.540 | and it's going around on track, but it's going full speed.
01:29:31.700 | And there was a bug in the code that the car just went,
01:29:37.020 | it didn't turn, it went straight full speed
01:29:40.780 | and smashed into the wall.
01:29:42.540 | And I remember thinking the seriousness
01:29:46.220 | with which you need to approach the design
01:29:49.940 | of artificial intelligence systems
01:29:51.300 | and the programming of artificial intelligence systems
01:29:54.340 | is high because the consequences are high.
01:29:57.660 | Like that little car smashing into the wall,
01:30:00.860 | for some reason I immediately thought of like an algorithm
01:30:03.900 | that controls nuclear weapons, having the same kind of bug.
01:30:07.340 | And so like the lowest level engineer
01:30:10.020 | and the CEO of a company all need to have the seriousness
01:30:13.980 | in approaching this problem
01:30:15.100 | and thinking about the worst case consequences.
01:30:17.260 | - So I think that is true.
01:30:19.060 | I mean, what I also recognize in myself
01:30:23.340 | and others even asking this question
01:30:25.380 | is that it evokes a lot of fear
01:30:27.980 | and fear itself ends up being actually quite debilitating.
01:30:32.220 | The place where I arrived at the moment
01:30:35.700 | might sound cheesy or so,
01:30:39.220 | but it's almost to build things out of love
01:30:43.420 | rather than fear.
01:30:44.460 | - Yeah.
01:30:45.820 | - Like a focus on how I can maximize the value,
01:30:50.820 | how the systems that I'm building might be useful.
01:30:55.900 | I'm not saying that the fear doesn't exist out there
01:31:00.060 | and like it totally makes sense to minimize it,
01:31:03.140 | but I don't want to be working because I'm scared.
01:31:06.820 | I want to be working out of passion, out of curiosity,
01:31:10.500 | out of the looking forward for the positive future.
01:31:14.860 | - With the definition of love arising
01:31:18.260 | from a rigorous practice of empathy.
01:31:20.900 | So not just like your own conception
01:31:22.700 | of what is good for the world,
01:31:24.180 | but always listening to others.
01:31:26.260 | - Correct, like the love where I'm considering
01:31:28.660 | reward functions of others.
01:31:30.580 | - Others.
01:31:32.020 | To limit to infinity is like a sum,
01:31:34.620 | like one to N where N is 7 billion or whatever it is.
01:31:38.180 | - Not projecting my reward functions on others.
01:31:40.220 | - Yeah, exactly.
01:31:41.780 | - Okay, can we just take a step back
01:31:44.060 | to something else super cool, which is OpenAI Codex?
01:31:48.500 | Can you give an overview of what OpenAI Codex
01:31:50.940 | and GitHub Copilot is, how it works,
01:31:54.940 | and why the hell it works so well?
01:31:56.700 | - So with GPT-3, we noticed that the system,
01:32:01.940 | that system train on all the language out there
01:32:05.260 | started having some rudimentary coding capabilities.
01:32:08.260 | So we're able to ask it to implement addition function
01:32:13.260 | between two numbers,
01:32:14.260 | and indeed it can write Python or JavaScript code for that.
01:32:18.100 | And then we thought we might as well just go full steam ahead
01:32:22.780 | and try to create a system that is actually good
01:32:25.780 | at what we are doing every day ourselves,
01:32:28.620 | which is programming.
01:32:30.180 | We optimize models for proficiency in coding.
01:32:34.540 | We actually even created models
01:32:36.940 | that both have a comprehension of language and code.
01:32:41.660 | And Codex is API for these models.
01:32:45.540 | - So it's first pre-trained on language,
01:32:48.540 | and then I don't know if you can say fine-tuned,
01:32:52.820 | 'cause there's a lot of code, but it's language and code.
01:32:56.340 | - It's language and code.
01:32:58.260 | It's also optimized for various things,
01:33:00.140 | like let's say low latency and so on.
01:33:02.540 | Codex is the API, the similar to GPT-3.
01:33:05.980 | We expect that there will be proliferation
01:33:07.900 | of the potential products that can use coding capabilities,
01:33:11.500 | and I can speak about it in a second.
01:33:14.900 | Copilot is a first product developed by GitHub.
01:33:18.180 | So as we're building models,
01:33:20.820 | we wanted to make sure that these models are useful.
01:33:23.660 | And we worked together with GitHub
01:33:25.220 | on building the first product.
01:33:27.300 | Copilot is actually as you code,
01:33:29.620 | it suggests you code completions.
01:33:32.220 | And we have seen in the past,
01:33:34.020 | there are various tools that can suggest
01:33:36.780 | how to view characters of the code or the line of code.
01:33:41.020 | The thing about Copilot is it can generate 10 lines of code.
01:33:44.780 | It's often the way how it works is
01:33:48.260 | you often write in the comment what you want to happen,
01:33:50.780 | because people in comments, they describe what happens next.
01:33:54.340 | So these days when I code,
01:33:56.900 | instead of going to Google to search
01:33:59.380 | for the appropriate code to solve my problem,
01:34:02.780 | I say, "Oh, for this array, could you smooth it?"
01:34:07.500 | And then it imports some appropriate libraries
01:34:10.220 | and say it uses NumPy convolution,
01:34:12.420 | or so that I was not even aware that exists,
01:34:15.060 | and it does the appropriate thing.
01:34:16.740 | - So you write a comment, maybe the header of a function,
01:34:21.700 | and it completes the function.
01:34:23.620 | Of course, you don't know what is the space
01:34:25.540 | of all the possible small programs it can generate.
01:34:29.100 | What are the failure cases?
01:34:30.620 | How many edge cases?
01:34:32.100 | How many subtle errors there are?
01:34:34.220 | How many big errors there are?
01:34:35.900 | It's hard to know, but the fact that it works at all
01:34:38.500 | in a large number of cases is incredible.
01:34:41.260 | It's like a kind of search engine
01:34:45.260 | into code that's been written on the internet.
01:34:47.980 | - Correct, so for instance, when you search things online,
01:34:51.860 | then usually you get to some particular case.
01:34:55.940 | Like if you go to Stack Overflow,
01:34:57.740 | people describe that one particular situation,
01:35:01.860 | and then they seek for a solution.
01:35:03.260 | But in case of Copilot, it's aware of your entire context,
01:35:07.980 | and in context is, "Oh, these are the libraries
01:35:09.940 | "that they are using."
01:35:10.780 | That's the set of the variables that is initialized,
01:35:14.380 | and on the spot, it can actually tell you what to do.
01:35:17.540 | So the interesting thing is,
01:35:19.420 | and we think that Copilot is one possible product
01:35:22.540 | using Codex, but there is a place for many more.
01:35:25.300 | So internally, we tried out to create other fun products.
01:35:29.980 | So it turns out that a lot of tools out there,
01:35:33.340 | let's say Google Calendar or Microsoft Word or so,
01:35:36.380 | they all have internal API to build plugins around them.
01:35:41.380 | So there is a way, a sophisticated way,
01:35:44.460 | to control Calendar or Microsoft Word.
01:35:47.740 | Today, if you want more complicated behaviors
01:35:51.100 | from these programs, you have to add a new button
01:35:53.300 | for every behavior.
01:35:54.380 | But it is possible to use Codex
01:35:57.260 | and tell, for instance, to Calendar,
01:35:59.860 | could you schedule an appointment with Lex
01:36:04.380 | next week after 2 p.m.?
01:36:06.460 | And it writes corresponding piece of code.
01:36:08.580 | And that's the thing that actually you want.
01:36:11.460 | - So interesting.
01:36:12.300 | So what you figure out is there's a lot of programs
01:36:15.340 | with which you can interact through code.
01:36:17.740 | And so there, you can generate that code
01:36:21.300 | from natural language.
01:36:23.140 | That's fascinating.
01:36:24.140 | - And that's somewhat like also closest
01:36:26.220 | to what was the promise of Siri or Alexa.
01:36:29.980 | So previously, all these behaviors,
01:36:31.620 | they were hard-coded.
01:36:33.740 | And it seems that Codex on the fly can pick up the API
01:36:38.180 | of, let's say, a given software.
01:36:40.420 | And then it can turn language into use of this API.
01:36:43.380 | - So without hard-coding, it can find,
01:36:45.260 | it can translate to machine language.
01:36:47.780 | - Correct.
01:36:49.060 | - So for example, this would be really exciting for me,
01:36:51.340 | like for Adobe products like Photoshop,
01:36:54.740 | which I think ActionScript,
01:36:58.060 | I think there's a scripting language
01:36:59.340 | that communicates with them.
01:37:00.220 | Same with Premiere.
01:37:01.380 | - And do you could imagine that that allows even
01:37:03.620 | to do coding by voice on your phone?
01:37:07.620 | So for instance, in the past, as of today,
01:37:11.100 | I'm not editing Word documents on my phone
01:37:14.260 | because it's just the keyboard is too small.
01:37:16.580 | But if I would be able to tell to my phone,
01:37:21.300 | make the header large, then move the paragraphs around,
01:37:24.340 | and that's actually what I want.
01:37:26.980 | So I can tell you one more cool thing,
01:37:28.780 | or even how I'm thinking about Codex.
01:37:31.700 | So if you look actually at the evolution of computers,
01:37:36.700 | we started with very primitive interfaces,
01:37:40.340 | which is a punch card.
01:37:41.340 | And punch card, essentially, you make holes
01:37:44.860 | in the plastic card to indicate zeros and ones.
01:37:49.020 | And during that time, there was a small number
01:37:51.940 | of specialists who were able to use computers.
01:37:54.260 | And by the way, people even suspected
01:37:55.940 | that there is no need for many more people to use computers.
01:37:58.940 | But then we moved from punch cards to,
01:38:03.220 | at first, assembly and C.
01:38:05.460 | And these programming languages,
01:38:07.580 | they were slightly higher level.
01:38:09.380 | They allowed many more people to code.
01:38:11.740 | And they also led to more of a proliferation of technology.
01:38:16.300 | And further on, there was a jump to,
01:38:19.860 | say, from C++ to Java and Python.
01:38:22.420 | And every time it has happened,
01:38:24.420 | more people are able to code,
01:38:26.020 | and we build more technology.
01:38:28.620 | And it's even hard to imagine now
01:38:31.860 | if someone will tell you that you should write code
01:38:34.740 | in assembly instead of, let's say, Python or Java or JavaScript.
01:38:39.540 | And Codex is yet another step
01:38:41.700 | toward kind of bringing computers closer to humans,
01:38:44.620 | such that you communicate with a computer
01:38:47.660 | with your own language,
01:38:49.300 | rather than with a specialized language.
01:38:51.540 | And I think that it will lead to
01:38:55.100 | an increase of number of people who can code.
01:38:57.820 | - Yeah, and the kind of technologies
01:38:59.660 | that those people will create is, it's innumerable.
01:39:03.420 | It could be a huge number of technologies
01:39:06.340 | we're not predicting at all.
01:39:08.300 | 'Cause that's less and less requirement
01:39:10.220 | of having a technical mind, a programming mind.
01:39:14.420 | You're not opening it to the world of
01:39:16.420 | other kinds of minds, creative minds, artistic minds,
01:39:21.300 | all that kind of stuff.
01:39:22.140 | - I would like, for instance, biologists who work on DNA
01:39:25.020 | to be able to program
01:39:26.540 | and not to need to spend a lot of time learning it.
01:39:29.500 | And I believe that's a good thing to the world.
01:39:31.940 | And I would actually add, I would add,
01:39:33.620 | so at the moment, I'm managing Codex team
01:39:37.060 | and also language team.
01:39:38.380 | And I believe that there is like
01:39:40.060 | plenty of brilliant people out there,
01:39:42.380 | and they should apply.
01:39:44.380 | - Oh, okay, yeah, awesome.
01:39:45.900 | So what's the language in the Codex?
01:39:47.460 | So those are kind of, they're overlapping teams.
01:39:50.740 | So it's like GPT, the raw language,
01:39:52.860 | and then the Codex is like applied to programming.
01:39:57.140 | - Correct, and they are quite intertwined.
01:39:59.980 | There are many more teams involved
01:40:01.540 | making these models extremely efficient and deployable.
01:40:06.540 | For instance, there are people who are working to
01:40:09.420 | make our data centers amazing,
01:40:12.500 | or there are people who work on
01:40:14.460 | putting these models into production,
01:40:16.580 | or even pushing it at the very limit of the scale.
01:40:20.300 | - So all aspects from the infrastructure
01:40:24.340 | to the actual machine learning.
01:40:25.180 | - So I'm just saying there are multiple teams.
01:40:27.100 | While the team working on Codex and language,
01:40:31.100 | I guess I'm directly managing them.
01:40:33.580 | I would love to hire more.
01:40:35.180 | - Yeah, if you're interested in machine learning,
01:40:38.180 | this is probably one of the most exciting problems
01:40:41.700 | and systems to be working on.
01:40:43.780 | 'Cause it's actually, it's pretty cool.
01:40:46.140 | Like what the program synthesis,
01:40:48.660 | like generating a program is very interesting,
01:40:51.300 | very interesting problem that has echoes of reasoning
01:40:55.860 | and intelligence in it.
01:40:58.060 | And I think there's a lot of fundamental questions
01:41:00.700 | that you might be able to sneak up to
01:41:04.740 | by generating programs.
01:41:06.180 | - Yeah, one more exciting thing about the programs is that,
01:41:09.740 | so I said that the, you know, in case of language,
01:41:13.300 | that one of the troubles is even evaluating language.
01:41:15.940 | So when the things are made up,
01:41:17.380 | you need somehow either a human to say
01:41:22.100 | that this doesn't make sense.
01:41:23.900 | Or so in case of program, there is this one extra lever
01:41:26.780 | that we can actually execute programs
01:41:28.420 | and see what they evaluate to.
01:41:30.460 | So that process might be somewhat more automated
01:41:35.140 | in order to improve the qualities of generations.
01:41:39.380 | - Oh, that's fascinating.
01:41:40.220 | So like the, wow, that's really interesting.
01:41:43.100 | - So for the language, the, you know,
01:41:45.260 | the simulation to actually execute it, that's a human mind.
01:41:48.340 | - Yeah.
01:41:49.180 | - For programs, there is a computer
01:41:52.180 | on which you can evaluate it.
01:41:53.620 | - Wow, that's a brilliant little insight
01:41:59.780 | that the thing compiles and runs.
01:42:02.700 | That's first.
01:42:04.100 | And second, you can evaluate on a,
01:42:06.580 | like do automated unit testing.
01:42:09.540 | And in some sense, it seems to me
01:42:11.620 | that we'll be able to make a tremendous progress.
01:42:13.940 | You know, we are in the paradigm
01:42:16.060 | that there is way more data.
01:42:18.340 | There is like a transcription of millions
01:42:21.700 | of software engineers.
01:42:24.620 | - Yeah.
01:42:25.620 | Yeah.
01:42:27.220 | So, I mean, you just mean,
01:42:29.340 | 'cause I was gonna ask you about reliability.
01:42:31.540 | The thing about programs is you don't know
01:42:33.780 | if they're gonna, like a program
01:42:37.540 | that's controlling a nuclear power plant
01:42:39.700 | has to be very reliable.
01:42:41.460 | - So I wouldn't start with controlling nuclear power plant.
01:42:44.700 | Maybe one day, but that's not actually,
01:42:46.980 | that's not on the current roadmap.
01:42:48.740 | That's not the step one.
01:42:51.100 | - You know, it's the Russian thing.
01:42:52.780 | You just wanna go to the most powerful,
01:42:54.380 | destructive thing right away, run by JavaScript.
01:42:57.700 | But I got you.
01:42:58.540 | So this is a lower impact, but nevertheless,
01:43:00.460 | what you're making me realize,
01:43:02.020 | it is possible to achieve some levels of reliability
01:43:04.460 | by doing testing.
01:43:06.620 | And you could imagine that, you know,
01:43:08.820 | maybe there are ways for a model to write even code
01:43:11.780 | for testing itself and so on.
01:43:14.020 | And there exist ways to create the feedback loops
01:43:17.580 | that the model could keep on improving.
01:43:19.540 | - By writing programs that generate tests.
01:43:23.940 | - For instance.
01:43:24.780 | For instance.
01:43:25.620 | - And that's how we get consciousness
01:43:28.940 | because it's meta compression.
01:43:30.740 | That's what you're gonna write.
01:43:31.580 | That's the comment.
01:43:32.540 | That's the prompt that generates consciousness.
01:43:34.940 | - Compressor of compressors.
01:43:36.820 | You just write that.
01:43:38.540 | Do you think the code that generates consciousness
01:43:40.660 | would be simple?
01:43:42.340 | - So let's see.
01:43:44.140 | I mean, ultimately the core idea behind will be simple,
01:43:48.300 | but there will be also decent amount of engineering involved.
01:43:51.300 | Like in some sense, it seems that, you know,
01:43:57.140 | spreading these models on many machines,
01:44:00.980 | it's not that trivial.
01:44:02.100 | - Yeah.
01:44:02.940 | - And we find all sorts of innovations
01:44:06.100 | that make our models more efficient.
01:44:08.660 | I believe that first models that I guess are conscious
01:44:13.500 | are like a truly intelligent.
01:44:14.860 | They will have all sorts of tricks.
01:44:17.580 | - But then again, there's a, which is certain argument
01:44:22.620 | that maybe the tricks are temporary things.
01:44:25.340 | - Yeah, they might be temporary things.
01:44:26.820 | And in some sense, it's also even important
01:44:29.700 | to know that even the cost of a trick.
01:44:34.220 | So sometimes people are eager to put the trick
01:44:37.620 | while forgetting that there is a cost of maintenance.
01:44:41.380 | - Or like a long-term cost.
01:44:43.060 | - Long-term cost or maintenance,
01:44:44.620 | or maybe even flexibility of code
01:44:48.220 | to actually implement new ideas.
01:44:49.620 | So even if you have something that gives you two X,
01:44:52.100 | but it requires, you know, 1000 lines of code,
01:44:55.140 | I'm not sure if it's actually worth it.
01:44:57.020 | So in some sense, you know,
01:44:58.380 | if it's five lines of code and two X, I would take it.
01:45:01.820 | And we see many of this, but also, you know,
01:45:06.380 | that requires some level of, I guess,
01:45:10.060 | lack of attachment to code that we are willing to remove it.
01:45:13.820 | - Yeah.
01:45:14.660 | So you led the OpenAI robotics team.
01:45:18.860 | Can you give an overview of the cool things
01:45:21.260 | you were able to accomplish?
01:45:22.380 | What are you most proud of?
01:45:24.100 | - So when we started robotics,
01:45:25.460 | we knew that actually reinforcement learning works
01:45:27.420 | and it is possible to solve fairly complicated problems.
01:45:31.540 | Like for instance, AlphaGo is an evidence
01:45:33.660 | that it is possible to build superhuman Go players.
01:45:38.660 | Dota2 is an evidence that it's possible
01:45:41.780 | to build superhuman agents playing Dota.
01:45:46.780 | So I asked myself a question, you know,
01:45:49.140 | what about robots out there?
01:45:50.660 | Could we train machines to solve arbitrary tasks
01:45:53.900 | in the physical world?
01:45:55.780 | Our approach was, I guess,
01:45:57.580 | let's pick a complicated problem that if we would solve it,
01:46:01.940 | that means that we made some significant progress,
01:46:05.420 | the domain, and then we went after the problem.
01:46:08.300 | So we noticed that actually the robots out there,
01:46:12.620 | they are kind of at the moment, optimized per task.
01:46:15.060 | So you can have a robot that it's like,
01:46:17.340 | if you have a robot opening a bottle,
01:46:19.780 | it's very likely that the end factor is that bottle opener.
01:46:23.740 | And in some sense, that's a hack to be able to solve a task,
01:46:27.820 | which makes any task easier.
01:46:29.780 | And I asked myself, so what would be a robot
01:46:32.980 | that can actually solve many tasks?
01:46:35.300 | And we concluded that like human hands have such a quality
01:46:40.300 | that indeed they are, you know,
01:46:43.140 | you have five kind of tiny arms attached.
01:46:47.260 | Individually, they can manipulate
01:46:49.940 | pretty broad spectrum of objects.
01:46:51.900 | So we went after a single hand,
01:46:54.580 | like trying to solve Rubik's cube single-handed.
01:46:57.420 | We picked this task because we thought
01:46:59.380 | that there is no way to hard-code it.
01:47:02.100 | And it's also, we picked a robot
01:47:03.620 | on which it would be hard to hard-code it.
01:47:06.060 | And we went after the solution such that
01:47:09.580 | it could generalize to other problems.
01:47:11.540 | - And just to clarify,
01:47:12.700 | it's one robotic hand solving the Rubik's cube.
01:47:16.620 | The hard part is in the solution to the Rubik's cube
01:47:18.860 | is the manipulation of the, of like having it not fall
01:47:23.100 | out of the hand, having it use the five baby arms to,
01:47:28.100 | what is it, like rotate different parts of the Rubik's cube
01:47:31.980 | to achieve the solution.
01:47:33.500 | - Correct.
01:47:34.340 | - Yeah.
01:47:35.180 | So what was the hardest part about that?
01:47:38.740 | What was the approach taken there?
01:47:40.540 | What are you most proud of?
01:47:41.820 | - Obviously we have like a strong belief
01:47:43.620 | in reinforcement learning.
01:47:45.300 | And, you know, one path it is to do reinforcement learning
01:47:49.620 | in the real world.
01:47:51.100 | Other path is to the simulation.
01:47:54.140 | In some sense, the tricky part about the real world
01:47:57.340 | is at the moment, our models, they require a lot of data.
01:48:00.620 | There is essentially no data.
01:48:02.460 | And I did, we decided to go through the path
01:48:06.340 | of the simulation.
01:48:07.300 | And in simulation, you can have infinite amount of data.
01:48:10.060 | The tricky part is the fidelity of the simulation.
01:48:12.980 | And also can you in simulation represent everything
01:48:16.220 | that you represent otherwise in the real world.
01:48:19.180 | And, you know, it turned out that, you know,
01:48:22.500 | because there is lack of fidelity, it is possible to,
01:48:25.860 | what we arrived at is training a model
01:48:30.420 | that doesn't solve one simulation,
01:48:32.260 | but it actually solves the entire range of simulations,
01:48:35.900 | which vary in terms of like, what's the,
01:48:39.900 | exactly the friction of the cube or weight or so.
01:48:43.700 | And the single AI that can solve all of them
01:48:47.020 | ends up working well with the reality.
01:48:49.420 | - How do you generate the different simulations?
01:48:51.540 | - So, you know, there's plenty of parameters out there.
01:48:54.500 | We just pick them randomly.
01:48:56.020 | And in simulation model just goes for thousands of years
01:49:01.020 | and keeps on solving Rubik's cube in each of them.
01:49:03.980 | And the thing is that neural network that we used,
01:49:07.140 | it has a memory and as it presses, for instance,
01:49:11.900 | the side of the cube, it can sense,
01:49:15.820 | oh, that's actually this side was difficult to press.
01:49:20.140 | I should press it stronger.
01:49:21.540 | And throughout this process kind of learn
01:49:24.740 | it's even how to solve this particular instance
01:49:28.100 | of the Rubik's cube by given mass.
01:49:30.100 | It's kind of like, you know, sometimes when you go to a gym
01:49:34.260 | and after bench press, you try to lift the glass
01:49:39.260 | and you kind of forgot and your head goes like,
01:49:46.420 | up right away because kind of you got used
01:49:48.980 | to maybe different weight and it takes a second to adjust.
01:49:52.100 | - Yeah.
01:49:53.100 | - And this kind of a memory, the model gain
01:49:56.100 | through that process of interacting
01:49:58.020 | with the cube in the simulation.
01:50:00.020 | - I appreciate you speaking to the audience
01:50:02.100 | with the bench press, all the bros in the audience
01:50:05.020 | probably working out right now.
01:50:06.260 | There's probably somebody listening to this
01:50:07.820 | actually doing bench press.
01:50:09.220 | So maybe put the bar down and pick up the water bottle
01:50:14.020 | and you'll know exactly what Czech is talking about.
01:50:17.580 | Okay, so what was the hardest part
01:50:22.580 | of getting the whole thing to work?
01:50:24.860 | - So the hardest part is at the moment
01:50:28.100 | when it comes to physical work,
01:50:31.300 | when it comes to robots, they require maintenance.
01:50:35.020 | It's hard to replicate a million times.
01:50:37.940 | It's also, it's hard to replay things exactly.
01:50:42.140 | I remember this situation that one guy at our company,
01:50:47.140 | he had like a model that performs way better
01:50:50.100 | than other models in solving RubyFube.
01:50:53.100 | And, you know, we kind of didn't know what's going on,
01:50:57.380 | why is that?
01:50:59.020 | And it turned out that, you know,
01:51:02.780 | he was running it from his laptop that had better CPU
01:51:06.620 | or better, maybe local GPU as well.
01:51:10.460 | And because of that, there was less of a latency
01:51:13.780 | and the model was the same.
01:51:15.700 | And that actually made solving RubySquib more reliable.
01:51:19.660 | So in some sense, there might be some saddlebacks like that
01:51:22.500 | when it comes to running things in the real world.
01:51:25.420 | Even hinting on that, you could imagine
01:51:28.500 | that the initial models, you would like to have models
01:51:31.140 | which are insanely huge neural networks
01:51:33.620 | and you would like to give them even more time for thinking.
01:51:37.780 | And when you have these real time systems,
01:51:40.500 | then you might be constrained actually
01:51:43.700 | by the amount of latency.
01:51:46.020 | And ultimately I would like to build a system
01:51:49.100 | that it is worth for you to wait five minutes
01:51:52.980 | because it gives you the answer
01:51:55.060 | that you are willing to wait for five minutes.
01:51:57.700 | - So latency is a very unpleasant constraint
01:52:00.060 | under which to operate.
01:52:01.260 | - Correct.
01:52:02.100 | And also there is actually one more thing
01:52:04.100 | which is tricky about robots.
01:52:05.740 | There is actually not much data.
01:52:09.980 | So the data that I'm speaking about would be a data
01:52:12.900 | of first person experience from the robot
01:52:17.020 | and like gigabytes of data like that.
01:52:19.100 | If we would have gigabytes of data like that
01:52:21.340 | of robots solving various problems,
01:52:23.460 | it would be very easy to make a progress on robotics.
01:52:26.420 | And you can see that in case of text or code,
01:52:29.340 | there is a lot of data, like a first person perspective
01:52:32.220 | data on the writing code.
01:52:34.860 | - Yeah, so you had this,
01:52:36.220 | you mentioned this really interesting idea
01:52:38.620 | that if you were to build like a successful robotics company
01:52:43.260 | so OpenAs mission is much bigger than robotics.
01:52:45.860 | This is one of the things you've worked on.
01:52:49.060 | But if it was a robotics company,
01:52:51.420 | that you wouldn't so quickly dismiss supervised learning.
01:52:55.140 | - Correct.
01:52:55.980 | - You would build a robot that was perhaps what,
01:53:00.980 | like an empty shell, like dumb,
01:53:04.700 | and they would operate under teleoperation.
01:53:07.780 | So you would invest, that's just one way to do it.
01:53:11.340 | Invest in human supervision,
01:53:12.780 | like direct human control of the robots as it's learning.
01:53:16.460 | And over time, add more and more automation.
01:53:19.420 | - That's correct.
01:53:20.260 | So let's say that's how I would build
01:53:21.980 | a robotics company today.
01:53:23.820 | If I would be building a robotics company,
01:53:25.540 | which is, you know, spend $10 million or so
01:53:28.780 | recording human trajectories, controlling a robot.
01:53:32.220 | - After you find a thing that the robot should be doing
01:53:36.780 | that there's a market fit for,
01:53:38.540 | like that you can make a lot of money with that product.
01:53:40.500 | - Correct, correct.
01:53:41.340 | - Yeah.
01:53:42.180 | - So I would record data
01:53:44.380 | and then I would essentially train supervised
01:53:46.820 | learning model on it.
01:53:48.260 | That might be the path today.
01:53:50.420 | Long term, I think that actually what is needed
01:53:53.260 | is to train powerful models over video.
01:53:57.660 | So you have seen maybe a models
01:54:01.060 | that can generate images like DALI.
01:54:03.020 | And people are looking into models generating videos.
01:54:07.260 | They're like, "Why are these algorithmic questions
01:54:09.420 | even how to do it?"
01:54:10.660 | And it's unclear if there is enough compute
01:54:12.460 | for this purpose.
01:54:13.780 | But I suspect that the models that,
01:54:17.620 | which would have a level of understanding of video,
01:54:22.020 | same as GPT has a level of understanding of text,
01:54:25.180 | could be used to train robots to solve tasks.
01:54:29.420 | They would have a lot of common sense.
01:54:31.340 | - If one day, I'm pretty sure one day,
01:54:35.660 | there will be a robotics company,
01:54:37.540 | by robotics company I mean the primary source of income
01:54:42.460 | is from robots that is worth over $1 trillion.
01:54:47.260 | What do you think that company will do?
01:54:50.900 | I think self-driving cars now.
01:54:52.420 | - It's interesting 'cause my mind went to personal robotics,
01:54:56.300 | robots in the home.
01:54:57.860 | It seems like there's much more market opportunity there.
01:55:01.020 | I think it's very difficult to achieve.
01:55:04.220 | I mean, this might speak to something important,
01:55:09.180 | which is I understand self-driving much better
01:55:11.300 | than I understand robotics in the home.
01:55:13.300 | So I understand how difficult it is
01:55:14.940 | to actually solve self-driving.
01:55:17.900 | To a level, not just the actual computer vision
01:55:21.260 | and the control problem
01:55:22.420 | and just the basic problems of driving,
01:55:24.420 | but creating a product that would undeniably be,
01:55:29.420 | that will cost less money,
01:55:32.420 | like it will save you a lot of money,
01:55:33.580 | like orders of magnitude less money
01:55:35.340 | that could replace Uber drivers, for example.
01:55:37.860 | So car sharing that's autonomous,
01:55:39.740 | that creates a similar or better experience
01:55:43.340 | in terms of how quickly you get from A to B
01:55:46.020 | or just whatever.
01:55:47.260 | The pleasantness of the experience,
01:55:49.580 | the efficiency of the experience,
01:55:51.020 | the value of the experience,
01:55:52.500 | and at the same time, the car itself costs cheaper.
01:55:56.500 | I think that's very difficult to achieve.
01:55:58.540 | I think there's a lot more low-hanging fruit in the home.
01:56:03.540 | - That could be.
01:56:06.220 | I also want to give you a perspective
01:56:08.020 | on how challenging it would be at home
01:56:11.460 | or maybe kind of depends on the exact problem
01:56:14.540 | that you'd be solving.
01:56:16.100 | If we're speaking about these robotic arms
01:56:19.580 | and hence these things,
01:56:21.460 | they cost tens of thousands of dollars or maybe 100K.
01:56:25.820 | And maybe obviously,
01:56:29.740 | maybe there would be economy of scale,
01:56:31.380 | these things would be cheaper,
01:56:33.180 | but actually for any household to buy it,
01:56:35.820 | the price would have to go down to maybe a thousand bucks.
01:56:39.180 | - Yeah.
01:56:40.020 | I personally think that,
01:56:42.100 | so self-driving car provides a clear service.
01:56:46.020 | I don't think robots in the home,
01:56:48.140 | there'll be a trillion dollar company
01:56:49.660 | will just be all about service.
01:56:52.140 | Meaning it will not necessarily be about like a robotic arm
01:56:55.940 | that's helps you, I don't know, open a bottle
01:56:59.420 | or wash the dishes or any of that kind of stuff.
01:57:04.100 | It has to be able to take care of that whole,
01:57:06.260 | the therapist thing you mentioned.
01:57:08.540 | I think that's, of course,
01:57:11.020 | there's a line between what is a robot and what is not.
01:57:14.420 | Like, does it really need a body?
01:57:16.000 | But some AI system with some embodiment, I think.
01:57:21.000 | - So the tricky part,
01:57:22.980 | when you think actually what's the difficult part is
01:57:25.580 | when the robot has,
01:57:28.980 | like when there is a diversity of the environment
01:57:31.260 | with which the robot has to interact, that becomes hard.
01:57:33.400 | So on one spectrum, you have industrial robots,
01:57:37.820 | as they are doing over and over the same thing,
01:57:39.900 | it is possible to some extent to prescribe the movements
01:57:43.980 | and with very small amount of intelligence,
01:57:46.560 | the movement can be repeated millions of times.
01:57:49.840 | There are also various pieces of industrial robots
01:57:54.280 | where it becomes harder and harder.
01:57:56.080 | Like for instance, in case of Tesla,
01:57:58.520 | might be a matter of putting a rug inside of a car.
01:58:02.960 | And because the rug kind of moves around,
01:58:05.600 | it's not that easy, it's not exactly the same every time.
01:58:09.800 | There's a big the case
01:58:10.640 | that you need actually humans to do it.
01:58:13.560 | While welding cars together, it's a very repetitive process.
01:58:17.400 | Then in case of self-driving itself,
01:58:20.780 | the difficulty has to do with the diversity
01:58:25.700 | of the environment, but still the car itself,
01:58:29.040 | the problem that you are solving
01:58:30.620 | is you try to avoid even interacting with things.
01:58:34.500 | You are not touching anything around
01:58:36.500 | 'cause touching itself is hard.
01:58:38.060 | And then if you would have in the home robot
01:58:41.340 | that has to touch things,
01:58:43.020 | and if these things, they change the shape,
01:58:45.100 | if there is a huge variety of things to be touched,
01:58:47.700 | then that's difficult.
01:58:48.780 | If you are speaking about the robot,
01:58:50.340 | which there is head that is smiling in some way
01:58:53.380 | with cameras that doesn't touch things,
01:58:56.600 | that's relatively simple.
01:58:57.980 | - Okay, so to both agree and to push back.
01:59:02.900 | So you're referring to touch like soft robotics,
01:59:07.140 | like the actual touch.
01:59:09.500 | But I would argue that you could formulate
01:59:13.100 | just basic interaction between like non-contact interaction
01:59:18.100 | is also a kind of touch.
01:59:20.220 | And that might be very difficult to solve.
01:59:21.860 | That's the basic, not disagreement,
01:59:23.940 | but that's the basic open question to me
01:59:26.860 | with self-driving cars and disagreement with Elon,
01:59:30.300 | which is how much interaction is required
01:59:32.820 | to solve self-driving cars?
01:59:34.200 | How much touch is required?
01:59:36.060 | You said that in your intuition, touch is not required.
01:59:40.380 | In my intuition to create a product
01:59:42.780 | that's compelling to use,
01:59:44.340 | you're going to have to interact with pedestrians,
01:59:48.080 | not just avoid pedestrians, but interact with them.
01:59:51.580 | When we drive around in major cities,
01:59:54.420 | we're constantly threatening everybody's life
01:59:56.900 | with our movements.
01:59:58.100 | And that's how they respect us.
02:00:00.740 | There's a game they're ready to go on with pedestrians.
02:00:03.660 | And I'm afraid you can't just formulate autonomous driving
02:00:08.660 | as a collision avoidance problem.
02:00:11.780 | - So I think it goes beyond,
02:00:13.740 | like a collision avoidance is the first order approximation,
02:00:17.540 | but then at least in case of Tesla,
02:00:20.220 | they are gathering data from people driving their cars.
02:00:23.540 | And I believe that's an example of supervised learning data
02:00:26.380 | that they can train their models on,
02:00:29.100 | and they are doing it,
02:00:31.380 | which can give a model this like another level of behavior
02:00:36.380 | that is needed to actually interact with the real world.
02:00:41.180 | - Yeah, it's interesting how much data
02:00:43.620 | is required to achieve that.
02:00:45.500 | What do you think of the whole Tesla autopilot approach,
02:00:49.900 | the computer vision based approach with multiple cameras,
02:00:53.280 | and as a data engine,
02:00:54.440 | it's a multitask, multi-headed neural network,
02:00:57.260 | and is this fascinating process of,
02:01:00.240 | similar to what you're talking about
02:01:02.140 | with the robotics approach,
02:01:04.980 | which is, you deploy neural network
02:01:07.460 | and then there's humans that use it,
02:01:09.900 | and then it runs into trouble in a bunch of places
02:01:12.460 | and that stuff is sent back.
02:01:13.760 | So like the deployment discovers a bunch of edge cases
02:01:18.020 | and those edge cases are sent back for supervised annotation
02:01:21.900 | thereby improving the neural network.
02:01:23.460 | And that's deployed again,
02:01:25.180 | it goes over and over until the network becomes really good
02:01:29.140 | at the task of driving, becomes safer and safer.
02:01:32.320 | What do you think of that kind of approach to robotics?
02:01:35.360 | - I believe that's the way to go.
02:01:36.820 | So in some sense, even when I was speaking about,
02:01:39.780 | collecting trajectories from humans,
02:01:41.720 | that's like a first step,
02:01:43.460 | and then you deploy the system
02:01:44.780 | and then you have humans revising all the issues.
02:01:47.740 | And in some sense,
02:01:48.760 | like at this approach converges to system
02:01:52.120 | that doesn't make mistakes
02:01:53.140 | because for the cases where there are mistakes,
02:01:55.320 | you got their data, how to fix them,
02:01:57.220 | and the system will keep on improving.
02:01:59.260 | - So there's a very, to me, difficult question
02:02:02.020 | of how hard that, how long that converging takes,
02:02:05.100 | how hard it is.
02:02:06.180 | The other aspect of autonomous vehicles
02:02:09.860 | is probably applies to certain robotics applications
02:02:13.060 | is society, right?
02:02:14.900 | They put, as the quality of the system converges,
02:02:19.900 | so one, there's a human factors perspective of psychology
02:02:23.300 | of humans being able to supervise those,
02:02:25.780 | even with teleoperation, those robots.
02:02:27.700 | And the other society willing to accept robots.
02:02:31.280 | Currently society is much harsher on self-driving cars
02:02:33.840 | than it is on human driven cars
02:02:35.780 | in terms of the expectation of safety.
02:02:37.820 | So the bar is set much higher than for humans.
02:02:41.220 | And so if there's a death in an autonomous vehicle,
02:02:44.460 | that's seen as a much more,
02:02:46.220 | much more dramatic than a death
02:02:51.220 | in a human driven vehicle.
02:02:53.100 | Part of the success of deployment of robots
02:02:55.260 | is figuring out how to make robots part of society,
02:02:58.780 | both on the, just the human side,
02:03:01.680 | on the media journalist side,
02:03:03.580 | and also on the policy government side.
02:03:05.780 | And that seems to be,
02:03:07.620 | maybe you can put that into the objective function
02:03:09.620 | to optimize, but that is definitely a tricky one.
02:03:14.300 | And I wonder if that is actually the trickiest part
02:03:18.440 | for self-driving cars or any system that's safety critical.
02:03:22.420 | It's not the algorithm,
02:03:23.860 | it's the society accepting it.
02:03:25.760 | - Yeah, I would say,
02:03:29.380 | I believe that the part of the process of deployment
02:03:33.060 | is actually showing people
02:03:34.960 | that they're given things can be trusted.
02:03:36.980 | - Yeah.
02:03:37.820 | - And, you know, trust is also like a glass
02:03:40.260 | that is actually really easy to crack it.
02:03:43.460 | - Yeah.
02:03:44.300 | - And damage it.
02:03:45.940 | And I think that's actually very common
02:03:49.820 | with innovation,
02:03:54.100 | that there is some resistance toward it.
02:03:56.220 | - Yeah.
02:03:57.060 | - And it's just a natural progression.
02:03:59.140 | So in some sense, people will have to keep on proving
02:04:01.860 | that indeed the systems are worth being used.
02:04:05.260 | And I would say,
02:04:06.100 | I also found out that often the best way to convince people
02:04:11.840 | is by letting them experience it.
02:04:14.100 | - Yeah, absolutely.
02:04:14.980 | That's the case with Tesla Autopilot, for example.
02:04:17.740 | That's the case with,
02:04:19.660 | yeah, with basically robots in general.
02:04:21.620 | It's kind of funny to hear people talk about robots.
02:04:24.380 | Like there's a lot of fear,
02:04:27.340 | even with like legged robots.
02:04:29.540 | But when they actually interact with them,
02:04:32.380 | there's joy.
02:04:33.980 | I love interacting with them.
02:04:35.260 | And the same with the car.
02:04:36.700 | With the robot,
02:04:38.740 | if it starts being useful,
02:04:41.060 | I think people immediately understand.
02:04:42.980 | And if the product is designed well, they fall in love.
02:04:45.920 | You're right.
02:04:46.940 | - It's actually even similar
02:04:48.100 | when I'm thinking about Copilot,
02:04:50.140 | the GitHub Copilot.
02:04:51.300 | There was a spectrum of responses that people had.
02:04:54.460 | And ultimately,
02:04:56.340 | the important piece was to let people try it out.
02:05:00.140 | And then many people just loved it.
02:05:02.660 | - Especially like programmers.
02:05:05.060 | - Yeah, programmers.
02:05:05.900 | But like some of them, they came with a fear.
02:05:08.380 | - Yeah.
02:05:09.220 | - But then you try it out
02:05:10.040 | and you think, actually, that's cool.
02:05:11.900 | And you can try to resist the same way as,
02:05:14.940 | you could resist moving from punch cards
02:05:17.620 | to let's say, C++ or so.
02:05:20.900 | And it's a little bit futile.
02:05:23.060 | - So we talked about generation of program,
02:05:26.540 | generation of language,
02:05:28.140 | even self-supervised learning in the visual space
02:05:32.660 | for robotics and then reinforcement learning.
02:05:35.220 | What do you, in like this whole beautiful spectrum of AI,
02:05:40.100 | do you think is a good benchmark,
02:05:42.540 | a good test to strive for
02:05:46.100 | to achieve intelligence?
02:05:47.780 | That's a strong test of intelligence.
02:05:49.900 | You know, it started with Alan Turing and the Turing test.
02:05:53.300 | Maybe you think natural language conversation
02:05:56.060 | is a good test?
02:05:57.180 | - So, you know, it would be nice if, for instance,
02:06:00.020 | machine would be able to solve Riemann hypothesis in math.
02:06:03.400 | That would be, I think that would be very impressive.
02:06:07.700 | - So theorem proving, is that to you,
02:06:10.620 | proving theorems is a good,
02:06:12.540 | oh, like one thing that the machine did,
02:06:14.900 | you would say, damn.
02:06:16.660 | - Exactly.
02:06:17.500 | - Okay.
02:06:19.660 | - That would be quite impressive.
02:06:22.740 | I mean, the tricky part about the benchmarks is,
02:06:25.600 | you know, as we are getting closer with them,
02:06:28.220 | we have to invent new benchmarks.
02:06:29.540 | There is actually no ultimate benchmark out there.
02:06:31.860 | - Yeah, see, my thought with the Riemann hypothesis
02:06:34.500 | would be the moment the machine proves it,
02:06:37.500 | we would say, okay, well, then the problem was easy.
02:06:40.220 | - That's what happens.
02:06:42.260 | And I mean, in some sense,
02:06:43.820 | that's actually what happens over the years in AI,
02:06:46.740 | that like, we get used to things very quickly.
02:06:50.300 | - You know something, I talked to Rodney Brooks,
02:06:52.540 | I don't know if you know who that is.
02:06:54.620 | He called AlphaZero a homework problem.
02:06:57.260 | 'Cause he was saying like, there's nothing special about it.
02:06:59.900 | It's not a big leap.
02:07:00.980 | And I didn't, well, he's coming from one of the aspects
02:07:04.740 | that we referred to is,
02:07:06.060 | he was part of the founding of iRobot,
02:07:08.660 | which deployed now tens of millions of robot in the home.
02:07:12.060 | So if you see robots that are actually in the homes of people
02:07:17.060 | as the legitimate instantiation of artificial intelligence,
02:07:21.900 | then yes, maybe an AI that plays a silly game
02:07:24.580 | like Go and chess is not a real accomplishment,
02:07:26.660 | but to me, it's a fundamental leap.
02:07:29.420 | But I think we as humans then say, okay,
02:07:31.620 | well, then that game of chess or Go wasn't that difficult
02:07:36.020 | compared to the thing that's currently unsolved.
02:07:38.380 | So my intuition is that from perspective of the evolution
02:07:43.180 | of these AI systems,
02:07:45.820 | we'll at first see the tremendous progress in digital space.
02:07:50.020 | And the main thing about digital space
02:07:52.180 | is also that you can, everything is,
02:07:54.180 | that there is a lot of recorded data,
02:07:56.500 | plus you can very rapidly deploy things
02:07:58.660 | to billions of people.
02:08:00.100 | While in case of physical space,
02:08:03.380 | the deployment part takes multiple years.
02:08:05.740 | You have to manufacture things
02:08:07.580 | and delivering it to actual people is very hard.
02:08:12.300 | So I'm expecting that the first
02:08:17.020 | and that prices in digital space of goods,
02:08:20.500 | they would go down to, let's say marginal costs are to zero.
02:08:25.260 | - And also the question is how much of our life
02:08:27.180 | will be in digital?
02:08:28.380 | Because it seems like we're heading towards
02:08:30.780 | more and more of our lives being in the digital space.
02:08:33.500 | So like innovation in the physical space
02:08:36.260 | might become less and less significant.
02:08:38.220 | Like, why do you need to drive anywhere
02:08:41.140 | if most of your life is spent in virtual reality?
02:08:44.260 | - I still would like to, at least at the moment,
02:08:47.500 | my impression is that I would like to have a physical contact
02:08:50.540 | with other people and that's very important to me.
02:08:53.100 | We don't have a way to replicate it in the computer.
02:08:55.340 | It might be the case that over the time it will change.
02:08:58.260 | - Like in 10 years from now,
02:09:00.060 | why not have like an arbitrary infinite number of people
02:09:02.940 | you can interact with?
02:09:04.180 | Some of them are real, some are not
02:09:06.820 | with arbitrary characteristics that you can define
02:09:11.180 | based on your own preferences.
02:09:12.700 | - I think that's maybe where we are heading
02:09:14.580 | and maybe I'm resisting the future.
02:09:16.620 | - Yeah.
02:09:17.460 | I'm telling you, if I got to choose,
02:09:22.660 | if I could live in Elder Scrolls Skyrim
02:09:26.900 | versus the real world,
02:09:28.620 | I'm not so sure I would stay with the real world.
02:09:31.620 | - Yeah, I mean, the question is,
02:09:33.300 | will VR be sufficient to get us there
02:09:36.060 | or do you need to plug electrodes in the brain?
02:09:39.260 | And it would be nice if these electrodes
02:09:42.660 | wouldn't be invasive.
02:09:44.140 | - Yeah.
02:09:45.180 | Or at least like provably non-destructive.
02:09:47.860 | But in the digital space,
02:09:51.140 | do you think we'll be able to solve the Turing test,
02:09:54.380 | the spirit of the Turing test,
02:09:55.740 | which is, do you think we'll be able to achieve
02:10:00.260 | compelling natural language conversation between people?
02:10:02.940 | Like have friends that are AI systems on the internet.
02:10:07.180 | - I totally think it's doable.
02:10:08.900 | - Do you think the current approach of GPT
02:10:11.860 | will take us there? - Yes.
02:10:12.700 | So there is the part of at first
02:10:15.780 | learning all the content out there.
02:10:17.380 | And I think that still system should keep on learning
02:10:19.940 | as it speaks with you.
02:10:21.300 | - Yeah.
02:10:22.420 | - And I think that should work.
02:10:24.100 | The question is how exactly to do it.
02:10:25.740 | And obviously we have people at OpenAI
02:10:29.260 | asking these questions and kind of at first pre-training
02:10:33.860 | on all existing content is like a backbone
02:10:36.620 | and is a decent backbone.
02:10:37.940 | - Do you think AI needs a body
02:10:42.980 | connecting to our robotics question
02:10:45.380 | to truly connect with humans?
02:10:47.340 | Or can most of the connection be in the digital space?
02:10:50.420 | - So let's see.
02:10:52.420 | We know that there are people who met each other online
02:10:55.500 | and they felt in love.
02:10:56.620 | - Yeah.
02:10:58.820 | - So it seems that it's conceivable to establish connection
02:11:03.140 | which is purely through internet.
02:11:06.340 | Of course, it might be more compelling
02:11:09.180 | the more modalities you add.
02:11:10.940 | - So it would be like you're proposing like a Tinder
02:11:14.700 | but for AI.
02:11:15.700 | You like swipe right and left
02:11:18.060 | and half the systems are AI and the other is humans
02:11:21.660 | and you don't know which is which.
02:11:23.380 | - That would be our formulation of Turing test.
02:11:27.780 | The moment AI is able to achieve more swipe right or left
02:11:32.260 | or whatever, the moment it's able to be more attractive
02:11:35.340 | than other humans, it passes the Turing test.
02:11:38.180 | - Then you would pass the Turing test in attractiveness.
02:11:40.780 | - That's right.
02:11:41.620 | Well, no, like attractiveness just to clarify.
02:11:43.300 | - That would be conversation.
02:11:44.140 | - Not just visual, right, right, right.
02:11:45.380 | It's also attractiveness with wit and humor
02:11:49.300 | and whatever makes conversations pleasant for humans.
02:11:53.740 | Okay, all right.
02:11:58.460 | So you're saying it's possible to achieve
02:12:01.300 | in a digital space.
02:12:02.260 | - In some sense, I would almost ask the question,
02:12:04.780 | why wouldn't that be possible?
02:12:06.900 | - Right.
02:12:07.740 | Well, I have this argument with my dad all the time.
02:12:10.860 | He thinks that touch and smell are really important.
02:12:14.100 | - So they can be very important
02:12:16.420 | and I'm saying the initial systems, they won't have it.
02:12:19.140 | Still, I wouldn't, like there are people being born
02:12:24.260 | without these senses and I believe that they can still
02:12:29.260 | fall in love and have meaningful life.
02:12:32.260 | - Yeah, I wonder if it's possible to go close
02:12:35.980 | to all the way by just training on transcripts
02:12:38.980 | of conversations.
02:12:40.580 | Like I wonder how far that takes us.
02:12:42.340 | - So I think that actually still you want images.
02:12:45.180 | Like I would like, so I don't have kids,
02:12:47.060 | but like I could imagine having AI tutor,
02:12:50.820 | it has to see kids drawing some pictures on the paper.
02:12:55.820 | - And also facial expressions, all that kind of stuff.
02:12:58.620 | We use, dogs and humans use their eyes
02:13:01.700 | to communicate with each other.
02:13:04.220 | I think that's a really powerful mechanism
02:13:06.980 | of communication, body language too,
02:13:09.180 | that words are much lower bandwidth.
02:13:12.740 | - And for body language, we still,
02:13:14.500 | we can have a system that displays an image
02:13:17.460 | of its or facial expression on the computer.
02:13:20.140 | It doesn't have to move, you know, mechanical pieces or so.
02:13:23.540 | So I think that there is like kind of a progression.
02:13:27.660 | You can imagine that text might be the simplest to tackle,
02:13:31.820 | but this is not a complete human experience at all.
02:13:36.820 | You expand it to, let's say, images,
02:13:39.620 | both for input and output.
02:13:41.380 | And what you describe is actually the final,
02:13:44.700 | I guess, frontier, what makes us human,
02:13:47.300 | the fact that we can touch each other or smell or so.
02:13:50.180 | And it's the hardest from perspective
02:13:52.460 | of data and deployment.
02:13:54.260 | And I believe that these things might happen gradually.
02:13:58.620 | - Are you excited by that possibility,
02:14:01.420 | this particular application of human to AI
02:14:06.340 | friendship and interaction?
02:14:07.860 | - So let's see.
02:14:09.820 | - Like, would you, do you look forward to a world,
02:14:12.260 | you said you're living with a few folks
02:14:14.140 | and you're very close friends with them.
02:14:16.100 | Do you look forward to a day where one or two
02:14:18.060 | of those friends are AI systems?
02:14:19.660 | - So if the system would be truly wishing me well,
02:14:23.500 | rather than being in the situation that it optimizes
02:14:26.460 | for my time to interact with the system.
02:14:29.460 | - The line between those is, it's a gray area.
02:14:34.460 | - I think that's the distinction between love and possession.
02:14:39.180 | And these things, they might be often correlated
02:14:43.180 | for humans, but it's like, you might find that
02:14:47.180 | there are some friends with whom you haven't spoke
02:14:49.180 | for months.
02:14:50.020 | And then, you pick up the phone,
02:14:52.900 | it's as the time hasn't passed.
02:14:55.300 | It's they are not holding to you.
02:14:57.820 | And I wouldn't like to have AI system that,
02:15:00.860 | you know, it's trying to convince me to spend time with it.
02:15:05.380 | I would like the system to optimize for what I care about
02:15:09.980 | and help me in achieving my own goals.
02:15:12.980 | - But there's some, I mean, I don't know.
02:15:17.140 | There's some manipulation, there's some possessiveness,
02:15:19.780 | there's some insecurities, there's fragility.
02:15:22.220 | All those things are necessary to form a close friendship
02:15:25.980 | over time, to go through some dark shit together,
02:15:28.380 | some bliss and happiness together.
02:15:31.380 | I feel like there's a lot of greedy self-centered behavior
02:15:34.700 | within that process.
02:15:35.940 | - My intuition, but I might be wrong,
02:15:38.860 | is that human-computer interaction
02:15:42.380 | doesn't have to go through computer being greedy,
02:15:46.860 | possessive, and so on.
02:15:47.980 | It is possible to train systems,
02:15:50.260 | maybe that they actually, you know,
02:15:53.940 | they are, I guess, prompted or fine-tuned or so,
02:15:57.500 | to truly optimize for what you care about.
02:16:00.100 | And you could imagine that, you know,
02:16:02.540 | that the way how the process would look like
02:16:04.660 | is at some point, we as humans,
02:16:08.660 | we look at the transcript of the conversation
02:16:11.260 | or like an entire interaction, and we say,
02:16:13.940 | "Actually, here, there was more loving way to go about it."
02:16:17.740 | And we supervise system toward being more loving.
02:16:21.500 | Or maybe we train the system such that
02:16:23.820 | it has a reward function toward being more loving.
02:16:26.180 | - Yeah, or maybe the possibility of the system
02:16:29.100 | being an asshole and manipulative and possessive
02:16:32.980 | every once in a while is a feature, not a bug.
02:16:36.660 | Because some of the happiness that we experience
02:16:41.460 | when two souls meet each other,
02:16:43.140 | when two humans meet each other,
02:16:44.980 | is a kind of break from the assholes in the world.
02:16:48.460 | And so you need assholes in AI as well,
02:16:52.020 | because like, it'll be like a breath of fresh air
02:16:54.980 | to discover an AI that the three previous AIs you had-
02:16:59.780 | - Are too friendly?
02:17:01.260 | - Are, no, or cruel or whatever.
02:17:04.420 | It's like some kind of mix.
02:17:06.060 | And then this one is just right.
02:17:08.220 | But you need to experience the full spectrum.
02:17:10.460 | Like, I think you need to be able to engineer assholes.
02:17:13.980 | - So, let's see.
02:17:15.620 | - Because there's some level to us being appreciated,
02:17:21.220 | to appreciate the human experience,
02:17:24.020 | we need the dark and the light.
02:17:27.180 | - So that kind of reminds me.
02:17:29.980 | I met a while ago at the meditation retreat,
02:17:32.900 | one woman, beautiful, beautiful woman.
02:17:39.260 | And she had a crutch, okay?
02:17:42.220 | She had trouble walking on one leg.
02:17:45.060 | I asked her what has happened.
02:17:47.500 | And she said that five years ago,
02:17:50.780 | she was in Maui, Hawaii,
02:17:53.460 | and she was eating a salad
02:17:55.340 | and some snail fell into the salad.
02:17:58.140 | And apparently there are neurotoxic snails over there.
02:18:02.500 | And she got into coma for a year.
02:18:04.340 | And apparently there is a high chance of even just dying,
02:18:09.780 | but she was in the coma.
02:18:10.980 | At some point, she regained partially consciousness.
02:18:14.980 | She was able to hear people in the room.
02:18:17.100 | People behave as she wouldn't be there.
02:18:20.500 | At some point, she started being able to speak,
02:18:24.500 | but she was mumbling,
02:18:25.420 | like barely able to express herself.
02:18:28.700 | At some point, she got into wheelchair.
02:18:31.460 | Then at some point,
02:18:32.300 | she actually noticed that she can move her toe.
02:18:37.300 | And then she knew that she will be able to walk.
02:18:40.940 | And then, you know, that's where she was five years after.
02:18:43.340 | And she said that since then,
02:18:45.420 | she appreciates the fact that she can move her toe.
02:18:47.980 | And I was thinking,
02:18:50.780 | do I need to go through such experience
02:18:53.020 | to appreciate that I can move my toe?
02:18:55.740 | - Wow, that's really good story.
02:18:57.340 | A really deep example, yeah.
02:18:59.900 | - And in some sense,
02:19:00.780 | it might be the case that we don't see light
02:19:05.020 | if we haven't went through the darkness.
02:19:07.500 | But I wouldn't say that we should.
02:19:09.900 | - We shouldn't assume that that's the case.
02:19:12.100 | We may be able to engineer shortcuts.
02:19:15.620 | - Yeah, Ilya had this, you know,
02:19:18.580 | belief that maybe one has to go for a week or six months
02:19:22.740 | to some challenging camp.
02:19:24.860 | - Yeah.
02:19:25.780 | - To just experience, you know, a lot of difficulties.
02:19:29.180 | And then comes back and actually everything is bright.
02:19:32.460 | Everything is beautiful.
02:19:33.660 | - I'm with Ilya on this.
02:19:34.620 | It must be a Russian thing.
02:19:35.620 | Where are you from originally?
02:19:37.060 | - I'm Polish.
02:19:38.020 | - Polish, okay.
02:19:40.340 | I'm tempted to say that explains a lot,
02:19:43.620 | but yeah, there's something about the Russian,
02:19:46.100 | the necessity of suffering.
02:19:47.860 | I believe suffering or rather struggle is necessary.
02:19:52.860 | - I believe that struggle is necessary.
02:19:54.420 | I mean, in some sense,
02:19:55.940 | you even look at the story of any superhero, the movie.
02:20:00.500 | It's not that it was like everything goes easy, easy, easy.
02:20:03.460 | - I like how that's your ground truth
02:20:05.260 | is the story of superheroes.
02:20:07.860 | Okay, you mentioned that you used to do research at night
02:20:11.860 | and go to bed at like 6 a.m. or 7 a.m.
02:20:15.220 | I still do that often.
02:20:17.620 | What sleep schedules have you tried
02:20:21.340 | to make for a productive and happy life?
02:20:23.260 | Like, is there some interesting wild sleeping patterns
02:20:28.100 | that you engaged that you found
02:20:29.900 | that works really well for you?
02:20:31.540 | - I tried at some point decreasing number of hours of sleep,
02:20:34.700 | like gradually, like half an hour every few days to this.
02:20:39.220 | You know, I was hoping to just save time.
02:20:42.100 | That clearly didn't work for me.
02:20:43.580 | Like at some point there's like a phase shift
02:20:45.980 | and I felt tired all the time.
02:20:48.580 | You know, there was a time that I used to work
02:20:52.660 | during the nights.
02:20:54.060 | The nice thing about the nights is that no one disturbs you.
02:20:57.780 | And even I remember when I was meeting for the first time
02:21:02.780 | with Greg Brockman, he's CTO and chairman of OpenAI.
02:21:07.060 | Our meeting was scheduled to 5 p.m.
02:21:09.700 | and I overstepped for the meeting.
02:21:12.620 | - Overslept for the meeting at 5 p.m., yeah.
02:21:15.380 | Now you sound like me, that's hilarious, okay, yeah.
02:21:18.420 | - And at that moment, in some sense,
02:21:20.620 | my sleeping schedule also has to do with the fact
02:21:24.420 | that I'm interacting with people.
02:21:27.500 | I sleep without an alarm.
02:21:29.340 | - So, yeah, the team thing you mentioned,
02:21:33.260 | the extrovert thing, because most humans operate
02:21:36.500 | during a certain set of hours,
02:21:39.300 | you're forced to then operate at the same set of hours.
02:21:42.900 | But I'm not quite there yet.
02:21:47.100 | I found a lot of joy, just like you said,
02:21:49.220 | working through the night, because it's quiet,
02:21:52.540 | because the world doesn't disturb you.
02:21:54.260 | And there's some aspect,
02:21:56.140 | counter to everything you're saying,
02:21:58.260 | there's some joyful aspect to sleeping
02:22:00.380 | through the mess of the day,
02:22:02.700 | because people are having meetings and sending emails
02:22:05.900 | and there's drama, meetings.
02:22:08.220 | I can sleep through all the meetings.
02:22:10.140 | - You know, I have meetings every day
02:22:11.540 | and they prevent me from having sufficient amount of time
02:22:14.460 | for focused work.
02:22:16.820 | And then I modified my calendar
02:22:21.580 | and I said that I'm out of office Wednesday, Thursday
02:22:24.060 | and Friday every day,
02:22:25.380 | and I'm having meetings only Monday and Tuesday.
02:22:28.260 | And that vastly, positively influenced my mood,
02:22:31.820 | that I have literally like three days for fully focused work.
02:22:35.260 | - Yeah, so there's better solutions to this problem
02:22:38.500 | than staying awake all night.
02:22:40.700 | Okay, you've been part of development
02:22:43.580 | of some of the greatest ideas in artificial intelligence.
02:22:46.220 | What would you say is your process
02:22:47.660 | for developing good novel ideas?
02:22:50.540 | - You have to be aware that clearly
02:22:52.660 | there are many other brilliant people around.
02:22:55.300 | So you have to ask yourself a question,
02:22:58.820 | why the given idea, let's say,
02:23:02.580 | wasn't tried by someone else.
02:23:06.260 | And in some sense, it has to do with,
02:23:10.060 | you know, kind of simple, it might sound simple,
02:23:12.500 | but like thinking outside of the box.
02:23:14.540 | And what do I mean here?
02:23:16.180 | So for instance, for a while, people in academia,
02:23:20.020 | they assumed that you have a fixed data set
02:23:25.020 | and then you optimize the algorithms
02:23:28.580 | in order to get the best performance.
02:23:31.540 | And that was so ingrained assumption
02:23:35.460 | that no one thought about training models on anti-internet.
02:23:40.460 | Or like that maybe some people thought about it,
02:23:44.500 | but it felt to many as unfair.
02:23:48.780 | And in some sense, that's almost like,
02:23:52.180 | it's not my idea or so,
02:23:53.540 | but that's an example of breaking a typical assumption.
02:23:56.900 | So you want to be in the paradigm
02:23:58.940 | that you are breaking a typical assumption.
02:24:01.820 | - In the context of the AI community,
02:24:04.500 | getting to pick your data set is cheating.
02:24:07.980 | - Correct, and in some sense,
02:24:09.620 | so that was assumption that many people had out there.
02:24:13.380 | And then if you free yourself from assumptions,
02:24:17.580 | then they are likely to achieve something
02:24:21.140 | that others cannot do.
02:24:22.420 | And in some sense, if you are trying to do
02:24:23.860 | exactly the same things as others,
02:24:26.300 | it's very likely that you're going to have the same results.
02:24:28.940 | - Yeah, but there's also that kind of tension,
02:24:32.020 | which is asking yourself the question,
02:24:35.100 | why haven't others done this?
02:24:38.500 | Because, I mean, I get a lot of good ideas,
02:24:43.500 | but I think probably most of them suck
02:24:46.220 | when they meet reality.
02:24:48.900 | - So actually, I think the other big piece
02:24:52.580 | is getting into habit of generating ideas,
02:24:56.260 | training your brain towards generating ideas,
02:24:58.660 | and not even suspending judgment of the ideas.
02:25:03.660 | So in some sense, I noticed myself
02:25:06.900 | that even if I'm in the process of generating ideas,
02:25:09.780 | if I tell myself, oh, that was a bad idea,
02:25:13.180 | then that actually interrupts the process,
02:25:16.260 | and I cannot generate more ideas,
02:25:18.260 | because I'm actually focused on the negative part,
02:25:20.420 | why it won't work.
02:25:22.300 | But I created also environment in the way
02:25:25.260 | that it's very easy for me to store new ideas.
02:25:28.380 | So for instance, next to my bed, I have a voice recorder,
02:25:33.380 | and it happens to me often,
02:25:35.420 | like I wake up during the night, and I have some idea.
02:25:38.540 | In the past, I was writing them down on my phone,
02:25:41.620 | but that means turning on the screen,
02:25:44.500 | and that wakes me up, or like pulling a paper,
02:25:47.260 | which requires turning on the light.
02:25:51.100 | These days, I just start recording it.
02:25:53.820 | - What do you think, I don't know if you know
02:25:55.340 | who Jim Keller is.
02:25:56.460 | - I know Jim Keller.
02:25:57.420 | - He's a big proponent of thinking harder
02:26:00.740 | on a problem right before sleep,
02:26:02.820 | so that he can sleep through it, and solve it in his sleep,
02:26:06.540 | or like come up with radical stuff in his sleep.
02:26:09.540 | He was trying to get me to do this.
02:26:11.180 | - So it happened from my experience perspective,
02:26:16.180 | it happened to me many times during the high school days
02:26:19.740 | when I was doing mathematics,
02:26:22.340 | that I had the solution to my problem as I woke up.
02:26:25.460 | At the moment, regarding thinking hard
02:26:30.100 | about the given problem is,
02:26:31.580 | I'm trying to actually devote substantial amount of time
02:26:35.420 | to think about important problems,
02:26:36.940 | not just before the sleep.
02:26:38.980 | I can organizing amount of the huge chunks of time,
02:26:42.220 | such that I'm not constantly working on the urgent problems,
02:26:45.580 | but I actually have time to think about the important one.
02:26:48.380 | - So you do it naturally.
02:26:49.940 | But his idea is that you kind of prime your brain
02:26:54.500 | to make sure that that's the focus.
02:26:56.260 | Oftentimes people have other worries in their life
02:26:58.500 | that's not fundamentally deep problems.
02:27:00.820 | They're like, I don't know,
02:27:02.660 | just stupid drama in your life,
02:27:04.740 | and even at work, all that kind of stuff.
02:27:06.980 | He wants to kind of pick the most important problem
02:27:10.940 | that you're thinking about, and go to bed on that.
02:27:14.060 | - I think that's why, I mean,
02:27:15.460 | the other thing that comes to my mind is also,
02:27:18.060 | I feel the most fresh in the morning.
02:27:20.580 | So during the morning,
02:27:21.620 | I try to work on the most important things,
02:27:24.300 | rather than just being pulled by urgent things
02:27:27.460 | or checking email or so.
02:27:28.900 | - What do you do with the,
02:27:30.860 | 'cause I've been doing the voice recorder thing too,
02:27:33.020 | but I end up recording so many messages,
02:27:35.740 | it's hard to organize.
02:27:37.460 | - I have the same problem.
02:27:38.620 | Now I have heard that Google Pixel
02:27:41.420 | is really good in transcribing text,
02:27:43.740 | and I might get a Google Pixel
02:27:45.380 | just for the sake of transcribing text.
02:27:47.220 | - Yeah, people listening to this,
02:27:48.300 | if you have a good voice recorder suggestion
02:27:50.300 | that transcribes, please let me know.
02:27:52.860 | Some of it is, this has to do with OpenAI Codex too.
02:27:57.860 | Some of it is simply the friction.
02:28:02.060 | I need apps that remove that friction
02:28:05.620 | between voice and the organization
02:28:08.420 | of the resulting transcripts and all that kind of stuff.
02:28:11.220 | But yes, you're right, absolutely.
02:28:13.700 | During, for me, it's walking, sleep too,
02:28:16.780 | but walking and running, especially running,
02:28:21.580 | get a lot of thoughts during running,
02:28:23.220 | and there's no good mechanism for recording thoughts.
02:28:26.140 | - So one more thing that I do,
02:28:27.740 | I have a separate phone, which has no apps.
02:28:32.740 | Maybe it has like Audible or let's say Kindle.
02:28:37.380 | No one has this phone number,
02:28:38.500 | this kind of my meditation phone.
02:28:40.740 | And I try to expand the amount of time
02:28:44.460 | that that's the phone that I'm having.
02:28:47.380 | It has also Google Maps if I need to go somewhere.
02:28:49.980 | And I also use this phone to write down ideas.
02:28:52.980 | - Ah, that's a really good idea.
02:28:55.900 | That's a really good idea.
02:28:57.140 | - Often actually what I end up doing
02:28:58.620 | is even sending a message from that phone to the other phone.
02:29:02.500 | So that's actually my way of recording messages
02:29:05.100 | or I just put them into notes.
02:29:06.980 | - I love it.
02:29:07.820 | What advice would you give to a young person,
02:29:11.540 | high school, college, about how to be successful?
02:29:16.540 | You've done a lot of incredible things in the past decade.
02:29:20.660 | So maybe you have some--
02:29:22.500 | - Something, there might be something.
02:29:23.980 | - There might be something.
02:29:25.340 | - I mean, it might sound like simplistic or so,
02:29:30.260 | but I would say literally just follow your passion,
02:29:34.820 | double down on it.
02:29:35.660 | And if you don't know what's your passion,
02:29:37.220 | just figure out what could be a passion.
02:29:40.740 | So the step might be an exploration.
02:29:43.460 | When I was in elementary school was math and chemistry.
02:29:47.980 | And I remember for some time I gave up on math
02:29:51.140 | because my school teacher, she told me that I'm dumb.
02:29:54.940 | And I guess maybe an advice would be just ignore people
02:30:00.660 | if they tell you that you're dumb.
02:30:02.060 | - You're dumb.
02:30:03.180 | You mentioned something offline about chemistry
02:30:05.620 | and explosives.
02:30:06.860 | What was that about?
02:30:09.820 | - So let's see.
02:30:10.860 | (laughing)
02:30:12.020 | So a story goes like that.
02:30:13.700 | I got into chemistry, maybe I was like a second grade
02:30:21.020 | of my elementary school, third grade.
02:30:23.460 | I started going to chemistry classes.
02:30:26.380 | I really love building stuff.
02:30:31.100 | And I did all the experiments that they describe
02:30:34.140 | in the book, like how to create oxygen with vinegar
02:30:38.900 | and baking soda or so.
02:30:40.860 | So I did all the experiments.
02:30:44.100 | And at some point I was, so what's next?
02:30:46.780 | What can I do?
02:30:48.380 | And explosives, they also, it's like you have a clear
02:30:52.820 | reward signal, if the thing worked or not.
02:30:55.380 | So I remember at first I got interested
02:31:01.500 | in producing hydrogen.
02:31:03.140 | That was kind of funny experiment from school.
02:31:05.540 | You can just burn it.
02:31:06.820 | And then I moved to nitroglycerin.
02:31:09.820 | So that's also relatively easy to synthesize.
02:31:13.540 | I started producing essentially dynamite
02:31:16.540 | and detonating it with a friend.
02:31:18.900 | I remember there was a, there was at first like maybe
02:31:21.980 | two attempts that I went with a friend to detonate
02:31:24.700 | what we built and it didn't work out.
02:31:27.100 | And like a third time he was like, ah, it won't work.
02:31:29.860 | Like let's don't waste time.
02:31:32.460 | And now we were, I was carrying this,
02:31:36.860 | you know, that tube with dynamite, I don't know,
02:31:41.380 | pound or so, dynamite in my backpack.
02:31:44.620 | We're like riding on the bike to the edges of the city.
02:31:48.220 | (laughing)
02:31:50.620 | - Yeah.
02:31:51.460 | - And--
02:31:52.380 | - Attempt number three.
02:31:54.220 | This was be attempt number three.
02:31:56.180 | - Attempt number three.
02:31:57.620 | And now we dig a hole to put it inside.
02:32:02.220 | It actually had the, you know, electrical detonator.
02:32:07.020 | We draw a cable behind the tree.
02:32:10.620 | I even, I never, I haven't ever seen like a explosion before.
02:32:14.980 | So I thought that there will be a lot of sound.
02:32:18.180 | But you know, we were like laying down
02:32:19.740 | and I'm holding the cable and the battery.
02:32:22.540 | At some point, you know, we kind of like a three to one
02:32:25.300 | and I just connected it and it felt like
02:32:29.020 | at the ground shake.
02:32:30.580 | It was like more like a sound.
02:32:33.020 | And then the soil started kind of lifting up
02:32:36.500 | and started falling on us.
02:32:37.820 | - Yeah, wow.
02:32:39.380 | - And then the other friend said,
02:32:42.140 | let's make sure the next time we have helmets.
02:32:44.140 | (laughing)
02:32:45.820 | But also, you know, I'm happy that nothing happened to me.
02:32:49.140 | It could have been the case that I lost the limbo or so.
02:32:52.420 | - Yeah, but that's childhood of an engineering mind
02:32:59.140 | with a strong reward signal of an explosion.
02:33:02.860 | I love it.
02:33:04.620 | And there's some aspect of chemists,
02:33:08.020 | the chemists I know, like my dad,
02:33:10.700 | with plasma chemistry, plasma physics,
02:33:12.500 | he was very much into explosives too.
02:33:14.860 | It's a worrying quality of people that work in chemistry
02:33:18.260 | that they love.
02:33:19.380 | I think it is that exactly,
02:33:21.340 | is the strong signal that the thing worked.
02:33:24.860 | - There is no doubt.
02:33:25.780 | - There's no doubt.
02:33:26.740 | There's some magic.
02:33:28.020 | It's almost like a reminder that physics works,
02:33:31.100 | that chemistry works.
02:33:32.660 | It's cool.
02:33:33.500 | It's almost like a little glimpse at nature
02:33:36.300 | that you yourself engineer.
02:33:38.140 | That's why I really like artificial intelligence,
02:33:40.540 | especially robotics, is you create a little piece of nature.
02:33:45.540 | - And in some sense, even for me with explosives,
02:33:48.900 | the motivation was creation rather than destruction.
02:33:51.300 | - Yes, exactly.
02:33:52.940 | In terms of advice, I forgot to ask
02:33:55.940 | about just machine learning and deep learning.
02:33:58.220 | For people who are specifically interested
02:34:00.500 | in machine learning,
02:34:02.300 | how would you recommend they get into the field?
02:34:04.540 | - So I would say re-implement everything.
02:34:07.580 | And also there is plenty of courses.
02:34:09.820 | - So like from scratch?
02:34:11.460 | - So on different levels of abstraction in some sense,
02:34:14.020 | but I would say re-implement something from scratch,
02:34:16.780 | re-implement something from a paper,
02:34:18.660 | re-implement something from podcasts
02:34:20.860 | that you have heard about.
02:34:22.540 | I would say that's a powerful way to understand things.
02:34:24.940 | So it's often the case that you read the description
02:34:28.260 | and you think you understand,
02:34:30.100 | but you truly understand once you build it,
02:34:33.540 | then you actually know what really mattered
02:34:35.940 | in the description.
02:34:37.420 | - Is there a particular topics that you find people
02:34:39.940 | just fall in love with?
02:34:42.060 | So I've seen,
02:34:43.420 | I tend to really enjoy reinforcement learning
02:34:48.940 | because it's much easier to get to a point
02:34:54.060 | where you feel like you created something special,
02:34:56.940 | like fun games kind of things.
02:34:58.860 | - It's rewarding.
02:34:59.700 | - It's rewarding, yeah.
02:35:00.820 | As opposed to like re-implementing from scratch,
02:35:06.340 | more like supervised learning kind of things.
02:35:08.700 | - So if someone would optimize for things to be rewarding,
02:35:15.020 | then it feels that the things that are somewhat generative,
02:35:18.260 | they have such a property.
02:35:19.500 | So you have, for instance, adversarial networks,
02:35:22.380 | or you have just even generative language models.
02:35:25.740 | And you can even see,
02:35:28.060 | internally we have seen this thing with our releases.
02:35:31.940 | So we released recently two models.
02:35:35.100 | There is one model called DALI that generates images,
02:35:38.140 | and there is other model called CLIP
02:35:40.460 | that actually you provide various possibilities,
02:35:44.980 | what could be the answer to what is on the picture,
02:35:47.500 | and it can tell you which one is the most likely.
02:35:50.740 | And in some sense, in case of the first one, DALI,
02:35:54.860 | it is very easy for you to understand
02:35:57.500 | that actually there is magic going on.
02:35:59.540 | And in case of the second one,
02:36:02.700 | even though it is insanely powerful,
02:36:04.940 | and people from vision community,
02:36:08.420 | as they started probing it inside,
02:36:10.100 | they actually understood how far it goes.
02:36:14.860 | It's difficult for person at first to see
02:36:19.180 | how well it works.
02:36:20.380 | And that's the same, as you said,
02:36:22.940 | that in case of supervised learning models,
02:36:24.940 | you might not kind of see,
02:36:27.100 | or it's not that easy for you to understand the strength.
02:36:31.340 | - Even though you don't believe in magic, to see the magic.
02:36:34.020 | - To see the magic, yeah.
02:36:35.180 | - The generative, that's really brilliant.
02:36:37.500 | So anything that's generative,
02:36:39.780 | 'cause then you are at the core of the creation.
02:36:43.060 | You get to experience creation without much effort,
02:36:46.620 | unless you have to do it from scratch.
02:36:48.660 | - And it feels that humans are wired.
02:36:51.980 | There is some level of reward for creating stuff.
02:36:54.860 | Of course, different people have a different weight
02:36:58.380 | on this reward.
02:36:59.220 | - Yeah, in the big objective function.
02:37:01.860 | - In the big objective function, of a person.
02:37:04.100 | - Of a person.
02:37:05.460 | You wrote that beautiful
02:37:08.700 | is what you intensely pay attention to.
02:37:11.940 | Even a cockroach is beautiful if you look very closely.
02:37:15.160 | Can you expand on this?
02:37:17.340 | What is beauty?
02:37:19.700 | - So what I wrote here actually corresponds
02:37:23.900 | to my subjective experience that I had
02:37:26.420 | through extended periods of meditation.
02:37:29.460 | It's pretty crazy that at some point,
02:37:33.180 | the meditation gets you to the place
02:37:35.100 | that you have really increased focus, increased attention.
02:37:40.100 | And then you look at the very simple objects
02:37:43.580 | that were all the time around you.
02:37:45.300 | You can look at the table or on the pen or at the nature.
02:37:49.420 | And you notice more and more details
02:37:53.620 | and it becomes very pleasant to look at it.
02:37:55.860 | And once again, it kind of reminds me of my childhood.
02:37:59.940 | Like just pure joy of being.
02:38:04.900 | It's also, I have seen even the reverse effect
02:38:08.380 | that by default, regardless of what we possess,
02:38:12.700 | we very quickly get used to it.
02:38:15.300 | And you can have a very beautiful house
02:38:18.780 | and if you don't put sufficient effort,
02:38:22.380 | you're just gonna get used to it
02:38:24.580 | and it doesn't bring any more joy
02:38:26.340 | regardless of what you have.
02:38:28.060 | - Yeah.
02:38:29.340 | Well, I actually,
02:38:30.340 | I find that material possessions get in the way
02:38:36.780 | of that experience of pure joy.
02:38:38.660 | So I've always, I've been very fortunate
02:38:43.460 | to just find joy in simple things.
02:38:46.460 | Just like you're saying, just like, I don't know,
02:38:49.900 | objects in my life, just stupid objects,
02:38:52.300 | like this cup, like thing, you know, just objects.
02:38:55.660 | Sounds, okay, I'm not being eloquent,
02:38:57.660 | but literally objects in the world.
02:38:59.740 | They're just full of joy 'cause it's like,
02:39:03.140 | I can't believe, one, I can't believe
02:39:05.940 | that I'm fortunate enough to be alive
02:39:08.380 | to experience these objects.
02:39:10.460 | And then two, I can't believe humans are clever enough
02:39:13.860 | to build these objects.
02:39:15.980 | The hierarchy of pleasure that that provides is infinite.
02:39:20.580 | - I mean, even if you look at the cup of water,
02:39:22.620 | so you see first like a level of like a reflection of light,
02:39:26.460 | but then you think, no man, there's like trillions
02:39:29.220 | upon trillions of particles bouncing against each other.
02:39:33.140 | There is also the tension on the surface
02:39:37.020 | that if the back could like stand on it
02:39:39.980 | and move around, and you think it also has this
02:39:42.700 | like a magical property that as you decrease temperature,
02:39:46.380 | it actually expands in volume, which allows for the,
02:39:50.580 | you know, legs to freeze on the surface
02:39:53.460 | and at the bottom to have actually not freeze,
02:39:56.260 | which allows for like a crazy.
02:39:59.020 | - Yeah.
02:39:59.860 | - You look in detail at some objects
02:40:01.900 | and you think actually, you know, this table,
02:40:04.420 | that was just the figment of someone's imagination
02:40:06.580 | at some point.
02:40:07.540 | And then there was like a thousands of people involved
02:40:09.500 | to actually manufacture it and put it here.
02:40:12.300 | And by default, no one cares.
02:40:14.700 | - And then you can start thinking about evolution,
02:40:18.340 | how it all started from single cell organisms
02:40:21.140 | that led to this table.
02:40:22.540 | - And these thoughts, they give me life appreciation.
02:40:26.020 | - Yeah, exactly.
02:40:26.860 | - And even lack of thoughts, just the pure raw signal
02:40:29.420 | also gives the life appreciation.
02:40:31.740 | - See, the thing is, and then that's coupled for me
02:40:37.060 | with the sadness that the whole ride ends
02:40:40.180 | and perhaps is deeply coupled in that,
02:40:42.860 | the fact that this experience, this moment ends,
02:40:46.020 | gives it an intensity
02:40:49.100 | that I'm not sure I would otherwise have.
02:40:51.460 | So in that same way, I try to meditate on my own death often.
02:40:56.220 | Do you think about your mortality?
02:40:58.100 | Are you afraid of death?
02:41:03.220 | - So fear of death is like one of the most fundamental fears
02:41:07.500 | that each of us has.
02:41:09.220 | We might be not even aware of it.
02:41:11.100 | It requires to look inside to even recognize
02:41:13.700 | that it's out there.
02:41:16.100 | There is still, let's say, this property of nature
02:41:20.380 | that if things would last forever,
02:41:23.260 | then they would be also boring to us.
02:41:26.180 | The fact that the things change in some way
02:41:28.380 | gives any meaning to them.
02:41:31.780 | I also found out that it seems to be very healing
02:41:36.780 | to people to have these short experiences,
02:41:43.060 | like I guess, psychedelic experiences
02:41:47.980 | in which they experience death of self,
02:41:52.620 | in which they let go of this fear
02:41:56.420 | and then maybe can even increase
02:41:59.220 | the appreciation of the moment.
02:42:01.660 | It seems that many people,
02:42:03.500 | they can easily comprehend the fact
02:42:08.500 | that the money's finite
02:42:12.260 | while they don't see that time is finite.
02:42:15.020 | I have this discussion with Ilya from time to time.
02:42:18.820 | He's saying, "Man, life will pass very fast.
02:42:23.620 | "At some point, I will be 40, 50, 60, 70, and then it's over."
02:42:27.420 | This is true, which also makes me believe
02:42:30.940 | that every single moment, it is so unique
02:42:34.820 | that should be appreciated.
02:42:37.780 | And this also makes me think
02:42:39.740 | that I should be acting on my life
02:42:43.660 | because otherwise it will pass.
02:42:46.380 | I also like this framework of thinking from Jeff Bezos
02:42:51.020 | on regret minimization,
02:42:52.980 | that I would like, if I will be at that deathbed,
02:42:56.380 | to look back on my life
02:43:00.620 | and not regret that I haven't done something.
02:43:03.420 | It's usually, you might regret that you haven't tried.
02:43:07.860 | I'm fine with failing.
02:43:09.140 | - But I haven't tried.
02:43:11.940 | What's the Nietzsche eternal occurrence?
02:43:15.500 | Try to live a life that,
02:43:17.100 | if you had to live it infinitely many times,
02:43:19.820 | that would be the, you'd be okay with that kind of life.
02:43:24.820 | So try to live it optimally.
02:43:27.100 | - I can say that it's almost like
02:43:30.780 | unbillable to me where I am in my life.
02:43:36.780 | I'm extremely grateful for actually people whom I met.
02:43:40.740 | I would say, I think that I'm decently smart and so on,
02:43:44.700 | but I think that actually to great extent
02:43:49.620 | where I am has to do with the people who I met.
02:43:54.500 | - Would you be okay if after this conversation you died?
02:43:58.300 | - So if I'm dead, then it kind of,
02:44:01.500 | I don't have a choice anymore.
02:44:03.740 | So in some sense, there's like a plenty of things
02:44:05.380 | that I would like to try out in my life.
02:44:07.700 | I feel that I'm gradually going one by one
02:44:11.700 | and I'm just doing them.
02:44:13.660 | I think that the list will be always infinite.
02:44:16.140 | - Yeah.
02:44:17.380 | So might as well go today.
02:44:19.020 | - Yeah, I mean, to be clear, I'm not looking forward to die.
02:44:23.860 | I would say if there is no choice, I would accept it.
02:44:27.500 | But like in some sense, if there would be a choice,
02:44:32.380 | if there would be possibility to live,
02:44:34.180 | I would fight for living.
02:44:35.500 | - I find it's more honest and real to think about
02:44:41.940 | dying today at the end of the day.
02:44:45.100 | That seems to me, at least to my brain,
02:44:49.100 | more honest slap in the face
02:44:51.780 | as opposed to I still have 10 years today.
02:44:56.100 | Then I'm much more about appreciating the cup and the table
02:44:59.380 | and so on, and less about silly worldly accomplishments
02:45:03.780 | and all those kinds of things.
02:45:05.260 | - We have in the company a person who say at some point
02:45:10.260 | found out that they have cancer.
02:45:11.980 | And that also gives huge perspective with respect
02:45:14.980 | to what matters now.
02:45:16.740 | And often people in situations like that,
02:45:19.340 | they conclude that actually what matters
02:45:20.860 | is human connection.
02:45:21.940 | - And love, and people conclude also if you have kids,
02:45:27.820 | kids is family.
02:45:29.620 | You, I think tweeted,
02:45:31.900 | "We don't assign the minus infinity reward to our death.
02:45:36.220 | Such a reward would prevent us from taking any risk.
02:45:39.540 | We wouldn't be able to cross a road
02:45:41.380 | in fear of being hit by a car."
02:45:43.340 | So in the objective function, you mentioned fear of death
02:45:46.020 | might be fundamental to the human condition.
02:45:49.260 | - So as I said, let's assume that there are
02:45:51.700 | like reward functions in our brain.
02:45:53.500 | And the interesting thing is even realization
02:46:00.260 | how different reward functions can play with your behavior.
02:46:04.980 | As a matter of fact, I wouldn't say that you should assign
02:46:08.100 | infinite negative reward to anything
02:46:11.340 | because that messes up the math.
02:46:13.020 | - The math doesn't work out.
02:46:15.260 | - It doesn't work out.
02:46:16.100 | And as you said, even, you know,
02:46:18.580 | government or some insurance companies,
02:46:20.620 | you said they assign $9 million to human life.
02:46:24.380 | And I'm just saying it with respect to,
02:46:27.140 | that might be a hard statement to ourselves,
02:46:30.860 | but in some sense that there is a finite value
02:46:33.220 | of our own life.
02:46:34.180 | I'm trying to put it from perspective of being less,
02:46:40.820 | of being more egoless and realizing fragility
02:46:45.460 | of my own life.
02:46:46.380 | And in some sense, the fear of death
02:46:52.380 | might prevent you from acting
02:46:54.900 | because anything can cause death.
02:46:57.180 | - Yeah, and I'm sure actually,
02:47:00.020 | if you were to put death in the objective function,
02:47:02.580 | there's probably so many aspects to death and fear of death
02:47:06.300 | and realization of death and mortality.
02:47:10.500 | There's just whole components of finiteness
02:47:14.300 | of not just your life, but every experience and so on
02:47:18.020 | that you're gonna have to formalize mathematically.
02:47:21.900 | - And also, you know, that might lead to
02:47:24.140 | you spending a lot of compute cycles
02:47:29.340 | on this like a,
02:47:30.900 | deliberating this terrible future
02:47:34.740 | instead of experiencing now.
02:47:36.380 | And then in some sense,
02:47:39.100 | it's also kind of unpleasant simulation
02:47:41.300 | to run in your head.
02:47:42.220 | - Yeah.
02:47:43.100 | (laughing)
02:47:44.580 | - Do you think there's an objective function
02:47:46.940 | that describes the entirety of human life?
02:47:51.620 | So, you know, usually the way you ask that
02:47:54.060 | is what is the meaning of life?
02:47:56.260 | Is there a universal objective functions
02:47:59.620 | that captures the why of life?
02:48:02.220 | - So, yeah, I mean,
02:48:03.860 | I suspected that they will ask this question,
02:48:05.580 | but it's also a question that I ask myself many, many times.
02:48:09.380 | See, I can tell you a framework that I have these days
02:48:11.780 | to think about this question.
02:48:13.340 | So I think that fundamentally meaning of life
02:48:16.380 | has to do with some of our reward functions
02:48:19.540 | that we have in brain,
02:48:20.700 | and they might have to do with,
02:48:22.700 | let's say, for instance, curiosity or human connection,
02:48:27.060 | which might mean understanding others.
02:48:29.100 | It's also possible for a person
02:48:33.220 | to slightly modify their reward function.
02:48:35.740 | Usually they mostly stay fixed,
02:48:37.900 | but it's possible to modify reward function.
02:48:40.380 | And you can pretty much choose.
02:48:41.860 | So in some sense, reward functions,
02:48:43.420 | optimizing reward functions,
02:48:45.060 | they will give you a life satisfaction.
02:48:47.740 | - Is there some randomness in the function?
02:48:49.700 | - I think when you are born, there is some randomness.
02:48:51.660 | Like you can see that some people, for instance,
02:48:54.420 | they care more about building stuff.
02:48:57.900 | Some people care more about caring for others.
02:49:00.940 | Some people, there are all sorts of default reward functions.
02:49:05.020 | And then in some sense, you can ask yourself,
02:49:07.060 | what is the satisfying way for you
02:49:11.900 | to go after this reward function?
02:49:13.900 | And you just go after this reward function.
02:49:15.500 | And some people also ask,
02:49:17.820 | are these reward functions real?
02:49:20.060 | I almost think about it as, let's say,
02:49:24.860 | if you would have to discover mathematics,
02:49:27.700 | in mathematics, you are likely to run into various objects,
02:49:31.540 | like complex numbers or differentiation,
02:49:34.740 | some other objects.
02:49:35.820 | And these are very natural objects that arise.
02:49:38.460 | And similarly, the reward functions
02:49:40.220 | that we are having in our brain,
02:49:41.980 | they are somewhat very natural,
02:49:43.780 | that there is a reward function for understanding,
02:49:48.780 | like a comprehension, curiosity, and so on.
02:49:53.500 | So in some sense, they are in the same way natural
02:49:56.860 | as they're natural objects in mathematics.
02:49:59.140 | - Interesting.
02:49:59.980 | So you know, there's the old sort of debate,
02:50:02.900 | is mathematics invented or discovered?
02:50:05.780 | You're saying reward functions are discovered.
02:50:07.980 | So nature provides-
02:50:08.820 | - So nature provided some,
02:50:10.700 | you can still, let's say, expand it throughout the life.
02:50:13.020 | Some of the reward functions, they might be futile.
02:50:15.540 | Like for instance, there might be a reward function,
02:50:18.380 | maximize amount of wealth.
02:50:20.420 | - Yeah.
02:50:21.260 | - And this is more like a learned reward function.
02:50:24.540 | But we know also that some reward functions,
02:50:27.620 | if you optimize them, you won't be quite satisfied.
02:50:30.860 | - Well, I don't know which part of your reward function
02:50:35.180 | resulted in you coming today,
02:50:36.980 | but I am deeply appreciative that you did
02:50:39.380 | spend your valuable time with me.
02:50:41.180 | Wojciech, it was really fun talking to you.
02:50:44.020 | You're brilliant, you're a good human being,
02:50:46.460 | and it's an honor to meet you and an honor to talk to you.
02:50:49.060 | Thanks for talking today, brother.
02:50:50.940 | - Thank you, Lex, a lot.
02:50:51.780 | I appreciated your questions, curiosity.
02:50:54.380 | I had a lot of fun being here.
02:50:55.980 | - Thanks for listening to this conversation
02:50:59.060 | with Wojciech Czeremba.
02:51:00.740 | To support this podcast,
02:51:02.220 | please check out our sponsors in the description.
02:51:05.540 | And now, let me leave you with some words
02:51:07.980 | from Arthur C. Clarke,
02:51:09.700 | who is the author of "2001, A Space Odyssey."
02:51:13.900 | It may be that our role on this planet
02:51:17.100 | is not to worship God, but to create him.
02:51:20.140 | Thank you for listening, and I hope to see you next time.
02:51:24.380 | (upbeat music)
02:51:26.960 | (upbeat music)
02:51:29.540 | [BLANK_AUDIO]