back to index

Max Tegmark: The Case for Halting AI Development | Lex Fridman Podcast #371


Chapters

0:0 Introduction
1:56 Intelligent alien civilizations
14:20 Life 3.0 and superintelligent AI
25:47 Open letter to pause Giant AI Experiments
50:54 Maintaining control
79:44 Regulation
90:34 Job automation
99:48 Elon Musk
121:31 Open source
128:1 How AI may kill all humans
138:32 Consciousness
147:54 Nuclear winter
158:21 Questions for AGI

Whisper Transcript | Transcript Only Page

00:00:00.000 | A lot of people have said for many years
00:00:01.840 | that there will come a time when we want to pause a little bit.
00:00:06.320 | That time is now.
00:00:07.200 | The following is a conversation with Max Tegmark,
00:00:13.600 | his third time on the podcast.
00:00:15.440 | In fact, his first appearance was episode number one
00:00:18.800 | of this very podcast.
00:00:20.560 | He is a physicist and artificial intelligence researcher
00:00:24.000 | at MIT, co-founder of Future of Life Institute,
00:00:27.120 | and author of Life 3.0, Being Human in the Age
00:00:31.400 | of Artificial Intelligence.
00:00:33.480 | Most recently, he's a key figure in spearheading
00:00:36.320 | the open letter calling for a six-month pause
00:00:39.120 | on giant AI experiments, like training GPT-4.
00:00:43.960 | The letter reads, "We're calling for a pause
00:00:47.440 | on training of models larger than GPT-4 for six months.
00:00:51.880 | This does not imply a pause or ban on all AI research
00:00:55.000 | and development or the use of systems that have already
00:00:58.080 | been placed on the market.
00:00:59.800 | Our call is specific and addresses
00:01:02.440 | a very small pool of actors who possesses this capability."
00:01:06.920 | The letter has been signed by over 50,000 individuals,
00:01:09.960 | including 1,800 CEOs and over 1,500 professors.
00:01:14.480 | Signatories include Yoshua Bengio, Stuart Russell,
00:01:17.840 | Elon Musk, Steve Wozniak, Yuval Noah Harari, Andrew Yang,
00:01:21.880 | and many others.
00:01:23.600 | This is a defining moment in the history
00:01:26.040 | of human civilization, where the balance of power
00:01:29.000 | between human and AI begins to shift.
00:01:32.880 | And Max's mind and his voice is one of the most valuable
00:01:36.760 | and powerful in a time like this.
00:01:39.520 | His support, his wisdom, his friendship has been a gift
00:01:43.680 | I'm forever deeply grateful for.
00:01:46.720 | This is the Alex Friedman Podcast.
00:01:48.840 | To support it, please check out our sponsors
00:01:50.680 | in the description.
00:01:51.840 | And now, dear friends, here's Max Tegmark.
00:01:55.600 | You were the first ever guest on this podcast,
00:01:59.280 | episode number one.
00:02:00.600 | So first of all, Max, I just have to say,
00:02:03.760 | thank you for giving me a chance.
00:02:05.240 | Thank you for starting this journey.
00:02:06.640 | It's been an incredible journey.
00:02:07.840 | Just thank you for sitting down with me
00:02:11.200 | and just acting like I'm somebody who matters,
00:02:14.280 | that I'm somebody who's interesting to talk to.
00:02:16.680 | And thank you for doing it.
00:02:18.880 | That meant a lot.
00:02:20.200 | - Thanks to you for putting your heart and soul into this.
00:02:24.280 | I know when you delve into controversial topics,
00:02:26.880 | it's inevitable to get hit by what Hamlet talks about,
00:02:30.720 | the slings and arrows and stuff.
00:02:32.360 | And I really admire this.
00:02:33.800 | It's in an era where YouTube videos are too long
00:02:37.400 | and now it has to be like a 20-minute TikTok,
00:02:39.960 | 20-second TikTok clip.
00:02:41.880 | It's just so refreshing to see you going exactly
00:02:44.280 | against all of the advice and doing these really long form
00:02:47.520 | things and the people appreciate it.
00:02:49.720 | Reality is nuanced.
00:02:51.360 | And thanks for sharing it that way.
00:02:55.840 | - So let me ask you again,
00:02:57.280 | the first question I've ever asked on this podcast,
00:02:59.520 | episode number one, talking to you,
00:03:02.200 | do you think there's intelligent life out there
00:03:04.920 | in the universe?
00:03:05.840 | Let's revisit that question.
00:03:07.320 | Do you have any updates?
00:03:08.800 | What's your view when you look out to the stars?
00:03:12.360 | - So when we look out to the stars,
00:03:14.120 | if you define our universe the way most astrophysicists do,
00:03:18.920 | not as all of space, but the spherical region of space
00:03:22.560 | that we can see with our telescopes,
00:03:23.880 | from which light has a time to reach us,
00:03:25.920 | since our Big Bang, I'm in the minority.
00:03:29.400 | I estimate that we are the only life
00:03:34.360 | in this spherical volume that has invented internet,
00:03:39.360 | radios, gotten our level of tech.
00:03:41.600 | And if that's true,
00:03:43.680 | then it puts a lot of responsibility on us
00:03:47.880 | to not mess this one up.
00:03:49.960 | Because if it's true, it means that life is quite rare.
00:03:54.400 | And we are stewards of this one spark
00:03:58.160 | of advanced consciousness, which if we nurture it
00:04:01.320 | and help it grow, eventually life can spread from here
00:04:05.440 | out into much of our universe.
00:04:06.800 | And we can have this just amazing future.
00:04:08.560 | Whereas if we instead are reckless
00:04:11.360 | with the technology we build and just snuff it out
00:04:14.080 | due to stupidity or infighting,
00:04:17.360 | then maybe the rest of cosmic history in our universe
00:04:22.360 | is just gonna be a play for empty benches.
00:04:24.800 | But I do think that we are actually very likely
00:04:28.960 | to get visited by aliens, alien intelligence quite soon.
00:04:33.240 | But I think we are gonna be building
00:04:34.560 | that alien intelligence.
00:04:36.680 | - So we're going to give birth
00:04:40.600 | to an intelligent alien civilization.
00:04:44.200 | Unlike anything that human,
00:04:47.200 | that evolution here on Earth was able to create
00:04:49.620 | in terms of the path, the biological path it took.
00:04:52.800 | - Yeah, and it's gonna be much more alien
00:04:56.000 | than a cat or even the most exotic animal
00:05:00.920 | on the planet right now.
00:05:02.720 | Because it will not have been created
00:05:05.480 | through the usual Darwinian competition
00:05:07.620 | where it necessarily cares about self-preservation,
00:05:11.080 | afraid of death, any of those things.
00:05:15.120 | The space of alien minds that you can build
00:05:18.600 | is just so much vaster than what evolution will give you.
00:05:22.100 | And with that also comes a great responsibility
00:05:24.880 | for us to make sure that the kind of minds we create
00:05:28.240 | are the kind of minds that it's good to create.
00:05:32.240 | Minds that will share our values
00:05:36.520 | and be good for humanity and life.
00:05:39.480 | And also create minds that don't suffer.
00:05:42.020 | - Do you try to visualize the full space
00:05:46.900 | of alien minds that AI could be?
00:05:49.800 | Do you try to consider all the different kinds
00:05:51.920 | of intelligences, sort of generalizing
00:05:55.480 | what humans are able to do to the full spectrum
00:05:58.600 | of what intelligent creatures, entities could do?
00:06:01.920 | - I try, but I would say I fail.
00:06:05.180 | I mean, it's very difficult for a human mind
00:06:08.800 | to really grapple with something so completely alien.
00:06:13.800 | I mean, even for us, right?
00:06:17.040 | If we just try to imagine, how would it feel
00:06:18.680 | if we were completely indifferent towards death
00:06:23.480 | or individuality?
00:06:25.060 | Even if you just imagine that, for example,
00:06:29.320 | you could just copy my knowledge of how to speak Swedish.
00:06:34.520 | Boom, now you can speak Swedish.
00:06:37.280 | And you could copy any of my cool experiences
00:06:39.680 | and then you could delete the ones you didn't like
00:06:41.160 | in your own life, just like that.
00:06:43.160 | It would already change quite a lot
00:06:45.800 | about how you feel as a human being, right?
00:06:48.520 | You probably spend less effort studying things
00:06:51.680 | if you just copy them.
00:06:52.600 | And you might be less afraid of death
00:06:54.660 | because if the plane you're on starts to crash,
00:06:58.300 | you'd just be like, "Oh, shucks, I haven't backed
00:07:01.600 | "my brain up for four hours.
00:07:04.640 | "So I'm gonna lose all these wonderful experiences
00:07:08.320 | "of this flight."
00:07:09.700 | We might also start feeling more compassionate,
00:07:16.140 | maybe with other people, if we can so readily share
00:07:18.560 | each other's experiences and our knowledge
00:07:20.840 | and feel more like a hive mind.
00:07:23.480 | It's very hard, though.
00:07:24.760 | I really feel very humble about this, to grapple with it,
00:07:29.760 | how it might actually feel.
00:07:33.080 | - The one thing which is so obvious, though,
00:07:35.280 | which I think is just really worth reflecting on
00:07:38.360 | is because the mind space of possible intelligences
00:07:42.400 | is so different from ours, it's very dangerous
00:07:45.600 | if we assume they're gonna be like us,
00:07:47.200 | or anything like us.
00:07:48.420 | - Well, the entirety of human written history
00:07:54.480 | has been through poetry, through novels,
00:07:57.440 | been trying to describe through philosophy,
00:08:00.960 | trying to describe the human condition
00:08:03.080 | and what's entailed in it.
00:08:04.480 | Just like you said, fear of death
00:08:05.720 | and all those kinds of things, what is love,
00:08:07.680 | and all of that changes if you have a different
00:08:11.200 | kind of intelligence, all of it.
00:08:13.280 | The entirety, all those poems, they're trying to sneak up
00:08:16.480 | to what the hell it means to be human.
00:08:18.360 | All of that changes.
00:08:19.880 | How AI concerns and existential crises that AI experiences,
00:08:24.880 | how that clashes with the human existential crisis,
00:08:29.800 | the human condition, that's hard to fathom, hard to predict.
00:08:34.480 | - It's hard, but it's fascinating to think about also.
00:08:37.960 | Even in the best case scenario where we don't lose control
00:08:42.180 | over the ever more powerful AI that we're building
00:08:44.960 | to other humans whose goals we think are horrible,
00:08:49.120 | and where we don't lose control to the machines,
00:08:51.840 | and AI provides the things we want,
00:08:56.320 | even then, you get into the questions you touched here.
00:08:59.660 | Maybe it's the struggle that it's actually hard
00:09:03.120 | to do things is part of the things
00:09:04.760 | that gives us meaning as well.
00:09:07.120 | For example, I found it so shocking
00:09:09.900 | that this new Microsoft GPT-4 commercial
00:09:14.240 | that they put together has this woman talking about,
00:09:18.000 | showing this demo of how she's gonna give
00:09:20.620 | a graduation speech to her beloved daughter,
00:09:23.820 | and she asks GPT-4 to write it.
00:09:25.780 | If it's frigging 200 words or so,
00:09:28.880 | if I realized that my parents couldn't be bothered
00:09:31.840 | struggling a little bit to write 200 words
00:09:35.000 | and outsource that to their computer,
00:09:36.760 | I would feel really offended, actually.
00:09:39.020 | And so I wonder if eliminating too much
00:09:44.400 | of the struggle from our existence,
00:09:46.360 | do you think that would also take away a little bit of what--
00:09:53.240 | - It means to be human, yeah.
00:09:55.060 | We can't even predict.
00:09:57.860 | I had somebody mention to me that they started using
00:10:02.380 | Chad GPT with a 3.5 and not 4.0
00:10:06.400 | to write what they really feel to a person,
00:10:12.580 | and they have a temper issue,
00:10:14.220 | and they're basically trying to get Chad GPT
00:10:17.800 | to rewrite it in a nicer way,
00:10:19.760 | to get the point across, but rewrite it in a nicer way.
00:10:22.620 | So we're even removing the inner asshole
00:10:26.180 | from our communication.
00:10:27.680 | So there's some positive aspects of that,
00:10:31.840 | but mostly it's just the transformation
00:10:34.200 | of how humans communicate.
00:10:35.780 | And it's scary because so much of our society
00:10:40.780 | is based on this glue of communication.
00:10:44.640 | And if we're now using AI as the medium of communication
00:10:49.080 | that does the language for us,
00:10:51.140 | so much of the emotion that's laden in human communication,
00:10:55.520 | so much of the intent that's going to be handled
00:10:59.320 | by outsourced AI, how does that change everything?
00:11:02.140 | How does that change the internal state
00:11:03.940 | of how we feel about other human beings?
00:11:06.500 | What makes us lonely, what makes us excited?
00:11:08.980 | What makes us afraid, how we fall in love,
00:11:11.160 | all that kind of stuff.
00:11:12.180 | - Yeah, for me personally, I have to confess,
00:11:15.060 | the challenge is one of the things
00:11:16.500 | that really makes my life feel meaningful.
00:11:22.660 | If I go hike a mountain with my wife, Maya,
00:11:26.120 | I don't want to just press a button and be at the top.
00:11:28.220 | I want to struggle and come up there sweaty
00:11:30.240 | and feel, wow, we did this.
00:11:32.360 | In the same way, I want to constantly work on myself
00:11:37.360 | to become a better person.
00:11:39.320 | If I say something in anger that I regret,
00:11:42.680 | I want to go back and really work on myself
00:11:46.480 | rather than just tell an AI from now on
00:11:49.720 | always filter what I write
00:11:51.080 | so I don't have to work on myself
00:11:53.620 | 'cause then I'm not growing.
00:11:55.840 | - Yeah, but then again, it could be like with chess.
00:11:59.840 | An AI, once it significantly, obviously,
00:12:04.680 | supersedes the performance of humans,
00:12:06.840 | it will live in its own world
00:12:08.800 | and provide maybe a flourishing civilization for humans,
00:12:12.600 | but we humans will continue hiking mountains
00:12:15.120 | and playing our games even though AI is so much smarter,
00:12:18.240 | so much stronger, so much superior in every single way,
00:12:21.120 | just like with chess.
00:12:22.280 | That's one possible hopeful trajectory here
00:12:26.720 | is that humans will continue to human
00:12:28.560 | and AI will just be a kind of
00:12:34.240 | a medium that enables the human experience to flourish.
00:12:45.600 | - Yeah, I would phrase that as rebranding ourselves
00:12:50.600 | from homo sapiens to homo sentiens.
00:12:53.920 | Right now, sapiens, the ability to be intelligent,
00:12:58.200 | we've even put it in our species name.
00:13:00.280 | We're branding ourselves as the smartest
00:13:05.120 | information processing entity on the planet.
00:13:08.600 | That's clearly gonna change if AI continues ahead.
00:13:14.200 | So maybe we should focus on the experience instead,
00:13:16.520 | the subjective experience that we have with homo sentiens
00:13:20.440 | and say that's what's really valuable,
00:13:23.080 | the love, the connection, the other things.
00:13:25.400 | Get off our high horses and get rid of this hubris
00:13:31.240 | that only we can do integrals.
00:13:35.160 | - So consciousness, the subjective experience
00:13:37.880 | is a fundamental value to what it means to be human.
00:13:42.120 | Make that the priority.
00:13:44.200 | - That feels like a hopeful direction to me,
00:13:47.480 | but that also requires more compassion,
00:13:50.920 | not just towards other humans because they happen
00:13:53.680 | to be the smartest on the planet,
00:13:55.720 | but also towards all our other fellow creatures
00:13:57.760 | on this planet.
00:13:58.880 | I personally feel right now we're treating
00:14:01.320 | a lot of farm animals horribly, for example,
00:14:03.480 | and the excuse we're using is,
00:14:04.880 | oh, they're not as smart as us.
00:14:06.720 | But if we admit that we're not that smart
00:14:10.040 | in the grand scheme of things either in the post-AI epoch,
00:14:13.040 | then surely we should value the subjective experience
00:14:17.840 | of a cow also.
00:14:19.520 | - Well, allow me to briefly look at the book,
00:14:23.960 | which at this point is becoming more and more visionary
00:14:26.340 | that you've written, I guess over five years ago,
00:14:28.880 | Life 3.0.
00:14:29.880 | So first of all, 3.0, what's 1.0, what's 2.0, what's 3.0?
00:14:35.640 | And how's that vision sort of evolve,
00:14:38.880 | the vision in the book evolve to today?
00:14:41.200 | - Life 1.0 is really dumb, like bacteria,
00:14:45.160 | in that it can't actually learn anything at all
00:14:46.960 | during the lifetime.
00:14:47.880 | The learning just comes from this genetic process
00:14:51.800 | from one generation to the next.
00:14:55.200 | Life 2.0 is us and other animals which have brains,
00:15:00.200 | which can learn during their lifetime a great deal.
00:15:06.960 | And you were born without being able to speak English.
00:15:11.960 | And at some point you decided,
00:15:13.520 | hey, I wanna upgrade my software.
00:15:15.320 | Let's install an English speaking module.
00:15:17.560 | - So you did.
00:15:19.280 | - And Life 3.0, which does not exist yet,
00:15:23.840 | can replace not only its software the way we can,
00:15:27.400 | but also its hardware.
00:15:28.500 | And that's where we're heading towards at high speed.
00:15:33.400 | We're already maybe 2.1,
00:15:34.840 | 'cause we can put in an artificial knee,
00:15:38.960 | pacemaker, et cetera, et cetera.
00:15:42.520 | And if Neuralink and other companies succeed,
00:15:45.760 | we'll be Life 2.2, et cetera.
00:15:48.000 | But what the company's trying to build, AGI,
00:15:52.080 | or trying to make is, of course, full 3.0.
00:15:54.720 | And you can put that intelligence
00:15:56.000 | into something that also has no
00:15:57.720 | biological basis whatsoever.
00:16:02.760 | - So less constraints and more capabilities,
00:16:05.400 | just like the leap from 1.0 to 2.0.
00:16:08.720 | There is, nevertheless,
00:16:10.080 | you speaking so harshly about bacteria,
00:16:12.120 | so disrespectfully about bacteria,
00:16:14.300 | there is still the same kind of magic there
00:16:18.240 | that permeates Life 2.0 and 3.0.
00:16:22.480 | It seems like maybe the thing that's truly powerful
00:16:26.520 | about life, intelligence, and consciousness
00:16:29.400 | was already there in 1.0.
00:16:31.960 | Is it possible?
00:16:32.840 | - I think we should be humble and not be so quick
00:16:37.960 | to make everything binary and say either it's there
00:16:42.120 | or it's not.
00:16:42.960 | Clearly, there's a great spectrum.
00:16:44.960 | And there is even controversy about whether some unicellular
00:16:48.960 | organisms like amoebas can maybe learn a little bit
00:16:51.520 | after all.
00:16:53.440 | So apologies if I offended any bacteria here.
00:16:56.200 | It wasn't my intent.
00:16:57.040 | It was more that I wanted to talk up how cool it is
00:16:59.960 | to actually have a brain,
00:17:01.420 | where you can learn dramatically within your lifetime.
00:17:04.680 | - Typical human.
00:17:05.800 | - And the higher up you get from 1.0 to 2.0 to 3.0,
00:17:09.240 | the more you become the captain of your own ship,
00:17:12.480 | the master of your own destiny,
00:17:13.960 | and the less you become a slave
00:17:15.520 | to whatever evolution gave you, right?
00:17:17.540 | By upgrading our software,
00:17:20.240 | we can be so different from previous generations
00:17:22.640 | and even from our parents,
00:17:24.560 | much more so than even a bacterium.
00:17:27.160 | You know, no offense to them.
00:17:29.180 | And if you can also swap out your hardware
00:17:32.080 | and take any physical form you want,
00:17:33.840 | of course, really the sky's the limit.
00:17:36.800 | - Yeah, so it accelerates the rate
00:17:40.680 | at which you can perform the computation
00:17:43.560 | that determines your destiny.
00:17:45.520 | - Yeah, and I think it's worth commenting a bit
00:17:48.760 | on what you means in this context also,
00:17:50.560 | if you swap things out a lot, right?
00:17:52.640 | This is controversial, but my current
00:17:59.380 | understanding is that life is best thought of
00:18:04.380 | not as a bag of meat or even a bag of elementary particles,
00:18:10.860 | but rather as a system which can process information
00:18:16.820 | and retain its own complexity,
00:18:19.580 | even though nature is always trying to mess it up.
00:18:21.580 | So it's all about information processing.
00:18:25.100 | And that makes it a lot like something
00:18:28.500 | like a wave in the ocean,
00:18:29.880 | which is not its water molecules, right?
00:18:33.600 | The water molecules bob up and down,
00:18:35.120 | but the wave moves forward.
00:18:36.240 | It's an information pattern.
00:18:37.800 | In the same way, you, Lex,
00:18:40.540 | you're not the same atoms as during the first
00:18:43.520 | time you did with me. - Time we talked, yeah.
00:18:44.600 | - You've swapped out most of them, but still you.
00:18:47.840 | - Yeah. - And the information pattern
00:18:51.080 | is still there.
00:18:52.320 | And if you could swap out your arms
00:18:55.840 | and whatever, you can still have this kind of continuity.
00:19:00.840 | It becomes much more sophisticated,
00:19:03.480 | sort of wave forward in time
00:19:04.860 | where the information lives on.
00:19:06.880 | I lost both of my parents since our last podcast.
00:19:11.440 | And it actually gives me a lot of solace
00:19:13.880 | that this way of thinking about them,
00:19:17.060 | they haven't entirely died because a lot of mommy
00:19:21.400 | and daddies, sorry, I'm getting a little emotional here,
00:19:24.940 | but a lot of their values and ideas
00:19:28.840 | and even jokes and so on, they haven't gone away, right?
00:19:33.320 | Some of them live on.
00:19:34.160 | I can carry on some of them.
00:19:35.760 | And they also live on a lot of other people.
00:19:38.920 | So in this sense, even with Life 2.0,
00:19:41.900 | we can, to some extent, already transcend
00:19:45.880 | our physical bodies and our death.
00:19:49.160 | And particularly if you can share your own information,
00:19:53.920 | your own ideas with many others like you do
00:19:57.320 | in your podcast, then that's the closest
00:20:02.320 | to immortality we can get with our bio-bodies.
00:20:06.840 | - You carry a little bit of them in you in some sense.
00:20:10.280 | - Yeah, yeah.
00:20:11.120 | - Do you miss them?
00:20:13.000 | You miss your mom and dad?
00:20:14.080 | - Of course, of course.
00:20:15.680 | - What did you learn about life from them
00:20:17.240 | if it can take a bit of a tangent?
00:20:21.520 | - So many things.
00:20:22.640 | For starters, my fascination for math
00:20:28.920 | and the physical mysteries of our universe.
00:20:32.360 | I got a lot of that from my dad.
00:20:34.960 | But I think my obsession for fairly big questions
00:20:38.520 | and consciousness and so on,
00:20:40.000 | that actually came mostly from my mom.
00:20:42.880 | And what I got from both of them,
00:20:47.120 | which is a very core part of really who I am,
00:20:49.840 | I think is this,
00:20:53.360 | just feeling comfortable with not buying
00:21:02.000 | into what everybody else is saying.
00:21:07.360 | Doing what I think is right.
00:21:11.280 | They both very much just did their own thing.
00:21:19.800 | And sometimes they got flack for it,
00:21:21.160 | and they did it anyway.
00:21:22.320 | - That's why you've always been an inspiration to me,
00:21:25.920 | that you're at the top of your field
00:21:27.400 | and you're still willing to tackle
00:21:31.400 | the big questions in your own way.
00:21:35.160 | You're one of the people that represents MIT best to me.
00:21:40.160 | You've always been an inspiration in that.
00:21:41.960 | So it's good to hear that you got that from your mom and dad.
00:21:44.000 | - Yeah, you're too kind.
00:21:44.960 | But yeah, I mean, the good reason to do science
00:21:49.720 | is because you're really curious,
00:21:51.520 | you wanna figure out the truth.
00:21:53.800 | If you think this is how it is,
00:21:57.200 | and everyone else says, no, no, that's bullshit,
00:21:58.960 | and it's that way, you know,
00:22:00.360 | you stick with what you think is true.
00:22:04.240 | And even if everybody else keeps thinking it's bullshit,
00:22:09.480 | there's a certain,
00:22:10.360 | I always root for the underdog when I watch movies.
00:22:15.960 | And my dad once, one time, for example,
00:22:18.800 | when I wrote one of my craziest papers ever,
00:22:22.160 | talking about our universe ultimately being mathematical,
00:22:24.280 | which we're not gonna get into today,
00:22:25.680 | I got this email from a quite famous professor saying,
00:22:28.280 | this is not only bullshit,
00:22:29.520 | but it's gonna ruin your career.
00:22:31.160 | You should stop doing this kind of stuff.
00:22:33.280 | I sent it to my dad.
00:22:34.800 | Do you know what he said?
00:22:35.640 | - What did he say?
00:22:36.800 | - He replied with a quote from Dante.
00:22:39.600 | (speaking in foreign language)
00:22:42.720 | Follow your own path and let the people talk.
00:22:45.480 | Go dad!
00:22:48.240 | This is the kind of thing,
00:22:49.080 | you know, he's dead, but that attitude is not.
00:22:53.000 | - How did losing them as a man, as a human being change you?
00:22:59.240 | How did it expand your thinking about the world?
00:23:01.160 | How did it expand your thinking about,
00:23:03.340 | you know, this thing we're talking about,
00:23:05.720 | which is humans creating another living,
00:23:09.800 | sentient, perhaps, being?
00:23:12.120 | - I think it,
00:23:18.040 | mainly did two things.
00:23:19.400 | One of them, just going through all their stuff
00:23:23.760 | after they had passed away and so on,
00:23:25.840 | just drove home to me how important it is to ask ourselves,
00:23:28.840 | why are we doing these things we do?
00:23:31.520 | Because it's inevitable that you look at some things
00:23:34.040 | they spent an enormous time on and you ask,
00:23:36.200 | in hindsight, would they really have spent
00:23:38.680 | so much time on this?
00:23:40.400 | Would they have done something that was more meaningful?
00:23:42.920 | So I've been looking more in my life now and asking,
00:23:46.520 | you know, why am I doing what I'm doing?
00:23:48.680 | And I feel,
00:23:50.120 | it should either be something I really enjoy doing
00:23:56.680 | or it should be something that I find
00:23:58.840 | really, really meaningful because it helps humanity.
00:24:02.840 | And if it's in none of those two categories,
00:24:09.480 | maybe I should spend less time on it, you know?
00:24:12.480 | The other thing is dealing with death up in person like this,
00:24:17.000 | it's actually made me less afraid of,
00:24:20.600 | even less afraid of other people telling me
00:24:24.560 | that I'm an idiot, you know, which happens regularly.
00:24:27.960 | And just, I'm gonna live my life, do my thing, you know?
00:24:31.280 | And it's made it a little bit easier for me to focus
00:24:38.360 | on what I feel is really important.
00:24:40.600 | - What about fear of your own death?
00:24:42.440 | Has it made it more real that this is,
00:24:45.960 | that this is something that happens?
00:24:49.480 | - Yeah, it's made it extremely real.
00:24:51.400 | And I'm next in line in our family now, right?
00:24:54.200 | It's me and my younger brother.
00:24:56.080 | But they both handled it with such dignity.
00:25:01.080 | That was a true inspiration also.
00:25:04.600 | They never complained about things.
00:25:06.880 | And you know, when you're old
00:25:08.600 | and your body starts falling apart,
00:25:10.200 | it's more and more to complain about.
00:25:11.400 | They looked at what could they still do
00:25:13.160 | that was meaningful.
00:25:14.760 | And they focused on that rather than wasting time
00:25:17.880 | talking about or even thinking much
00:25:22.120 | about things they were disappointed in.
00:25:24.480 | I think anyone can make themselves depressed
00:25:26.360 | if they start their morning by making a list of grievances.
00:25:30.800 | Whereas if you start your day with a little meditation
00:25:34.160 | and just things you're grateful for,
00:25:36.680 | you basically choose to be a happy person.
00:25:39.840 | - Because you only have a finite number of days
00:25:42.480 | you should spend them.
00:25:43.440 | - Make it count.
00:25:44.640 | - Being grateful.
00:25:45.760 | - Yeah.
00:25:46.600 | - Well, you do happen to be working on a thing
00:25:52.840 | which seems to have potentially some of the greatest impact
00:25:57.840 | on human civilization of anything humans have ever created,
00:26:00.800 | which is artificial intelligence.
00:26:02.200 | This is on the both detailed technical level
00:26:05.240 | and in the high philosophical level you work on.
00:26:08.280 | So you've mentioned to me that there's an open letter
00:26:12.760 | that you're working on.
00:26:15.040 | - It's actually going live in a few hours.
00:26:18.920 | (Lex laughing)
00:26:20.040 | I've been having late nights and early mornings.
00:26:22.760 | It's been very exciting actually.
00:26:24.840 | In short, have you seen "Don't Look Up", the film?
00:26:29.840 | - Yes, yes.
00:26:32.400 | - I don't wanna be the movie spoiler
00:26:34.040 | for anyone watching this who hasn't seen it.
00:26:36.400 | But if you're watching this,
00:26:37.600 | you haven't seen it, watch it.
00:26:40.440 | Because we are actually acting out,
00:26:43.360 | it's life imitating art.
00:26:45.640 | Humanity is doing exactly that right now,
00:26:47.880 | except it's an asteroid that we are building ourselves.
00:26:52.480 | Almost nobody is talking about it.
00:26:54.840 | People are squabbling across the planet
00:26:56.680 | about all sorts of things which seem very minor
00:26:58.840 | compared to the asteroid that's about to hit us, right?
00:27:02.320 | Most politicians don't even have their radar,
00:27:04.440 | this on the radar,
00:27:05.280 | they think maybe in 100 years or whatever.
00:27:07.680 | Right now, we're at a fork in the road.
00:27:11.400 | This is the most important fork humanity has reached
00:27:14.960 | in its over 100,000 years on this planet.
00:27:17.840 | We're building effectively a new species
00:27:21.720 | that's smarter than us.
00:27:22.880 | It doesn't look so much like a species yet
00:27:25.440 | 'cause it's mostly not embodied in robots,
00:27:27.560 | but that's a technicality which will soon be changed.
00:27:32.080 | And this arrival of artificial general intelligence
00:27:37.080 | that can do all our jobs as well as us,
00:27:39.280 | and probably shortly thereafter, superintelligence,
00:27:43.120 | which greatly exceeds our cognitive abilities,
00:27:46.360 | it's gonna either be the best thing ever to happen
00:27:48.720 | to humanity or the worst.
00:27:50.000 | I'm really quite confident that there is
00:27:52.000 | not that much middle ground there.
00:27:55.160 | - But it would be fundamentally transformative
00:27:58.080 | to human civilization. - Of course.
00:27:59.920 | Utterly and totally.
00:28:01.400 | Again, we branded ourselves as homo sapiens
00:28:04.560 | 'cause it seemed like the basic thing.
00:28:06.080 | We're the king of the castle on this planet.
00:28:09.000 | We're the smart ones.
00:28:10.160 | If we can control everything else,
00:28:11.840 | this could very easily change.
00:28:15.520 | We're certainly not gonna be the smartest
00:28:18.360 | on the planet for very long unless AI progress just halts.
00:28:22.720 | And we can talk more about why I think that's true
00:28:25.400 | 'cause it's controversial.
00:28:28.400 | And then we can also talk about
00:28:29.920 | reasons you might think it's gonna be the best thing ever
00:28:35.160 | and the reason you think it's gonna be the end of humanity,
00:28:39.120 | which is, of course, super controversial.
00:28:41.480 | But what I think we can, anyone who's working on advanced AI
00:28:46.480 | can agree on is it's much like the film "Don't Look Up"
00:28:52.720 | in that it's just really comical how little serious
00:28:57.640 | public debate there is about it given how huge it is.
00:29:01.440 | - So what we're talking about is a development
00:29:06.960 | of currently things like GPT-4
00:29:09.200 | and the signs it's showing of rapid improvement
00:29:14.840 | that may in the near term lead to development
00:29:18.440 | of super intelligent AGI, AI, general AI systems,
00:29:23.160 | and what kind of impact that has on society.
00:29:26.040 | - Exactly.
00:29:26.880 | - The whole thing is achieves
00:29:28.440 | general human level intelligence
00:29:30.680 | and then beyond that, general super human level intelligence.
00:29:34.680 | There's a lot of questions to explore here.
00:29:38.900 | So one, you mentioned halt.
00:29:41.640 | Is that the content of the letter?
00:29:44.800 | - Yeah.
00:29:45.640 | - Is to suggest that maybe we should pause
00:29:47.880 | the development of these systems.
00:29:49.680 | - Exactly.
00:29:50.520 | So this is very controversial.
00:29:54.480 | When we talked the first time,
00:29:56.000 | we talked about how I was involved
00:29:57.400 | in starting the Future Life Institute
00:29:59.040 | and we worked very hard on 2014, 2015
00:30:02.120 | was the mainstream AI safety.
00:30:04.080 | The idea that there even could be risks
00:30:07.280 | and that you could do things about them.
00:30:09.720 | Before then, a lot of people thought it was just really
00:30:11.640 | kooky to even talk about it and a lot of AI researchers
00:30:14.560 | felt worried that this was too flaky
00:30:18.600 | and could be bad for funding
00:30:20.080 | and that the people who talked about it
00:30:21.440 | were just not, didn't understand AI.
00:30:24.120 | I'm very, very happy with how that's gone
00:30:28.800 | and that now, it's completely mainstream.
00:30:32.160 | You go in any AI conference and people talk about AI safety
00:30:34.840 | and it's a nerdy technical field full of equations
00:30:37.920 | and similar and blah, blah.
00:30:39.520 | As it should be.
00:30:42.600 | But there's this other thing which has been quite taboo
00:30:47.520 | up until now, calling for slowdown.
00:30:50.560 | So what we've constantly been saying,
00:30:54.440 | including myself, I've been biting my tongue a lot,
00:30:56.520 | is that we don't need to slow down AI development,
00:31:01.520 | we just need to win this race, the wisdom race,
00:31:05.520 | between the growing power of the AI
00:31:07.840 | and the growing wisdom with which we manage it.
00:31:12.280 | And rather than trying to slow down AI,
00:31:14.280 | let's just try to accelerate the wisdom.
00:31:16.880 | Do all this technical work to figure out
00:31:18.720 | how you can actually ensure that your powerful AI
00:31:21.280 | is gonna do what you want it to do
00:31:23.320 | and have society adapt also with incentives and regulations
00:31:28.320 | so that these things get put to good use.
00:31:31.080 | Sadly, that didn't pan out.
00:31:34.840 | The progress on technical AI and capabilities
00:31:39.960 | has gone a lot faster than many people thought
00:31:46.080 | back when we started this in 2014.
00:31:49.080 | Turned out to be easier to build
00:31:50.960 | really advanced AI than we thought.
00:31:52.720 | And on the other side, it's gone much slower
00:31:58.360 | than we hoped with getting policy makers and others
00:32:03.360 | to actually put incentives in place
00:32:06.840 | to steer this in the good direction.
00:32:10.600 | Maybe we should unpack it and talk a little bit about each.
00:32:12.400 | So why did it go faster than a lot of people thought?
00:32:16.920 | In hindsight, it's exactly like building flying machines.
00:32:21.920 | People spent a lot of time wondering
00:32:24.360 | about how do birds fly?
00:32:26.560 | And that turned out to be really hard.
00:32:28.800 | Have you seen the TED Talk with a flying bird?
00:32:31.760 | - Like a flying robotic bird?
00:32:33.200 | - Yeah, it flies around the audience.
00:32:35.240 | But it took 100 years longer to figure out how to do that
00:32:38.440 | than for the Wright brothers to build the first airplane
00:32:40.440 | because it turned out there was a much easier way to fly.
00:32:43.080 | And evolution picked the more complicated one
00:32:45.640 | because it had its hands tied.
00:32:48.000 | It could only build a machine that could assemble itself,
00:32:51.040 | which the Wright brothers didn't care about.
00:32:53.920 | They can only build a machine
00:32:55.040 | that used only the most common atoms in the periodic table.
00:32:58.320 | Wright brothers didn't care about that.
00:32:59.780 | They could use steel, iron atoms.
00:33:03.280 | And it had to be able to repair itself
00:33:05.920 | and it also had to be incredibly fuel efficient.
00:33:08.520 | A lot of birds use less than half the fuel
00:33:13.360 | of a remote control plane flying the same distance.
00:33:16.880 | For humans, just throw a little more,
00:33:18.880 | put a little more fuel in a roof.
00:33:20.400 | There you go, 100 years earlier.
00:33:22.360 | That's exactly what's happening now
00:33:24.400 | with these large language models.
00:33:26.000 | The brain is incredibly complicated.
00:33:29.960 | Many people made the mistake of thinking
00:33:31.800 | we had to figure out how the brain does human level AI first
00:33:35.320 | before we could build in a machine.
00:33:37.520 | That was completely wrong.
00:33:39.000 | You can take an incredibly simple computational system
00:33:44.000 | called a transformer network
00:33:45.760 | and just train it to do something incredibly dumb.
00:33:48.480 | Just read a gigantic amount of texts
00:33:50.760 | and try to predict the next word.
00:33:52.400 | And it turns out if you just throw a ton of compute at that
00:33:57.240 | and a ton of data, it gets to be frighteningly good
00:34:01.520 | like GPT-4, which I've been playing with so much
00:34:03.840 | since it came out.
00:34:06.080 | And there's still some debate
00:34:09.120 | about whether that can get you all the way
00:34:11.040 | to full human level or not.
00:34:13.680 | But yeah, we can come back to the details of that
00:34:16.240 | and how you might get to human level AI
00:34:17.760 | even if large language models don't.
00:34:22.200 | - Can you briefly, if it's just a small tangent,
00:34:24.560 | comment on your feelings about GPT-4?
00:34:27.120 | Suggest that you're impressed by this rate of progress,
00:34:31.280 | but where is it?
00:34:32.760 | Can GPT-4 reason?
00:34:35.480 | What are the intuitions?
00:34:37.800 | What are human interpretable words you can assign
00:34:40.600 | to the capabilities of GPT-4
00:34:42.280 | that makes you so damn impressed with it?
00:34:45.040 | - I'm both very excited about it and terrified.
00:34:48.400 | It's an interesting mixture of emotions.
00:34:52.040 | - All the best things in life include those two somehow.
00:34:55.320 | - Yeah, it can absolutely reason.
00:34:57.360 | Anyone who hasn't played with it,
00:34:59.580 | I highly recommend doing that before dissing it.
00:35:03.040 | It can do quite remarkable reasoning.
00:35:06.440 | I've had it do a lot of things,
00:35:08.760 | which I realized I couldn't do that myself that well even.
00:35:12.200 | And obviously does it dramatically faster than we do too
00:35:16.040 | when you watch it type.
00:35:17.920 | And it's doing that while servicing a massive number
00:35:20.720 | of other humans at the same time.
00:35:23.080 | At the same time, it cannot reason as well as a human can
00:35:28.080 | on some tasks.
00:35:30.280 | Just because it's obviously a limitation
00:35:32.320 | from its architecture.
00:35:33.360 | We have in our heads what in geek speak
00:35:36.560 | is called a recurrent neural network.
00:35:38.720 | There are loops.
00:35:39.560 | Information can go from this neuron to this neuron
00:35:41.480 | to this neuron and then back to this one.
00:35:43.240 | You can like ruminate on something for a while.
00:35:44.880 | You can self-reflect a lot.
00:35:46.740 | These large language models that are,
00:35:50.320 | they cannot, like GPT-4.
00:35:52.400 | It's a so-called transformer where it's just like
00:35:55.200 | a one-way street of information basically.
00:35:57.880 | In geek speak, it's called a feed-forward neural network.
00:36:01.120 | And it's only so deep.
00:36:03.080 | So it can only do logic that's that many steps
00:36:05.840 | and that deep.
00:36:06.680 | And you can create the problems which will fail to solve
00:36:12.480 | for that reason.
00:36:13.760 | But the fact that it can do so amazing things
00:36:20.280 | with this incredibly simple architecture already
00:36:23.640 | is quite stunning.
00:36:24.880 | And what we see in my lab at MIT
00:36:27.280 | when we look inside large language models
00:36:30.600 | to try to figure out how they're doing it,
00:36:32.280 | that's the key core focus of our research.
00:36:35.520 | It's called mechanistic interpretability in geek speak.
00:36:40.520 | You have this machine that does something smart.
00:36:42.920 | You try to reverse engineer it.
00:36:44.400 | See how does it do it?
00:36:45.480 | Or you think of it also as artificial neuroscience.
00:36:49.840 | (laughing)
00:36:50.680 | 'Cause that's exactly what neuroscientists do
00:36:51.520 | with actual brains.
00:36:52.360 | But here you have the advantage that you can,
00:36:54.360 | you don't have to worry about measurement errors.
00:36:56.200 | You can see what every neuron is doing all the time.
00:36:59.560 | And a recurrent thing we see again and again,
00:37:02.360 | there's been a number of beautiful papers quite recently
00:37:06.200 | by a lot of researchers, some of them here,
00:37:09.120 | I mean in this area, is where when they figure out
00:37:12.080 | how something is done, you can say,
00:37:14.280 | "Oh man, that's such a dumb way of doing it."
00:37:16.840 | And you immediately see how it can be improved.
00:37:18.840 | Like for example, there was a beautiful paper recently
00:37:22.200 | where they figured out how a large language model
00:37:24.640 | stores certain facts, like Eiffel Tower is in Paris.
00:37:28.800 | And they figured out exactly how it's stored.
00:37:31.680 | The proof that they understood it was they could edit it.
00:37:34.200 | They changed some of the synapses in it.
00:37:37.800 | And then they asked it, "Where's the Eiffel Tower?"
00:37:39.680 | And it said, "It's in Rome."
00:37:41.440 | And then they asked you, "How do you get there?"
00:37:43.680 | "Oh, how do you get there from Germany?"
00:37:45.720 | "Oh, you take this train,
00:37:47.000 | and the Roma Termini train station, and this and that."
00:37:51.120 | And what might you see if you're in front of it?
00:37:52.880 | "Oh, you might see the Colosseum."
00:37:55.880 | So they had edited it.
00:37:57.040 | - So they literally moved it to Rome.
00:37:59.160 | - But the way that it's storing this information,
00:38:01.800 | it's incredibly dumb.
00:38:03.360 | For any fellow nerds listening to this,
00:38:07.800 | there was a big matrix, and roughly speaking,
00:38:11.720 | there are certain row and column vectors
00:38:13.200 | which encode these things, and they correspond
00:38:15.600 | very hand-wavily to principal components.
00:38:17.880 | And it would be much more efficient for a sparse matrix
00:38:21.360 | to just store in the database.
00:38:23.680 | But in everything so far, we've figured out
00:38:27.800 | how these things do, or ways where you can see
00:38:29.960 | they can easily be improved.
00:38:31.320 | And the fact that this particular architecture
00:38:34.240 | has some roadblocks built into it is in no way
00:38:37.680 | gonna prevent crafty researchers
00:38:40.960 | from quickly finding workarounds
00:38:42.600 | and making other kinds of architectures
00:38:47.480 | sort of go all the way.
00:38:50.120 | In short, it's turned out to be a lot easier
00:38:54.360 | to build close to human intelligence than we thought,
00:38:58.240 | and that means our runway as a species
00:39:00.240 | to get our shit together has shortened.
00:39:05.240 | - And it seems like the scary thing
00:39:07.960 | about the effectiveness of large language models,
00:39:11.680 | so Sam Altman I recently had a conversation with,
00:39:14.920 | and he really showed that the leap from GPT-3
00:39:19.840 | to GPT-4 has to do with just a bunch of hacks,
00:39:23.200 | a bunch of little explorations with smart researchers
00:39:28.200 | doing a few little fixes here and there.
00:39:30.560 | It's not some fundamental leap and transformation
00:39:34.240 | in the architecture.
00:39:35.680 | - And more data and more compute.
00:39:37.720 | - And more data and compute, but he said the big leaps
00:39:39.680 | has to do with not the data and the compute,
00:39:42.840 | but just learning this new discipline, just like you said.
00:39:46.800 | So researchers are going to look at these architectures,
00:39:48.920 | and there might be big leaps where you realize,
00:39:52.480 | wait, why are we doing this in this dumb way?
00:39:54.440 | And all of a sudden this model is 10x smarter,
00:39:57.160 | and that can happen on any one day,
00:39:59.400 | on any one Tuesday or Wednesday afternoon,
00:40:02.080 | and then all of a sudden you have a system
00:40:03.640 | that's 10x smarter.
00:40:05.740 | It seems like it's such a new discipline.
00:40:07.440 | It's such a new, like we understand so little
00:40:10.320 | about why this thing works so damn well
00:40:12.540 | that the linear improvement of compute, or exponential,
00:40:16.200 | but the steady improvement of compute,
00:40:17.720 | steady improvement of the data,
00:40:19.320 | may not be the thing that even leads to the next leap.
00:40:21.520 | It could be a surprise little hack that improves everything.
00:40:24.240 | - Or a lot of little leaps here and there,
00:40:25.800 | because so much of this is out in the open also.
00:40:29.960 | So many smart people are looking at this
00:40:33.640 | and trying to figure out little leaps here and there,
00:40:35.680 | and it becomes this sort of collective race
00:40:39.120 | where a lot of people feel,
00:40:40.600 | if I don't take the leap, someone else will.
00:40:42.640 | And this is actually very crucial for the other part of it.
00:40:45.880 | Why do we want to slow this down?
00:40:47.140 | So again, what this open letter is calling for
00:40:50.040 | is just pausing all training of systems
00:40:54.040 | that are more powerful than GPT-4 for six months.
00:40:59.400 | Just give a chance for the labs to coordinate a bit
00:41:04.400 | on safety and for society to adapt,
00:41:08.080 | give the right incentives to the labs.
00:41:09.840 | 'Cause you've interviewed a lot of these people
00:41:14.200 | who lead these labs, and you know just as well as I do
00:41:16.720 | that they're good people, they're idealistic people.
00:41:18.980 | They're doing this first and foremost
00:41:21.300 | because they believe that AI has a huge potential
00:41:23.580 | to help humanity.
00:41:24.700 | But at the same time, they are trapped
00:41:29.520 | in this horrible race to the bottom.
00:41:33.100 | Have you read "Meditations on Moloch" by Scott Alexander?
00:41:40.820 | - Yes.
00:41:41.660 | - Yeah, it's a beautiful essay on this poem by Ginzburg
00:41:44.560 | where he interprets it as being about this monster.
00:41:47.960 | It's this game theory monster that pits people
00:41:53.520 | against each other in this race to the bottom
00:41:55.980 | where everybody ultimately loses.
00:41:58.140 | The evil thing about this monster is
00:41:59.700 | even though everybody sees it and understands,
00:42:02.140 | they still can't get out of the race.
00:42:03.980 | A good fraction of all the bad things that we humans do
00:42:08.420 | are caused by Moloch.
00:42:10.040 | I like Scott Alexander's naming of the monster
00:42:15.040 | so we humans can think of it as a thing.
00:42:19.220 | If you look at why do we have overfishing,
00:42:23.120 | why do we have more generally the tragedy of the commons,
00:42:26.360 | why is it that, so Liv Borre,
00:42:29.560 | I don't know if you've had her on your podcast.
00:42:31.200 | - Yeah, she's become a friend, yeah.
00:42:33.080 | - Great, she made this awesome point recently
00:42:36.560 | that beauty filters that a lot of female influencers
00:42:40.740 | feel pressure to use are exactly Moloch in action again.
00:42:46.420 | First, nobody was using them
00:42:47.820 | and people saw them just the way they were
00:42:51.060 | and then some of them started using it
00:42:52.960 | and becoming ever more plastic fantastic
00:42:56.660 | and then the other ones that weren't using it
00:42:58.460 | started to realize that if they wanna just keep
00:43:01.820 | their market share, they have to start using it too
00:43:05.860 | and then you're in a situation where they're all using it
00:43:09.080 | and none of them has any more market share
00:43:11.280 | or less than before.
00:43:12.240 | So nobody gained anything, everybody lost
00:43:15.280 | and they have to keep becoming
00:43:17.920 | ever more plastic fantastic also, right?
00:43:20.440 | But nobody can go back to the old way
00:43:25.000 | 'cause it's just too costly, right?
00:43:28.660 | Moloch is everywhere and Moloch is not a new arrival
00:43:34.840 | on the scene either.
00:43:36.220 | We humans have developed a lot of collaboration mechanisms
00:43:39.340 | to help us fight back against Moloch
00:43:41.540 | through various kinds of constructive collaboration.
00:43:45.340 | The Soviet Union and the United States
00:43:47.380 | did sign a number of arms control treaties
00:43:51.780 | against Moloch who is trying to stoke them
00:43:53.980 | into unnecessarily risky nuclear arms races, et cetera,
00:43:58.220 | et cetera and this is exactly what's happening
00:44:00.340 | on the AI front.
00:44:01.980 | This time, it's a little bit geopolitics
00:44:05.360 | but it's mostly money where there's just
00:44:07.360 | so much commercial pressure.
00:44:08.680 | If you take any of these leaders of the top tech companies,
00:44:12.700 | if they just say, this is too risky,
00:44:16.600 | I wanna pause for six months,
00:44:19.400 | they're gonna get a lot of pressure
00:44:20.800 | from shareholders and others.
00:44:22.380 | They're like, well, if you pause,
00:44:26.240 | but those guys don't pause,
00:44:27.700 | we don't wanna get our lunch eaten.
00:44:31.400 | - Yeah.
00:44:32.240 | - And shareholders even have the power
00:44:34.160 | to replace the executives in the worst case, right?
00:44:37.360 | So we did this open letter because we wanna help
00:44:42.320 | these idealistic tech executives
00:44:44.600 | to do what their heart tells them
00:44:47.480 | by providing enough public pressure
00:44:49.280 | on the whole sector to just pause
00:44:52.000 | so that they can all pause in a coordinated fashion.
00:44:55.560 | And I think without the public pressure,
00:44:57.320 | none of them can do it alone,
00:44:59.800 | push back against their shareholders,
00:45:02.440 | no matter how good-hearted they are.
00:45:05.280 | 'Cause Moloch is a really powerful foe.
00:45:07.280 | - So the idea is to,
00:45:11.660 | for the major developers of AI systems like this,
00:45:15.040 | so we're talking about Microsoft, Google,
00:45:17.120 | Meta, and anyone else?
00:45:21.920 | - Well, OpenAI is very close with Microsoft now, of course.
00:45:25.200 | And there are plenty of smaller players.
00:45:28.640 | For example, Anthropic is very impressive.
00:45:32.480 | There's Conjecture.
00:45:33.640 | There's many, many, many players.
00:45:35.040 | I don't wanna make a long list to leave anyone out.
00:45:37.680 | And for that reason, it's so important
00:45:41.920 | that some coordination happens,
00:45:44.560 | that there's external pressure on all of them
00:45:46.840 | saying you all need to pause.
00:45:48.760 | 'Cause then the people, the researchers in these organizations
00:45:52.800 | who the leaders who wanna slow down a little bit,
00:45:54.880 | they can say to their shareholders,
00:45:56.640 | everybody's slowing down because of this pressure,
00:46:01.240 | and it's the right thing to do.
00:46:03.320 | - Have you seen in history their examples
00:46:07.400 | where it's possible to pause the Moloch?
00:46:09.240 | - Absolutely.
00:46:10.080 | Like human cloning, for example.
00:46:12.640 | You could make so much money on human cloning.
00:46:14.940 | Why aren't we doing it?
00:46:19.640 | Because biologists thought hard about this
00:46:23.960 | and felt like this is way too risky.
00:46:27.560 | They got together in the '70s in the Selomar
00:46:30.960 | and decided even to stop a lot more stuff also,
00:46:34.640 | just editing the human germline,
00:46:36.320 | gene editing that goes in to our offspring,
00:46:42.000 | and decided let's not do this
00:46:44.800 | because it's too unpredictable what it's gonna lead to.
00:46:48.120 | We could lose control over what happens to our species.
00:46:51.920 | So they paused.
00:46:53.840 | There was a ton of money to be made there.
00:46:56.280 | So it's very doable,
00:46:57.800 | but you need a public awareness of what the risks are
00:47:02.240 | and the broader community coming in and saying,
00:47:05.160 | hey, let's slow down.
00:47:06.800 | And another common pushback I get today
00:47:09.720 | is we can't stop in the West because China.
00:47:13.700 | And in China, undoubtedly,
00:47:17.320 | they also get told we can't slow down because the West,
00:47:20.440 | because both sides think they're the good guy.
00:47:22.800 | But look at human cloning.
00:47:25.420 | Did China forge ahead with human cloning?
00:47:28.920 | There's been exactly one human cloning
00:47:30.520 | that's actually been done that I know of.
00:47:32.440 | It was done by a Chinese guy.
00:47:34.160 | Do you know where he is now?
00:47:36.080 | In jail.
00:47:36.920 | And you know who put him there?
00:47:39.640 | - Who?
00:47:40.480 | - Chinese government.
00:47:42.160 | Not because Westerners said, China, look, this is...
00:47:45.680 | No, the Chinese government put him there
00:47:47.040 | because they also felt they like control,
00:47:50.480 | the Chinese government.
00:47:51.760 | If anything, maybe they are even more concerned
00:47:54.080 | about having control than Western governments
00:47:57.160 | have no incentive of just losing control
00:47:59.800 | over where everything is going.
00:48:01.600 | And you can also see the Ernie bot
00:48:03.320 | that was released by, I believe, Baidu recently.
00:48:07.200 | They got a lot of pushback from the government
00:48:08.920 | and had to rein it in in a big way.
00:48:11.400 | I think once this basic message comes out
00:48:15.240 | that this isn't an arms race,
00:48:16.560 | it's a suicide race,
00:48:17.840 | where everybody loses if anybody's AI goes out of control,
00:48:23.600 | it really changes the whole dynamic.
00:48:25.560 | I'll say this again, 'cause this is a very basic point
00:48:32.080 | I think a lot of people get wrong.
00:48:34.200 | Because a lot of people dismiss the whole idea
00:48:38.240 | that AI can really get very superhuman,
00:48:42.080 | because they think there's something really magical
00:48:43.880 | about intelligence such that it can only exist
00:48:46.320 | in the human mind.
00:48:47.160 | Because they believe that,
00:48:48.360 | they think it's gonna kind of get to just more or less
00:48:51.000 | GPT-4++ and then that's it.
00:48:54.560 | They don't see it as a suicide race.
00:48:58.520 | They think whoever gets that first,
00:49:00.000 | they're gonna control the world, they're gonna win.
00:49:02.560 | That's not how it's gonna be.
00:49:04.160 | And we can talk again about the scientific arguments
00:49:08.320 | from why it's not gonna stop there.
00:49:09.600 | But the way it's gonna be is if anybody completely
00:49:13.600 | loses control and you don't care,
00:49:16.600 | if someone manages to take over the world
00:49:21.840 | who really doesn't share your goals,
00:49:24.200 | you probably don't really even care very much
00:49:25.800 | about what nationality they have.
00:49:27.120 | You're not gonna like it, much worse than today.
00:49:29.800 | If you live in Orwellian dystopia,
00:49:34.120 | what do you care who created it, right?
00:49:36.720 | And if someone, if it goes farther
00:49:38.960 | and we just lose control even to the machines,
00:49:44.360 | so that it's not us versus them, it's us versus it,
00:49:47.360 | what do you care who created this underlying entity
00:49:52.320 | which has goals different from humans ultimately
00:49:55.320 | and we get marginalized, we get made obsolete,
00:49:58.560 | we get replaced?
00:49:59.600 | That's what I mean when I say it's a suicide race.
00:50:04.960 | It's kind of like we're rushing towards this cliff,
00:50:09.040 | but the closer to the cliff we get,
00:50:10.560 | the more scenic the views are and the more money we make.
00:50:13.240 | The more money there is there, so we keep going,
00:50:16.920 | but we have to also stop at some point, right?
00:50:19.200 | Quit while we're ahead.
00:50:20.760 | And it's a suicide race which cannot be won,
00:50:25.760 | but the way to really benefit from it
00:50:33.440 | is to continue developing awesome AI, a little bit slower,
00:50:38.440 | so we make it safe, make sure it does the things
00:50:41.640 | we always want and create a condition where everybody wins.
00:50:44.760 | Technology has shown us that geopolitics
00:50:49.440 | and politics in general is not a zero-sum game at all.
00:50:54.160 | - So there is some rate of development that will lead
00:50:57.200 | us as a human species to lose control of this thing
00:51:01.960 | and the hope you have is that there's some lower level
00:51:05.200 | of development which will not allow us to lose control.
00:51:09.840 | This is an interesting thought you have
00:51:11.080 | about losing control, so if you are somebody
00:51:14.360 | like Sander Parchai or Sam Altman at the head
00:51:17.160 | of a company like this, you're saying if they develop
00:51:20.240 | an AGI, they too will lose control of it.
00:51:23.280 | So no one person can maintain control,
00:51:26.720 | no group of individuals can maintain control.
00:51:29.120 | - If it's created very, very soon and is a big black box
00:51:33.760 | that we don't understand, like the large language models,
00:51:36.000 | yeah, then I'm very confident they're gonna lose control.
00:51:39.040 | But this isn't just me saying it.
00:51:40.760 | Sam Altman and Demis Hassabis have both said
00:51:44.000 | and themselves acknowledged that there's really great risks
00:51:47.880 | with this and they wanna slow down once they feel
00:51:50.240 | like it's scary, but it's clear that they're stuck
00:51:53.960 | and again, Moloch is forcing them to go a little faster
00:51:57.800 | than they're comfortable with because of pressure
00:51:59.680 | from just commercial pressures, right?
00:52:01.860 | To get a bit optimistic here, of course this is a problem
00:52:06.600 | that can be ultimately solved.
00:52:10.320 | It's just to win this wisdom race, it's clear that what
00:52:14.440 | we hoped that was gonna happen hasn't happened.
00:52:17.160 | The capability progress has gone faster
00:52:20.000 | than a lot of people thought and the progress
00:52:22.800 | in the public sphere of policymaking and so on
00:52:25.120 | has gone slower than we thought.
00:52:26.480 | Even the technical AI safety has gone slower.
00:52:29.060 | A lot of the technical safety research was kind of banking
00:52:32.060 | on that large language models and other poorly understood
00:52:35.800 | systems couldn't get us all the way, that you had to build
00:52:38.560 | more of a kind of intelligence that you could understand,
00:52:41.240 | maybe it could prove itself safe, things like this.
00:52:45.600 | And I'm quite confident that this can be done
00:52:50.360 | so we can reap all the benefits, but we cannot do it
00:52:53.680 | as quickly as this out of control express train we are
00:52:58.680 | on now is gonna get the AGI.
00:53:00.200 | That's why we need a little more time, I feel.
00:53:02.600 | - Is there something to be said,
00:53:05.640 | what Sam Altman talked about, which is,
00:53:07.720 | while we're in the pre-AGI stage, to release often
00:53:12.720 | and as transparently as possible to learn a lot.
00:53:17.400 | So as opposed to being extremely cautious, release a lot.
00:53:21.360 | Don't invest in a closed development where you focus
00:53:25.880 | on AI safety while it's somewhat dumb, quote unquote.
00:53:30.280 | Release as often as possible.
00:53:33.240 | And as you start to see signs of human level intelligence
00:53:38.240 | or superhuman level intelligence, then you put a halt on it.
00:53:41.960 | - Well, what a lot of safety researchers have been saying
00:53:45.480 | for many years is that the most dangerous things you can do
00:53:48.520 | with an AI is, first of all, teach it to write code.
00:53:52.220 | - Yeah.
00:53:53.060 | - 'Cause that's the first step towards recursive
00:53:54.520 | self-improvement, which can take it from AGI
00:53:56.440 | to much higher levels.
00:53:58.600 | Okay, oops, we've done that.
00:54:01.560 | And another thing, high risk is connected to the internet.
00:54:05.840 | Let it go to websites, download stuff on its own,
00:54:08.640 | talk to people.
00:54:09.740 | Oops, we've done that already.
00:54:12.480 | You know, Eliezer Yudkowsky,
00:54:13.600 | you said you interviewed him recently, right?
00:54:15.160 | - Yes, yes.
00:54:16.000 | - He had this tweet recently, which gave me one
00:54:19.200 | of the best laughs in a while, where he was like,
00:54:20.920 | "Hey, people used to make fun of me and say,
00:54:22.820 | "you're so stupid, Eliezer, 'cause you're saying
00:54:25.120 | "you have to worry."
00:54:28.320 | Obviously, developers, once they get to really strong AI,
00:54:32.760 | first thing you're gonna do is never connect it
00:54:34.760 | to the internet, keep it in a box
00:54:37.400 | where you can really study it.
00:54:39.440 | So he had written it in the meme form,
00:54:43.280 | so it's like, "then," and then that.
00:54:46.520 | Now, "lol, let's make a chatbot."
00:54:51.520 | (both laughing)
00:54:53.920 | And the third thing, Stuart Russell,
00:54:56.480 | you know, amazing AI researcher,
00:55:00.360 | he has argued for a while that we should never teach AI
00:55:05.360 | anything about humans.
00:55:08.400 | Above all, we should never let it learn
00:55:10.400 | about human psychology and how you manipulate humans.
00:55:13.020 | That's the most dangerous kind of knowledge you can give it.
00:55:16.240 | Yeah, you can teach it all it needs to know
00:55:18.320 | about how to cure cancer and stuff like that,
00:55:19.960 | but don't let it read Daniel Kahneman's book
00:55:23.400 | about cognitive biases and all that.
00:55:25.520 | And then, "oops, lol, let's invent social media
00:55:30.160 | recommender algorithms," which do exactly that.
00:55:34.720 | They get so good at knowing us and pressing our buttons
00:55:39.720 | that we're starting to create a world now
00:55:43.440 | where we just have ever more hatred
00:55:45.440 | 'cause they figured out that these algorithms,
00:55:48.760 | not out of evil, but just to make money on advertising,
00:55:51.920 | that the best way to get more engagement, the euphemism,
00:55:56.040 | get people glued to their little rectangles,
00:55:58.240 | but it's just to make them pissed off.
00:56:00.080 | - Well, that's really interesting that a large AI system
00:56:03.520 | that's doing the recommender system kind of task
00:56:06.440 | on social media is basically just studying human beings
00:56:09.880 | because it's a bunch of us rats giving it signal,
00:56:14.780 | nonstop signal.
00:56:15.920 | It'll show a thing, and then we give signal
00:56:17.960 | on whether we spread that thing, we like that thing,
00:56:20.880 | that thing increases our engagement,
00:56:22.440 | gets us to return to the platform.
00:56:24.240 | It has that on the scale of hundreds of millions
00:56:26.200 | of people constantly.
00:56:27.840 | So it's just learning and learning and learning.
00:56:29.760 | And presumably, if the number of parameters
00:56:32.080 | in the neural network that's doing the learning,
00:56:34.240 | the more end-to-end the learning is,
00:56:38.280 | the more it's able to just basically encode
00:56:41.240 | how to manipulate human behavior,
00:56:43.960 | how to control humans at scale.
00:56:45.360 | - Exactly, and that is not something
00:56:47.080 | I think is in humanity's interest.
00:56:49.400 | - Yes.
00:56:50.240 | - Now it's mainly letting some humans
00:56:52.440 | manipulate other humans for profit and power,
00:56:56.820 | which already caused a lot of damage.
00:57:00.860 | And eventually that's a sort of skill
00:57:03.840 | that can make AIs persuade humans to let them escape
00:57:07.880 | whatever safety precautions we have.
00:57:10.320 | But there was a really nice article
00:57:12.480 | in the New York Times recently by Yuval Noah Harari
00:57:16.880 | and two co-authors, including Tristan Harris
00:57:19.680 | from "The Social Dilemma."
00:57:20.920 | They have this phrase in there I love.
00:57:25.160 | Humanity's first contact with advanced AI
00:57:28.280 | was social media.
00:57:30.720 | And we lost that one.
00:57:32.780 | We now live in a country where there's much more hate
00:57:38.200 | in the world where there's much more hate, in fact.
00:57:41.000 | And in our democracy, we're having this conversation
00:57:43.960 | and people can't even agree on who won the last election.
00:57:47.920 | And we humans often point fingers at other humans
00:57:50.680 | and say it's their fault.
00:57:51.600 | But it's really Moloch and these AI algorithms.
00:57:55.020 | We got the algorithms and then Moloch
00:57:59.900 | pitted the social media companies against each other
00:58:02.200 | so nobody could have a less creepy algorithm
00:58:04.160 | 'cause then they would lose out on revenue
00:58:05.720 | to the other company.
00:58:07.040 | - Is there any way to win that battle back
00:58:08.840 | just if we just linger on this one battle
00:58:11.760 | that we've lost in terms of social media?
00:58:13.920 | Is it possible to redesign social media,
00:58:16.920 | this very medium in which we use as a civilization
00:58:20.920 | to communicate with each other,
00:58:22.480 | to have these kinds of conversations,
00:58:24.240 | to have discourse to try to figure out
00:58:25.760 | how to solve the biggest problems in the world,
00:58:28.360 | whether that's nuclear war or the development of AGI?
00:58:32.120 | Is it possible to do social media correctly?
00:58:35.920 | - I think it's not only possible, but it's necessary.
00:58:38.920 | Who are we kidding that we're gonna be able to solve
00:58:40.880 | all these other challenges if we can't even have
00:58:42.840 | a conversation with each other?
00:58:44.000 | It's constructive.
00:58:45.480 | The whole idea, the key idea of democracy
00:58:47.880 | is that you get a bunch of people together
00:58:50.400 | and they have a real conversation,
00:58:52.000 | the ones you try to foster on this podcast
00:58:53.920 | where you respectfully listen to people you disagree with.
00:58:57.120 | And you realize, actually, there are some things,
00:58:59.440 | actually, some common ground we have,
00:59:01.040 | and we both agree, let's not have a nuclear war,
00:59:04.680 | let's not do that, et cetera, et cetera.
00:59:07.600 | We're kidding ourselves thinking we can face off
00:59:12.480 | the second contact with ever more powerful AI
00:59:16.400 | that's happening now with these large language models
00:59:19.160 | if we can't even have a functional conversation
00:59:23.640 | in the public space.
00:59:25.520 | That's why I started the Improve the News project,
00:59:28.400 | improvethenews.org.
00:59:29.920 | But I'm an optimist, fundamentally,
00:59:33.560 | in that there is a lot of intrinsic goodness in people
00:59:41.760 | and that what makes the difference
00:59:45.200 | between someone doing good things for humanity
00:59:48.000 | and bad things is not some sort of fairy tale thing
00:59:51.600 | that this person was born with an evil gene
00:59:53.720 | and this one was not born with a good gene.
00:59:55.440 | No, I think it's whether we put,
00:59:58.240 | whether people find themselves in situations
01:00:01.600 | that bring out the best in them
01:00:03.800 | or that bring out the worst in them.
01:00:05.880 | And I feel we're building an internet and a society
01:00:10.080 | that brings out the worst in us.
01:00:14.320 | - But it doesn't have to be that way.
01:00:16.080 | - No, it does not.
01:00:16.920 | - It's possible to create incentives
01:00:18.880 | and also create incentives that make money,
01:00:22.680 | that both make money and bring out the best in people.
01:00:24.800 | - I mean, in the long term,
01:00:25.880 | it's not a good investment for anyone
01:00:27.880 | to have a nuclear war, for example.
01:00:30.400 | And is it a good investment for humanity
01:00:32.720 | if we just ultimately replace all humans by machines
01:00:35.680 | and then we're so obsolete that eventually
01:00:37.440 | there are no humans left?
01:00:40.600 | - Well, it depends, I guess, on how you do the math.
01:00:43.000 | But I would say by any reasonable economics,
01:00:46.640 | if you look at the future income of humans
01:00:48.440 | and there aren't any, that's not a good investment.
01:00:51.000 | Moreover, why can't we have a little bit of pride
01:00:56.360 | in our species, damn it?
01:00:58.120 | Why should we just build another species
01:00:59.800 | that gets rid of us?
01:01:01.840 | If we were Neanderthals,
01:01:03.440 | would we really consider it a smart move
01:01:07.240 | if we had really advanced biotech to build Homo sapiens?
01:01:11.560 | You might say, "Hey, Max, yeah, let's build
01:01:15.840 | these Homo sapiens.
01:01:18.920 | They're gonna be smarter than us.
01:01:20.280 | Maybe they can help us defend us better against predators
01:01:23.400 | and help fix up our caves, make them nicer.
01:01:27.200 | We'll control them undoubtedly."
01:01:29.000 | So then they build a couple, a little baby girl,
01:01:31.840 | a little baby boy.
01:01:35.960 | And then you have some wise old Neanderthal elders like,
01:01:39.400 | "Hmm, I'm scared that we're opening a Pandora's box here
01:01:44.400 | and that we're gonna get outsmarted
01:01:46.800 | by these super Neanderthal intelligences
01:01:51.800 | and there won't be any Neanderthals left."
01:01:55.280 | But then you have a bunch of others in the cave,
01:01:56.640 | "Well, yeah, you're such a Luddite scaremonger.
01:01:58.920 | Of course, they're gonna wanna keep us around
01:02:00.920 | 'cause we are their creators.
01:02:02.320 | I think the smarter they get,
01:02:05.320 | the nicer they're gonna get.
01:02:06.280 | They're gonna leave us.
01:02:07.400 | They're gonna want us around and it's gonna be fine.
01:02:11.480 | And besides, look at these babies, they're so cute.
01:02:14.400 | Clearly, they're totally harmless."
01:02:16.000 | That's exactly, those babies are exactly GPT-4.
01:02:19.160 | It's not, I wanna be clear, it's not GPT-4 that's terrifying.
01:02:24.160 | It's the GPT-4 is a baby technology.
01:02:27.800 | You know, and Microsoft even had a paper recently out
01:02:33.040 | with the title something like "Sparkles of AGI."
01:02:36.720 | Well, they were basically saying this is baby AI,
01:02:39.840 | like these little Neanderthal babies.
01:02:41.640 | And it's gonna grow up.
01:02:44.600 | There's gonna be other systems from the same company,
01:02:48.520 | from other companies, they'll be way more powerful
01:02:51.000 | but they're gonna take all the things,
01:02:53.120 | ideas from these babies.
01:02:55.440 | And before we know it, we're gonna be like
01:02:58.600 | those last Neanderthals who were pretty disappointed.
01:03:02.680 | And when they realized that they were getting replaced.
01:03:06.240 | - Well, this interesting point you make,
01:03:07.920 | which is the programming, it's entirely possible
01:03:10.160 | that GPT-4 is already the kind of system
01:03:13.200 | that can change everything by writing programs.
01:03:18.200 | - Like three, yeah, because it's Life 2.0,
01:03:21.880 | the systems I'm afraid of are gonna look nothing
01:03:25.280 | like a large language model and they're not gonna.
01:03:29.080 | But once it or other people figure out a way
01:03:32.880 | of using this tech to make much better tech,
01:03:35.080 | it's just constantly replacing its software.
01:03:38.160 | And from everything we've seen about how these work
01:03:42.040 | under the hood, they're like the minimum viable intelligence.
01:03:45.520 | They do everything in the dumbest way
01:03:47.680 | that still works sort of.
01:03:49.080 | - Yeah.
01:03:49.920 | - So they are Life 3.0, except when they replace
01:03:54.280 | their software, it's a lot faster
01:03:56.600 | than when you decide to learn Swedish.
01:03:59.120 | And moreover, they think a lot faster than us too.
01:04:04.680 | So when, we don't think on how one logical step
01:04:09.680 | every nanosecond or so the way they do.
01:04:18.400 | And we can't also just suddenly scale up our hardware
01:04:21.920 | massively in the cloud, we're so limited, right?
01:04:26.160 | So they are also Life, can soon become a little bit more
01:04:31.160 | like Life 3.0 in that if they need more hardware,
01:04:36.040 | hey, just rent it in the cloud, you know?
01:04:38.160 | How do you pay for it?
01:04:39.000 | Well, with all the services you provide.
01:04:41.000 | - And what we haven't seen yet, which could change a lot,
01:04:49.760 | is entire software systems.
01:04:53.920 | So right now, programming is done sort of in bits and pieces
01:04:57.480 | as an assistant tool to humans.
01:05:01.240 | But I do a lot of programming,
01:05:03.120 | and with the kind of stuff that GPT-4 is able to do,
01:05:05.720 | I mean, it's replacing a lot what I'm able to do.
01:05:07.900 | But you still need a human in the loop
01:05:10.760 | to kind of manage the design of things,
01:05:13.680 | manage like what are the prompts
01:05:15.760 | that generate the kind of stuff,
01:05:17.320 | to do some basic adjustment of the code,
01:05:19.360 | just do some debugging.
01:05:21.120 | But if it's possible to add on top of GPT-4
01:05:25.440 | kind of feedback loop of self-debugging, improving the code,
01:05:30.440 | and then you launch that system onto the wild,
01:05:35.440 | on the internet, because everything is connected,
01:05:37.400 | and have it do things, have it interact with humans,
01:05:39.920 | and then get that feedback,
01:05:41.240 | now you have this giant ecosystem of humans.
01:05:44.720 | That's one of the things that Elon Musk recently
01:05:47.720 | sort of tweeted as a case why everyone needs to pay $7
01:05:51.920 | or whatever for Twitter.
01:05:53.080 | - To make sure they're real.
01:05:54.760 | - Make sure they're real.
01:05:55.600 | We're now going to be living in a world
01:05:57.480 | where the bots are getting smarter and smarter and smarter
01:06:01.120 | to a degree where you can't tell the difference
01:06:05.880 | between a human and a bot.
01:06:06.920 | - That's right.
01:06:07.760 | - And now you can have bots outnumber humans
01:06:10.620 | by one million to one, which is why he's making a case
01:06:14.840 | why you have to pay to prove you're human,
01:06:17.360 | which is one of the only mechanisms to prove,
01:06:19.400 | which is depressing.
01:06:21.280 | - And I feel we have to remember,
01:06:24.480 | as individuals, we should, from time to time,
01:06:27.920 | ask ourselves why are we doing what we're doing,
01:06:29.920 | right, and as a species, we need to do that too.
01:06:33.000 | So if we're building, as you say,
01:06:36.240 | machines that are outnumbering us
01:06:39.280 | and more and more outsmarting us
01:06:41.360 | and replacing us on the job market,
01:06:43.080 | not just for the dangerous and boring tasks,
01:06:46.480 | but also for writing poems and doing art
01:06:49.480 | and things that a lot of people find really meaningful,
01:06:52.160 | gotta ask ourselves, why?
01:06:53.760 | Why are we doing this?
01:06:54.900 | The answer is Moloch is tricking us into doing it.
01:07:01.900 | And it's such a clever trick
01:07:03.120 | that even though we see the trick,
01:07:04.520 | we still have no choice but to fall for it, right?
01:07:07.360 | Also, the thing you said about you using co-pilot AI tools
01:07:15.720 | to program faster, how many times,
01:07:17.400 | what factor faster would you say you code now?
01:07:20.360 | Does it go twice as fast or?
01:07:22.560 | - I don't really, because it's such a new tool.
01:07:25.960 | - Yeah.
01:07:27.040 | - I don't know if speed is significantly improved,
01:07:29.580 | but it feels like I'm a year away
01:07:33.200 | from being five to 10 times faster.
01:07:36.960 | - So if that's typical for programmers,
01:07:39.680 | then you're already seeing another kind of self,
01:07:44.240 | recursive self-improvement, right?
01:07:45.680 | Because previously, one major generation of improvement
01:07:50.480 | of the code would happen on the human R&D timescale.
01:07:53.440 | And now if that's five times shorter,
01:07:55.400 | then it's gonna take five times less time
01:07:57.960 | than it otherwise would to develop the next level
01:08:00.320 | of these tools and so on.
01:08:02.520 | So this is exactly the beginning
01:08:06.440 | of an intelligence explosion.
01:08:09.040 | There can be humans in the loop a lot in the early stages,
01:08:11.760 | and then eventually humans are needed less and less,
01:08:14.480 | and the machines can more kind of go along.
01:08:16.440 | But what you said there is just an exact example
01:08:19.600 | of these sort of things.
01:08:20.720 | Another thing which,
01:08:22.120 | I was kind of lying on my psychiatrist,
01:08:27.520 | imagining I'm on a psychiatrist's couch here,
01:08:29.680 | saying, "What are my fears that people would do
01:08:31.480 | "with AI systems?"
01:08:33.040 | So I mentioned three that I had fears about many years ago
01:08:37.080 | that they would do, namely teach you the code,
01:08:41.680 | connect it to the internet,
01:08:42.760 | then teach it to manipulate humans.
01:08:45.200 | A fourth one is building an API
01:08:48.200 | where code can control this super powerful thing, right?
01:08:52.800 | That's very unfortunate,
01:08:54.560 | because one thing that systems like GPT-4
01:08:58.520 | have going for them is that they are an oracle
01:09:00.840 | in the sense that they just answer questions.
01:09:04.200 | There is no robot connected to GPT-4.
01:09:07.080 | GPT-4 can't go and do stock trading based on its thinking.
01:09:10.480 | It is not an agent.
01:09:13.000 | An intelligent agent is something
01:09:14.480 | that takes in information from the world,
01:09:16.560 | processes it to figure out what action to take
01:09:20.520 | based on its goals that it has,
01:09:22.360 | and then does something back on the world.
01:09:26.460 | But once you have an API, for example, GPT-4,
01:09:29.800 | nothing stops Joe Schmo and a lot of other people
01:09:33.040 | from building real agents,
01:09:35.680 | which just keep making calls somewhere
01:09:38.080 | in some inner loop somewhere
01:09:39.480 | to these powerful oracle systems,
01:09:41.720 | which makes them themselves much more powerful.
01:09:45.600 | That's another kind of unfortunate development,
01:09:48.920 | which I think we would have been better off delaying.
01:09:53.360 | I don't want to pick on any particular companies.
01:09:55.040 | I think they're all under a lot of pressure to make money.
01:09:58.260 | And again, the reason we're calling for this pause
01:10:04.480 | is to give them all cover
01:10:06.320 | to do what they know is the right thing,
01:10:08.840 | slow down a little bit at this point.
01:10:10.280 | But everything we've talked about,
01:10:12.860 | I hope we'll make it clear to people watching this
01:10:17.920 | why these sort of human level tools
01:10:22.000 | can cause a gradual acceleration.
01:10:23.640 | You keep using yesterday's technology
01:10:25.440 | to build tomorrow's technology.
01:10:27.320 | And when you do that over and over again,
01:10:30.320 | you naturally get an explosion.
01:10:32.360 | That's the definition of an explosion in science.
01:10:36.520 | If you have two people and they fall in love,
01:10:41.520 | now you have four people,
01:10:44.080 | and then they can make more babies,
01:10:46.120 | and now you have eight people,
01:10:47.120 | and then you have 16, 32, 64, et cetera.
01:10:50.840 | We call that a population explosion,
01:10:53.200 | where it's just that each,
01:10:55.160 | if it's instead free neutrons in a nuclear reaction,
01:10:59.560 | if each one can make more than one,
01:11:02.080 | then you get an exponential growth in that.
01:11:03.600 | We call it a nuclear explosion.
01:11:05.800 | All explosions are like that.
01:11:06.920 | And an intelligence explosion,
01:11:08.040 | it's just exactly the same principle,
01:11:09.440 | that some amount of intelligence
01:11:11.240 | can make more intelligence than that,
01:11:14.040 | and then repeat.
01:11:15.440 | You always get exponentials.
01:11:17.680 | - What's your intuition why it does?
01:11:19.040 | You mentioned there's some technical reasons
01:11:21.000 | why it doesn't stop at a certain point.
01:11:23.680 | What's your intuition?
01:11:24.720 | And do you have any intuition why it might stop?
01:11:28.360 | - It's obviously gonna stop
01:11:29.400 | when it bumps up against the laws of physics.
01:11:31.720 | There are some things you just can't do
01:11:32.960 | no matter how smart you are, right?
01:11:34.480 | - Allegedly.
01:11:36.200 | - 'Cause we don't know the full laws of physics yet, right?
01:11:41.080 | - Seth Lloyd wrote a really cool paper
01:11:42.680 | on the physical limits on computation, for example.
01:11:46.080 | If you put too much energy into it,
01:11:49.000 | then in finite space, it'll turn into a black hole.
01:11:51.920 | You can't move information around
01:11:53.320 | faster than the speed of light, stuff like that.
01:11:54.920 | But it's hard to store way more
01:11:58.680 | than a modest number of bits per atom, et cetera.
01:12:02.720 | But those limits are just astronomically above,
01:12:06.920 | like 30 orders of magnitude above where we are now.
01:12:09.320 | Bigger jump in intelligence
01:12:14.720 | than if you go from an ant to a human.
01:12:18.480 | I think, of course, what we want to do
01:12:23.560 | is have a controlled thing.
01:12:26.720 | A nuclear reactor, you put moderators in
01:12:28.680 | to make sure exactly it doesn't blow up out of control.
01:12:32.480 | When we do experiments with biology and cells and so on,
01:12:37.480 | we also try to make sure it doesn't get out of control.
01:12:41.160 | We can do this with AI, too.
01:12:44.440 | The thing is, we haven't succeeded yet.
01:12:47.360 | And Moloch is exactly doing the opposite,
01:12:51.680 | just fueling, just egging everybody on,
01:12:54.400 | faster, faster, faster,
01:12:56.360 | or the other company is gonna catch up with you,
01:12:58.280 | or the other country is gonna catch up with you.
01:13:01.840 | We have to want this stuff.
01:13:03.400 | I don't believe in this,
01:13:06.360 | just asking people to look into their hearts
01:13:09.400 | and do the right thing.
01:13:10.880 | It's easier for others to say that,
01:13:12.680 | but if you're in a situation
01:13:14.680 | where your company is gonna get screwed,
01:13:17.360 | by other companies that are not stopping,
01:13:23.840 | you're putting people in a very hard situation.
01:13:26.080 | The right thing to do
01:13:26.920 | is change the whole incentive structure instead.
01:13:29.920 | And this is not an old...
01:13:31.520 | Maybe I should say one more thing about this,
01:13:34.360 | 'cause Moloch has been around as humanity's
01:13:37.440 | number one or number two enemy
01:13:40.720 | since the beginning of civilization.
01:13:42.320 | And we came up with some really cool countermeasures.
01:13:46.600 | First of all, already over 100,000 years ago,
01:13:49.760 | evolution realized that it was very unhelpful
01:13:52.960 | that people kept killing each other all the time.
01:13:55.520 | So it genetically gave us compassion.
01:14:00.240 | And made it so that if you get two drunk dudes
01:14:03.240 | getting into a pointless bar fight,
01:14:05.040 | they might give each other black eyes,
01:14:07.920 | but they have a lot of inhibition
01:14:10.600 | towards just killing each other.
01:14:12.760 | And similarly, if you find a baby lying on the street
01:14:18.160 | when you go out for your morning jog tomorrow,
01:14:20.520 | you're gonna stop and pick it up, right?
01:14:22.120 | Even though it may make you late for your next podcast.
01:14:25.880 | So evolution gave us these genes
01:14:28.080 | that make our own egoistic incentives more aligned
01:14:32.320 | with what's good for the greater group we're part of.
01:14:35.520 | And then as we got a bit more sophisticated
01:14:39.760 | and developed language, we invented gossip,
01:14:43.640 | which is also a fantastic anti-Moloch.
01:14:45.800 | 'Cause now it really discourages liars,
01:14:51.680 | moochers, cheaters, because their own incentive now
01:14:57.040 | is not to do this, because word quickly gets around
01:15:00.880 | and then suddenly people aren't gonna invite them
01:15:02.880 | to their dinners anymore or trust them.
01:15:05.640 | And then when we got still more sophisticated
01:15:07.560 | and bigger societies, invented the legal system,
01:15:11.440 | where even strangers who couldn't rely on gossip
01:15:14.240 | and things like this would treat each other,
01:15:16.640 | would have an incentive.
01:15:17.880 | Now those guys in the bar fight,
01:15:19.240 | even if someone is so drunk
01:15:21.080 | that he actually wants to kill the other guy,
01:15:26.160 | he also has a little thought in the back of his head
01:15:28.080 | that, "Do I really wanna spend the next 10 years
01:15:30.480 | eating really crappy food in a small room?
01:15:34.680 | I'm just gonna chill out."
01:15:38.760 | And we similarly have tried to give these incentives
01:15:40.840 | to our corporations by having regulation
01:15:44.360 | and all sorts of oversight,
01:15:45.760 | so that their incentives are aligned with the greater good.
01:15:48.480 | We tried really hard.
01:15:49.560 | And the big problem that we're failing now
01:15:55.640 | is not that we haven't tried before,
01:15:57.480 | but it's just that the tech is growing much,
01:16:00.040 | is developing much faster
01:16:01.480 | than the regulators have been able to keep up.
01:16:03.440 | So regulators, it's kind of comical,
01:16:06.720 | like European Union right now is doing this AI act, right?
01:16:10.160 | Which, in the beginning,
01:16:13.040 | they had a little opt-out exception
01:16:16.040 | that GPT-4 would be completely excluded from regulation.
01:16:19.600 | Brilliant idea.
01:16:21.680 | - What's the logic behind that?
01:16:24.240 | - Some lobbyists pushed successfully for this.
01:16:27.400 | So we were actually quite involved
01:16:28.600 | with the Future Life Institute,
01:16:30.080 | Mark Brackel, Christo Ouk, Anthony Aguirre, and others.
01:16:34.160 | We're quite involved with educating various people
01:16:38.160 | involved in this process
01:16:39.120 | about these general purpose AI models coming
01:16:42.960 | and pointing out that they would become the laughingstock
01:16:45.360 | if they didn't put it in.
01:16:46.800 | So the French started pushing for it.
01:16:48.840 | It got put in to the draft,
01:16:50.840 | and it looked like all was good.
01:16:52.520 | And then there was a huge counter push from lobbyists.
01:16:56.800 | There were more lobbyists in Brussels from tech companies
01:16:59.520 | than from oil companies, for example.
01:17:02.440 | And it looked like it might,
01:17:04.000 | is it gonna maybe get taken out again?
01:17:06.680 | And now GPT-4 happened,
01:17:09.000 | and I think it's gonna stay in.
01:17:10.560 | But this just shows, you know,
01:17:12.320 | Moloch can be defeated,
01:17:14.240 | but the challenge we're facing is that the tech
01:17:18.080 | is generally much faster than what the policymakers are.
01:17:23.080 | And a lot of the policymakers
01:17:25.640 | also don't have a tech background.
01:17:28.160 | So it's, you know, we really need to work hard
01:17:31.200 | to educate them on what's taking place here.
01:17:34.800 | So we're getting the situation where the first kind of non,
01:17:39.240 | so I define artificial intelligence
01:17:41.320 | just as non-biological intelligence, right?
01:17:44.680 | And by that definition,
01:17:46.160 | a company, a corporation is also an artificial intelligence
01:17:50.560 | because the corporation isn't, it's humans, it's a system.
01:17:53.960 | If its CEO decides,
01:17:56.040 | if the CEO of a tobacco company decides one morning
01:17:58.760 | that she or he doesn't wanna sell cigarettes anymore,
01:18:01.000 | they'll just put another CEO in there.
01:18:02.840 | It's not enough to align the incentives of individual people
01:18:08.080 | or align individual computers' incentives to their owners,
01:18:12.920 | which is what technically AI safety research is about.
01:18:16.120 | You also have to align the incentives of corporations
01:18:18.800 | with the greater good.
01:18:19.840 | And some corporations have gotten so big and so powerful
01:18:23.040 | very quickly that in many cases,
01:18:26.400 | their lobbyists instead align the regulators
01:18:30.440 | to what they want rather than the other way around.
01:18:33.000 | It's a classic regulatory capture.
01:18:35.600 | - All right, is the thing that the slowdown hopes to achieve
01:18:40.400 | is give enough time to the regulators to catch up
01:18:43.560 | or enough time to the companies themselves to breathe
01:18:46.280 | and understand how to do AI safety correctly?
01:18:48.880 | - I think both, but I think that the vision,
01:18:52.000 | the path to success I see is first you give a breather
01:18:55.240 | actually to the people in these companies,
01:18:58.040 | their leadership who wants to do the right thing
01:19:00.240 | and they all have safety teams and so on on their companies.
01:19:03.080 | Give them a chance to get together with the other companies
01:19:08.720 | and the outside pressure can also help catalyze that
01:19:11.320 | and work out what is it that's,
01:19:17.280 | what are the reasonable safety requirements
01:19:21.200 | one should put on future systems before they get rolled out?
01:19:25.040 | There are a lot of people also in academia
01:19:27.520 | and elsewhere outside of these companies
01:19:29.240 | who can be brought into this
01:19:30.320 | and have a lot of very good ideas.
01:19:32.760 | And then I think it's very realistic
01:19:35.480 | that within six months you can get these people coming up,
01:19:39.880 | so here's a white paper,
01:19:40.880 | here's where we all think it's reasonable.
01:19:43.440 | You know, you didn't,
01:19:45.160 | just because cars killed a lot of people,
01:19:46.760 | you didn't ban cars,
01:19:48.080 | but they got together a bunch of people and decided,
01:19:50.200 | you know, in order to be allowed to sell a car,
01:19:52.320 | it has to have a seatbelt in it.
01:19:53.920 | They're the analogous things that you can start requiring
01:19:58.160 | a future AI systems so that they are safe.
01:20:03.080 | And once this heavy lifting,
01:20:08.080 | this intellectual work has been done by experts in the field,
01:20:11.760 | which can be done quickly,
01:20:13.520 | I think it's going to be quite easy to get policymakers
01:20:16.200 | to see, yeah, this is a good idea.
01:20:19.360 | And it's, you know, for the companies to fight Moloch,
01:20:24.360 | they want, and I believe Sam Altman
01:20:27.760 | has explicitly called for this,
01:20:29.120 | they want the regulators to actually adopt it
01:20:31.000 | so that their competition is going to abide by it too, right?
01:20:33.840 | You don't want to be enacting all these principles
01:20:38.840 | and then you abide by them,
01:20:40.760 | and then there's this one little company
01:20:43.760 | that doesn't sign on to it,
01:20:46.880 | and then now they can gradually overtake you.
01:20:49.640 | Then the companies will get,
01:20:51.080 | be able to sleep secure,
01:20:54.280 | knowing that everybody's playing by the same rules.
01:20:56.680 | - So do you think it's possible to develop guardrails
01:21:00.800 | that keep the systems from basically
01:21:05.800 | damaging irreparably humanity
01:21:09.200 | while still enabling sort of the capitalist-fueled
01:21:12.240 | competition between companies
01:21:13.640 | as they develop how to best make money with this AI?
01:21:16.960 | You think there's a balancing--
01:21:18.040 | - Totally. - That's possible?
01:21:19.160 | - Absolutely, I mean, we've seen that
01:21:20.560 | in many other sectors where you've had the free market
01:21:23.240 | produce quite good things without causing particular harm.
01:21:28.600 | When the guardrails are there and they work,
01:21:30.800 | capitalism is a very good way of optimizing
01:21:35.360 | for just getting the same things done more efficiently.
01:21:38.120 | But it was good, and in hindsight,
01:21:40.840 | and I've never met anyone,
01:21:42.360 | even on parties way over on the right,
01:21:48.160 | in any country who thinks it was a terrible idea
01:21:51.720 | to ban child labor, for example.
01:21:55.200 | - Yeah, but it seems like this particular technology
01:21:57.880 | has gotten so good so fast, become powerful
01:22:02.560 | to a degree where you could see in the near term
01:22:05.400 | the ability to make a lot of money
01:22:07.800 | and to put guardrails, to develop guardrails quickly
01:22:10.440 | in that kind of context seems to be tricky.
01:22:12.960 | It's not similar to cars or child labor.
01:22:16.640 | It seems like the opportunity to make a lot of money here
01:22:19.960 | very quickly is right here before us.
01:22:22.840 | - So again, there's this cliff.
01:22:24.920 | - Yeah, it gets quite scenic.
01:22:27.200 | The closer to the cliff you go,
01:22:29.000 | the more money there is, the more gold ingots
01:22:32.720 | there are on the ground you can pick up or whatever
01:22:34.280 | if you want to drive there very fast.
01:22:36.080 | But it's not in anyone's incentive that we go over the cliff
01:22:38.720 | and it's not like everybody's in their own car.
01:22:40.920 | All the cars are connected together with a chain.
01:22:43.680 | So if anyone goes over, they'll start dragging the others down too.
01:22:48.160 | And so ultimately, it's in the selfish interests
01:22:52.560 | also of the people in the companies to slow down
01:22:56.200 | when you start seeing the contours of the cliff
01:22:59.280 | there in front of you.
01:23:00.600 | The problem is that even though the people
01:23:03.080 | who are building the technology and the CEOs,
01:23:07.400 | they really get it, the shareholders
01:23:10.080 | and these other market forces,
01:23:12.400 | they are people who don't honestly understand
01:23:16.040 | that the cliff is there.
01:23:16.880 | They usually don't.
01:23:17.960 | You have to get quite into the weeds
01:23:19.600 | to really appreciate how powerful this is and how fast.
01:23:22.560 | And a lot of people are even still stuck again
01:23:24.160 | in this idea that in this carbon chauvinism,
01:23:29.160 | as I like to call it, that you can only have
01:23:31.800 | our level of intelligence in humans,
01:23:34.800 | that there's something magical about it.
01:23:36.000 | Whereas the people in the tech companies
01:23:38.200 | who build this stuff, they all realize
01:23:41.440 | that intelligence is information processing of a certain kind.
01:23:45.720 | And it really doesn't matter at all
01:23:48.000 | whether the information is processed by carbon atoms
01:23:50.280 | in neurons in brains or by silicon atoms
01:23:55.000 | in some technology we build.
01:23:56.840 | So you brought up capitalism earlier,
01:24:00.720 | and there are a lot of people who love capitalism
01:24:02.560 | and a lot of people who really, really don't.
01:24:07.560 | And it struck me recently that what's happening
01:24:12.960 | with capitalism here is exactly analogous
01:24:16.360 | to the way in which superintelligence might wipe us out.
01:24:20.120 | So, you know, I studied economics for my undergrad,
01:24:25.120 | Stockholm School of Economics, yay.
01:24:28.320 | (laughing)
01:24:29.760 | - Well, no, I tell me.
01:24:31.000 | - So I was very interested in how you could use
01:24:34.080 | market forces to just get stuff done more efficiently,
01:24:37.040 | but give the right incentives to the market
01:24:38.880 | so that it wouldn't do really bad things.
01:24:41.520 | So Dylan Hadfield-Manel, who's a professor
01:24:44.760 | and colleague of mine at MIT,
01:24:47.360 | wrote this really interesting paper
01:24:49.400 | with some collaborators recently,
01:24:51.520 | where they proved mathematically that if you just take
01:24:54.480 | one goal that you just optimize for,
01:24:57.880 | on and on and on indefinitely,
01:24:59.720 | that you think is gonna bring you in the right direction.
01:25:03.400 | What basically always happens is,
01:25:05.440 | in the beginning, it will make things better for you.
01:25:08.680 | But if you keep going, at some point,
01:25:11.320 | it's gonna start making things worse for you again.
01:25:13.720 | And then gradually it's gonna make it
01:25:15.000 | really, really terrible.
01:25:16.400 | So just as a simple, the way I think of the proof is,
01:25:20.520 | suppose you wanna go from here back to Austin, for example,
01:25:25.520 | and you're like, okay, yeah, let's just, let's go south,
01:25:29.400 | but you put in exactly the right,
01:25:30.760 | sort of the right direction.
01:25:32.120 | Just optimize that, south is possible.
01:25:34.120 | You get closer and closer to Austin,
01:25:35.920 | but there's always some little error.
01:25:41.360 | So you're not going exactly towards Austin,
01:25:44.240 | but you get pretty close.
01:25:45.440 | But eventually you start going away again,
01:25:47.240 | and eventually you're gonna be leaving the solar system.
01:25:50.160 | - Yeah.
01:25:51.000 | - And they proved, it's a beautiful mathematical proof.
01:25:53.440 | This happens generally, and this is very important for AI,
01:25:57.800 | because even though Stuart Russell has written a book
01:26:02.240 | and given a lot of talks on why it's a bad idea
01:26:06.000 | to have AI just blindly optimize something,
01:26:08.440 | that's what pretty much all our systems do.
01:26:10.720 | We have something called the loss function
01:26:12.280 | that we're just minimizing, or reward function,
01:26:14.160 | we're just maximizing.
01:26:15.680 | And capitalism is exactly like that too.
01:26:21.920 | We wanted to get stuff done more efficiently
01:26:26.240 | than people wanted.
01:26:27.560 | So we introduced the free market.
01:26:30.440 | Things got done much more efficiently than they did
01:26:34.360 | in say, communism, right?
01:26:38.760 | And it got better.
01:26:39.760 | But then it just kept optimizing.
01:26:43.840 | And kept optimizing.
01:26:45.320 | And you got ever bigger companies
01:26:46.480 | and ever more efficient information processing,
01:26:48.640 | and now also very much powered by IT.
01:26:51.080 | And eventually a lot of people are beginning to feel,
01:26:55.360 | wait, we're kind of optimizing a bit too much.
01:26:57.320 | Like, why did we just chop down half the rainforest?
01:26:59.920 | And why did suddenly these regulators
01:27:03.480 | get captured by lobbyists and so on?
01:27:07.360 | It's just the same optimization
01:27:08.600 | that's been running for too long.
01:27:11.040 | If you have an AI that actually has power over the world
01:27:15.000 | and you just give it one goal
01:27:16.200 | and just keep optimizing that,
01:27:18.240 | most likely everybody's gonna be like,
01:27:20.040 | yay, this is great in the beginning,
01:27:21.320 | things are getting better.
01:27:23.480 | But it's almost impossible to give it exactly
01:27:27.760 | the right direction to optimize in.
01:27:29.920 | And then eventually all hay breaks loose, right?
01:27:34.680 | Nick Bostrom and others have given examples
01:27:37.440 | that sound quite silly.
01:27:38.840 | What if you just wanna tell it to cure cancer or something,
01:27:43.800 | and that's all you tell it?
01:27:45.120 | Maybe it's gonna decide to take over entire continents
01:27:50.120 | just so it can get more supercomputer facilities in there
01:27:53.600 | and figure out how to cure cancer backwards.
01:27:55.960 | And then you're like, wait, that's not what I wanted, right?
01:27:58.440 | And the issue with capitalism
01:28:02.960 | and the issue with the front-end AI
01:28:04.200 | have kind of merged now
01:28:05.600 | because the Moloch I talked about
01:28:08.600 | is exactly the capitalist Moloch
01:28:10.400 | that we have built an economy
01:28:12.360 | that is optimizing for only one thing, profit.
01:28:16.680 | And that worked great back when things were very inefficient
01:28:20.960 | and then now it's getting done better.
01:28:22.720 | And it worked great as long as the companies
01:28:24.760 | were small enough that they couldn't capture the regulators.
01:28:28.080 | But that's not true anymore, but they keep optimizing.
01:28:32.520 | And now they realize that these companies
01:28:37.000 | can make even more profit by building ever more powerful AI
01:28:39.520 | even if it's reckless,
01:28:40.680 | but optimize more and more and more and more and more.
01:28:46.040 | So this is Moloch again, showing up.
01:28:50.280 | And I just wanna, anyone here who has any concerns
01:28:54.200 | about late-stage capitalism having gone a little too far,
01:28:59.200 | you should worry about superintelligence
01:29:02.400 | 'cause it's the same villain in both cases, it's Moloch.
01:29:06.120 | - And optimizing one objective function
01:29:10.040 | aggressively, blindly is going to take us there.
01:29:13.560 | - Yeah, we have this pause from time to time
01:29:16.080 | and look into our hearts and ask, why are we doing this?
01:29:20.560 | Is this, am I still going towards Austin
01:29:23.360 | or have I gone too far?
01:29:25.320 | Maybe we should change direction.
01:29:27.400 | - And that is the idea behind a halt for six months.
01:29:30.920 | Why six months? - Yeah.
01:29:32.300 | - That seems like a very short period.
01:29:34.200 | Can we just linger and explore different ideas here?
01:29:37.680 | Because this feels like a really important moment
01:29:40.160 | in human history where pausing would actually
01:29:42.960 | have a significant positive effect.
01:29:46.160 | - We said six months because we figured
01:29:50.480 | the number one pushback we were gonna get in the West
01:29:54.360 | was like, but China.
01:29:57.960 | And everybody knows there's no way that China
01:30:01.040 | is gonna catch up with the West on this in six months.
01:30:03.520 | So that argument goes off the table
01:30:05.800 | and you can forget about geopolitical competition
01:30:08.200 | and just focus on the real issue.
01:30:11.360 | That's why we put this.
01:30:12.600 | - That's really interesting.
01:30:13.800 | But you've already made the case that even for China,
01:30:18.000 | if you actually wanna take on that argument,
01:30:20.640 | China too would not be bothered by a longer halt
01:30:25.400 | because they don't wanna lose control,
01:30:26.960 | even more than the West doesn't.
01:30:28.560 | - That's what I think.
01:30:30.400 | - That's a really interesting argument.
01:30:32.280 | I have to actually really think about that,
01:30:33.960 | which the kind of thing people assume
01:30:36.920 | is if you develop an AGI, that open AI,
01:30:40.040 | if they're the ones that do it, for example,
01:30:42.200 | they're going to win.
01:30:44.000 | But you're saying, no, everybody loses.
01:30:47.400 | - Yeah, it's gonna get better and better and better
01:30:49.840 | and then kaboom, we all lose.
01:30:52.080 | That's what's gonna happen.
01:30:53.120 | - When lose and win are defined in a metric
01:30:55.000 | of basically quality of life for human civilization
01:31:00.000 | and for Sam Altman.
01:31:01.360 | - To be blunt, my personal guess,
01:31:05.280 | and people can quibble with this,
01:31:06.320 | is that we're just gonna, there won't be any humans.
01:31:08.680 | That's it, that's what I mean by lose.
01:31:10.560 | We can see in history, once you have some species
01:31:15.000 | or some group of people who aren't needed anymore,
01:31:18.180 | doesn't usually work out so well for them, right?
01:31:22.640 | - Yeah.
01:31:24.320 | - There were a lot of horses that were used
01:31:26.440 | for traffic in Boston and then the car got invented
01:31:29.080 | and most of them got, yeah, well, we don't need to go there.
01:31:33.040 | And if you look at humans, right now,
01:31:36.040 | why did the labor movement succeed
01:31:45.360 | after the Industrial Revolution?
01:31:46.720 | Because it was needed.
01:31:47.900 | Even though we had a lot of Molochs
01:31:52.920 | and there was child labor and so on,
01:31:54.840 | the company still needed to have workers
01:31:58.680 | and that's why strikes had power and so on.
01:32:02.640 | If we get to the point where most humans
01:32:05.120 | aren't needed anymore, I think it's quite naive
01:32:07.760 | to think that they're gonna still be treated well.
01:32:10.600 | We say that, yeah, yeah, everybody's equal
01:32:13.200 | and the government will always, we'll always protect them.
01:32:15.540 | But if you look in practice,
01:32:17.460 | groups that are very disenfranchised
01:32:19.480 | and don't have any actual power,
01:32:22.320 | usually get screwed.
01:32:24.200 | And now, in the beginning, so Industrial Revolution,
01:32:29.200 | we automated away muscle work.
01:32:30.920 | But that worked out pretty well eventually
01:32:35.560 | because we educated ourselves
01:32:36.880 | and started working with our brains instead
01:32:38.680 | and got usually more interesting, better paid jobs.
01:32:42.520 | But now we're beginning to replace brain work.
01:32:44.280 | So we replaced a lot of boring stuff,
01:32:46.320 | like we got the pocket calculator
01:32:48.800 | so you don't have people adding stuff
01:32:50.920 | and adding, multiplying numbers anymore at work.
01:32:53.720 | Fine, there were better jobs they could get.
01:32:56.280 | But now, GPT-4 and the stable diffusion
01:33:01.280 | and techniques like this,
01:33:02.800 | they're really beginning to blow away
01:33:06.000 | some jobs that people really love having.
01:33:08.720 | There was a heartbreaking article just,
01:33:10.600 | post just yesterday on social media I saw
01:33:13.120 | about this guy who was doing 3D modeling for gaming
01:33:17.320 | and all of a sudden now they got this new software,
01:33:20.960 | he just says prompts and he feels his whole job
01:33:24.760 | that he loved just lost its meaning.
01:33:27.320 | And I asked GPT-4 to rewrite "Twinkle, Twinkle Little Star"
01:33:32.320 | in the style of Shakespeare.
01:33:34.600 | I couldn't have done such a good job.
01:33:37.720 | It was really impressive.
01:33:39.920 | You've seen a lot of the art coming out here.
01:33:42.160 | So I'm all for automating away the dangerous jobs
01:33:47.160 | and boring jobs.
01:33:48.520 | But I think you hear a lot,
01:33:51.840 | some arguments which are too glib.
01:33:53.200 | Sometimes people say,
01:33:54.040 | "Well, that's all that's gonna happen.
01:33:55.120 | "We're getting rid of the boring,
01:33:57.600 | "tedious, dangerous jobs."
01:33:59.160 | It's just not true.
01:34:00.160 | There are a lot of really interesting jobs
01:34:01.880 | that are being taken away now.
01:34:03.240 | Journalism is gonna get crushed.
01:34:05.960 | Coding is gonna get crushed.
01:34:08.760 | I predict the job market for programmers,
01:34:12.120 | salaries are gonna start dropping.
01:34:15.360 | You said you can code five times faster
01:34:18.080 | than you need five times fewer programmers.
01:34:20.160 | Maybe there'll be more output also,
01:34:22.880 | but you'll still end up needing fewer programmers than today.
01:34:27.080 | And I love coding.
01:34:28.320 | I think it's super cool.
01:34:29.960 | So we need to stop and ask ourselves why again
01:34:35.000 | are we doing this as humans?
01:34:36.720 | I feel that AI should be built by humanity for humanity.
01:34:44.520 | And let's not forget that.
01:34:45.960 | It shouldn't be by Moloch for Moloch.
01:34:48.800 | Or what it really is now is kind of by humanity for Moloch,
01:34:53.240 | which doesn't make any sense.
01:34:54.880 | It's for us that we're doing it.
01:34:57.680 | And it would make a lot more sense
01:35:00.280 | if we develop, figure out gradually and safely
01:35:04.000 | how to make all this tech.
01:35:04.920 | And then we think about what are the kind of jobs
01:35:06.760 | that people really don't wanna have?
01:35:08.880 | Automate them all away.
01:35:10.640 | And then we ask,
01:35:11.480 | what are the jobs that people really find meaning in?
01:35:15.240 | Like maybe taking care of children in the daycare center,
01:35:20.240 | maybe doing art, et cetera, et cetera.
01:35:23.320 | And even if it were possible to automate that way,
01:35:26.760 | we don't need to do that, right?
01:35:28.600 | We built these machines.
01:35:30.320 | - Well, it's possible that we redefine
01:35:33.760 | or rediscover what are the jobs that give us meaning.
01:35:36.680 | So for me, the thing, it is really sad.
01:35:40.200 | Like I, half the time I'm excited,
01:35:43.920 | half the time I'm crying as I'm generating code
01:35:48.680 | because I kind of love programming.
01:35:52.640 | It's an act of creation.
01:35:55.240 | You have an idea, you design it,
01:35:58.240 | and then you bring it to life and it does something,
01:36:00.080 | especially if there's some intelligence that it does something.
01:36:02.400 | It doesn't even have to have intelligence.
01:36:04.120 | Printing "Hello World" on screen,
01:36:06.240 | you made a little machine and it comes to life.
01:36:10.240 | - Yeah.
01:36:11.080 | - And there's a bunch of tricks you learn along the way
01:36:13.840 | 'cause you've been doing it for many, many years.
01:36:17.440 | And then to see AI be able to generate
01:36:19.920 | all the tricks you thought were special.
01:36:21.920 | - Yeah.
01:36:22.760 | - I don't know, it's very, it's scary.
01:36:29.040 | It's almost painful, like a loss of innocence maybe.
01:36:34.080 | Like maybe when I was younger,
01:36:36.520 | I remember before I learned that sugar's bad for you,
01:36:39.960 | you should be on a diet.
01:36:41.720 | I remember I enjoyed candy deeply,
01:36:44.320 | in a way I just can't anymore,
01:36:47.120 | that I know is bad for me.
01:36:48.760 | I enjoyed it unapologetically, fully, just intensely.
01:36:53.760 | And I just, I lost that.
01:36:55.840 | Now I feel like a little bit of that is lost for me
01:36:59.400 | with programming, or being lost with programming,
01:37:01.520 | similar as it is for the 3D modeler,
01:37:06.440 | no longer being able to really enjoy the art of modeling
01:37:10.000 | 3D things for gaming.
01:37:11.840 | I don't know, I don't know what to make sense of that.
01:37:13.400 | Maybe I would rediscover that the true magic
01:37:15.960 | of what it means to be human
01:37:16.960 | is connecting with other humans,
01:37:18.160 | to have conversations like this.
01:37:19.800 | I don't know, to have sex, to eat food,
01:37:24.040 | to really intensify the value from conscious experiences
01:37:28.240 | versus like creating other stuff.
01:37:30.320 | - You're pitching the rebranding again
01:37:32.360 | from Homo sapiens to Homo sentiens.
01:37:34.000 | - Homo sentiens, yeah.
01:37:34.960 | - The meaningful experiences.
01:37:36.480 | And just to inject some optimism in this here,
01:37:38.400 | so we don't sound like a bunch of gloomers,
01:37:40.640 | we can totally have our cake and eat it.
01:37:43.080 | You hear a lot of totally bullshit claims
01:37:45.080 | that we can't afford having more teachers,
01:37:47.800 | have to cut the number of nurses.
01:37:49.440 | That's just nonsense, obviously.
01:37:51.680 | With anything, even quite far short of AGI,
01:37:57.800 | we can dramatically improve, grow the GDP
01:38:01.720 | and produce a wealth of goods and services.
01:38:05.600 | It's very easy to create a world
01:38:07.160 | where everybody's better off than today,
01:38:09.160 | including the richest people can be better off as well.
01:38:13.560 | It's not a zero-sum game in technology.
01:38:17.000 | Again, you can have two countries like Sweden and Denmark
01:38:20.440 | have all these ridiculous wars century after century.
01:38:25.360 | And sometimes Sweden got a little better off
01:38:28.800 | 'cause it got a little bit bigger.
01:38:29.720 | And then Denmark got a little bit better off
01:38:31.280 | 'cause Sweden got a little bit smaller.
01:38:33.320 | But then technology came along
01:38:35.000 | and we both got just dramatically wealthier
01:38:37.240 | without taking it away from anyone else.
01:38:38.640 | It was just a total win for everyone.
01:38:40.960 | And AI can do that on steroids.
01:38:44.480 | If you can build safe AGI,
01:38:47.960 | if you can build super intelligence,
01:38:49.800 | basically all the limitations that cause harm today
01:38:55.000 | can be completely eliminated.
01:38:57.840 | It's a wonderful possibility.
01:39:00.720 | And this is not sci-fi.
01:39:01.920 | This is something which is clearly possible
01:39:03.760 | according to the laws of physics.
01:39:05.640 | And we can talk about ways of making it safe also.
01:39:09.440 | But unfortunately, that'll only happen
01:39:13.480 | if we steer in that direction.
01:39:14.720 | That's absolutely not the default outcome.
01:39:17.120 | That's why income inequality keeps going up.
01:39:22.000 | That's why the life expectancy in the US
01:39:23.960 | has been going down now.
01:39:25.000 | I think it's four years in a row.
01:39:27.240 | I just read a heartbreaking study from the CDC
01:39:30.760 | about how something like one third
01:39:33.480 | of all teenage girls in the US
01:39:36.400 | been thinking about suicide.
01:39:37.840 | Those are steps in totally the wrong direction.
01:39:42.600 | And it's important to keep our eyes on the prize here
01:39:45.880 | that we have the power now for the first time
01:39:50.880 | in the history of our species.
01:39:53.960 | To harness artificial intelligence,
01:39:55.680 | to help us really flourish
01:39:58.120 | and help bring out the best in our humanity
01:40:03.360 | rather than the worst of it.
01:40:05.840 | To help us have really fulfilling experiences
01:40:10.240 | that feel truly meaningful.
01:40:11.480 | And you and I shouldn't sit here
01:40:13.680 | and dictate to future generations what they will be.
01:40:15.520 | Let them figure it out,
01:40:16.360 | but let's give them a chance to live
01:40:18.680 | and not foreclose all these possibilities for them
01:40:21.320 | by just messing things up, right?
01:40:23.040 | - And for that, we'll have to solve the AI safety problem.
01:40:25.800 | It would be nice if we can linger
01:40:27.080 | on exploring that a little bit.
01:40:29.520 | So one interesting way to enter that discussion
01:40:33.760 | is you tweeted and Elon replied.
01:40:37.920 | You tweeted, "Let's not just focus on whether GPT-4
01:40:40.400 | "will do more harm or good on the job market,
01:40:42.580 | "but also whether its coding skills
01:40:44.580 | "will hasten the arrival of superintelligence."
01:40:47.480 | That's something we've been talking about, right?
01:40:49.440 | So Elon proposed one thing in the reply,
01:40:51.560 | saying, "Maximum truth-seeking
01:40:53.320 | "is my best guess for AI safety."
01:40:55.960 | Can you maybe steelman the case
01:40:59.400 | for this objective function of truth
01:41:04.400 | and maybe make an argument against it?
01:41:06.760 | And in general, what are your different ideas
01:41:09.960 | to start approaching the solution to AI safety?
01:41:12.720 | - I didn't see that reply, actually.
01:41:14.400 | - Oh, interesting.
01:41:16.120 | - But I really resonate with it because
01:41:19.240 | AI is not evil.
01:41:20.600 | It caused people around the world
01:41:23.000 | to hate each other much more,
01:41:24.520 | but that's because we made it in a certain way.
01:41:27.960 | It's a tool.
01:41:28.800 | We can use it for great things and bad things.
01:41:30.480 | And we could just as well have AI systems.
01:41:33.240 | And this is part of my vision for success here,
01:41:36.840 | truth-seeking AI that really brings us together again.
01:41:41.840 | Why do people hate each other so much
01:41:43.800 | between countries and within countries?
01:41:46.080 | It's because they each have totally different versions
01:41:49.640 | of the truth, right?
01:41:50.840 | If they all had the same truth
01:41:54.560 | that they trusted for good reason,
01:41:56.240 | 'cause they could check it and verify it
01:41:57.800 | and not have to believe
01:41:58.640 | in some self-proclaimed authority, right?
01:42:00.840 | There wouldn't be nearly as much hate.
01:42:03.960 | There'd be a lot more understanding instead.
01:42:06.040 | And this is, I think,
01:42:09.160 | something AI can help enormously with.
01:42:11.200 | For example, if you're a journalist,
01:42:14.960 | for example, a little baby step in this direction
01:42:18.640 | is this website called Metaculous,
01:42:21.040 | where people bet and make predictions,
01:42:25.320 | not for money, but just for their own reputation.
01:42:29.000 | And it's kind of funny, actually.
01:42:30.480 | You treat the humans like you treat AI,
01:42:32.400 | as you have a loss function where they get penalized
01:42:35.560 | if they're super confident on something
01:42:37.880 | and then the opposite happens.
01:42:39.440 | Whereas if you're kind of humble and then you're like,
01:42:43.120 | I think it's 51% chance this is gonna happen,
01:42:45.360 | and then the other happens, you don't get penalized much.
01:42:48.600 | And what you can see is that some people
01:42:50.000 | are much better at predicting than others.
01:42:52.400 | They've earned your trust, right?
01:42:54.040 | One project that I'm working on right now
01:42:57.680 | is an outgrowth of Improve the News Foundation
01:42:59.320 | together with the Metaculous folks
01:43:00.520 | is seeing if we can really scale this up a lot
01:43:03.040 | with more powerful AI.
01:43:04.560 | 'Cause I would love it.
01:43:06.320 | I would love for there to be
01:43:07.400 | a really powerful truth-seeking system
01:43:10.560 | where that is trustworthy
01:43:14.320 | because it keeps being right about stuff.
01:43:17.120 | And people who come to it
01:43:19.240 | and maybe look at its latest trust ranking
01:43:24.160 | of different pundits and newspapers, et cetera,
01:43:27.480 | if they wanna know why someone got a low score,
01:43:29.840 | they can click on it and see all the predictions
01:43:32.440 | that they actually made and how they turned out.
01:43:35.040 | This is how we do it in science.
01:43:38.160 | You trust scientists like Einstein who said something
01:43:40.560 | everybody thought was bullshit and turned out to be right.
01:43:44.200 | You get a lot of trust points
01:43:45.560 | and he did it multiple times even.
01:43:47.440 | I think AI has the power to really heal
01:43:53.800 | a lot of the rifts we're seeing by creating trust systems.
01:43:58.400 | It has to get away from this idea today
01:44:02.520 | with some fact-checking science
01:44:03.880 | which might themselves have an agenda
01:44:05.760 | and you just trust it because of its reputation.
01:44:08.160 | You wanna have it so these sort of systems,
01:44:13.080 | they earn their trust and they're completely transparent.
01:44:16.320 | This I think would actually help a lot.
01:44:18.480 | That can, I think, help heal
01:44:21.400 | the very dysfunctional conversation that humanity has
01:44:24.920 | about how it's gonna deal with all its biggest challenges
01:44:28.880 | in the world today.
01:44:31.520 | - And then on the technical side,
01:44:35.920 | another common sort of gloom comment I get
01:44:39.400 | from people who are saying, "We're just screwed.
01:44:40.920 | "There's no hope," is, well, things like GPT-4
01:44:44.120 | are way too complicated for a human to ever understand
01:44:47.160 | and prove that they can be trustworthy.
01:44:49.240 | They're forgetting that AI can help us
01:44:51.600 | prove that things work.
01:44:53.480 | There's this very fundamental fact that in math,
01:44:58.240 | it's much harder to come up with a proof
01:45:01.760 | than it is to verify that the proof is correct.
01:45:04.920 | You can actually write a little proof-checking code
01:45:07.040 | which is quite short, but you can, as a human, understand it.
01:45:10.640 | And then it can check the most monstrously long proof
01:45:12.960 | ever generated even by a computer and say,
01:45:14.920 | "Yeah, this is valid."
01:45:16.280 | So right now, we have this approach
01:45:26.880 | with virus-checking software that it looks to see
01:45:29.680 | if there's something, if you should not trust it.
01:45:31.800 | And if it can prove to itself
01:45:33.160 | that you should not trust that code, it warns you.
01:45:35.600 | What if you flip this around?
01:45:40.000 | And this is an idea I give credit to Steve on Mahindra for.
01:45:44.240 | So that it will only run the code if it can prove,
01:45:47.360 | instead of not running it, if it can prove
01:45:49.000 | that it's not trustworthy, if it will only run it
01:45:51.520 | if it can prove that it's trustworthy.
01:45:52.920 | So it asks the code, "Prove to me that you're gonna do
01:45:55.120 | "what you say you're gonna do."
01:45:57.400 | And it gives you this proof.
01:46:00.640 | And you, a little proof-taker, can check it.
01:46:03.880 | Now you can actually trust an AI
01:46:06.480 | that's much more intelligent than you are, right?
01:46:08.880 | Because it's a problem to come up with this proof
01:46:13.440 | that you could never have found, but you should trust it.
01:46:16.160 | - So this is the interesting point.
01:46:17.760 | I agree with you, but this is where Eliezer Yakovsky
01:46:21.760 | might disagree with you.
01:46:23.200 | His claim, not with you, but with this idea.
01:46:26.760 | His claim is a super-intelligent AI
01:46:30.680 | would be able to know how to lie to you with such a proof.
01:46:34.800 | - How to lie to you and give me a proof
01:46:37.840 | that I'm gonna think is correct?
01:46:39.640 | - Yeah.
01:46:40.480 | - But it's not me that's lying to you.
01:46:41.920 | That's the trick, my proof-checker.
01:46:44.240 | It's just a piece of code.
01:46:45.160 | - So his general idea is a super-intelligent system
01:46:50.120 | can lie to a dumber proof-checker.
01:46:54.000 | So you're going to have, as a system
01:46:56.600 | becomes more and more intelligent,
01:46:57.760 | there's going to be a threshold
01:46:59.840 | where a super-intelligent system
01:47:01.560 | would be able to effectively lie
01:47:03.120 | to a slightly dumber AGI system.
01:47:05.400 | He really focuses on this weak AGI to strong AGI jump
01:47:11.680 | where the strong AGI can make all the weak AGIs
01:47:15.760 | think that it's just one of them, but it's no longer that.
01:47:19.960 | And that leap is when it runs away from you.
01:47:23.760 | - I don't buy that argument.
01:47:25.720 | I think no matter how super-intelligent an AI is,
01:47:29.320 | it's never gonna be able to prove to me
01:47:30.880 | that there are only finitely many primes, for example.
01:47:33.560 | It just can't.
01:47:36.800 | And it can try to snow me by making up
01:47:40.000 | all sorts of new weird rules of deduction
01:47:42.840 | that say, trust me, the way your proof-checker works
01:47:47.840 | is too limited, and we have this new hyper-math,
01:47:51.760 | and it's true.
01:47:53.040 | But then I would just take the attitude,
01:47:55.560 | okay, I'm gonna forfeit some of these
01:47:58.080 | supposedly super-cool technologies.
01:48:00.000 | I'm only gonna go with the ones that I can prove
01:48:01.840 | with my own trusted proof-checker.
01:48:03.880 | Then I think it's fine.
01:48:05.320 | There's still, of course, this is not something
01:48:08.520 | anyone has successfully implemented at this point,
01:48:10.360 | but I think I just give it as an example of hope.
01:48:14.680 | We don't have to do all the work ourselves.
01:48:17.160 | This is exactly the sort of very boring and tedious task
01:48:19.880 | that's perfect to outsource to an AI.
01:48:22.720 | And this is a way in which less powerful
01:48:24.680 | and less intelligent agents like us
01:48:26.960 | can actually continue to control
01:48:29.720 | and trust more powerful ones.
01:48:31.760 | - So build AGI systems that help us defend
01:48:33.800 | against other AGI systems.
01:48:35.840 | - Well, for starters, begin with a simple problem
01:48:39.160 | of just making sure that the system that you own
01:48:41.120 | or that's supposed to be loyal to you
01:48:44.320 | has to prove to itself that it's always gonna do
01:48:46.440 | the things that you actually want it to do.
01:48:48.240 | And if it can't prove it, maybe it's still gonna do it,
01:48:51.040 | but you won't run it.
01:48:52.480 | So you just forfeit some aspects
01:48:54.400 | of all the cool things AI can do.
01:48:56.520 | I bet you dollars and donuts it can still do
01:48:58.480 | some incredibly cool stuff for you.
01:49:00.520 | - Yeah.
01:49:01.440 | - There are other things too
01:49:02.600 | that we shouldn't sweep under the rug,
01:49:03.880 | like not every human agrees on exactly
01:49:06.880 | what direction we should go with humanity, right?
01:49:09.440 | - Yes.
01:49:10.840 | - And you've talked a lot about geopolitical things
01:49:13.760 | on your podcast to this effect,
01:49:16.280 | but I think that shouldn't distract us
01:49:19.120 | from the fact that there are actually a lot of things
01:49:21.680 | that everybody in the world virtually agrees on
01:49:25.920 | that, hey, you know, like having no humans on the planet
01:49:29.000 | in a near future, let's not do that, right?
01:49:35.120 | You look at something like
01:49:36.000 | the United Nations Sustainable Development Goals,
01:49:39.360 | some of them are quite ambitious,
01:49:42.280 | and basically all the countries agree,
01:49:44.960 | US, China, Russia, Ukraine, they all agree.
01:49:47.960 | So instead of quibbling about the little things
01:49:50.960 | we don't agree on, let's start with the things
01:49:53.120 | we do agree on and get them done.
01:49:56.720 | Instead of being so distracted by all these things
01:49:59.200 | we disagree on that Moloch wins,
01:50:02.840 | because frankly, Moloch going wild now,
01:50:07.840 | it feels like a war on life playing out in front of our eyes.
01:50:12.000 | If you just look at it from space, you know,
01:50:15.960 | we're on this planet, beautiful, vibrant ecosystem.
01:50:20.720 | Now we start chopping down big parts of it,
01:50:24.680 | even though most people thought that was a bad idea.
01:50:27.760 | Always start doing ocean acidification,
01:50:30.480 | wiping out all sorts of species.
01:50:33.000 | Oh, now we have all these close calls,
01:50:34.800 | we almost had a nuclear war.
01:50:36.720 | And we're replacing more and more of the biosphere
01:50:39.880 | with non-living things.
01:50:42.880 | We're also replacing in our social lives
01:50:45.600 | a lot of the things which were so valuable to humanity.
01:50:49.120 | A lot of social interactions now are replaced
01:50:51.360 | by people staring into their rectangles, right?
01:50:54.320 | And I'm not a psychologist, I'm out of my depth here,
01:50:58.640 | but I suspect that part of the reason why teen suicide
01:51:02.640 | and suicide in general in the US,
01:51:04.760 | the record breaking levels is actually caused by,
01:51:08.080 | again, AI technologies and social media,
01:51:11.600 | making people spend less time
01:51:13.080 | and actually just human interaction.
01:51:16.320 | We've all seen a bunch of good looking people
01:51:19.840 | in restaurants staring into the rectangles
01:51:22.240 | instead of looking into each other's eyes, right?
01:51:24.680 | So that's also a part of the war on life,
01:51:28.160 | that we're replacing so many
01:51:31.760 | really life affirming things by technology.
01:51:38.200 | We're putting technology between us.
01:51:41.640 | The technology that was supposed to connect us
01:51:43.800 | is actually distancing us, ourselves from each other.
01:51:47.000 | And then we're giving ever more power
01:51:50.680 | to things which are not alive.
01:51:52.680 | These large corporations are not living things, right?
01:51:55.600 | They're just maximizing profit.
01:51:57.480 | I wanna win the war on life.
01:52:01.960 | I think we humans, together with all our fellow living things
01:52:06.280 | on this planet, will be better off if we can
01:52:08.880 | remain in control over the non-living things
01:52:12.920 | and make sure that they work for us.
01:52:15.120 | I really think it can be done.
01:52:17.560 | - Can you just linger on this,
01:52:19.840 | maybe high level philosophical disagreement
01:52:23.160 | with Eliezer Yudkowsky
01:52:24.920 | in the hope you're stating?
01:52:30.080 | So he is very sure,
01:52:31.960 | he puts a very high probability,
01:52:35.560 | very close to one, depending on the day he puts it at one,
01:52:39.400 | that AI is going to kill humans.
01:52:42.800 | That there's just, he does not see a trajectory
01:52:47.320 | which it doesn't end up with that conclusion.
01:52:50.920 | What trajectory do you see that doesn't end up there?
01:52:54.360 | And maybe can you see the point he's making?
01:52:57.680 | And can you also see a way out?
01:53:01.300 | - First of all, I tremendously respect Eliezer Yudkowsky
01:53:07.480 | and his thinking.
01:53:10.040 | Second, I do share his view
01:53:13.280 | that there's a pretty large chance
01:53:14.840 | that we're not gonna make it as humans,
01:53:16.840 | that there won't be any humans on the planet
01:53:19.800 | in the not-too-distant future.
01:53:20.840 | And that makes me very sad.
01:53:22.280 | We just had a little baby, and I keep asking myself,
01:53:24.840 | how old is he even gonna get?
01:53:34.080 | And I ask myself,
01:53:37.360 | it feels, I said to my wife recently,
01:53:39.200 | it feels a little bit like I was just diagnosed
01:53:40.960 | with some sort of cancer,
01:53:43.560 | which has some risk of dying from
01:53:48.000 | and some risk of surviving,
01:53:49.360 | except this is the kind of cancer
01:53:53.560 | which will kill all of humanity.
01:53:54.720 | So I completely take seriously his concerns.
01:53:59.220 | I think, but I absolutely don't think it's hopeless.
01:54:05.360 | I think there is, first of all, a lot of momentum now,
01:54:10.360 | for the first time, actually,
01:54:15.200 | since the many, many years that have passed,
01:54:16.960 | since I and many others started warning about this,
01:54:20.000 | I feel most people are getting it now.
01:54:23.040 | Just talking to this guy in the gas station
01:54:30.400 | near our house the other day,
01:54:34.440 | and he's like, "I think we're getting replaced."
01:54:37.880 | So that's positive, that we're finally seeing this reaction,
01:54:44.360 | which is the first step towards solving the problem.
01:54:47.640 | Second, I really think that this vision
01:54:50.480 | of only running AIs,
01:54:52.280 | if the stakes are really high,
01:54:55.720 | that can prove to us that they're safe,
01:54:57.880 | it's really just virus checking in reverse again.
01:55:00.280 | I think it's scientifically doable.
01:55:03.680 | I don't think it's hopeless.
01:55:05.120 | We might have to forfeit some of the technology
01:55:08.960 | that we could get if we were putting blind faith in our AIs,
01:55:12.320 | but we're still gonna get amazing stuff.
01:55:14.360 | - Do you envision a process with a proof checker,
01:55:16.080 | like something like GPT-4 or GPT-5,
01:55:18.840 | would go through a process of rigorous interrogation?
01:55:21.840 | - No, I think it's hopeless.
01:55:23.200 | That's like trying to prove Vero about five spaghetti.
01:55:25.840 | (laughing)
01:55:27.720 | What I think, well, the vision I have for success
01:55:31.440 | is instead that just like we human beings
01:55:34.600 | were able to look at our brains
01:55:36.720 | and distill out the key knowledge.
01:55:38.160 | Galileo, when his dad threw him an apple when he was a kid,
01:55:42.520 | he was able to catch it
01:55:43.360 | 'cause his brain could in this funny spaghetti kind of way,
01:55:45.840 | predict how parabolas are gonna move.
01:55:48.040 | His Kahneman system won, right?
01:55:49.640 | Then he got older and it's like, wait, this is a parabola.
01:55:53.200 | It's Y equals X squared.
01:55:55.640 | I can distill this knowledge out,
01:55:56.960 | and today you can easily program it into a computer
01:55:59.680 | and it can simulate not just that,
01:56:01.640 | but how to get to Mars and so on, right?
01:56:04.240 | I envision a similar process
01:56:05.640 | where we use the amazing learning power of neural networks
01:56:09.120 | to discover the knowledge in the first place,
01:56:12.320 | but we don't stop with a black box and use that.
01:56:16.880 | We then do a second round of AI
01:56:19.160 | where we use automated systems to extract out the knowledge
01:56:21.600 | and see what are the insights it's had.
01:56:24.120 | And then we put that knowledge
01:56:28.280 | into a completely different kind of architecture
01:56:31.800 | or programming language or whatever
01:56:33.640 | that's made in a way that it can be both really efficient
01:56:37.040 | and also is more amenable to very formal verification.
01:56:41.480 | That's my vision.
01:56:44.280 | I'm not sitting here saying I'm confident,
01:56:46.880 | 100% sure that it's gonna work,
01:56:48.600 | but I don't think, the chance is certainly not zero either,
01:56:51.960 | and it will certainly be possible to do
01:56:53.560 | for a lot of really cool AI applications
01:56:57.280 | that we're not using now.
01:56:58.920 | So we can have a lot of the fun that we're excited about
01:57:02.240 | if we do this.
01:57:03.840 | We're gonna need a little bit of time.
01:57:05.680 | That's why it's good to pause
01:57:08.560 | and put in place requirements.
01:57:12.440 | One more thing also, I think,
01:57:15.640 | someone might think,
01:57:17.960 | well, 0% chance we're gonna survive.
01:57:20.680 | Let's just give up, right?
01:57:22.680 | That's very dangerous
01:57:26.240 | because there's no more guaranteed way to fail
01:57:29.840 | than to convince yourself that it's impossible and not try.
01:57:33.280 | When you study history and military history,
01:57:39.280 | the first thing you learn is
01:57:40.920 | that that's how you do psychological warfare.
01:57:44.680 | You persuade the other side that it's hopeless
01:57:47.120 | so they don't even fight.
01:57:48.360 | And then, of course, you win, right?
01:57:51.520 | Let's not do this psychological warfare on ourselves
01:57:55.120 | and say there's 100% probability
01:57:56.760 | we're all screwed anyway.
01:57:58.400 | It's sadly, I do get that a little bit sometimes
01:58:03.480 | from some young people who are so convinced
01:58:06.680 | that we're all screwed that they're like,
01:58:08.160 | I'm just gonna play computer games and do drugs
01:58:12.000 | 'cause we're screwed anyway, right?
01:58:14.000 | It's important to keep the hope alive
01:58:17.560 | because it actually has a causal impact
01:58:20.120 | and makes it more likely that we're gonna succeed.
01:58:22.720 | - It seems like the people that actually build solutions
01:58:25.680 | to a problem seemingly impossible to solve problems
01:58:28.760 | are the ones that believe.
01:58:31.200 | They're the ones who are the optimists.
01:58:33.040 | And it seems like there's some fundamental law
01:58:36.560 | to the universe where fake it 'til you make it
01:58:38.720 | kind of works.
01:58:40.000 | Like, believe it's possible and it becomes possible.
01:58:43.480 | - Yeah, was it Henry Ford who said that
01:58:46.000 | if you tell yourself that it's impossible, it is?
01:58:52.480 | Let's not make that mistake.
01:58:54.040 | And this is a big mistake society is making,
01:58:56.560 | I think all in all.
01:58:57.400 | Everybody's so gloomy and the media are also very biased
01:58:59.920 | towards if it bleeds, it leads and gloom and doom.
01:59:02.400 | So most visions of the future we have
01:59:07.400 | are dystopian, which really demotivates people.
01:59:12.600 | We wanna really, really, really focus on the upside also
01:59:16.000 | to give people the willingness to fight for it.
01:59:18.800 | And for AI, you and I mostly talked about gloom here again,
01:59:23.800 | but let's not forget that we have probably both lost
01:59:30.000 | someone we really cared about to some disease
01:59:33.920 | that we were told was incurable.
01:59:35.680 | Well, it's not.
01:59:37.160 | There's no law of physics saying
01:59:38.480 | you have to die of that cancer or whatever.
01:59:40.400 | Of course you can cure it.
01:59:42.280 | And there are so many other things
01:59:44.160 | that we with our human intelligence
01:59:45.920 | have also failed to solve on this planet,
01:59:49.360 | which AI could also very much help us with.
01:59:52.280 | So if we can get this right, just be a little more chill
01:59:56.760 | and slow down a little bit till we get it right.
01:59:59.160 | It's mind blowing how awesome our future can be.
02:00:04.480 | We talked a lot about stuff on earth, it can be great.
02:00:08.000 | But even if you really get ambitious
02:00:09.920 | and look up into the skies,
02:00:11.280 | there's no reason we have to be stuck on this planet
02:00:13.680 | for the rest of the remaining,
02:00:16.800 | for billions of years to come.
02:00:19.160 | We totally understand now as laws of physics
02:00:22.480 | let life spread out into space to other solar systems,
02:00:26.480 | to other galaxies and flourish
02:00:27.840 | for billions and billions of years.
02:00:29.960 | And this to me is a very, very hopeful vision
02:00:32.800 | that really motivates me to fight.
02:00:37.880 | And coming back to the end,
02:00:39.040 | something you talked about again,
02:00:40.680 | the struggle, how the human struggle
02:00:42.920 | is one of the things that also really gives meaning
02:00:45.240 | to our lives.
02:00:46.440 | If there's ever been an epic struggle, this is it.
02:00:50.080 | And isn't it even more epic if you're the underdog?
02:00:53.320 | If most people are telling you this is gonna fail,
02:00:55.760 | it's impossible, right?
02:00:57.480 | And you persist and you succeed.
02:01:01.040 | That's what we can do together as a species on this one.
02:01:05.280 | A lot of pundits are ready to count this out.
02:01:08.800 | - Both in the battle to keep AI safe
02:01:11.560 | and becoming a multi-planetary species.
02:01:13.680 | - Yeah, and they're the same challenge.
02:01:16.480 | If we can keep AI safe,
02:01:17.640 | that's how we're gonna get multi-planetary very efficiently.
02:01:21.600 | - I have some sort of technical questions
02:01:23.600 | about how to get it right.
02:01:24.720 | So one idea that I'm not even sure
02:01:28.800 | what the right answer is to is,
02:01:31.200 | should systems like GPT-4 be open sourced
02:01:35.000 | in whole or in part?
02:01:36.600 | Can you see the case for either?
02:01:40.680 | - I think the answer right now is no.
02:01:42.600 | I think the answer early on was yes.
02:01:45.880 | So we could bring in all the wonderful,
02:01:50.200 | great thought process of everybody on this.
02:01:53.160 | But asking, should we open source GPT-4 now
02:01:56.600 | is just the same as if you say,
02:01:57.880 | well, is it good?
02:01:58.720 | Should we open source how to build
02:02:02.320 | really small nuclear weapons?
02:02:04.280 | Should we open source how to make bioweapons?
02:02:09.680 | Should we open source how to make a new virus
02:02:13.600 | that kills 90% of everybody who gets it?
02:02:15.440 | Of course we shouldn't.
02:02:17.400 | - So it's already that powerful.
02:02:19.600 | It's already that powerful that we have to respect
02:02:22.720 | the power of the systems we've built.
02:02:26.680 | - The knowledge that you get
02:02:29.000 | from open sourcing everything we do now
02:02:32.520 | might very well be powerful enough
02:02:35.120 | that people looking at that
02:02:38.040 | can use it to build the things
02:02:39.960 | that are really threatening.
02:02:41.600 | Remember, open AI is,
02:02:43.280 | GPT-4 is a baby AI.
02:02:46.440 | Baby, sort of baby proto, almost little bit AGI
02:02:50.720 | according to what Microsoft's recent paper said.
02:02:53.920 | It's not that that we're scared of.
02:02:55.640 | What we're scared about is people taking that
02:02:57.880 | who might be a lot less responsible
02:03:01.280 | than the company that made it
02:03:02.680 | and just going to town with it.
02:03:06.000 | That's why we want to,
02:03:10.480 | it's an information hazard.
02:03:12.200 | There are many things which are not open sourced
02:03:15.120 | right now in society for a very good reason.
02:03:17.760 | Like how do you make
02:03:18.960 | certain kind of very powerful toxins
02:03:23.600 | out of stuff you can buy at Home Depot,
02:03:27.000 | you don't open source those things for a reason.
02:03:29.560 | And this is really no different.
02:03:32.480 | - So open-- - And I'm saying that,
02:03:34.920 | I have to say it feels in a way a bit weird to say it
02:03:38.120 | because MIT is like the cradle of the open source movement.
02:03:42.400 | And I love open source in general,
02:03:44.320 | power to the people, let's say.
02:03:46.080 | But there's always gonna be some stuff
02:03:50.920 | that you don't open source.
02:03:52.320 | And it's just like you don't open source.
02:03:55.480 | So we have a three month old baby, right?
02:03:56.880 | When he gets a little bit older,
02:03:58.480 | we're not gonna open source to him
02:03:59.640 | all the most dangerous things he can do in the house.
02:04:02.080 | - Yeah. - Right?
02:04:04.400 | But it does, it's a weird feeling
02:04:07.040 | because this is one of the first moments in history
02:04:10.600 | where there's a strong case to be made
02:04:13.240 | not to open source software.
02:04:15.740 | This is when the software has become too dangerous.
02:04:19.720 | - Yeah, but it's not the first time
02:04:21.160 | that we didn't wanna open source a technology.
02:04:23.080 | - Technology, yeah.
02:04:24.040 | Is there something to be said
02:04:28.400 | about how to get the release of such systems right?
02:04:30.980 | Like GPT-4 and GPT-5.
02:04:33.980 | So OpenAI went through a pretty rigorous effort
02:04:37.820 | for several months.
02:04:39.140 | You could say it could be longer,
02:04:40.300 | but nevertheless it's longer than you would have expected
02:04:42.900 | of trying to test the system
02:04:44.540 | to see like what are the ways it goes wrong
02:04:46.660 | to make it very difficult for people,
02:04:49.300 | somewhat difficult for people to ask things,
02:04:51.860 | how do I make a bomb for $1?
02:04:54.300 | Or how do I say I hate a certain group on Twitter
02:05:00.180 | in a way that doesn't get me blocked from Twitter,
02:05:02.260 | banned from Twitter, those kinds of questions.
02:05:05.380 | So you basically use the system to do harm.
02:05:08.940 | Is there something you could say about ideas
02:05:13.340 | you have just on looking,
02:05:15.460 | having thought about this problem of AI safety,
02:05:17.740 | how to release such system,
02:05:18.940 | how to test such systems when you have them
02:05:21.000 | inside the company?
02:05:22.280 | - Yeah, so a lot of people say
02:05:29.900 | that the two biggest risks from large language models are
02:05:33.220 | it's spreading disinformation,
02:05:40.020 | harmful information of various types.
02:05:42.380 | And second, being used for offensive cyber weapon.
02:05:48.620 | So I think those are not the two greatest threats.
02:05:53.300 | They're very serious threats and it's wonderful
02:05:55.020 | that people are trying to mitigate them.
02:05:58.500 | A much bigger elephant in the room
02:06:00.220 | is how is this just gonna disrupt our economy
02:06:02.620 | in a huge way, obviously,
02:06:03.620 | and maybe take away a lot of the most meaningful jobs.
02:06:06.300 | And an even bigger one is the one we spent
02:06:08.820 | so much time talking about here,
02:06:10.180 | that this becomes the bootloader
02:06:15.180 | for the more powerful AI.
02:06:17.860 | - Write code, connect it to the internet, manipulate humans.
02:06:21.120 | - Yeah, and before we know it, we have something else,
02:06:23.900 | which is not at all a large language model.
02:06:25.700 | It looks nothing like it,
02:06:26.860 | but which is way more intelligent and capable and has goals.
02:06:29.860 | And that's the elephant in the room.
02:06:33.720 | And obviously, no matter how hard
02:06:36.220 | any of these companies have tried,
02:06:37.920 | that's not something that's easy for them to verify
02:06:41.220 | with large language models.
02:06:42.460 | And the only way to really lower that risk a lot
02:06:45.660 | would be to not let, for example,
02:06:48.940 | never let it read any code, not train on that,
02:06:52.020 | and not put it into an API,
02:06:54.400 | and not give it access to so much information
02:06:59.400 | about how to manipulate humans.
02:07:01.960 | But that doesn't mean you still can't make
02:07:05.840 | a ton of money on them.
02:07:08.280 | We're gonna just watch now this coming year,
02:07:13.720 | Microsoft is rolling out the new Office suite
02:07:17.680 | where you go into Microsoft Word and give it a prompt,
02:07:21.480 | and it writes the whole text for you,
02:07:23.160 | and then you edit it.
02:07:24.600 | And then you're like,
02:07:25.440 | "Oh, give me a PowerPoint version of this,"
02:07:26.680 | and it makes it.
02:07:27.520 | And now take the spreadsheet and blah, blah.
02:07:31.440 | All of those things, I think,
02:07:32.920 | you can debate the economic impact of it
02:07:35.920 | and whether society is prepared to deal with this disruption,
02:07:39.280 | but those are not the things which,
02:07:41.040 | that's not the elephant of the room
02:07:43.560 | that keeps me awake at night for wiping out humanity.
02:07:46.200 | And I think that's the biggest misunderstanding we have.
02:07:51.200 | A lot of people think that we're scared
02:07:52.640 | of automatic spreadsheets.
02:07:55.680 | That's not the case.
02:07:56.560 | That's not what Eliezer was freaked out about either.
02:07:59.720 | - Is there, in terms of the actual mechanism
02:08:03.600 | of how AI might kill all humans,
02:08:06.720 | so something you've been outspoken about,
02:08:09.600 | you've talked about a lot, is autonomous weapon systems.
02:08:13.720 | So the use of AI in war.
02:08:17.200 | Is that one of the things that still you carry concern for
02:08:21.200 | as these systems become more and more powerful?
02:08:23.120 | - I carry concern for it,
02:08:24.120 | not that all humans are going to get killed by slaughterbots,
02:08:26.480 | but rather just as express route into Orwellian dystopia
02:08:31.480 | where it becomes much easier for very few to kill very many
02:08:35.080 | and therefore it becomes very easy
02:08:36.400 | for very few to dominate very many.
02:08:38.200 | If you want to know how AI could kill all people,
02:08:43.920 | just ask yourself,
02:08:45.400 | humans have driven a lot of species extinct.
02:08:47.720 | How do we do it?
02:08:49.960 | We were smarter than them.
02:08:51.320 | Usually we didn't do it even systematically
02:08:55.520 | by going around one after the other
02:08:58.000 | and stepping on them or shooting them or anything like that.
02:09:00.080 | We just like chopped down their habitat
02:09:02.160 | 'cause we needed it for something else.
02:09:04.240 | In some cases, we did it by putting more carbon dioxide
02:09:08.080 | in the atmosphere because of some reason
02:09:10.840 | that those animals didn't even understand
02:09:13.680 | and now they're gone, right?
02:09:15.800 | So if you're an AI and you just want to figure something out
02:09:20.800 | then you decide, we just really need the space here
02:09:26.480 | to build more compute facilities.
02:09:28.520 | If that's the only goal it has,
02:09:34.240 | we are just the sort of accidental roadkill along the way.
02:09:37.920 | And you could totally imagine,
02:09:38.960 | yeah, maybe this oxygen is kind of annoying
02:09:40.840 | 'cause it caused more corrosion,
02:09:42.160 | so let's get rid of the oxygen
02:09:44.360 | and good luck surviving after that.
02:09:46.480 | I'm not particularly concerned
02:09:48.200 | that they would want to kill us
02:09:49.920 | just because that would be a goal in itself.
02:09:54.920 | We've driven a number of the elephant species extinct.
02:10:02.320 | It wasn't 'cause we didn't like elephants.
02:10:04.420 | The basic problem is you just don't want to give,
02:10:11.040 | you don't want to cede control over your planet
02:10:13.960 | to some other more intelligent entity
02:10:17.120 | that doesn't share your goals.
02:10:18.440 | It's that simple.
02:10:19.280 | So which brings us to another key challenge
02:10:23.720 | which AI safety researchers have been grappling with
02:10:25.880 | for a long time.
02:10:27.440 | How do you make AI first of all understand our goals
02:10:31.720 | and then adopt our goals
02:10:32.760 | and then retain them as they get smarter, right?
02:10:41.520 | All three of those are really hard, right?
02:10:44.080 | Like a human child, first they're just not smart enough
02:10:49.080 | to understand our goals.
02:10:50.640 | They can't even talk.
02:10:53.020 | And then eventually they're teenagers
02:10:56.240 | and understand our goals just fine,
02:10:57.640 | but they don't share.
02:10:59.080 | But there is fortunately a magic phase in the middle
02:11:03.760 | where they're smart enough to understand our goals
02:11:05.440 | and malleable enough that we can hopefully
02:11:06.840 | with good parenting teach them right from wrong
02:11:09.580 | and instill good goals in them, right?
02:11:12.280 | So those are all tough challenges with computers.
02:11:17.960 | And then even if you teach your kids good goals
02:11:20.560 | when they're little, they might outgrow them too.
02:11:22.300 | And that's a challenge for machines to keep improving.
02:11:25.720 | So these are a lot of hard challenges we're up for,
02:11:30.380 | but I don't think any of them are insurmountable.
02:11:33.240 | The fundamental reason why Eliezer looked so depressed
02:11:37.980 | when he last saw him was because he felt
02:11:39.800 | there just wasn't enough time.
02:11:42.060 | - Oh, not that it was unsolvable.
02:11:44.600 | - Correct. - There's just not enough time.
02:11:46.000 | - He was hoping that humanity was gonna take this threat
02:11:48.240 | more seriously so we would have more time.
02:11:50.800 | - Yeah.
02:11:51.640 | - And now we don't have more time.
02:11:53.360 | That's why the open letter is calling for more time.
02:11:56.360 | - But even with time, the AI alignment problem
02:12:02.880 | seems to be really difficult.
02:12:06.360 | - Oh yeah.
02:12:08.200 | - But it's also the most worthy problem,
02:12:11.660 | the most important problem for humanity to ever solve.
02:12:14.220 | Because if we solve that one, Lex,
02:12:15.940 | that aligned AI can help us solve all the other problems.
02:12:20.740 | - 'Cause it seems like it has to have constant humility
02:12:23.940 | about its goal, constantly questioning the goal.
02:12:26.440 | Because as you optimize towards a particular goal
02:12:31.220 | and you start to achieve it,
02:12:32.580 | that's when you have the unintended consequences,
02:12:34.320 | all the things you mentioned about.
02:12:35.940 | So how do you enforce and code a constant humility
02:12:40.000 | as your ability become better and better and better and better
02:12:42.920 | - Stewart, Professor Stewart Russell at Berkeley,
02:12:44.760 | who's also one of the driving forces behind this letter,
02:12:49.400 | he has a whole research program about this.
02:12:54.320 | I think of it as AI humility, exactly.
02:12:59.080 | Although he calls it inverse reinforcement learning
02:13:01.320 | and other nerdy terms.
02:13:02.840 | But it's about exactly that.
02:13:04.140 | Instead of telling the AI, here's this goal,
02:13:06.100 | go optimize the bejesus out of it.
02:13:08.920 | You tell it, okay, do what I want you to do,
02:13:15.220 | but I'm not gonna tell you right now what it is
02:13:16.900 | I want you to do, you need to figure it out.
02:13:19.260 | So then you give the incentive to be very humble
02:13:21.700 | and keep asking you questions along the way.
02:13:23.360 | Is this what you really meant?
02:13:24.580 | Is this what you wanted?
02:13:25.660 | And oh, this other thing I tried didn't work,
02:13:28.140 | seemed like it didn't work out right,
02:13:29.340 | should I try it differently?
02:13:33.240 | What's nice about this is it's not just philosophical
02:13:35.600 | mumbo jumbo, it's theorems and technical work
02:13:38.320 | that with more time I think you can make a lot of progress.
02:13:40.860 | And there are a lot of brilliant people now
02:13:43.320 | working on AI safety.
02:13:44.560 | We just need to give them a bit more time.
02:13:47.840 | - But also not that many relative to the scale of the problem.
02:13:50.800 | - No, exactly.
02:13:51.960 | There should be, at least just like every university
02:13:56.400 | worth its name has some cancer research going on
02:13:59.120 | in its biology department, right?
02:14:01.520 | Every university that does computer science
02:14:03.800 | should have a real effort in this area
02:14:07.060 | and it's nowhere near that.
02:14:09.300 | This is something I hope is changing now
02:14:11.980 | thanks to the GPT-4, right?
02:14:13.780 | So I think if there's a silver lining
02:14:17.180 | to what's happening here, even though I think many people
02:14:20.820 | would wish it would have been rolled out more carefully,
02:14:24.420 | is that this might be the wake up call
02:14:26.900 | that humanity needed to really
02:14:31.540 | stop fantasising about this being 100 years off
02:14:35.080 | and stop fantasising about this being completely
02:14:37.600 | controllable and predictable because it's so obvious
02:14:41.960 | it's not predictable, you know?
02:14:45.240 | Why is it that, I think it was ChatGPT
02:14:50.240 | tried to persuade a journalist,
02:14:54.880 | or was it ChatGPT-4, to divorce his wife?
02:15:00.080 | It was not 'cause the engineers that built it
02:15:02.780 | was like, "Heh heh heh heh heh, let's put this in here
02:15:06.940 | "and screw a little bit with people."
02:15:09.720 | They hadn't predicted it at all.
02:15:11.800 | They built the giant black box,
02:15:13.540 | trained to predict the next word,
02:15:16.560 | got all these emergent properties,
02:15:18.120 | and oops, it did this, you know?
02:15:20.540 | I think this is a very powerful wake up call
02:15:26.440 | and anyone watching this who's not scared,
02:15:29.840 | I would encourage them to just play a bit more
02:15:31.960 | with these tools that are out there now, like GPT-4.
02:15:36.020 | It's a wake up call is first step.
02:15:42.240 | Once you've woken up, then gotta slow down a little bit
02:15:45.600 | the risky stuff to give a chance to all,
02:15:48.800 | everyone who's woken up, to catch up
02:15:51.440 | with us on the safety front.
02:15:52.600 | - You know, what's interesting is, you know, MIT,
02:15:55.480 | that's computer science in general,
02:15:58.680 | but let's just even say computer science curriculum.
02:16:01.920 | How does the computer science curriculum change now?
02:16:04.400 | You mentioned programming.
02:16:06.300 | When I was coming up, programming is a prestigious position.
02:16:13.600 | Like, why would you be dedicating crazy amounts of time
02:16:17.520 | to become an excellent programmer?
02:16:19.240 | Like, the nature of programming is fundamentally changing.
02:16:21.800 | - The nature of our entire education system
02:16:24.760 | is completely turned on its head.
02:16:28.480 | Has anyone been able to like load that in
02:16:30.840 | and like think about, 'cause it's really turning.
02:16:33.960 | - I mean, some English professors, some English teachers
02:16:36.160 | are beginning to really freak out now.
02:16:38.360 | Right, like they give an essay assignment
02:16:40.560 | and they get back all this fantastic prose,
02:16:42.640 | like this is the style of Hemingway.
02:16:44.880 | And then they realize they have to completely rethink.
02:16:48.080 | And even, you know, just like we stopped teaching,
02:16:52.920 | writing a script, is that what you say in English?
02:16:57.880 | - Yeah, handwritten, yeah.
02:16:59.160 | - Yeah, when everybody started typing, you know,
02:17:01.200 | like so much of what we teach our kids today.
02:17:04.080 | - Yeah, I mean, that's,
02:17:09.960 | everything is changing and it's changing very,
02:17:15.440 | it's changing very quickly.
02:17:17.920 | And so much of us understanding how to deal
02:17:20.680 | with the big problems of the world
02:17:21.840 | is through the education system.
02:17:23.960 | And if the education system is being turned on its head,
02:17:26.620 | then what's next?
02:17:27.960 | It feels like having these kinds of conversations
02:17:30.600 | is essential to try to figure it out.
02:17:32.920 | And everything's happening so rapidly.
02:17:35.480 | I don't think there's even, speaking of safety,
02:17:38.280 | what the broad AI safety defined,
02:17:40.880 | I don't think most universities have courses on AI safety.
02:17:44.400 | It's like a philosophy seminar.
02:17:46.280 | - And like, I'm an educator myself,
02:17:48.680 | so it pains me to see this, say this,
02:17:50.560 | but I feel our education right now
02:17:52.280 | is completely obsoleted by what's happening.
02:17:56.380 | You know, you put a kid into first grade
02:17:58.620 | and then you're envisioning,
02:18:01.580 | and then they're gonna come out of high school
02:18:03.020 | 12 years later, and you've already pre-planned now
02:18:06.380 | what they're gonna learn when you're not even sure
02:18:08.700 | if there's gonna be any world left to come out to.
02:18:11.340 | Clearly, you need to have a much more
02:18:16.500 | opportunistic education system
02:18:17.980 | that keeps adapting itself very rapidly
02:18:20.220 | as society readapts.
02:18:22.720 | The skills that were really useful
02:18:25.200 | when the curriculum was written,
02:18:26.520 | I mean, how many of those skills
02:18:28.520 | are gonna get you a job in 12 years?
02:18:31.240 | I mean, seriously.
02:18:32.560 | - If we just linger on the GPT-4 system a little bit,
02:18:36.160 | you kind of hinted at it, especially talking about
02:18:41.920 | the importance of consciousness in the human mind
02:18:46.480 | with homo sentience.
02:18:48.380 | Do you think GPT-4 is conscious?
02:18:51.520 | - Love this question.
02:18:53.960 | So, let's define consciousness first,
02:18:57.560 | because in my experience, like 90% of all arguments
02:19:00.880 | about consciousness boil down to the two people
02:19:03.320 | arguing, having totally different definitions
02:19:05.060 | of what it is, and they're just shouting past each other.
02:19:08.240 | I define consciousness as subjective experience.
02:19:13.720 | Right now, I'm experiencing colors and sounds
02:19:17.740 | and emotions, but does a self-driving car
02:19:21.560 | experience anything?
02:19:22.680 | That's the question about whether it's conscious or not.
02:19:26.400 | Other people think you should define consciousness
02:19:30.280 | differently, fine by me, but then maybe
02:19:33.520 | use a different word for it.
02:19:34.960 | I'm gonna use consciousness for this, at least.
02:19:38.700 | But if people hate the, yeah.
02:19:43.800 | So, is GPT-4 conscious?
02:19:46.680 | Does GPT-4 have subjective experience?
02:19:50.060 | Short answer, I don't know, because we still don't know
02:19:53.240 | what it is that gives us wonderful subjective experience
02:19:56.680 | that is kind of the meaning of our life, right?
02:19:59.240 | 'Cause meaning itself, feeling a meaning
02:20:01.040 | is a subjective experience.
02:20:02.620 | Joy is a subjective experience.
02:20:04.120 | Love is a subjective experience.
02:20:05.720 | We don't know what it is.
02:20:08.660 | I've written some papers about this.
02:20:11.560 | A lot of people have.
02:20:13.760 | Giulio Tononi, a professor, has stuck his neck
02:20:18.480 | out the farthest and written down, actually,
02:20:20.040 | a very bold mathematical conjecture
02:20:23.080 | for what's the essence of conscious information processing.
02:20:26.960 | He might be wrong, he might be right,
02:20:29.080 | but we should test it.
02:20:30.180 | He postulates that consciousness has to do
02:20:34.440 | with loops in the information processing.
02:20:37.360 | So, our brain has loops.
02:20:38.640 | Information can go round and round.
02:20:41.400 | In computer science nerd speak,
02:20:43.600 | you call it a recurrent neural network
02:20:45.460 | where some of the output gets fed back in again.
02:20:48.440 | And with his mathematical formalism,
02:20:53.440 | if it's a feed-forward neural network
02:20:56.360 | where information only goes in one direction,
02:20:58.680 | like from your eye, retina, into the back of your brain,
02:21:01.480 | for example, that's not conscious.
02:21:03.040 | So, he would predict that your retina itself
02:21:04.600 | isn't conscious of anything, or a video camera.
02:21:09.460 | Now, the interesting thing about GPT-4
02:21:11.360 | is it's also just one-way flow of information.
02:21:14.800 | So, if Tononi is right, GPT-4 is a very intelligent zombie
02:21:19.800 | that can do all this smart stuff
02:21:22.440 | but isn't experiencing anything.
02:21:24.040 | And this is both a relief if it's true,
02:21:30.120 | in that you don't have to feel guilty
02:21:32.280 | about turning off GPT-4 and wiping its memory
02:21:35.600 | whenever a new user comes along.
02:21:37.840 | I wouldn't like if someone did that to me,
02:21:40.320 | neuralized me like in Men in Black.
02:21:42.260 | But it's also creepy that you can have
02:21:48.240 | very high intelligence, perhaps, that's not conscious.
02:21:51.120 | Because if we get replaced by machines,
02:21:53.400 | and it's sad enough that humanity isn't here anymore,
02:21:58.960 | 'cause I kind of like humanity.
02:22:00.520 | But at least if the machines were conscious,
02:22:04.280 | I could be like, well, but they're our descendants
02:22:06.200 | and maybe they have our values, they're our children.
02:22:09.280 | But if Tononi is right, and these are all transformers
02:22:13.000 | that are not in the sense of Hollywood,
02:22:19.260 | but in the sense of these one-way direction
02:22:21.920 | neural networks, so they're all the zombies.
02:22:24.880 | That's the ultimate zombie apocalypse now.
02:22:26.680 | We have this universe that goes on
02:22:28.320 | with great construction projects and stuff,
02:22:30.440 | but there's no one experiencing anything.
02:22:32.580 | That would be like the ultimate depressing future.
02:22:37.240 | So I actually think as we move forward
02:22:40.920 | to building more advanced AI,
02:22:42.920 | we should do more research on figuring out
02:22:44.880 | what kind of information processing actually has experience,
02:22:47.220 | because I think that's what it's all about.
02:22:49.620 | And I completely don't buy the dismissal that some people,
02:22:54.280 | some people will say, well, this is all bullshit
02:22:56.480 | because consciousness equals intelligence.
02:22:58.600 | That's obviously not true.
02:23:01.280 | You can have a lot of conscious experience
02:23:03.160 | when you're not really accomplishing any goals at all,
02:23:06.160 | you're just reflecting on something.
02:23:08.800 | And you can sometimes have things,
02:23:12.840 | doing things that require intelligence
02:23:14.160 | probably without being conscious.
02:23:16.160 | - But I also worry that we humans won't,
02:23:18.900 | will discriminate against AI systems
02:23:22.560 | that clearly exhibit consciousness,
02:23:24.800 | that we will not allow AI systems to have consciousness.
02:23:29.120 | We'll come up with theories about measuring consciousness
02:23:32.600 | that will say this is a lesser being.
02:23:35.200 | And this is why I worry about that,
02:23:37.120 | because maybe we humans will create something
02:23:40.640 | that is better than us humans
02:23:43.800 | in the way that we find beautiful,
02:23:47.080 | which is they have a deeper subjective experience
02:23:51.320 | of reality.
02:23:52.220 | Not only are they smarter, but they feel deeper.
02:23:55.600 | And we humans will hate them for it.
02:23:58.580 | As human history has shown,
02:24:02.100 | they'll be the other, we'll try to suppress it,
02:24:04.800 | they'll create conflict, they'll create war, all of this.
02:24:07.680 | I worry about this too.
02:24:09.320 | - Are you saying that we humans sometimes
02:24:11.400 | come up with self-serving arguments?
02:24:13.520 | No, we would never do that, would we?
02:24:15.640 | - Well, that's the danger here,
02:24:16.960 | is even in this early stages,
02:24:19.480 | we might create something beautiful
02:24:21.520 | and we'll erase its memory.
02:24:24.960 | - I was horrified as a kid
02:24:28.440 | when someone started boiling lobsters.
02:24:33.280 | I'm like, "Oh my God, that's so cruel."
02:24:36.000 | And some grownup there back in Sweden said,
02:24:38.520 | "Oh, it doesn't feel pain."
02:24:40.000 | I'm like, "How do you know that?"
02:24:41.400 | "Oh, scientists have shown that."
02:24:43.160 | And then there was a recent study
02:24:46.200 | where they show that lobsters actually do feel pain
02:24:48.480 | when you boil them.
02:24:49.580 | So they banned lobster boiling in Switzerland now,
02:24:51.880 | to kill them in a different way first.
02:24:54.480 | Presumably, that scientific research
02:24:56.600 | boiled down to someone asked the lobster,
02:24:58.520 | "Does this hurt?"
02:24:59.520 | (both laugh)
02:25:00.680 | - Survey, self-report.
02:25:01.800 | - We do the same thing with cruelty to farm animals,
02:25:03.880 | also all these self-serving arguments for why they're fine.
02:25:07.640 | Yeah, so we should certainly be watchful.
02:25:10.160 | I think step one is just be humble
02:25:12.000 | and acknowledge that consciousness
02:25:13.800 | is not the same thing as intelligence.
02:25:16.040 | And I believe that consciousness still is
02:25:18.600 | a form of information processing
02:25:20.160 | where it's really information being aware of itself
02:25:22.320 | in a certain way.
02:25:23.160 | And let's study it and give ourselves a little bit of time.
02:25:26.040 | And I think we will be able to figure out
02:25:28.200 | actually what it is that causes consciousness.
02:25:31.240 | Then we can make probably unconscious robots
02:25:34.560 | that do the boring jobs that we would feel immoral
02:25:37.600 | to give to machines.
02:25:38.440 | But if you have a companion robot taking care of your mom
02:25:42.080 | or something like that,
02:25:44.160 | she would probably want it to be conscious, right?
02:25:45.760 | So that the emotions it seems to display aren't fake.
02:25:49.640 | All these things can be done in a good way
02:25:53.720 | if we give ourselves a little bit of time
02:25:55.720 | and don't run and take on this challenge.
02:25:59.400 | - Is there something you could say to the timeline
02:26:02.000 | that you think about, about the development of AGI?
02:26:05.920 | Depending on the day, I'm sure that changes for you.
02:26:09.160 | But when do you think there'll be a really big leap
02:26:13.560 | in intelligence where you would definitively say
02:26:16.400 | we have built AGI?
02:26:17.920 | Do you think it's one year from now,
02:26:19.400 | five years from now, 10, 20, 50?
02:26:23.160 | What's your gut say?
02:26:27.680 | - Honestly, for the past decade,
02:26:32.520 | I've deliberately given very long timelines
02:26:34.720 | just because I didn't want to fuel
02:26:35.760 | some kind of stupid Moloch race.
02:26:37.760 | But I think that cat has really left the bag now.
02:26:42.000 | I think we might be very, very close.
02:26:46.600 | I don't think the Microsoft paper is totally off
02:26:50.640 | when they say that there are some glimmers of AGI.
02:26:54.960 | It's not AGI yet.
02:26:56.800 | It's not an agent.
02:26:57.720 | There's a lot of things it can't do.
02:26:59.760 | But I wouldn't bet very strongly
02:27:03.800 | against it happening very soon.
02:27:07.160 | That's why we decided to do this open letter
02:27:09.720 | because if there's ever been a time to pause, it's today.
02:27:14.280 | - There's a feeling like this GPT-4 is a big transition
02:27:19.360 | into waking everybody up to the effectiveness
02:27:23.560 | of these systems.
02:27:24.400 | And so the next version will be big.
02:27:28.520 | - Yeah, and if that next one isn't AGI,
02:27:31.440 | maybe the next next one will.
02:27:33.040 | And there are many companies trying to do these things.
02:27:35.440 | And the basic architecture of them
02:27:37.840 | is not some sort of super well-kept secret.
02:27:39.840 | So this is a time to,
02:27:43.000 | a lot of people have said for many years
02:27:45.960 | that there will come a time
02:27:46.920 | when we want to pause a little bit.
02:27:48.680 | That time is now.
02:27:54.280 | - You have spoken about and thought about nuclear war a lot
02:27:58.920 | over the past year.
02:28:01.480 | We've seemingly have come closest
02:28:06.480 | to the precipice of nuclear war,
02:28:09.520 | then at least in my lifetime.
02:28:11.540 | - Yeah.
02:28:13.700 | - What do you learn about human nature from that?
02:28:15.880 | - It's our old friend Moloch again.
02:28:19.240 | It's really scary to see it where
02:28:23.480 | America doesn't want there to be a nuclear war.
02:28:26.800 | Russia doesn't want there to be a global nuclear war either.
02:28:30.000 | We both know that it's just being others.
02:28:32.040 | If we just try to do it,
02:28:33.480 | both sides try to launch first,
02:28:35.760 | it's just another suicide race, right?
02:28:37.640 | So why is it the way you said
02:28:40.240 | that this is the closest we've come since 1962?
02:28:43.240 | In fact, I think we've come closer now
02:28:44.720 | than even the Cuban Missile Crisis.
02:28:47.000 | It's 'cause of Moloch.
02:28:48.080 | You have these other forces.
02:28:51.640 | On one hand, you have the West
02:28:54.920 | saying that we have to drive Russia out of Ukraine.
02:28:59.920 | It's a matter of pride.
02:29:01.080 | We've staked so much on it
02:29:04.640 | that it would be seen as a huge loss
02:29:08.360 | of the credibility of the West
02:29:10.080 | if we don't drive Russia out entirely of the Ukraine.
02:29:12.880 | And on the other hand, you have Russia
02:29:20.440 | and you have the Russian leadership
02:29:22.400 | who knows that if they get completely driven
02:29:24.840 | out of Ukraine, it's not just gonna be very humiliating
02:29:29.840 | for them, but they might,
02:29:32.440 | it often happens when countries lose wars
02:29:36.400 | that things don't go so well for their leadership either.
02:29:39.640 | You remember when Argentina invaded the Falkland Islands?
02:29:42.480 | The military junta ordered that, right?
02:29:48.240 | People were cheering on the streets at first
02:29:50.400 | when they took it.
02:29:51.840 | And then when they got their butt kicked by the British,
02:29:56.680 | you know what happened to those guys?
02:29:58.520 | They were out.
02:30:01.320 | And I believe those who are still alive are in jail now.
02:30:04.600 | So the Russian leadership is entirely cornered
02:30:09.320 | where they know that just getting driven out of Ukraine
02:30:14.320 | is not an option.
02:30:17.160 | And so this to me is a typical example of Moloch.
02:30:22.160 | You have these incentives of the two parties
02:30:27.040 | where both of them are just driven
02:30:29.600 | to escalate more and more, right?
02:30:30.920 | If Russia starts losing in the conventional warfare,
02:30:33.960 | the only thing they can do
02:30:36.480 | since they're back against the war is to keep escalating.
02:30:39.280 | And the West has put itself in the situation now
02:30:43.040 | where we've sort of already committed to drive Russia out.
02:30:45.520 | So the only option the West has is to call Russia's bluff
02:30:48.560 | and keep sending in more weapons.
02:30:50.200 | This really bothers me
02:30:52.160 | because Moloch can sometimes drive competing parties
02:30:55.480 | to do something which is ultimately just really bad
02:30:57.840 | for both of them.
02:30:58.880 | And what makes me even more worried
02:31:02.720 | is not just that it's difficult to see an ending,
02:31:07.720 | a quick, peaceful ending to this tragedy
02:31:12.320 | that doesn't involve some horrible escalation,
02:31:15.480 | but also that we understand more clearly now
02:31:19.000 | just how horrible it would be.
02:31:21.640 | There was an amazing paper that was published
02:31:23.960 | in Nature Food this August
02:31:27.080 | by some of the top researchers
02:31:30.480 | who've been studying nuclear winter for a long time.
02:31:31.960 | And what they basically did was they combined climate models
02:31:38.120 | with food agricultural models.
02:31:42.800 | So instead of just saying, yeah, it gets really cold,
02:31:45.280 | blah, blah, blah, they figured out actually
02:31:46.700 | how many people would die in different countries.
02:31:49.160 | And it's pretty mind-blowing.
02:31:52.280 | So basically what happens is the thing that kills
02:31:54.480 | the most people is not the explosions,
02:31:56.000 | it's not the radioactivity, it's not the EMP mayhem,
02:31:59.800 | it's not the rampaging mobs foraging for food.
02:32:04.120 | No, it's the fact that you get so much smoke
02:32:06.680 | coming up from the burning cities and stratosphere
02:32:09.720 | that spreads around the earth from the jet streams.
02:32:14.720 | So in typical models, you get like 10 years or so
02:32:19.000 | where it's just crazy cold.
02:32:20.440 | During the first year after the war,
02:32:25.960 | and their models, the temperature drops in Nebraska
02:32:30.960 | and in the Ukraine breadbaskets by like 20,
02:32:36.560 | Celsius or so if I remember.
02:32:38.920 | No, yeah, 20, 30 Celsius, depending on where you are,
02:32:42.600 | 40 Celsius in some places, which is 40 Fahrenheit
02:32:46.160 | to 80 Fahrenheit colder than what it would normally be.
02:32:48.840 | So I'm not good at farming, but if it's snowing,
02:32:53.840 | if it drops below freezing pretty much most days in July,
02:32:58.080 | and then that's not good.
02:32:59.160 | So they worked out, they put this into their farming models.
02:33:02.080 | And what they found was really interesting.
02:33:04.120 | The countries that get the most hard hit
02:33:06.080 | are the ones in the Northern hemisphere.
02:33:08.080 | So in the US and one model,
02:33:12.680 | they had about 99% of all Americans starving to death.
02:33:16.360 | In Russia and China and Europe,
02:33:18.160 | also about 99%, 98% starving to death.
02:33:21.120 | So you might be like, oh, it's kind of poetic justice
02:33:24.760 | that both the Russians and the Americans,
02:33:28.040 | 99% of them have to pay for it
02:33:29.720 | 'cause it was their bombs that did it.
02:33:31.360 | But that doesn't particularly cheer people up in Sweden
02:33:35.280 | or other random countries
02:33:37.280 | that have nothing to do with it.
02:33:38.920 | I think it hasn't entered the mainstream,
02:33:45.880 | not understanding very much just like how bad this is.
02:33:53.200 | Most people, especially a lot of people
02:33:55.720 | in decision-making positions still think of nuclear weapons
02:33:58.040 | as something that makes you powerful.
02:33:59.880 | Scary, powerful, they don't think of it as something
02:34:05.000 | where, yeah, just to within a percent or two,
02:34:09.760 | we're all just gonna starve to death.
02:34:11.880 | - And starving to death is the worst way to die,
02:34:17.040 | as a lot of more, as all the famines in history show,
02:34:25.120 | the torture involved in that.
02:34:27.160 | - Probably brings out the worst in people also,
02:34:29.520 | when people are desperate like this.
02:34:34.240 | I've heard some people say that
02:34:37.120 | if that's what's gonna happen,
02:34:39.240 | they'd rather be at ground zero and just get vaporized.
02:34:42.000 | But I think people underestimate the risk of this
02:34:49.760 | because they aren't afraid of Moloch.
02:34:53.400 | They think, oh, it's just gonna be,
02:34:54.800 | 'cause humans don't want this, so it's not gonna happen.
02:34:56.600 | That's the whole point of Moloch,
02:34:58.080 | that things happen that nobody wanted.
02:35:00.360 | - And that applies to nuclear weapons,
02:35:02.440 | and that applies to AGI.
02:35:04.360 | - Exactly, and it applies to some of the things
02:35:09.000 | that people have gotten most upset with capitalism for also,
02:35:12.440 | where everybody was just kind of trapped.
02:35:14.920 | It's not to see if some company does something
02:35:18.640 | that causes a lot of harm.
02:35:23.400 | Not that the CEO is a bad person,
02:35:25.560 | but she or he knew that all the other companies
02:35:29.160 | were doing this too.
02:35:30.000 | So Moloch is a formidable foe.
02:35:32.480 | I hope someone makes a good movie
02:35:40.320 | so we can see who the real enemy is.
02:35:42.160 | We're not fighting against each other.
02:35:45.680 | Moloch makes us fight against each other.
02:35:48.400 | That's what Moloch's superpower is.
02:35:50.720 | The hope here is any kind of technology
02:35:55.440 | or other mechanism that lets us instead realize
02:35:59.560 | that we're fighting the wrong enemy.
02:36:01.400 | - It's such a fascinating battle.
02:36:04.120 | - It's not us versus them, it's us versus it.
02:36:06.480 | - We are fighting, Moloch, for human survival.
02:36:11.840 | We as a civilization.
02:36:13.000 | - Have you seen the movie "Needful Things"?
02:36:16.320 | It's a Stephen King novel.
02:36:17.960 | I love Stephen King and Max von Sydow,
02:36:21.040 | Swedish actor, is playing the guy.
02:36:23.840 | It's brilliant.
02:36:25.000 | I just hadn't thought about that until now,
02:36:27.540 | but that's the closest I've seen to a movie about Moloch.
02:36:31.840 | I don't want to spoil the film for anyone
02:36:33.280 | who wants to watch it, but basically,
02:36:36.120 | it's about this guy who turns out to,
02:36:39.560 | you can interpret him as the devil or whatever,
02:36:41.600 | but he doesn't actually ever go around and kill people
02:36:44.200 | or torture people with burning coal or anything.
02:36:47.120 | He makes everybody fight each other,
02:36:49.200 | makes everybody fear each other, hate each other,
02:36:51.160 | and then kill each other.
02:36:53.120 | So that's the movie about Moloch.
02:36:56.400 | - Love is the answer.
02:36:57.460 | That seems to be one of the ways to fight Moloch
02:37:02.460 | is by compassion, by seeing the common humanity.
02:37:08.140 | - Yes, yes.
02:37:09.980 | And to not sound, so we don't sound like,
02:37:12.300 | like what's it, Kumbaya tree huggers here, right?
02:37:15.380 | (Lex laughing)
02:37:16.780 | We're not just saying love and peace, man.
02:37:19.540 | We're trying to actually help people
02:37:21.860 | understand the true facts about the other side.
02:37:26.360 | And feel the compassion because the truth
02:37:31.360 | makes you more compassionate, right?
02:37:35.940 | So that's why I really like using AI
02:37:42.080 | for truth-seeking technologies that can,
02:37:46.300 | as a result, get us more love than hate.
02:37:53.400 | And even if you can't get love,
02:37:56.680 | settle for some understanding,
02:37:59.800 | which already gives compassion.
02:38:01.240 | If someone is like, "I really disagree with you, Lex,
02:38:06.120 | "but I can see where you're coming from.
02:38:07.860 | "You're not a bad person who needs to be destroyed,
02:38:12.700 | "but I disagree with you,
02:38:13.560 | "and I'm happy to have an argument about it."
02:38:15.920 | That's a lot of progress compared to where we are
02:38:18.760 | at 2023 in the public space, wouldn't you say?
02:38:22.120 | - If we solve the AI safety problem, as we've talked about,
02:38:26.560 | and then you, Max Tegmark, who has been talking about this
02:38:31.040 | for many years, get to sit down with the AGI,
02:38:35.120 | with the early AGI system, on a beach with a drink,
02:38:38.220 | what kind of, what would you ask her?
02:38:41.760 | What kind of question would you ask?
02:38:42.760 | What would you talk about?
02:38:44.060 | Something so much smarter than you.
02:38:47.600 | Would you be afraid--
02:38:49.560 | - I knew you were gonna get me
02:38:50.720 | with a really zinger of a question.
02:38:53.720 | That's a good one.
02:38:54.560 | - Would you be afraid to ask some questions?
02:38:58.360 | - No.
02:38:59.680 | I'm not afraid of the truth.
02:39:01.040 | (laughing)
02:39:01.880 | I'm very humble.
02:39:02.760 | I know I'm just a meat bag with all these flaws,
02:39:05.440 | but I have, we talked a lot about homo sentiens.
02:39:09.920 | I've already tried that for a long time with myself.
02:39:12.920 | So that is what's really valuable about being alive for me,
02:39:16.400 | is that I have these meaningful experiences.
02:39:19.760 | It's not that I'm good at this or good at that or whatever.
02:39:24.400 | There's so much I suck at.
02:39:25.720 | - So you're not afraid for the system
02:39:28.480 | to show you just how dumb you are?
02:39:29.920 | - No, no.
02:39:30.760 | In fact, my son reminds me of that pretty frequently.
02:39:34.160 | - You could find out how dumb you are in terms of physics,
02:39:36.440 | how little we humans understand.
02:39:38.720 | - I'm cool with that.
02:39:40.200 | I think, so I can't waffle my way out of this question.
02:39:45.200 | It's a fair one.
02:39:49.280 | I think, given that I'm a really, really curious person,
02:39:52.480 | that's really the defining part of who I am.
02:39:57.240 | I'm so curious.
02:39:58.440 | I have some physics questions.
02:40:05.520 | (laughing)
02:40:06.720 | I love to understand.
02:40:09.800 | I have some questions about consciousness,
02:40:12.240 | about the nature of reality.
02:40:13.360 | I would just really, really love to understand also.
02:40:15.960 | I could tell you one, for example,
02:40:18.720 | that I've been obsessing about a lot recently.
02:40:21.000 | So I believe that, so suppose Tononi is right.
02:40:27.720 | And suppose there are some information processing systems
02:40:30.720 | that are conscious and some that are not.
02:40:32.560 | Suppose you can even make reasonably smart things
02:40:34.480 | like GPT-4 that are not conscious,
02:40:36.400 | but you can also make them conscious.
02:40:38.840 | Here's the question that keeps me awake at night.
02:40:41.280 | Is it the case that the unconscious zombie systems
02:40:47.800 | that are really intelligent are also really efficient?
02:40:50.040 | So they're really inefficient?
02:40:51.600 | So that when you try to make things more efficient,
02:40:54.960 | which there'll naturally be a pressure to do,
02:40:57.200 | they become conscious.
02:40:59.120 | I'm kind of hoping that that's correct.
02:41:02.480 | And do you want me to give you a hand wavey argument for it?
02:41:05.480 | In my lab, again, every time we look at
02:41:11.160 | how these large language models do something,
02:41:13.680 | we see that they do it in really dumb ways
02:41:15.200 | and you could make it better.
02:41:17.640 | If you, we have loops in our computer language for a reason.
02:41:22.640 | The code would get way, way longer
02:41:25.640 | if you weren't allowed to use them.
02:41:27.400 | It's more efficient to have the loops.
02:41:29.840 | And in order to have self-reflection,
02:41:34.240 | whether it's conscious or not,
02:41:37.000 | even an operating system knows things about itself.
02:41:39.560 | You need to have loops already.
02:41:44.080 | So I think, I'm waving my hands a lot,
02:41:48.360 | but I suspect that the most efficient way
02:41:53.240 | of implementing a given level of intelligence
02:41:55.840 | has loops in it, self-reflection, and will be conscious.
02:42:01.840 | - Isn't that great news?
02:42:04.240 | - Yes, if it's true, it's wonderful.
02:42:06.080 | 'Cause then we don't have to fear
02:42:07.920 | the ultimate zombie apocalypse.
02:42:09.560 | And I think if you look at our brains, actually,
02:42:12.160 | our brains are part zombie and part conscious.
02:42:16.920 | When I open my eyes, I immediately take all these pixels
02:42:24.960 | that hit my retina, right?
02:42:27.800 | And I'm like, "Oh, that's Lex."
02:42:29.880 | But I have no freaking clue of how I did that computation.
02:42:32.800 | It's actually quite complicated, right?
02:42:34.280 | It was only relatively recently
02:42:36.640 | we could even do it well with machines, right?
02:42:39.720 | You get a bunch of information processing happening
02:42:42.160 | in my retina, and then it goes to the lateral geniculate
02:42:44.480 | nucleus, my thalamus, and the area V1, V2, V4,
02:42:48.520 | and the fusiform face area here that Nancy Kenwisher
02:42:51.040 | at MIT invented, and blah, blah, blah, blah, blah.
02:42:53.400 | And I have no freaking clue how that worked, right?
02:42:56.320 | It feels to me subjectively like my conscious module
02:42:59.520 | just got a little email saying,
02:43:05.480 | "Facial processing task complete. It's Lex."
02:43:10.480 | - Yeah.
02:43:13.120 | - I'm gonna just go with that, right?
02:43:15.080 | So this fits perfectly with Tononi's model
02:43:18.440 | because this was all one-way information processing mainly.
02:43:23.060 | And it turned out for that particular task,
02:43:28.160 | that's all you needed, and it probably was
02:43:30.200 | kind of the most efficient way to do it.
02:43:32.560 | But there were a lot of other things
02:43:34.080 | that we associated with higher intelligence
02:43:36.120 | and planning and so on and so forth,
02:43:38.000 | where you kind of want to have loops
02:43:40.160 | and be able to ruminate and self-reflect
02:43:42.240 | and introspect and so on, where my hunch is
02:43:46.840 | that if you want to fake that with a zombie system
02:43:49.120 | that just all goes one way, you have to unroll those loops
02:43:52.240 | and it gets really, really long,
02:43:53.360 | and it's much more inefficient.
02:43:55.840 | So I'm actually hopeful that AI, if in the future
02:43:59.320 | we have all these very sublime and interesting machines
02:44:01.800 | that do cool things and are aligned with us,
02:44:04.600 | that they will also have consciousness
02:44:08.680 | for the kind of these things that we do.
02:44:11.680 | - That great intelligence is also correlated
02:44:14.520 | to great consciousness or a deep kind of consciousness.
02:44:18.480 | - Yes.
02:44:20.000 | So that's a happy thought for me
02:44:21.760 | 'cause the zombie apocalypse really is my worst nightmare
02:44:24.760 | of all, to be like adding insult to injury.
02:44:27.160 | Not only did we get replaced,
02:44:29.040 | but we frigging replaced ourselves by zombies.
02:44:32.240 | How dumb can we be?
02:44:34.120 | - That's such a beautiful vision,
02:44:35.480 | and that's actually a provable one.
02:44:37.080 | That's one that we humans can intuit and prove
02:44:40.200 | that those two things are correlated
02:44:42.520 | as we start to understand what it means to be intelligent
02:44:45.240 | and what it means to be conscious,
02:44:46.880 | which these systems, early AGI-like systems
02:44:51.120 | will help us understand.
02:44:52.400 | - And I just want to say one more thing,
02:44:53.760 | which is super important.
02:44:55.200 | Most of my colleagues, when I started going on
02:44:57.440 | about consciousness, tell me that it's all bullshit
02:44:59.360 | and I should stop talking about it.
02:45:01.120 | I hear a little inner voice from my father
02:45:04.240 | and from my mom saying, "Keep talking about it,"
02:45:06.960 | 'cause I think they're wrong.
02:45:08.040 | And the main way to convince people like that,
02:45:13.040 | that they're wrong, if they say that consciousness
02:45:17.120 | is just equal to intelligence, is to ask them,
02:45:19.560 | "What's wrong with torture?"
02:45:21.280 | Or, "Why are you against torture?"
02:45:23.960 | If it's just about these particles moving this way
02:45:28.920 | rather than that way, and there is no such thing
02:45:31.680 | as subjective experience, what's wrong with torture?
02:45:34.320 | Do you have a good comeback to that?
02:45:36.520 | - No, it seems like suffering imposed onto other humans
02:45:40.480 | is somehow deeply wrong in a way
02:45:44.200 | that intelligence doesn't quite explain.
02:45:46.120 | - And if someone tells me, "Well, it's just an illusion,
02:45:50.720 | "consciousness, whatever," I like to invite them
02:45:55.720 | the next time they're having surgery
02:45:58.920 | to do it without anesthesia.
02:46:00.360 | What is anesthesia really doing?
02:46:03.760 | If you have it, you can have it at local anesthesia
02:46:05.800 | when you're awake.
02:46:06.620 | I had that when they fixed my shoulder.
02:46:07.800 | It was super entertaining.
02:46:09.060 | What was it that it did?
02:46:12.560 | It just removed my subjective experience of pain.
02:46:15.560 | It didn't change anything about what was actually happening
02:46:17.760 | in my shoulder, right?
02:46:20.120 | So if someone says that's all bullshit,
02:46:22.120 | skip the anesthesia is my advice.
02:46:24.960 | This is incredibly central.
02:46:26.680 | - It could be fundamental to whatever this thing
02:46:30.080 | we have going on here.
02:46:31.320 | - It is fundamental because what we feel is so fundamental
02:46:36.080 | is suffering and joy and pleasure and meaning.
02:46:41.080 | That's all, those are all subjective experiences there.
02:46:47.880 | And let's not, those are the elephant in the room.
02:46:50.160 | That's what makes life worth living
02:46:51.840 | and that's what can make it horrible
02:46:53.040 | if it's just the way you're suffering.
02:46:54.420 | So let's not make the mistake of saying
02:46:56.640 | that that's all bullshit.
02:46:58.400 | - And let's not make the mistake of not instilling
02:47:02.640 | the AI systems with that same thing that makes us special.
02:47:07.640 | - Yeah.
02:47:09.600 | - Max, it's a huge honor that you will sit down to me
02:47:12.800 | the first time on the first episode of this podcast.
02:47:16.240 | It's a huge honor you sit down with me again
02:47:18.280 | and talk about this, what I think is the most important
02:47:21.400 | topic, the most important problem that we humans
02:47:25.080 | have to face and hopefully solve.
02:47:28.740 | - Yeah, well the honor is all mine and I'm so grateful
02:47:31.600 | to you for making more people aware of the fact
02:47:34.960 | that humanity has reached the most important fork
02:47:37.320 | in the road ever in its history
02:47:38.960 | and let's turn in the correct direction.
02:47:41.700 | - Thanks for listening to this conversation
02:47:44.240 | with Max Tegmark.
02:47:45.440 | To support this podcast, please check out our sponsors
02:47:47.840 | in the description.
02:47:49.440 | And now let me leave you with some words
02:47:51.440 | from Frank Herbert.
02:47:52.880 | History is a constant race
02:47:56.440 | between invention and catastrophe.
02:47:59.120 | Thank you for listening and hope to see you next time.
02:48:03.480 | (upbeat music)
02:48:06.060 | (upbeat music)
02:48:08.640 | [BLANK_AUDIO]