Max Tegmark: The Case for Halting AI Development

00:00:00.000 | A lot of people have said for many years

00:00:01.840 | that there will come a time when we want to pause a little bit.

00:00:06.320 | That time is now.

00:00:07.200 | The following is a conversation with Max Tegmark,

00:00:13.600 | his third time on the podcast.

00:00:15.440 | In fact, his first appearance was episode number one

00:00:18.800 | of this very podcast.

00:00:20.560 | He is a physicist and artificial intelligence researcher

00:00:24.000 | at MIT, co-founder of Future of Life Institute,

00:00:27.120 | and author of Life 3.0, Being Human in the Age

00:00:31.400 | of Artificial Intelligence.

00:00:33.480 | Most recently, he's a key figure in spearheading

00:00:36.320 | the open letter calling for a six-month pause

00:00:39.120 | on giant AI experiments, like training GPT-4.

00:00:43.960 | The letter reads, "We're calling for a pause

00:00:47.440 | on training of models larger than GPT-4 for six months.

00:00:51.880 | This does not imply a pause or ban on all AI research

00:00:55.000 | and development or the use of systems that have already

00:00:58.080 | been placed on the market.

00:00:59.800 | Our call is specific and addresses

00:01:02.440 | a very small pool of actors who possesses this capability."

00:01:06.920 | The letter has been signed by over 50,000 individuals,

00:01:09.960 | including 1,800 CEOs and over 1,500 professors.

00:01:14.480 | Signatories include Yoshua Bengio, Stuart Russell,

00:01:17.840 | Elon Musk, Steve Wozniak, Yuval Noah Harari, Andrew Yang,

00:01:21.880 | and many others.

00:01:23.600 | This is a defining moment in the history

00:01:26.040 | of human civilization, where the balance of power

00:01:29.000 | between human and AI begins to shift.

00:01:32.880 | And Max's mind and his voice is one of the most valuable

00:01:36.760 | and powerful in a time like this.

00:01:39.520 | His support, his wisdom, his friendship has been a gift

00:01:43.680 | I'm forever deeply grateful for.

00:01:46.720 | This is the Alex Friedman Podcast.

00:01:48.840 | To support it, please check out our sponsors

00:01:50.680 | in the description.

00:01:51.840 | And now, dear friends, here's Max Tegmark.

00:01:55.600 | You were the first ever guest on this podcast,

00:01:59.280 | episode number one.

00:02:00.600 | So first of all, Max, I just have to say,

00:02:03.760 | thank you for giving me a chance.

00:02:05.240 | Thank you for starting this journey.

00:02:06.640 | It's been an incredible journey.

00:02:07.840 | Just thank you for sitting down with me

00:02:11.200 | and just acting like I'm somebody who matters,

00:02:14.280 | that I'm somebody who's interesting to talk to.

00:02:16.680 | And thank you for doing it.

00:02:18.880 | That meant a lot.

00:02:20.200 | - Thanks to you for putting your heart and soul into this.

00:02:24.280 | I know when you delve into controversial topics,

00:02:26.880 | it's inevitable to get hit by what Hamlet talks about,

00:02:30.720 | the slings and arrows and stuff.

00:02:32.360 | And I really admire this.

00:02:33.800 | It's in an era where YouTube videos are too long

00:02:37.400 | and now it has to be like a 20-minute TikTok,

00:02:39.960 | 20-second TikTok clip.

00:02:41.880 | It's just so refreshing to see you going exactly

00:02:44.280 | against all of the advice and doing these really long form

00:02:47.520 | things and the people appreciate it.

00:02:49.720 | Reality is nuanced.

00:02:51.360 | And thanks for sharing it that way.

00:02:55.840 | - So let me ask you again,

00:02:57.280 | the first question I've ever asked on this podcast,

00:02:59.520 | episode number one, talking to you,

00:03:02.200 | do you think there's intelligent life out there

00:03:04.920 | in the universe?

00:03:05.840 | Let's revisit that question.

00:03:07.320 | Do you have any updates?

00:03:08.800 | What's your view when you look out to the stars?

00:03:12.360 | - So when we look out to the stars,

00:03:14.120 | if you define our universe the way most astrophysicists do,

00:03:18.920 | not as all of space, but the spherical region of space

00:03:22.560 | that we can see with our telescopes,

00:03:23.880 | from which light has a time to reach us,

00:03:25.920 | since our Big Bang, I'm in the minority.

00:03:29.400 | I estimate that we are the only life

00:03:34.360 | in this spherical volume that has invented internet,

00:03:39.360 | radios, gotten our level of tech.

00:03:41.600 | And if that's true,

00:03:43.680 | then it puts a lot of responsibility on us

00:03:47.880 | to not mess this one up.

00:03:49.960 | Because if it's true, it means that life is quite rare.

00:03:54.400 | And we are stewards of this one spark

00:03:58.160 | of advanced consciousness, which if we nurture it

00:04:01.320 | and help it grow, eventually life can spread from here

00:04:05.440 | out into much of our universe.

00:04:06.800 | And we can have this just amazing future.

00:04:08.560 | Whereas if we instead are reckless

00:04:11.360 | with the technology we build and just snuff it out

00:04:14.080 | due to stupidity or infighting,

00:04:17.360 | then maybe the rest of cosmic history in our universe

00:04:22.360 | is just gonna be a play for empty benches.

00:04:24.800 | But I do think that we are actually very likely

00:04:28.960 | to get visited by aliens, alien intelligence quite soon.

00:04:33.240 | But I think we are gonna be building

00:04:34.560 | that alien intelligence.

00:04:36.680 | - So we're going to give birth

00:04:40.600 | to an intelligent alien civilization.

00:04:44.200 | Unlike anything that human,

00:04:47.200 | that evolution here on Earth was able to create

00:04:49.620 | in terms of the path, the biological path it took.

00:04:52.800 | - Yeah, and it's gonna be much more alien

00:04:56.000 | than a cat or even the most exotic animal

00:05:00.920 | on the planet right now.

00:05:02.720 | Because it will not have been created

00:05:05.480 | through the usual Darwinian competition

00:05:07.620 | where it necessarily cares about self-preservation,

00:05:11.080 | afraid of death, any of those things.

00:05:15.120 | The space of alien minds that you can build

00:05:18.600 | is just so much vaster than what evolution will give you.

00:05:22.100 | And with that also comes a great responsibility

00:05:24.880 | for us to make sure that the kind of minds we create

00:05:28.240 | are the kind of minds that it's good to create.

00:05:32.240 | Minds that will share our values

00:05:36.520 | and be good for humanity and life.

00:05:39.480 | And also create minds that don't suffer.

00:05:42.020 | - Do you try to visualize the full space

00:05:46.900 | of alien minds that AI could be?

00:05:49.800 | Do you try to consider all the different kinds

00:05:51.920 | of intelligences, sort of generalizing

00:05:55.480 | what humans are able to do to the full spectrum

00:05:58.600 | of what intelligent creatures, entities could do?

00:06:01.920 | - I try, but I would say I fail.

00:06:05.180 | I mean, it's very difficult for a human mind

00:06:08.800 | to really grapple with something so completely alien.

00:06:13.800 | I mean, even for us, right?

00:06:17.040 | If we just try to imagine, how would it feel

00:06:18.680 | if we were completely indifferent towards death

00:06:23.480 | or individuality?

00:06:25.060 | Even if you just imagine that, for example,

00:06:29.320 | you could just copy my knowledge of how to speak Swedish.

00:06:34.520 | Boom, now you can speak Swedish.

00:06:37.280 | And you could copy any of my cool experiences

00:06:39.680 | and then you could delete the ones you didn't like

00:06:41.160 | in your own life, just like that.

00:06:43.160 | It would already change quite a lot

00:06:45.800 | about how you feel as a human being, right?

00:06:48.520 | You probably spend less effort studying things

00:06:51.680 | if you just copy them.

00:06:52.600 | And you might be less afraid of death

00:06:54.660 | because if the plane you're on starts to crash,

00:06:58.300 | you'd just be like, "Oh, shucks, I haven't backed

00:07:01.600 | "my brain up for four hours.

00:07:04.640 | "So I'm gonna lose all these wonderful experiences

00:07:08.320 | "of this flight."

00:07:09.700 | We might also start feeling more compassionate,

00:07:16.140 | maybe with other people, if we can so readily share

00:07:18.560 | each other's experiences and our knowledge

00:07:20.840 | and feel more like a hive mind.

00:07:23.480 | It's very hard, though.

00:07:24.760 | I really feel very humble about this, to grapple with it,

00:07:29.760 | how it might actually feel.

00:07:33.080 | - The one thing which is so obvious, though,

00:07:35.280 | which I think is just really worth reflecting on

00:07:38.360 | is because the mind space of possible intelligences

00:07:42.400 | is so different from ours, it's very dangerous

00:07:45.600 | if we assume they're gonna be like us,

00:07:47.200 | or anything like us.

00:07:48.420 | - Well, the entirety of human written history

00:07:54.480 | has been through poetry, through novels,

00:07:57.440 | been trying to describe through philosophy,

00:08:00.960 | trying to describe the human condition

00:08:03.080 | and what's entailed in it.

00:08:04.480 | Just like you said, fear of death

00:08:05.720 | and all those kinds of things, what is love,

00:08:07.680 | and all of that changes if you have a different

00:08:11.200 | kind of intelligence, all of it.

00:08:13.280 | The entirety, all those poems, they're trying to sneak up

00:08:16.480 | to what the hell it means to be human.

00:08:18.360 | All of that changes.

00:08:19.880 | How AI concerns and existential crises that AI experiences,

00:08:24.880 | how that clashes with the human existential crisis,

00:08:29.800 | the human condition, that's hard to fathom, hard to predict.

00:08:34.480 | - It's hard, but it's fascinating to think about also.

00:08:37.960 | Even in the best case scenario where we don't lose control

00:08:42.180 | over the ever more powerful AI that we're building

00:08:44.960 | to other humans whose goals we think are horrible,

00:08:49.120 | and where we don't lose control to the machines,

00:08:51.840 | and AI provides the things we want,

00:08:56.320 | even then, you get into the questions you touched here.

00:08:59.660 | Maybe it's the struggle that it's actually hard

00:09:03.120 | to do things is part of the things

00:09:04.760 | that gives us meaning as well.

00:09:07.120 | For example, I found it so shocking

00:09:09.900 | that this new Microsoft GPT-4 commercial

00:09:14.240 | that they put together has this woman talking about,

00:09:18.000 | showing this demo of how she's gonna give

00:09:20.620 | a graduation speech to her beloved daughter,

00:09:23.820 | and she asks GPT-4 to write it.

00:09:25.780 | If it's frigging 200 words or so,

00:09:28.880 | if I realized that my parents couldn't be bothered

00:09:31.840 | struggling a little bit to write 200 words

00:09:35.000 | and outsource that to their computer,

00:09:36.760 | I would feel really offended, actually.

00:09:39.020 | And so I wonder if eliminating too much

00:09:44.400 | of the struggle from our existence,

00:09:46.360 | do you think that would also take away a little bit of what--

00:09:53.240 | - It means to be human, yeah.

00:09:55.060 | We can't even predict.

00:09:57.860 | I had somebody mention to me that they started using

00:10:02.380 | Chad GPT with a 3.5 and not 4.0

00:10:06.400 | to write what they really feel to a person,

00:10:12.580 | and they have a temper issue,

00:10:14.220 | and they're basically trying to get Chad GPT

00:10:17.800 | to rewrite it in a nicer way,

00:10:19.760 | to get the point across, but rewrite it in a nicer way.

00:10:22.620 | So we're even removing the inner asshole

00:10:26.180 | from our communication.

00:10:27.680 | So there's some positive aspects of that,

00:10:31.840 | but mostly it's just the transformation

00:10:34.200 | of how humans communicate.

00:10:35.780 | And it's scary because so much of our society

00:10:40.780 | is based on this glue of communication.

00:10:44.640 | And if we're now using AI as the medium of communication

00:10:49.080 | that does the language for us,

00:10:51.140 | so much of the emotion that's laden in human communication,

00:10:55.520 | so much of the intent that's going to be handled

00:10:59.320 | by outsourced AI, how does that change everything?

00:11:02.140 | How does that change the internal state

00:11:03.940 | of how we feel about other human beings?

00:11:06.500 | What makes us lonely, what makes us excited?

00:11:08.980 | What makes us afraid, how we fall in love,

00:11:11.160 | all that kind of stuff.

00:11:12.180 | - Yeah, for me personally, I have to confess,

00:11:15.060 | the challenge is one of the things

00:11:16.500 | that really makes my life feel meaningful.

00:11:22.660 | If I go hike a mountain with my wife, Maya,

00:11:26.120 | I don't want to just press a button and be at the top.

00:11:28.220 | I want to struggle and come up there sweaty

00:11:30.240 | and feel, wow, we did this.

00:11:32.360 | In the same way, I want to constantly work on myself

00:11:37.360 | to become a better person.

00:11:39.320 | If I say something in anger that I regret,

00:11:42.680 | I want to go back and really work on myself

00:11:46.480 | rather than just tell an AI from now on

00:11:49.720 | always filter what I write

00:11:51.080 | so I don't have to work on myself

00:11:53.620 | 'cause then I'm not growing.

00:11:55.840 | - Yeah, but then again, it could be like with chess.

00:11:59.840 | An AI, once it significantly, obviously,

00:12:04.680 | supersedes the performance of humans,

00:12:06.840 | it will live in its own world

00:12:08.800 | and provide maybe a flourishing civilization for humans,

00:12:12.600 | but we humans will continue hiking mountains

00:12:15.120 | and playing our games even though AI is so much smarter,

00:12:18.240 | so much stronger, so much superior in every single way,

00:12:21.120 | just like with chess.

00:12:22.280 | That's one possible hopeful trajectory here

00:12:26.720 | is that humans will continue to human

00:12:28.560 | and AI will just be a kind of

00:12:34.240 | a medium that enables the human experience to flourish.

00:12:45.600 | - Yeah, I would phrase that as rebranding ourselves

00:12:50.600 | from homo sapiens to homo sentiens.

00:12:53.920 | Right now, sapiens, the ability to be intelligent,

00:12:58.200 | we've even put it in our species name.

00:13:00.280 | We're branding ourselves as the smartest

00:13:05.120 | information processing entity on the planet.

00:13:08.600 | That's clearly gonna change if AI continues ahead.

00:13:14.200 | So maybe we should focus on the experience instead,

00:13:16.520 | the subjective experience that we have with homo sentiens

00:13:20.440 | and say that's what's really valuable,

00:13:23.080 | the love, the connection, the other things.

00:13:25.400 | Get off our high horses and get rid of this hubris

00:13:31.240 | that only we can do integrals.

00:13:35.160 | - So consciousness, the subjective experience

00:13:37.880 | is a fundamental value to what it means to be human.

00:13:42.120 | Make that the priority.

00:13:44.200 | - That feels like a hopeful direction to me,

00:13:47.480 | but that also requires more compassion,

00:13:50.920 | not just towards other humans because they happen

00:13:53.680 | to be the smartest on the planet,

00:13:55.720 | but also towards all our other fellow creatures

00:13:57.760 | on this planet.

00:13:58.880 | I personally feel right now we're treating

00:14:01.320 | a lot of farm animals horribly, for example,

00:14:03.480 | and the excuse we're using is,

00:14:04.880 | oh, they're not as smart as us.

00:14:06.720 | But if we admit that we're not that smart

00:14:10.040 | in the grand scheme of things either in the post-AI epoch,

00:14:13.040 | then surely we should value the subjective experience

00:14:17.840 | of a cow also.

00:14:19.520 | - Well, allow me to briefly look at the book,

00:14:23.960 | which at this point is becoming more and more visionary

00:14:26.340 | that you've written, I guess over five years ago,

00:14:28.880 | Life 3.0.

00:14:29.880 | So first of all, 3.0, what's 1.0, what's 2.0, what's 3.0?

00:14:35.640 | And how's that vision sort of evolve,

00:14:38.880 | the vision in the book evolve to today?

00:14:41.200 | - Life 1.0 is really dumb, like bacteria,

00:14:45.160 | in that it can't actually learn anything at all

00:14:46.960 | during the lifetime.

00:14:47.880 | The learning just comes from this genetic process

00:14:51.800 | from one generation to the next.

00:14:55.200 | Life 2.0 is us and other animals which have brains,

00:15:00.200 | which can learn during their lifetime a great deal.

00:15:06.960 | And you were born without being able to speak English.

00:15:11.960 | And at some point you decided,

00:15:13.520 | hey, I wanna upgrade my software.

00:15:15.320 | Let's install an English speaking module.

00:15:17.560 | - So you did.

00:15:19.280 | - And Life 3.0, which does not exist yet,

00:15:23.840 | can replace not only its software the way we can,

00:15:27.400 | but also its hardware.

00:15:28.500 | And that's where we're heading towards at high speed.

00:15:33.400 | We're already maybe 2.1,

00:15:34.840 | 'cause we can put in an artificial knee,

00:15:38.960 | pacemaker, et cetera, et cetera.

00:15:42.520 | And if Neuralink and other companies succeed,

00:15:45.760 | we'll be Life 2.2, et cetera.

00:15:48.000 | But what the company's trying to build, AGI,

00:15:52.080 | or trying to make is, of course, full 3.0.

00:15:54.720 | And you can put that intelligence

00:15:56.000 | into something that also has no

00:15:57.720 | biological basis whatsoever.

00:16:02.760 | - So less constraints and more capabilities,

00:16:05.400 | just like the leap from 1.0 to 2.0.

00:16:08.720 | There is, nevertheless,

00:16:10.080 | you speaking so harshly about bacteria,

00:16:12.120 | so disrespectfully about bacteria,

00:16:14.300 | there is still the same kind of magic there

00:16:18.240 | that permeates Life 2.0 and 3.0.

00:16:22.480 | It seems like maybe the thing that's truly powerful

00:16:26.520 | about life, intelligence, and consciousness

00:16:29.400 | was already there in 1.0.

00:16:31.960 | Is it possible?

00:16:32.840 | - I think we should be humble and not be so quick

00:16:37.960 | to make everything binary and say either it's there

00:16:42.120 | or it's not.

00:16:42.960 | Clearly, there's a great spectrum.

00:16:44.960 | And there is even controversy about whether some unicellular

00:16:48.960 | organisms like amoebas can maybe learn a little bit

00:16:51.520 | after all.

00:16:53.440 | So apologies if I offended any bacteria here.

00:16:56.200 | It wasn't my intent.

00:16:57.040 | It was more that I wanted to talk up how cool it is

00:16:59.960 | to actually have a brain,

00:17:01.420 | where you can learn dramatically within your lifetime.

00:17:04.680 | - Typical human.

00:17:05.800 | - And the higher up you get from 1.0 to 2.0 to 3.0,

00:17:09.240 | the more you become the captain of your own ship,

00:17:12.480 | the master of your own destiny,

00:17:13.960 | and the less you become a slave

00:17:15.520 | to whatever evolution gave you, right?

00:17:17.540 | By upgrading our software,

00:17:20.240 | we can be so different from previous generations

00:17:22.640 | and even from our parents,

00:17:24.560 | much more so than even a bacterium.

00:17:27.160 | You know, no offense to them.

00:17:29.180 | And if you can also swap out your hardware

00:17:32.080 | and take any physical form you want,

00:17:33.840 | of course, really the sky's the limit.

00:17:36.800 | - Yeah, so it accelerates the rate

00:17:40.680 | at which you can perform the computation

00:17:43.560 | that determines your destiny.

00:17:45.520 | - Yeah, and I think it's worth commenting a bit

00:17:48.760 | on what you means in this context also,

00:17:50.560 | if you swap things out a lot, right?

00:17:52.640 | This is controversial, but my current

00:17:59.380 | understanding is that life is best thought of

00:18:04.380 | not as a bag of meat or even a bag of elementary particles,

00:18:10.860 | but rather as a system which can process information

00:18:16.820 | and retain its own complexity,

00:18:19.580 | even though nature is always trying to mess it up.

00:18:21.580 | So it's all about information processing.

00:18:25.100 | And that makes it a lot like something

00:18:28.500 | like a wave in the ocean,

00:18:29.880 | which is not its water molecules, right?

00:18:33.600 | The water molecules bob up and down,

00:18:35.120 | but the wave moves forward.

00:18:36.240 | It's an information pattern.

00:18:37.800 | In the same way, you, Lex,

00:18:40.540 | you're not the same atoms as during the first

00:18:43.520 | time you did with me. - Time we talked, yeah.

00:18:44.600 | - You've swapped out most of them, but still you.

00:18:47.840 | - Yeah. - And the information pattern

00:18:51.080 | is still there.

00:18:52.320 | And if you could swap out your arms

00:18:55.840 | and whatever, you can still have this kind of continuity.

00:19:00.840 | It becomes much more sophisticated,

00:19:03.480 | sort of wave forward in time

00:19:04.860 | where the information lives on.

00:19:06.880 | I lost both of my parents since our last podcast.

00:19:11.440 | And it actually gives me a lot of solace

00:19:13.880 | that this way of thinking about them,

00:19:17.060 | they haven't entirely died because a lot of mommy

00:19:21.400 | and daddies, sorry, I'm getting a little emotional here,

00:19:24.940 | but a lot of their values and ideas

00:19:28.840 | and even jokes and so on, they haven't gone away, right?

00:19:33.320 | Some of them live on.

00:19:34.160 | I can carry on some of them.

00:19:35.760 | And they also live on a lot of other people.

00:19:38.920 | So in this sense, even with Life 2.0,

00:19:41.900 | we can, to some extent, already transcend

00:19:45.880 | our physical bodies and our death.

00:19:49.160 | And particularly if you can share your own information,

00:19:53.920 | your own ideas with many others like you do

00:19:57.320 | in your podcast, then that's the closest

00:20:02.320 | to immortality we can get with our bio-bodies.

00:20:06.840 | - You carry a little bit of them in you in some sense.

00:20:10.280 | - Yeah, yeah.

00:20:11.120 | - Do you miss them?

00:20:13.000 | You miss your mom and dad?

00:20:14.080 | - Of course, of course.

00:20:15.680 | - What did you learn about life from them

00:20:17.240 | if it can take a bit of a tangent?

00:20:21.520 | - So many things.

00:20:22.640 | For starters, my fascination for math

00:20:28.920 | and the physical mysteries of our universe.

00:20:32.360 | I got a lot of that from my dad.

00:20:34.960 | But I think my obsession for fairly big questions

00:20:38.520 | and consciousness and so on,

00:20:40.000 | that actually came mostly from my mom.

00:20:42.880 | And what I got from both of them,

00:20:47.120 | which is a very core part of really who I am,

00:20:49.840 | I think is this,

00:20:53.360 | just feeling comfortable with not buying

00:21:02.000 | into what everybody else is saying.

00:21:07.360 | Doing what I think is right.

00:21:11.280 | They both very much just did their own thing.

00:21:19.800 | And sometimes they got flack for it,

00:21:21.160 | and they did it anyway.

00:21:22.320 | - That's why you've always been an inspiration to me,

00:21:25.920 | that you're at the top of your field

00:21:27.400 | and you're still willing to tackle

00:21:31.400 | the big questions in your own way.

00:21:35.160 | You're one of the people that represents MIT best to me.

00:21:40.160 | You've always been an inspiration in that.

00:21:41.960 | So it's good to hear that you got that from your mom and dad.

00:21:44.000 | - Yeah, you're too kind.

00:21:44.960 | But yeah, I mean, the good reason to do science

00:21:49.720 | is because you're really curious,

00:21:51.520 | you wanna figure out the truth.

00:21:53.800 | If you think this is how it is,

00:21:57.200 | and everyone else says, no, no, that's bullshit,

00:21:58.960 | and it's that way, you know,

00:22:00.360 | you stick with what you think is true.

00:22:04.240 | And even if everybody else keeps thinking it's bullshit,

00:22:09.480 | there's a certain,

00:22:10.360 | I always root for the underdog when I watch movies.

00:22:15.960 | And my dad once, one time, for example,

00:22:18.800 | when I wrote one of my craziest papers ever,

00:22:22.160 | talking about our universe ultimately being mathematical,

00:22:24.280 | which we're not gonna get into today,

00:22:25.680 | I got this email from a quite famous professor saying,

00:22:28.280 | this is not only bullshit,

00:22:29.520 | but it's gonna ruin your career.

00:22:31.160 | You should stop doing this kind of stuff.

00:22:33.280 | I sent it to my dad.

00:22:34.800 | Do you know what he said?

00:22:35.640 | - What did he say?

00:22:36.800 | - He replied with a quote from Dante.

00:22:39.600 | (speaking in foreign language)

00:22:42.720 | Follow your own path and let the people talk.

00:22:45.480 | Go dad!

00:22:48.240 | This is the kind of thing,

00:22:49.080 | you know, he's dead, but that attitude is not.

00:22:53.000 | - How did losing them as a man, as a human being change you?

00:22:59.240 | How did it expand your thinking about the world?

00:23:01.160 | How did it expand your thinking about,

00:23:03.340 | you know, this thing we're talking about,

00:23:05.720 | which is humans creating another living,

00:23:09.800 | sentient, perhaps, being?

00:23:12.120 | - I think it,

00:23:18.040 | mainly did two things.

00:23:19.400 | One of them, just going through all their stuff

00:23:23.760 | after they had passed away and so on,

00:23:25.840 | just drove home to me how important it is to ask ourselves,

00:23:28.840 | why are we doing these things we do?

00:23:31.520 | Because it's inevitable that you look at some things

00:23:34.040 | they spent an enormous time on and you ask,

00:23:36.200 | in hindsight, would they really have spent

00:23:38.680 | so much time on this?

00:23:40.400 | Would they have done something that was more meaningful?

00:23:42.920 | So I've been looking more in my life now and asking,

00:23:46.520 | you know, why am I doing what I'm doing?

00:23:48.680 | And I feel,

00:23:50.120 | it should either be something I really enjoy doing

00:23:56.680 | or it should be something that I find

00:23:58.840 | really, really meaningful because it helps humanity.

00:24:02.840 | And if it's in none of those two categories,

00:24:09.480 | maybe I should spend less time on it, you know?

00:24:12.480 | The other thing is dealing with death up in person like this,

00:24:17.000 | it's actually made me less afraid of,

00:24:20.600 | even less afraid of other people telling me

00:24:24.560 | that I'm an idiot, you know, which happens regularly.

00:24:27.960 | And just, I'm gonna live my life, do my thing, you know?

00:24:31.280 | And it's made it a little bit easier for me to focus

00:24:38.360 | on what I feel is really important.

00:24:40.600 | - What about fear of your own death?

00:24:42.440 | Has it made it more real that this is,

00:24:45.960 | that this is something that happens?

00:24:49.480 | - Yeah, it's made it extremely real.

00:24:51.400 | And I'm next in line in our family now, right?

00:24:54.200 | It's me and my younger brother.

00:24:56.080 | But they both handled it with such dignity.

00:25:01.080 | That was a true inspiration also.

00:25:04.600 | They never complained about things.

00:25:06.880 | And you know, when you're old

00:25:08.600 | and your body starts falling apart,

00:25:10.200 | it's more and more to complain about.

00:25:11.400 | They looked at what could they still do

00:25:13.160 | that was meaningful.

00:25:14.760 | And they focused on that rather than wasting time

00:25:17.880 | talking about or even thinking much

00:25:22.120 | about things they were disappointed in.

00:25:24.480 | I think anyone can make themselves depressed

00:25:26.360 | if they start their morning by making a list of grievances.

00:25:30.800 | Whereas if you start your day with a little meditation

00:25:34.160 | and just things you're grateful for,

00:25:36.680 | you basically choose to be a happy person.

00:25:39.840 | - Because you only have a finite number of days

00:25:42.480 | you should spend them.

00:25:43.440 | - Make it count.

00:25:44.640 | - Being grateful.

00:25:45.760 | - Yeah.

00:25:46.600 | - Well, you do happen to be working on a thing

00:25:52.840 | which seems to have potentially some of the greatest impact

00:25:57.840 | on human civilization of anything humans have ever created,

00:26:00.800 | which is artificial intelligence.

00:26:02.200 | This is on the both detailed technical level

00:26:05.240 | and in the high philosophical level you work on.

00:26:08.280 | So you've mentioned to me that there's an open letter

00:26:12.760 | that you're working on.

00:26:15.040 | - It's actually going live in a few hours.

00:26:18.920 | (Lex laughing)

00:26:20.040 | I've been having late nights and early mornings.

00:26:22.760 | It's been very exciting actually.

00:26:24.840 | In short, have you seen "Don't Look Up", the film?

00:26:29.840 | - Yes, yes.

00:26:32.400 | - I don't wanna be the movie spoiler

00:26:34.040 | for anyone watching this who hasn't seen it.

00:26:36.400 | But if you're watching this,

00:26:37.600 | you haven't seen it, watch it.

00:26:40.440 | Because we are actually acting out,

00:26:43.360 | it's life imitating art.

00:26:45.640 | Humanity is doing exactly that right now,

00:26:47.880 | except it's an asteroid that we are building ourselves.

00:26:52.480 | Almost nobody is talking about it.

00:26:54.840 | People are squabbling across the planet

00:26:56.680 | about all sorts of things which seem very minor

00:26:58.840 | compared to the asteroid that's about to hit us, right?

00:27:02.320 | Most politicians don't even have their radar,

00:27:04.440 | this on the radar,

00:27:05.280 | they think maybe in 100 years or whatever.

00:27:07.680 | Right now, we're at a fork in the road.

00:27:11.400 | This is the most important fork humanity has reached

00:27:14.960 | in its over 100,000 years on this planet.

00:27:17.840 | We're building effectively a new species

00:27:21.720 | that's smarter than us.

00:27:22.880 | It doesn't look so much like a species yet

00:27:25.440 | 'cause it's mostly not embodied in robots,

00:27:27.560 | but that's a technicality which will soon be changed.

00:27:32.080 | And this arrival of artificial general intelligence

00:27:37.080 | that can do all our jobs as well as us,

00:27:39.280 | and probably shortly thereafter, superintelligence,

00:27:43.120 | which greatly exceeds our cognitive abilities,

00:27:46.360 | it's gonna either be the best thing ever to happen

00:27:48.720 | to humanity or the worst.

00:27:50.000 | I'm really quite confident that there is

00:27:52.000 | not that much middle ground there.

00:27:55.160 | - But it would be fundamentally transformative

00:27:58.080 | to human civilization. - Of course.

00:27:59.920 | Utterly and totally.

00:28:01.400 | Again, we branded ourselves as homo sapiens

00:28:04.560 | 'cause it seemed like the basic thing.

00:28:06.080 | We're the king of the castle on this planet.

00:28:09.000 | We're the smart ones.

00:28:10.160 | If we can control everything else,

00:28:11.840 | this could very easily change.

00:28:15.520 | We're certainly not gonna be the smartest

00:28:18.360 | on the planet for very long unless AI progress just halts.

00:28:22.720 | And we can talk more about why I think that's true

00:28:25.400 | 'cause it's controversial.

00:28:28.400 | And then we can also talk about

00:28:29.920 | reasons you might think it's gonna be the best thing ever

00:28:35.160 | and the reason you think it's gonna be the end of humanity,

00:28:39.120 | which is, of course, super controversial.

00:28:41.480 | But what I think we can, anyone who's working on advanced AI

00:28:46.480 | can agree on is it's much like the film "Don't Look Up"

00:28:52.720 | in that it's just really comical how little serious

00:28:57.640 | public debate there is about it given how huge it is.

00:29:01.440 | - So what we're talking about is a development

00:29:06.960 | of currently things like GPT-4

00:29:09.200 | and the signs it's showing of rapid improvement

00:29:14.840 | that may in the near term lead to development

00:29:18.440 | of super intelligent AGI, AI, general AI systems,

00:29:23.160 | and what kind of impact that has on society.

00:29:26.040 | - Exactly.

00:29:26.880 | - The whole thing is achieves

00:29:28.440 | general human level intelligence

00:29:30.680 | and then beyond that, general super human level intelligence.

00:29:34.680 | There's a lot of questions to explore here.

00:29:38.900 | So one, you mentioned halt.

00:29:41.640 | Is that the content of the letter?

00:29:44.800 | - Yeah.

00:29:45.640 | - Is to suggest that maybe we should pause

00:29:47.880 | the development of these systems.

00:29:49.680 | - Exactly.

00:29:50.520 | So this is very controversial.

00:29:54.480 | When we talked the first time,

00:29:56.000 | we talked about how I was involved

00:29:57.400 | in starting the Future Life Institute

00:29:59.040 | and we worked very hard on 2014, 2015

00:30:02.120 | was the mainstream AI safety.

00:30:04.080 | The idea that there even could be risks

00:30:07.280 | and that you could do things about them.

00:30:09.720 | Before then, a lot of people thought it was just really

00:30:11.640 | kooky to even talk about it and a lot of AI researchers

00:30:14.560 | felt worried that this was too flaky

00:30:18.600 | and could be bad for funding

00:30:20.080 | and that the people who talked about it

00:30:21.440 | were just not, didn't understand AI.

00:30:24.120 | I'm very, very happy with how that's gone

00:30:28.800 | and that now, it's completely mainstream.

00:30:32.160 | You go in any AI conference and people talk about AI safety

00:30:34.840 | and it's a nerdy technical field full of equations

00:30:37.920 | and similar and blah, blah.

00:30:39.520 | As it should be.

00:30:42.600 | But there's this other thing which has been quite taboo

00:30:47.520 | up until now, calling for slowdown.

00:30:50.560 | So what we've constantly been saying,

00:30:54.440 | including myself, I've been biting my tongue a lot,

00:30:56.520 | is that we don't need to slow down AI development,

00:31:01.520 | we just need to win this race, the wisdom race,

00:31:05.520 | between the growing power of the AI

00:31:07.840 | and the growing wisdom with which we manage it.

00:31:12.280 | And rather than trying to slow down AI,

00:31:14.280 | let's just try to accelerate the wisdom.

00:31:16.880 | Do all this technical work to figure out

00:31:18.720 | how you can actually ensure that your powerful AI

00:31:21.280 | is gonna do what you want it to do

00:31:23.320 | and have society adapt also with incentives and regulations

00:31:28.320 | so that these things get put to good use.

00:31:31.080 | Sadly, that didn't pan out.

00:31:34.840 | The progress on technical AI and capabilities

00:31:39.960 | has gone a lot faster than many people thought

00:31:46.080 | back when we started this in 2014.

00:31:49.080 | Turned out to be easier to build

00:31:50.960 | really advanced AI than we thought.

00:31:52.720 | And on the other side, it's gone much slower

00:31:58.360 | than we hoped with getting policy makers and others

00:32:03.360 | to actually put incentives in place

00:32:06.840 | to steer this in the good direction.

00:32:10.600 | Maybe we should unpack it and talk a little bit about each.

00:32:12.400 | So why did it go faster than a lot of people thought?

00:32:16.920 | In hindsight, it's exactly like building flying machines.

00:32:21.920 | People spent a lot of time wondering

00:32:24.360 | about how do birds fly?

00:32:26.560 | And that turned out to be really hard.

00:32:28.800 | Have you seen the TED Talk with a flying bird?

00:32:31.760 | - Like a flying robotic bird?

00:32:33.200 | - Yeah, it flies around the audience.

00:32:35.240 | But it took 100 years longer to figure out how to do that

00:32:38.440 | than for the Wright brothers to build the first airplane

00:32:40.440 | because it turned out there was a much easier way to fly.

00:32:43.080 | And evolution picked the more complicated one

00:32:45.640 | because it had its hands tied.

00:32:48.000 | It could only build a machine that could assemble itself,

00:32:51.040 | which the Wright brothers didn't care about.

00:32:53.920 | They can only build a machine

00:32:55.040 | that used only the most common atoms in the periodic table.

00:32:58.320 | Wright brothers didn't care about that.

00:32:59.780 | They could use steel, iron atoms.

00:33:03.280 | And it had to be able to repair itself

00:33:05.920 | and it also had to be incredibly fuel efficient.

00:33:08.520 | A lot of birds use less than half the fuel

00:33:13.360 | of a remote control plane flying the same distance.

00:33:16.880 | For humans, just throw a little more,

00:33:18.880 | put a little more fuel in a roof.

00:33:20.400 | There you go, 100 years earlier.

00:33:22.360 | That's exactly what's happening now

00:33:24.400 | with these large language models.

00:33:26.000 | The brain is incredibly complicated.

00:33:29.960 | Many people made the mistake of thinking

00:33:31.800 | we had to figure out how the brain does human level AI first

00:33:35.320 | before we could build in a machine.

00:33:37.520 | That was completely wrong.

00:33:39.000 | You can take an incredibly simple computational system

00:33:44.000 | called a transformer network

00:33:45.760 | and just train it to do something incredibly dumb.

00:33:48.480 | Just read a gigantic amount of texts

00:33:50.760 | and try to predict the next word.

00:33:52.400 | And it turns out if you just throw a ton of compute at that

00:33:57.240 | and a ton of data, it gets to be frighteningly good

00:34:01.520 | like GPT-4, which I've been playing with so much

00:34:03.840 | since it came out.

00:34:06.080 | And there's still some debate

00:34:09.120 | about whether that can get you all the way

00:34:11.040 | to full human level or not.

00:34:13.680 | But yeah, we can come back to the details of that

00:34:16.240 | and how you might get to human level AI

00:34:17.760 | even if large language models don't.

00:34:22.200 | - Can you briefly, if it's just a small tangent,

00:34:24.560 | comment on your feelings about GPT-4?

00:34:27.120 | Suggest that you're impressed by this rate of progress,

00:34:31.280 | but where is it?

00:34:32.760 | Can GPT-4 reason?

00:34:35.480 | What are the intuitions?

00:34:37.800 | What are human interpretable words you can assign

00:34:40.600 | to the capabilities of GPT-4

00:34:42.280 | that makes you so damn impressed with it?

00:34:45.040 | - I'm both very excited about it and terrified.

00:34:48.400 | It's an interesting mixture of emotions.

00:34:52.040 | - All the best things in life include those two somehow.

00:34:55.320 | - Yeah, it can absolutely reason.

00:34:57.360 | Anyone who hasn't played with it,

00:34:59.580 | I highly recommend doing that before dissing it.

00:35:03.040 | It can do quite remarkable reasoning.

00:35:06.440 | I've had it do a lot of things,

00:35:08.760 | which I realized I couldn't do that myself that well even.

00:35:12.200 | And obviously does it dramatically faster than we do too

00:35:16.040 | when you watch it type.

00:35:17.920 | And it's doing that while servicing a massive number

00:35:20.720 | of other humans at the same time.

00:35:23.080 | At the same time, it cannot reason as well as a human can

00:35:28.080 | on some tasks.

00:35:30.280 | Just because it's obviously a limitation

00:35:32.320 | from its architecture.

00:35:33.360 | We have in our heads what in geek speak

00:35:36.560 | is called a recurrent neural network.

00:35:38.720 | There are loops.

00:35:39.560 | Information can go from this neuron to this neuron

00:35:41.480 | to this neuron and then back to this one.

00:35:43.240 | You can like ruminate on something for a while.

00:35:44.880 | You can self-reflect a lot.

00:35:46.740 | These large language models that are,

00:35:50.320 | they cannot, like GPT-4.

00:35:52.400 | It's a so-called transformer where it's just like

00:35:55.200 | a one-way street of information basically.

00:35:57.880 | In geek speak, it's called a feed-forward neural network.

00:36:01.120 | And it's only so deep.

00:36:03.080 | So it can only do logic that's that many steps

00:36:05.840 | and that deep.

00:36:06.680 | And you can create the problems which will fail to solve

00:36:12.480 | for that reason.

00:36:13.760 | But the fact that it can do so amazing things

00:36:20.280 | with this incredibly simple architecture already

00:36:23.640 | is quite stunning.

00:36:24.880 | And what we see in my lab at MIT

00:36:27.280 | when we look inside large language models

00:36:30.600 | to try to figure out how they're doing it,

00:36:32.280 | that's the key core focus of our research.

00:36:35.520 | It's called mechanistic interpretability in geek speak.

00:36:40.520 | You have this machine that does something smart.

00:36:42.920 | You try to reverse engineer it.

00:36:44.400 | See how does it do it?

00:36:45.480 | Or you think of it also as artificial neuroscience.

00:36:49.840 | (laughing)

00:36:50.680 | 'Cause that's exactly what neuroscientists do

00:36:51.520 | with actual brains.

00:36:52.360 | But here you have the advantage that you can,

00:36:54.360 | you don't have to worry about measurement errors.

00:36:56.200 | You can see what every neuron is doing all the time.

00:36:59.560 | And a recurrent thing we see again and again,

00:37:02.360 | there's been a number of beautiful papers quite recently

00:37:06.200 | by a lot of researchers, some of them here,

00:37:09.120 | I mean in this area, is where when they figure out

00:37:12.080 | how something is done, you can say,

00:37:14.280 | "Oh man, that's such a dumb way of doing it."

00:37:16.840 | And you immediately see how it can be improved.

00:37:18.840 | Like for example, there was a beautiful paper recently

00:37:22.200 | where they figured out how a large language model

00:37:24.640 | stores certain facts, like Eiffel Tower is in Paris.

00:37:28.800 | And they figured out exactly how it's stored.

00:37:31.680 | The proof that they understood it was they could edit it.

00:37:34.200 | They changed some of the synapses in it.

00:37:37.800 | And then they asked it, "Where's the Eiffel Tower?"

00:37:39.680 | And it said, "It's in Rome."

00:37:41.440 | And then they asked you, "How do you get there?"

00:37:43.680 | "Oh, how do you get there from Germany?"

00:37:45.720 | "Oh, you take this train,

00:37:47.000 | and the Roma Termini train station, and this and that."

00:37:51.120 | And what might you see if you're in front of it?

00:37:52.880 | "Oh, you might see the Colosseum."

00:37:55.880 | So they had edited it.

00:37:57.040 | - So they literally moved it to Rome.

00:37:59.160 | - But the way that it's storing this information,

00:38:01.800 | it's incredibly dumb.

00:38:03.360 | For any fellow nerds listening to this,

00:38:07.800 | there was a big matrix, and roughly speaking,

00:38:11.720 | there are certain row and column vectors

00:38:13.200 | which encode these things, and they correspond

00:38:15.600 | very hand-wavily to principal components.

00:38:17.880 | And it would be much more efficient for a sparse matrix

00:38:21.360 | to just store in the database.

00:38:23.680 | But in everything so far, we've figured out

00:38:27.800 | how these things do, or ways where you can see

00:38:29.960 | they can easily be improved.

00:38:31.320 | And the fact that this particular architecture

00:38:34.240 | has some roadblocks built into it is in no way

00:38:37.680 | gonna prevent crafty researchers

00:38:40.960 | from quickly finding workarounds

00:38:42.600 | and making other kinds of architectures

00:38:47.480 | sort of go all the way.

00:38:50.120 | In short, it's turned out to be a lot easier

00:38:54.360 | to build close to human intelligence than we thought,

00:38:58.240 | and that means our runway as a species

00:39:00.240 | to get our shit together has shortened.

00:39:05.240 | - And it seems like the scary thing

00:39:07.960 | about the effectiveness of large language models,

00:39:11.680 | so Sam Altman I recently had a conversation with,

00:39:14.920 | and he really showed that the leap from GPT-3

00:39:19.840 | to GPT-4 has to do with just a bunch of hacks,

00:39:23.200 | a bunch of little explorations with smart researchers

00:39:28.200 | doing a few little fixes here and there.

00:39:30.560 | It's not some fundamental leap and transformation

00:39:34.240 | in the architecture.

00:39:35.680 | - And more data and more compute.

00:39:37.720 | - And more data and compute, but he said the big leaps

00:39:39.680 | has to do with not the data and the compute,

00:39:42.840 | but just learning this new discipline, just like you said.

00:39:46.800 | So researchers are going to look at these architectures,

00:39:48.920 | and there might be big leaps where you realize,

00:39:52.480 | wait, why are we doing this in this dumb way?

00:39:54.440 | And all of a sudden this model is 10x smarter,

00:39:57.160 | and that can happen on any one day,

00:39:59.400 | on any one Tuesday or Wednesday afternoon,

00:40:02.080 | and then all of a sudden you have a system

00:40:03.640 | that's 10x smarter.

00:40:05.740 | It seems like it's such a new discipline.

00:40:07.440 | It's such a new, like we understand so little

00:40:10.320 | about why this thing works so damn well

00:40:12.540 | that the linear improvement of compute, or exponential,

00:40:16.200 | but the steady improvement of compute,

00:40:17.720 | steady improvement of the data,

00:40:19.320 | may not be the thing that even leads to the next leap.

00:40:21.520 | It could be a surprise little hack that improves everything.

00:40:24.240 | - Or a lot of little leaps here and there,

00:40:25.800 | because so much of this is out in the open also.

00:40:29.960 | So many smart people are looking at this

00:40:33.640 | and trying to figure out little leaps here and there,

00:40:35.680 | and it becomes this sort of collective race

00:40:39.120 | where a lot of people feel,

00:40:40.600 | if I don't take the leap, someone else will.

00:40:42.640 | And this is actually very crucial for the other part of it.

00:40:45.880 | Why do we want to slow this down?

00:40:47.140 | So again, what this open letter is calling for

00:40:50.040 | is just pausing all training of systems

00:40:54.040 | that are more powerful than GPT-4 for six months.

00:40:59.400 | Just give a chance for the labs to coordinate a bit

00:41:04.400 | on safety and for society to adapt,

00:41:08.080 | give the right incentives to the labs.

00:41:09.840 | 'Cause you've interviewed a lot of these people

00:41:14.200 | who lead these labs, and you know just as well as I do

00:41:16.720 | that they're good people, they're idealistic people.

00:41:18.980 | They're doing this first and foremost

00:41:21.300 | because they believe that AI has a huge potential

00:41:23.580 | to help humanity.

00:41:24.700 | But at the same time, they are trapped

00:41:29.520 | in this horrible race to the bottom.

00:41:33.100 | Have you read "Meditations on Moloch" by Scott Alexander?

00:41:40.820 | - Yes.

00:41:41.660 | - Yeah, it's a beautiful essay on this poem by Ginzburg

00:41:44.560 | where he interprets it as being about this monster.

00:41:47.960 | It's this game theory monster that pits people

00:41:53.520 | against each other in this race to the bottom

00:41:55.980 | where everybody ultimately loses.

00:41:58.140 | The evil thing about this monster is

00:41:59.700 | even though everybody sees it and understands,

00:42:02.140 | they still can't get out of the race.

00:42:03.980 | A good fraction of all the bad things that we humans do

00:42:08.420 | are caused by Moloch.

00:42:10.040 | I like Scott Alexander's naming of the monster

00:42:15.040 | so we humans can think of it as a thing.

00:42:19.220 | If you look at why do we have overfishing,

00:42:23.120 | why do we have more generally the tragedy of the commons,

00:42:26.360 | why is it that, so Liv Borre,

00:42:29.560 | I don't know if you've had her on your podcast.

00:42:31.200 | - Yeah, she's become a friend, yeah.

00:42:33.080 | - Great, she made this awesome point recently

00:42:36.560 | that beauty filters that a lot of female influencers

00:42:40.740 | feel pressure to use are exactly Moloch in action again.

00:42:46.420 | First, nobody was using them

00:42:47.820 | and people saw them just the way they were

00:42:51.060 | and then some of them started using it

00:42:52.960 | and becoming ever more plastic fantastic

00:42:56.660 | and then the other ones that weren't using it

00:42:58.460 | started to realize that if they wanna just keep

00:43:01.820 | their market share, they have to start using it too

00:43:05.860 | and then you're in a situation where they're all using it

00:43:09.080 | and none of them has any more market share

00:43:11.280 | or less than before.

00:43:12.240 | So nobody gained anything, everybody lost

00:43:15.280 | and they have to keep becoming

00:43:17.920 | ever more plastic fantastic also, right?

00:43:20.440 | But nobody can go back to the old way

00:43:25.000 | 'cause it's just too costly, right?

00:43:28.660 | Moloch is everywhere and Moloch is not a new arrival

00:43:34.840 | on the scene either.

00:43:36.220 | We humans have developed a lot of collaboration mechanisms

00:43:39.340 | to help us fight back against Moloch

00:43:41.540 | through various kinds of constructive collaboration.

00:43:45.340 | The Soviet Union and the United States

00:43:47.380 | did sign a number of arms control treaties

00:43:51.780 | against Moloch who is trying to stoke them

00:43:53.980 | into unnecessarily risky nuclear arms races, et cetera,

00:43:58.220 | et cetera and this is exactly what's happening

00:44:00.340 | on the AI front.

00:44:01.980 | This time, it's a little bit geopolitics

00:44:05.360 | but it's mostly money where there's just

00:44:07.360 | so much commercial pressure.

00:44:08.680 | If you take any of these leaders of the top tech companies,

00:44:12.700 | if they just say, this is too risky,

00:44:16.600 | I wanna pause for six months,

00:44:19.400 | they're gonna get a lot of pressure

00:44:20.800 | from shareholders and others.

00:44:22.380 | They're like, well, if you pause,

00:44:26.240 | but those guys don't pause,

00:44:27.700 | we don't wanna get our lunch eaten.

00:44:31.400 | - Yeah.

00:44:32.240 | - And shareholders even have the power

00:44:34.160 | to replace the executives in the worst case, right?

00:44:37.360 | So we did this open letter because we wanna help

00:44:42.320 | these idealistic tech executives

00:44:44.600 | to do what their heart tells them

00:44:47.480 | by providing enough public pressure

00:44:49.280 | on the whole sector to just pause

00:44:52.000 | so that they can all pause in a coordinated fashion.

00:44:55.560 | And I think without the public pressure,

00:44:57.320 | none of them can do it alone,

00:44:59.800 | push back against their shareholders,

00:45:02.440 | no matter how good-hearted they are.

00:45:05.280 | 'Cause Moloch is a really powerful foe.

00:45:07.280 | - So the idea is to,

00:45:11.660 | for the major developers of AI systems like this,

00:45:15.040 | so we're talking about Microsoft, Google,

00:45:17.120 | Meta, and anyone else?

00:45:21.920 | - Well, OpenAI is very close with Microsoft now, of course.

00:45:25.200 | And there are plenty of smaller players.

00:45:28.640 | For example, Anthropic is very impressive.

00:45:32.480 | There's Conjecture.

00:45:33.640 | There's many, many, many players.

00:45:35.040 | I don't wanna make a long list to leave anyone out.

00:45:37.680 | And for that reason, it's so important

00:45:41.920 | that some coordination happens,

00:45:44.560 | that there's external pressure on all of them

00:45:46.840 | saying you all need to pause.

00:45:48.760 | 'Cause then the people, the researchers in these organizations

00:45:52.800 | who the leaders who wanna slow down a little bit,

00:45:54.880 | they can say to their shareholders,

00:45:56.640 | everybody's slowing down because of this pressure,

00:46:01.240 | and it's the right thing to do.

00:46:03.320 | - Have you seen in history their examples

00:46:07.400 | where it's possible to pause the Moloch?

00:46:09.240 | - Absolutely.

00:46:10.080 | Like human cloning, for example.

00:46:12.640 | You could make so much money on human cloning.

00:46:14.940 | Why aren't we doing it?

00:46:19.640 | Because biologists thought hard about this

00:46:23.960 | and felt like this is way too risky.

00:46:27.560 | They got together in the '70s in the Selomar

00:46:30.960 | and decided even to stop a lot more stuff also,

00:46:34.640 | just editing the human germline,

00:46:36.320 | gene editing that goes in to our offspring,

00:46:42.000 | and decided let's not do this

00:46:44.800 | because it's too unpredictable what it's gonna lead to.

00:46:48.120 | We could lose control over what happens to our species.

00:46:51.920 | So they paused.

00:46:53.840 | There was a ton of money to be made there.

00:46:56.280 | So it's very doable,

00:46:57.800 | but you need a public awareness of what the risks are

00:47:02.240 | and the broader community coming in and saying,

00:47:05.160 | hey, let's slow down.

00:47:06.800 | And another common pushback I get today

00:47:09.720 | is we can't stop in the West because China.

00:47:13.700 | And in China, undoubtedly,

00:47:17.320 | they also get told we can't slow down because the West,

00:47:20.440 | because both sides think they're the good guy.

00:47:22.800 | But look at human cloning.

00:47:25.420 | Did China forge ahead with human cloning?

00:47:28.920 | There's been exactly one human cloning

00:47:30.520 | that's actually been done that I know of.

00:47:32.440 | It was done by a Chinese guy.

00:47:34.160 | Do you know where he is now?

00:47:36.080 | In jail.

00:47:36.920 | And you know who put him there?

00:47:39.640 | - Who?

00:47:40.480 | - Chinese government.

00:47:42.160 | Not because Westerners said, China, look, this is...

00:47:45.680 | No, the Chinese government put him there

00:47:47.040 | because they also felt they like control,

00:47:50.480 | the Chinese government.

00:47:51.760 | If anything, maybe they are even more concerned

00:47:54.080 | about having control than Western governments

00:47:57.160 | have no incentive of just losing control

00:47:59.800 | over where everything is going.

00:48:01.600 | And you can also see the Ernie bot

00:48:03.320 | that was released by, I believe, Baidu recently.

00:48:07.200 | They got a lot of pushback from the government

00:48:08.920 | and had to rein it in in a big way.

00:48:11.400 | I think once this basic message comes out

00:48:15.240 | that this isn't an arms race,

00:48:16.560 | it's a suicide race,

00:48:17.840 | where everybody loses if anybody's AI goes out of control,

00:48:23.600 | it really changes the whole dynamic.

00:48:25.560 | I'll say this again, 'cause this is a very basic point

00:48:32.080 | I think a lot of people get wrong.

00:48:34.200 | Because a lot of people dismiss the whole idea

00:48:38.240 | that AI can really get very superhuman,

00:48:42.080 | because they think there's something really magical

00:48:43.880 | about intelligence such that it can only exist

00:48:46.320 | in the human mind.

00:48:47.160 | Because they believe that,

00:48:48.360 | they think it's gonna kind of get to just more or less

00:48:51.000 | GPT-4++ and then that's it.

00:48:54.560 | They don't see it as a suicide race.

00:48:58.520 | They think whoever gets that first,

00:49:00.000 | they're gonna control the world, they're gonna win.

00:49:02.560 | That's not how it's gonna be.

00:49:04.160 | And we can talk again about the scientific arguments

00:49:08.320 | from why it's not gonna stop there.

00:49:09.600 | But the way it's gonna be is if anybody completely

00:49:13.600 | loses control and you don't care,

00:49:16.600 | if someone manages to take over the world

00:49:21.840 | who really doesn't share your goals,

00:49:24.200 | you probably don't really even care very much

00:49:25.800 | about what nationality they have.

00:49:27.120 | You're not gonna like it, much worse than today.

00:49:29.800 | If you live in Orwellian dystopia,

00:49:34.120 | what do you care who created it, right?

00:49:36.720 | And if someone, if it goes farther

00:49:38.960 | and we just lose control even to the machines,

00:49:44.360 | so that it's not us versus them, it's us versus it,

00:49:47.360 | what do you care who created this underlying entity

00:49:52.320 | which has goals different from humans ultimately

00:49:55.320 | and we get marginalized, we get made obsolete,

00:49:58.560 | we get replaced?

00:49:59.600 | That's what I mean when I say it's a suicide race.

00:50:04.960 | It's kind of like we're rushing towards this cliff,

00:50:09.040 | but the closer to the cliff we get,

00:50:10.560 | the more scenic the views are and the more money we make.

00:50:13.240 | The more money there is there, so we keep going,

00:50:16.920 | but we have to also stop at some point, right?

00:50:19.200 | Quit while we're ahead.

00:50:20.760 | And it's a suicide race which cannot be won,

00:50:25.760 | but the way to really benefit from it

00:50:33.440 | is to continue developing awesome AI, a little bit slower,

00:50:38.440 | so we make it safe, make sure it does the things

00:50:41.640 | we always want and create a condition where everybody wins.

00:50:44.760 | Technology has shown us that geopolitics

00:50:49.440 | and politics in general is not a zero-sum game at all.

00:50:54.160 | - So there is some rate of development that will lead

00:50:57.200 | us as a human species to lose control of this thing

00:51:01.960 | and the hope you have is that there's some lower level

00:51:05.200 | of development which will not allow us to lose control.

00:51:09.840 | This is an interesting thought you have

00:51:11.080 | about losing control, so if you are somebody

00:51:14.360 | like Sander Parchai or Sam Altman at the head

00:51:17.160 | of a company like this, you're saying if they develop

00:51:20.240 | an AGI, they too will lose control of it.

00:51:23.280 | So no one person can maintain control,

00:51:26.720 | no group of individuals can maintain control.

00:51:29.120 | - If it's created very, very soon and is a big black box

00:51:33.760 | that we don't understand, like the large language models,

00:51:36.000 | yeah, then I'm very confident they're gonna lose control.

00:51:39.040 | But this isn't just me saying it.

00:51:40.760 | Sam Altman and Demis Hassabis have both said

00:51:44.000 | and themselves acknowledged that there's really great risks

00:51:47.880 | with this and they wanna slow down once they feel

00:51:50.240 | like it's scary, but it's clear that they're stuck

00:51:53.960 | and again, Moloch is forcing them to go a little faster

00:51:57.800 | than they're comfortable with because of pressure

00:51:59.680 | from just commercial pressures, right?

00:52:01.860 | To get a bit optimistic here, of course this is a problem

00:52:06.600 | that can be ultimately solved.

00:52:10.320 | It's just to win this wisdom race, it's clear that what

00:52:14.440 | we hoped that was gonna happen hasn't happened.

00:52:17.160 | The capability progress has gone faster

00:52:20.000 | than a lot of people thought and the progress

00:52:22.800 | in the public sphere of policymaking and so on

00:52:25.120 | has gone slower than we thought.

00:52:26.480 | Even the technical AI safety has gone slower.

00:52:29.060 | A lot of the technical safety research was kind of banking

00:52:32.060 | on that large language models and other poorly understood

00:52:35.800 | systems couldn't get us all the way, that you had to build

00:52:38.560 | more of a kind of intelligence that you could understand,

00:52:41.240 | maybe it could prove itself safe, things like this.

00:52:45.600 | And I'm quite confident that this can be done

00:52:50.360 | so we can reap all the benefits, but we cannot do it

00:52:53.680 | as quickly as this out of control express train we are

00:52:58.680 | on now is gonna get the AGI.

00:53:00.200 | That's why we need a little more time, I feel.

00:53:02.600 | - Is there something to be said,

00:53:05.640 | what Sam Altman talked about, which is,

00:53:07.720 | while we're in the pre-AGI stage, to release often

00:53:12.720 | and as transparently as possible to learn a lot.

00:53:17.400 | So as opposed to being extremely cautious, release a lot.

00:53:21.360 | Don't invest in a closed development where you focus

00:53:25.880 | on AI safety while it's somewhat dumb, quote unquote.

00:53:30.280 | Release as often as possible.

00:53:33.240 | And as you start to see signs of human level intelligence

00:53:38.240 | or superhuman level intelligence, then you put a halt on it.

00:53:41.960 | - Well, what a lot of safety researchers have been saying

00:53:45.480 | for many years is that the most dangerous things you can do

00:53:48.520 | with an AI is, first of all, teach it to write code.

00:53:52.220 | - Yeah.

00:53:53.060 | - 'Cause that's the first step towards recursive

00:53:54.520 | self-improvement, which can take it from AGI

00:53:56.440 | to much higher levels.

00:53:58.600 | Okay, oops, we've done that.

00:54:01.560 | And another thing, high risk is connected to the internet.

00:54:05.840 | Let it go to websites, download stuff on its own,

00:54:08.640 | talk to people.

00:54:09.740 | Oops, we've done that already.

00:54:12.480 | You know, Eliezer Yudkowsky,

00:54:13.600 | you said you interviewed him recently, right?

00:54:15.160 | - Yes, yes.

00:54:16.000 | - He had this tweet recently, which gave me one

00:54:19.200 | of the best laughs in a while, where he was like,

00:54:20.920 | "Hey, people used to make fun of me and say,

00:54:22.820 | "you're so stupid, Eliezer, 'cause you're saying

00:54:25.120 | "you have to worry."

00:54:28.320 | Obviously, developers, once they get to really strong AI,

00:54:32.760 | first thing you're gonna do is never connect it

00:54:34.760 | to the internet, keep it in a box

00:54:37.400 | where you can really study it.

00:54:39.440 | So he had written it in the meme form,

00:54:43.280 | so it's like, "then," and then that.

00:54:46.520 | Now, "lol, let's make a chatbot."

00:54:51.520 | (both laughing)

00:54:53.920 | And the third thing, Stuart Russell,

00:54:56.480 | you know, amazing AI researcher,

00:55:00.360 | he has argued for a while that we should never teach AI

00:55:05.360 | anything about humans.

00:55:08.400 | Above all, we should never let it learn

00:55:10.400 | about human psychology and how you manipulate humans.

00:55:13.020 | That's the most dangerous kind of knowledge you can give it.

00:55:16.240 | Yeah, you can teach it all it needs to know

00:55:18.320 | about how to cure cancer and stuff like that,

00:55:19.960 | but don't let it read Daniel Kahneman's book

00:55:23.400 | about cognitive biases and all that.

00:55:25.520 | And then, "oops, lol, let's invent social media

00:55:30.160 | recommender algorithms," which do exactly that.

00:55:34.720 | They get so good at knowing us and pressing our buttons

00:55:39.720 | that we're starting to create a world now

00:55:43.440 | where we just have ever more hatred

00:55:45.440 | 'cause they figured out that these algorithms,

00:55:48.760 | not out of evil, but just to make money on advertising,

00:55:51.920 | that the best way to get more engagement, the euphemism,

00:55:56.040 | get people glued to their little rectangles,

00:55:58.240 | but it's just to make them pissed off.

00:56:00.080 | - Well, that's really interesting that a large AI system

00:56:03.520 | that's doing the recommender system kind of task

00:56:06.440 | on social media is basically just studying human beings

00:56:09.880 | because it's a bunch of us rats giving it signal,

00:56:14.780 | nonstop signal.

00:56:15.920 | It'll show a thing, and then we give signal

00:56:17.960 | on whether we spread that thing, we like that thing,

00:56:20.880 | that thing increases our engagement,

00:56:22.440 | gets us to return to the platform.

00:56:24.240 | It has that on the scale of hundreds of millions

00:56:26.200 | of people constantly.

00:56:27.840 | So it's just learning and learning and learning.

00:56:29.760 | And presumably, if the number of parameters

00:56:32.080 | in the neural network that's doing the learning,

00:56:34.240 | the more end-to-end the learning is,

00:56:38.280 | the more it's able to just basically encode

00:56:41.240 | how to manipulate human behavior,

00:56:43.960 | how to control humans at scale.

00:56:45.360 | - Exactly, and that is not something

00:56:47.080 | I think is in humanity's interest.

00:56:49.400 | - Yes.

00:56:50.240 | - Now it's mainly letting some humans

00:56:52.440 | manipulate other humans for profit and power,

00:56:56.820 | which already caused a lot of damage.

00:57:00.860 | And eventually that's a sort of skill

00:57:03.840 | that can make AIs persuade humans to let them escape

00:57:07.880 | whatever safety precautions we have.

00:57:10.320 | But there was a really nice article

00:57:12.480 | in the New York Times recently by Yuval Noah Harari

00:57:16.880 | and two co-authors, including Tristan Harris

00:57:19.680 | from "The Social Dilemma."

00:57:20.920 | They have this phrase in there I love.

00:57:25.160 | Humanity's first contact with advanced AI

00:57:28.280 | was social media.

00:57:30.720 | And we lost that one.

00:57:32.780 | We now live in a country where there's much more hate

00:57:38.200 | in the world where there's much more hate, in fact.

00:57:41.000 | And in our democracy, we're having this conversation

00:57:43.960 | and people can't even agree on who won the last election.

00:57:47.920 | And we humans often point fingers at other humans

00:57:50.680 | and say it's their fault.

00:57:51.600 | But it's really Moloch and these AI algorithms.

00:57:55.020 | We got the algorithms and then Moloch

00:57:59.900 | pitted the social media companies against each other

00:58:02.200 | so nobody could have a less creepy algorithm

00:58:04.160 | 'cause then they would lose out on revenue

00:58:05.720 | to the other company.

00:58:07.040 | - Is there any way to win that battle back

00:58:08.840 | just if we just linger on this one battle

00:58:11.760 | that we've lost in terms of social media?

00:58:13.920 | Is it possible to redesign social media,

00:58:16.920 | this very medium in which we use as a civilization

00:58:20.920 | to communicate with each other,

00:58:22.480 | to have these kinds of conversations,

00:58:24.240 | to have discourse to try to figure out

00:58:25.760 | how to solve the biggest problems in the world,

00:58:28.360 | whether that's nuclear war or the development of AGI?

00:58:32.120 | Is it possible to do social media correctly?

00:58:35.920 | - I think it's not only possible, but it's necessary.

00:58:38.920 | Who are we kidding that we're gonna be able to solve

00:58:40.880 | all these other challenges if we can't even have

00:58:42.840 | a conversation with each other?

00:58:44.000 | It's constructive.

00:58:45.480 | The whole idea, the key idea of democracy

00:58:47.880 | is that you get a bunch of people together

00:58:50.400 | and they have a real conversation,

00:58:52.000 | the ones you try to foster on this podcast

00:58:53.920 | where you respectfully listen to people you disagree with.

00:58:57.120 | And you realize, actually, there are some things,

00:58:59.440 | actually, some common ground we have,

00:59:01.040 | and we both agree, let's not have a nuclear war,

00:59:04.680 | let's not do that, et cetera, et cetera.

00:59:07.600 | We're kidding ourselves thinking we can face off

00:59:12.480 | the second contact with ever more powerful AI

00:59:16.400 | that's happening now with these large language models

00:59:19.160 | if we can't even have a functional conversation

00:59:23.640 | in the public space.

00:59:25.520 | That's why I started the Improve the News project,

00:59:28.400 | improvethenews.org.

00:59:29.920 | But I'm an optimist, fundamentally,

00:59:33.560 | in that there is a lot of intrinsic goodness in people

00:59:41.760 | and that what makes the difference

00:59:45.200 | between someone doing good things for humanity

00:59:48.000 | and bad things is not some sort of fairy tale thing

00:59:51.600 | that this person was born with an evil gene

00:59:53.720 | and this one was not born with a good gene.

00:59:55.440 | No, I think it's whether we put,

00:59:58.240 | whether people find themselves in situations

01:00:01.600 | that bring out the best in them

01:00:03.800 | or that bring out the worst in them.

01:00:05.880 | And I feel we're building an internet and a society

01:00:10.080 | that brings out the worst in us.

01:00:14.320 | - But it doesn't have to be that way.

01:00:16.080 | - No, it does not.

01:00:16.920 | - It's possible to create incentives

01:00:18.880 | and also create incentives that make money,

01:00:22.680 | that both make money and bring out the best in people.

01:00:24.800 | - I mean, in the long term,

01:00:25.880 | it's not a good investment for anyone

01:00:27.880 | to have a nuclear war, for example.

01:00:30.400 | And is it a good investment for humanity

01:00:32.720 | if we just ultimately replace all humans by machines

01:00:35.680 | and then we're so obsolete that eventually

01:00:37.440 | there are no humans left?

01:00:40.600 | - Well, it depends, I guess, on how you do the math.

01:00:43.000 | But I would say by any reasonable economics,

01:00:46.640 | if you look at the future income of humans

01:00:48.440 | and there aren't any, that's not a good investment.

01:00:51.000 | Moreover, why can't we have a little bit of pride

01:00:56.360 | in our species, damn it?

01:00:58.120 | Why should we just build another species

01:00:59.800 | that gets rid of us?

01:01:01.840 | If we were Neanderthals,

01:01:03.440 | would we really consider it a smart move

01:01:07.240 | if we had really advanced biotech to build Homo sapiens?

01:01:11.560 | You might say, "Hey, Max, yeah, let's build

01:01:15.840 | these Homo sapiens.

01:01:18.920 | They're gonna be smarter than us.

01:01:20.280 | Maybe they can help us defend us better against predators

01:01:23.400 | and help fix up our caves, make them nicer.

01:01:27.200 | We'll control them undoubtedly."

01:01:29.000 | So then they build a couple, a little baby girl,

01:01:31.840 | a little baby boy.

01:01:35.960 | And then you have some wise old Neanderthal elders like,

01:01:39.400 | "Hmm, I'm scared that we're opening a Pandora's box here

01:01:44.400 | and that we're gonna get outsmarted

01:01:46.800 | by these super Neanderthal intelligences

01:01:51.800 | and there won't be any Neanderthals left."

01:01:55.280 | But then you have a bunch of others in the cave,

01:01:56.640 | "Well, yeah, you're such a Luddite scaremonger.

01:01:58.920 | Of course, they're gonna wanna keep us around

01:02:00.920 | 'cause we are their creators.

01:02:02.320 | I think the smarter they get,

01:02:05.320 | the nicer they're gonna get.

01:02:06.280 | They're gonna leave us.

01:02:07.400 | They're gonna want us around and it's gonna be fine.

01:02:11.480 | And besides, look at these babies, they're so cute.

01:02:14.400 | Clearly, they're totally harmless."

01:02:16.000 | That's exactly, those babies are exactly GPT-4.

01:02:19.160 | It's not, I wanna be clear, it's not GPT-4 that's terrifying.

01:02:24.160 | It's the GPT-4 is a baby technology.

01:02:27.800 | You know, and Microsoft even had a paper recently out

01:02:33.040 | with the title something like "Sparkles of AGI."

01:02:36.720 | Well, they were basically saying this is baby AI,

01:02:39.840 | like these little Neanderthal babies.

01:02:41.640 | And it's gonna grow up.

01:02:44.600 | There's gonna be other systems from the same company,

01:02:48.520 | from other companies, they'll be way more powerful

01:02:51.000 | but they're gonna take all the things,

01:02:53.120 | ideas from these babies.

01:02:55.440 | And before we know it, we're gonna be like

01:02:58.600 | those last Neanderthals who were pretty disappointed.

01:03:02.680 | And when they realized that they were getting replaced.

01:03:06.240 | - Well, this interesting point you make,

01:03:07.920 | which is the programming, it's entirely possible

01:03:10.160 | that GPT-4 is already the kind of system

01:03:13.200 | that can change everything by writing programs.

01:03:18.200 | - Like three, yeah, because it's Life 2.0,

01:03:21.880 | the systems I'm afraid of are gonna look nothing

01:03:25.280 | like a large language model and they're not gonna.

01:03:29.080 | But once it or other people figure out a way

01:03:32.880 | of using this tech to make much better tech,

01:03:35.080 | it's just constantly replacing its software.

01:03:38.160 | And from everything we've seen about how these work

01:03:42.040 | under the hood, they're like the minimum viable intelligence.

01:03:45.520 | They do everything in the dumbest way

01:03:47.680 | that still works sort of.

01:03:49.080 | - Yeah.

01:03:49.920 | - So they are Life 3.0, except when they replace

01:03:54.280 | their software, it's a lot faster

01:03:56.600 | than when you decide to learn Swedish.

01:03:59.120 | And moreover, they think a lot faster than us too.

01:04:04.680 | So when, we don't think on how one logical step

01:04:09.680 | every nanosecond or so the way they do.

01:04:18.400 | And we can't also just suddenly scale up our hardware

01:04:21.920 | massively in the cloud, we're so limited, right?

01:04:26.160 | So they are also Life, can soon become a little bit more

01:04:31.160 | like Life 3.0 in that if they need more hardware,

01:04:36.040 | hey, just rent it in the cloud, you know?

01:04:38.160 | How do you pay for it?

01:04:39.000 | Well, with all the services you provide.

01:04:41.000 | - And what we haven't seen yet, which could change a lot,

01:04:49.760 | is entire software systems.

01:04:53.920 | So right now, programming is done sort of in bits and pieces

01:04:57.480 | as an assistant tool to humans.

01:05:01.240 | But I do a lot of programming,

01:05:03.120 | and with the kind of stuff that GPT-4 is able to do,

01:05:05.720 | I mean, it's replacing a lot what I'm able to do.

01:05:07.900 | But you still need a human in the loop

01:05:10.760 | to kind of manage the design of things,

01:05:13.680 | manage like what are the prompts

01:05:15.760 | that generate the kind of stuff,

01:05:17.320 | to do some basic adjustment of the code,

01:05:19.360 | just do some debugging.

01:05:21.120 | But if it's possible to add on top of GPT-4

01:05:25.440 | kind of feedback loop of self-debugging, improving the code,

01:05:30.440 | and then you launch that system onto the wild,

01:05:35.440 | on the internet, because everything is connected,

01:05:37.400 | and have it do things, have it interact with humans,

01:05:39.920 | and then get that feedback,

01:05:41.240 | now you have this giant ecosystem of humans.

01:05:44.720 | That's one of the things that Elon Musk recently

01:05:47.720 | sort of tweeted as a case why everyone needs to pay $7

01:05:51.920 | or whatever for Twitter.

01:05:53.080 | - To make sure they're real.

01:05:54.760 | - Make sure they're real.

01:05:55.600 | We're now going to be living in a world

01:05:57.480 | where the bots are getting smarter and smarter and smarter

01:06:01.120 | to a degree where you can't tell the difference

01:06:05.880 | between a human and a bot.

01:06:06.920 | - That's right.

01:06:07.760 | - And now you can have bots outnumber humans

01:06:10.620 | by one million to one, which is why he's making a case

01:06:14.840 | why you have to pay to prove you're human,

01:06:17.360 | which is one of the only mechanisms to prove,

01:06:19.400 | which is depressing.

01:06:21.280 | - And I feel we have to remember,

01:06:24.480 | as individuals, we should, from time to time,

01:06:27.920 | ask ourselves why are we doing what we're doing,

01:06:29.920 | right, and as a species, we need to do that too.

01:06:33.000 | So if we're building, as you say,

01:06:36.240 | machines that are outnumbering us

01:06:39.280 | and more and more outsmarting us

01:06:41.360 | and replacing us on the job market,

01:06:43.080 | not just for the dangerous and boring tasks,

01:06:46.480 | but also for writing poems and doing art

01:06:49.480 | and things that a lot of people find really meaningful,

01:06:52.160 | gotta ask ourselves, why?

01:06:53.760 | Why are we doing this?

01:06:54.900 | The answer is Moloch is tricking us into doing it.

01:07:01.900 | And it's such a clever trick

01:07:03.120 | that even though we see the trick,

01:07:04.520 | we still have no choice but to fall for it, right?

01:07:07.360 | Also, the thing you said about you using co-pilot AI tools

01:07:15.720 | to program faster, how many times,

01:07:17.400 | what factor faster would you say you code now?

01:07:20.360 | Does it go twice as fast or?

01:07:22.560 | - I don't really, because it's such a new tool.

01:07:25.960 | - Yeah.

01:07:27.040 | - I don't know if speed is significantly improved,

01:07:29.580 | but it feels like I'm a year away

01:07:33.200 | from being five to 10 times faster.

01:07:36.960 | - So if that's typical for programmers,

01:07:39.680 | then you're already seeing another kind of self,

01:07:44.240 | recursive self-improvement, right?

01:07:45.680 | Because previously, one major generation of improvement

01:07:50.480 | of the code would happen on the human R&D timescale.

01:07:53.440 | And now if that's five times shorter,

01:07:55.400 | then it's gonna take five times less time

01:07:57.960 | than it otherwise would to develop the next level

01:08:00.320 | of these tools and so on.

01:08:02.520 | So this is exactly the beginning

01:08:06.440 | of an intelligence explosion.

01:08:09.040 | There can be humans in the loop a lot in the early stages,

01:08:11.760 | and then eventually humans are needed less and less,

01:08:14.480 | and the machines can more kind of go along.

01:08:16.440 | But what you said there is just an exact example

01:08:19.600 | of these sort of things.

01:08:20.720 | Another thing which,

01:08:22.120 | I was kind of lying on my psychiatrist,

01:08:27.520 | imagining I'm on a psychiatrist's couch here,

01:08:29.680 | saying, "What are my fears that people would do

01:08:31.480 | "with AI systems?"

01:08:33.040 | So I mentioned three that I had fears about many years ago

01:08:37.080 | that they would do, namely teach you the code,

01:08:41.680 | connect it to the internet,

01:08:42.760 | then teach it to manipulate humans.

01:08:45.200 | A fourth one is building an API

01:08:48.200 | where code can control this super powerful thing, right?

01:08:52.800 | That's very unfortunate,

01:08:54.560 | because one thing that systems like GPT-4

01:08:58.520 | have going for them is that they are an oracle

01:09:00.840 | in the sense that they just answer questions.

01:09:04.200 | There is no robot connected to GPT-4.

01:09:07.080 | GPT-4 can't go and do stock trading based on its thinking.

01:09:10.480 | It is not an agent.

01:09:13.000 | An intelligent agent is something

01:09:14.480 | that takes in information from the world,

01:09:16.560 | processes it to figure out what action to take

01:09:20.520 | based on its goals that it has,

01:09:22.360 | and then does something back on the world.

01:09:26.460 | But once you have an API, for example, GPT-4,

01:09:29.800 | nothing stops Joe Schmo and a lot of other people

01:09:33.040 | from building real agents,

01:09:35.680 | which just keep making calls somewhere

01:09:38.080 | in some inner loop somewhere

01:09:39.480 | to these powerful oracle systems,

01:09:41.720 | which makes them themselves much more powerful.

01:09:45.600 | That's another kind of unfortunate development,

01:09:48.920 | which I think we would have been better off delaying.

01:09:53.360 | I don't want to pick on any particular companies.

01:09:55.040 | I think they're all under a lot of pressure to make money.

01:09:58.260 | And again, the reason we're calling for this pause

01:10:04.480 | is to give them all cover

01:10:06.320 | to do what they know is the right thing,

01:10:08.840 | slow down a little bit at this point.

01:10:10.280 | But everything we've talked about,

01:10:12.860 | I hope we'll make it clear to people watching this

01:10:17.920 | why these sort of human level tools

01:10:22.000 | can cause a gradual acceleration.

01:10:23.640 | You keep using yesterday's technology

01:10:25.440 | to build tomorrow's technology.

01:10:27.320 | And when you do that over and over again,

01:10:30.320 | you naturally get an explosion.

01:10:32.360 | That's the definition of an explosion in science.

01:10:36.520 | If you have two people and they fall in love,

01:10:41.520 | now you have four people,

01:10:44.080 | and then they can make more babies,

01:10:46.120 | and now you have eight people,

01:10:47.120 | and then you have 16, 32, 64, et cetera.

01:10:50.840 | We call that a population explosion,

01:10:53.200 | where it's just that each,

01:10:55.160 | if it's instead free neutrons in a nuclear reaction,

01:10:59.560 | if each one can make more than one,

01:11:02.080 | then you get an exponential growth in that.

01:11:03.600 | We call it a nuclear explosion.

01:11:05.800 | All explosions are like that.

01:11:06.920 | And an intelligence explosion,

01:11:08.040 | it's just exactly the same principle,

01:11:09.440 | that some amount of intelligence

01:11:11.240 | can make more intelligence than that,

01:11:14.040 | and then repeat.

01:11:15.440 | You always get exponentials.

01:11:17.680 | - What's your intuition why it does?

01:11:19.040 | You mentioned there's some technical reasons

01:11:21.000 | why it doesn't stop at a certain point.

01:11:23.680 | What's your intuition?

01:11:24.720 | And do you have any intuition why it might stop?

01:11:28.360 | - It's obviously gonna stop

01:11:29.400 | when it bumps up against the laws of physics.

01:11:31.720 | There are some things you just can't do

01:11:32.960 | no matter how smart you are, right?

01:11:34.480 | - Allegedly.

01:11:36.200 | - 'Cause we don't know the full laws of physics yet, right?

01:11:41.080 | - Seth Lloyd wrote a really cool paper

01:11:42.680 | on the physical limits on computation, for example.

01:11:46.080 | If you put too much energy into it,

01:11:49.000 | then in finite space, it'll turn into a black hole.

01:11:51.920 | You can't move information around

01:11:53.320 | faster than the speed of light, stuff like that.

01:11:54.920 | But it's hard to store way more

01:11:58.680 | than a modest number of bits per atom, et cetera.

01:12:02.720 | But those limits are just astronomically above,

01:12:06.920 | like 30 orders of magnitude above where we are now.

01:12:09.320 | Bigger jump in intelligence

01:12:14.720 | than if you go from an ant to a human.

01:12:18.480 | I think, of course, what we want to do

01:12:23.560 | is have a controlled thing.

01:12:26.720 | A nuclear reactor, you put moderators in

01:12:28.680 | to make sure exactly it doesn't blow up out of control.

01:12:32.480 | When we do experiments with biology and cells and so on,

01:12:37.480 | we also try to make sure it doesn't get out of control.

01:12:41.160 | We can do this with AI, too.

01:12:44.440 | The thing is, we haven't succeeded yet.

01:12:47.360 | And Moloch is exactly doing the opposite,

01:12:51.680 | just fueling, just egging everybody on,

01:12:54.400 | faster, faster, faster,

01:12:56.360 | or the other company is gonna catch up with you,

01:12:58.280 | or the other country is gonna catch up with you.

01:13:01.840 | We have to want this stuff.

01:13:03.400 | I don't believe in this,

01:13:06.360 | just asking people to look into their hearts

01:13:09.400 | and do the right thing.

01:13:10.880 | It's easier for others to say that,

01:13:12.680 | but if you're in a situation

01:13:14.680 | where your company is gonna get screwed,

01:13:17.360 | by other companies that are not stopping,

01:13:23.840 | you're putting people in a very hard situation.

01:13:26.080 | The right thing to do

01:13:26.920 | is change the whole incentive structure instead.

01:13:29.920 | And this is not an old...

01:13:31.520 | Maybe I should say one more thing about this,

01:13:34.360 | 'cause Moloch has been around as humanity's

01:13:37.440 | number one or number two enemy

01:13:40.720 | since the beginning of civilization.

01:13:42.320 | And we came up with some really cool countermeasures.

01:13:46.600 | First of all, already over 100,000 years ago,

01:13:49.760 | evolution realized that it was very unhelpful

01:13:52.960 | that people kept killing each other all the time.

01:13:55.520 | So it genetically gave us compassion.

01:14:00.240 | And made it so that if you get two drunk dudes

01:14:03.240 | getting into a pointless bar fight,

01:14:05.040 | they might give each other black eyes,

01:14:07.920 | but they have a lot of inhibition

01:14:10.600 | towards just killing each other.

01:14:12.760 | And similarly, if you find a baby lying on the street

01:14:18.160 | when you go out for your morning jog tomorrow,

01:14:20.520 | you're gonna stop and pick it up, right?

01:14:22.120 | Even though it may make you late for your next podcast.

01:14:25.880 | So evolution gave us these genes

01:14:28.080 | that make our own egoistic incentives more aligned

01:14:32.320 | with what's good for the greater group we're part of.

01:14:35.520 | And then as we got a bit more sophisticated

01:14:39.760 | and developed language, we invented gossip,

01:14:43.640 | which is also a fantastic anti-Moloch.

01:14:45.800 | 'Cause now it really discourages liars,

01:14:51.680 | moochers, cheaters, because their own incentive now

01:14:57.040 | is not to do this, because word quickly gets around

01:15:00.880 | and then suddenly people aren't gonna invite them

01:15:02.880 | to their dinners anymore or trust them.

01:15:05.640 | And then when we got still more sophisticated

01:15:07.560 | and bigger societies, invented the legal system,

01:15:11.440 | where even strangers who couldn't rely on gossip

01:15:14.240 | and things like this would treat each other,

01:15:16.640 | would have an incentive.

01:15:17.880 | Now those guys in the bar fight,

01:15:19.240 | even if someone is so drunk

01:15:21.080 | that he actually wants to kill the other guy,

01:15:26.160 | he also has a little thought in the back of his head

01:15:28.080 | that, "Do I really wanna spend the next 10 years

01:15:30.480 | eating really crappy food in a small room?

01:15:34.680 | I'm just gonna chill out."

01:15:38.760 | And we similarly have tried to give these incentives

01:15:40.840 | to our corporations by having regulation

01:15:44.360 | and all sorts of oversight,

01:15:45.760 | so that their incentives are aligned with the greater good.

01:15:48.480 | We tried really hard.

01:15:49.560 | And the big problem that we're failing now

01:15:55.640 | is not that we haven't tried before,

01:15:57.480 | but it's just that the tech is growing much,

01:16:00.040 | is developing much faster

01:16:01.480 | than the regulators have been able to keep up.

01:16:03.440 | So regulators, it's kind of comical,

01:16:06.720 | like European Union right now is doing this AI act, right?

01:16:10.160 | Which, in the beginning,

01:16:13.040 | they had a little opt-out exception

01:16:16.040 | that GPT-4 would be completely excluded from regulation.

01:16:19.600 | Brilliant idea.

01:16:21.680 | - What's the logic behind that?

01:16:24.240 | - Some lobbyists pushed successfully for this.

01:16:27.400 | So we were actually quite involved

01:16:28.600 | with the Future Life Institute,

01:16:30.080 | Mark Brackel, Christo Ouk, Anthony Aguirre, and others.

01:16:34.160 | We're quite involved with educating various people

01:16:38.160 | involved in this process

01:16:39.120 | about these general purpose AI models coming

01:16:42.960 | and pointing out that they would become the laughingstock

01:16:45.360 | if they didn't put it in.

01:16:46.800 | So the French started pushing for it.

01:16:48.840 | It got put in to the draft,

01:16:50.840 | and it looked like all was good.

01:16:52.520 | And then there was a huge counter push from lobbyists.

01:16:56.800 | There were more lobbyists in Brussels from tech companies

01:16:59.520 | than from oil companies, for example.

01:17:02.440 | And it looked like it might,

01:17:04.000 | is it gonna maybe get taken out again?

01:17:06.680 | And now GPT-4 happened,

01:17:09.000 | and I think it's gonna stay in.

01:17:10.560 | But this just shows, you know,

01:17:12.320 | Moloch can be defeated,

01:17:14.240 | but the challenge we're facing is that the tech

01:17:18.080 | is generally much faster than what the policymakers are.

01:17:23.080 | And a lot of the policymakers

01:17:25.640 | also don't have a tech background.

01:17:28.160 | So it's, you know, we really need to work hard

01:17:31.200 | to educate them on what's taking place here.

01:17:34.800 | So we're getting the situation where the first kind of non,

01:17:39.240 | so I define artificial intelligence

01:17:41.320 | just as non-biological intelligence, right?

01:17:44.680 | And by that definition,

01:17:46.160 | a company, a corporation is also an artificial intelligence

01:17:50.560 | because the corporation isn't, it's humans, it's a system.

01:17:53.960 | If its CEO decides,

01:17:56.040 | if the CEO of a tobacco company decides one morning

01:17:58.760 | that she or he doesn't wanna sell cigarettes anymore,

01:18:01.000 | they'll just put another CEO in there.

01:18:02.840 | It's not enough to align the incentives of individual people

01:18:08.080 | or align individual computers' incentives to their owners,

01:18:12.920 | which is what technically AI safety research is about.

01:18:16.120 | You also have to align the incentives of corporations

01:18:18.800 | with the greater good.

01:18:19.840 | And some corporations have gotten so big and so powerful

01:18:23.040 | very quickly that in many cases,

01:18:26.400 | their lobbyists instead align the regulators

01:18:30.440 | to what they want rather than the other way around.

01:18:33.000 | It's a classic regulatory capture.

01:18:35.600 | - All right, is the thing that the slowdown hopes to achieve

01:18:40.400 | is give enough time to the regulators to catch up

01:18:43.560 | or enough time to the companies themselves to breathe

01:18:46.280 | and understand how to do AI safety correctly?

01:18:48.880 | - I think both, but I think that the vision,

01:18:52.000 | the path to success I see is first you give a breather

01:18:55.240 | actually to the people in these companies,

01:18:58.040 | their leadership who wants to do the right thing

01:19:00.240 | and they all have safety teams and so on on their companies.

01:19:03.080 | Give them a chance to get together with the other companies

01:19:08.720 | and the outside pressure can also help catalyze that

01:19:11.320 | and work out what is it that's,

01:19:17.280 | what are the reasonable safety requirements

01:19:21.200 | one should put on future systems before they get rolled out?

01:19:25.040 | There are a lot of people also in academia

01:19:27.520 | and elsewhere outside of these companies

01:19:29.240 | who can be brought into this

01:19:30.320 | and have a lot of very good ideas.

01:19:32.760 | And then I think it's very realistic

01:19:35.480 | that within six months you can get these people coming up,

01:19:39.880 | so here's a white paper,

01:19:40.880 | here's where we all think it's reasonable.

01:19:43.440 | You know, you didn't,

01:19:45.160 | just because cars killed a lot of people,

01:19:46.760 | you didn't ban cars,

01:19:48.080 | but they got together a bunch of people and decided,

01:19:50.200 | you know, in order to be allowed to sell a car,

01:19:52.320 | it has to have a seatbelt in it.

01:19:53.920 | They're the analogous things that you can start requiring

01:19:58.160 | a future AI systems so that they are safe.

01:20:03.080 | And once this heavy lifting,

01:20:08.080 | this intellectual work has been done by experts in the field,

01:20:11.760 | which can be done quickly,

01:20:13.520 | I think it's going to be quite easy to get policymakers

01:20:16.200 | to see, yeah, this is a good idea.

01:20:19.360 | And it's, you know, for the companies to fight Moloch,

01:20:24.360 | they want, and I believe Sam Altman

01:20:27.760 | has explicitly called for this,

01:20:29.120 | they want the regulators to actually adopt it

01:20:31.000 | so that their competition is going to abide by it too, right?

01:20:33.840 | You don't want to be enacting all these principles

01:20:38.840 | and then you abide by them,

01:20:40.760 | and then there's this one little company

01:20:43.760 | that doesn't sign on to it,

01:20:46.880 | and then now they can gradually overtake you.

01:20:49.640 | Then the companies will get,

01:20:51.080 | be able to sleep secure,

01:20:54.280 | knowing that everybody's playing by the same rules.

01:20:56.680 | - So do you think it's possible to develop guardrails

01:21:00.800 | that keep the systems from basically

01:21:05.800 | damaging irreparably humanity

01:21:09.200 | while still enabling sort of the capitalist-fueled

01:21:12.240 | competition between companies

01:21:13.640 | as they develop how to best make money with this AI?

01:21:16.960 | You think there's a balancing--

01:21:18.040 | - Totally. - That's possible?

01:21:19.160 | - Absolutely, I mean, we've seen that

01:21:20.560 | in many other sectors where you've had the free market

01:21:23.240 | produce quite good things without causing particular harm.

01:21:28.600 | When the guardrails are there and they work,

01:21:30.800 | capitalism is a very good way of optimizing

01:21:35.360 | for just getting the same things done more efficiently.

01:21:38.120 | But it was good, and in hindsight,

01:21:40.840 | and I've never met anyone,

01:21:42.360 | even on parties way over on the right,

01:21:48.160 | in any country who thinks it was a terrible idea

01:21:51.720 | to ban child labor, for example.

01:21:55.200 | - Yeah, but it seems like this particular technology

01:21:57.880 | has gotten so good so fast, become powerful

01:22:02.560 | to a degree where you could see in the near term

01:22:05.400 | the ability to make a lot of money

01:22:07.800 | and to put guardrails, to develop guardrails quickly

01:22:10.440 | in that kind of context seems to be tricky.

01:22:12.960 | It's not similar to cars or child labor.

01:22:16.640 | It seems like the opportunity to make a lot of money here

01:22:19.960 | very quickly is right here before us.

01:22:22.840 | - So again, there's this cliff.

01:22:24.920 | - Yeah, it gets quite scenic.

01:22:27.200 | The closer to the cliff you go,

01:22:29.000 | the more money there is, the more gold ingots

01:22:32.720 | there are on the ground you can pick up or whatever

01:22:34.280 | if you want to drive there very fast.

01:22:36.080 | But it's not in anyone's incentive that we go over the cliff

01:22:38.720 | and it's not like everybody's in their own car.

01:22:40.920 | All the cars are connected together with a chain.

01:22:43.680 | So if anyone goes over, they'll start dragging the others down too.

01:22:48.160 | And so ultimately, it's in the selfish interests

01:22:52.560 | also of the people in the companies to slow down

01:22:56.200 | when you start seeing the contours of the cliff

01:22:59.280 | there in front of you.

01:23:00.600 | The problem is that even though the people

01:23:03.080 | who are building the technology and the CEOs,

01:23:07.400 | they really get it, the shareholders

01:23:10.080 | and these other market forces,

01:23:12.400 | they are people who don't honestly understand

01:23:16.040 | that the cliff is there.

01:23:16.880 | They usually don't.

01:23:17.960 | You have to get quite into the weeds

01:23:19.600 | to really appreciate how powerful this is and how fast.

01:23:22.560 | And a lot of people are even still stuck again

01:23:24.160 | in this idea that in this carbon chauvinism,

01:23:29.160 | as I like to call it, that you can only have

01:23:31.800 | our level of intelligence in humans,

01:23:34.800 | that there's something magical about it.

01:23:36.000 | Whereas the people in the tech companies

01:23:38.200 | who build this stuff, they all realize

01:23:41.440 | that intelligence is information processing of a certain kind.

01:23:45.720 | And it really doesn't matter at all

01:23:48.000 | whether the information is processed by carbon atoms

01:23:50.280 | in neurons in brains or by silicon atoms

01:23:55.000 | in some technology we build.

01:23:56.840 | So you brought up capitalism earlier,

01:24:00.720 | and there are a lot of people who love capitalism

01:24:02.560 | and a lot of people who really, really don't.

01:24:07.560 | And it struck me recently that what's happening

01:24:12.960 | with capitalism here is exactly analogous

01:24:16.360 | to the way in which superintelligence might wipe us out.

01:24:20.120 | So, you know, I studied economics for my undergrad,

01:24:25.120 | Stockholm School of Economics, yay.

01:24:28.320 | (laughing)

01:24:29.760 | - Well, no, I tell me.

01:24:31.000 | - So I was very interested in how you could use

01:24:34.080 | market forces to just get stuff done more efficiently,

01:24:37.040 | but give the right incentives to the market

01:24:38.880 | so that it wouldn't do really bad things.

01:24:41.520 | So Dylan Hadfield-Manel, who's a professor

01:24:44.760 | and colleague of mine at MIT,

01:24:47.360 | wrote this really interesting paper

01:24:49.400 | with some collaborators recently,

01:24:51.520 | where they proved mathematically that if you just take

01:24:54.480 | one goal that you just optimize for,

01:24:57.880 | on and on and on indefinitely,

01:24:59.720 | that you think is gonna bring you in the right direction.

01:25:03.400 | What basically always happens is,

01:25:05.440 | in the beginning, it will make things better for you.

01:25:08.680 | But if you keep going, at some point,

01:25:11.320 | it's gonna start making things worse for you again.

01:25:13.720 | And then gradually it's gonna make it

01:25:15.000 | really, really terrible.

01:25:16.400 | So just as a simple, the way I think of the proof is,

01:25:20.520 | suppose you wanna go from here back to Austin, for example,

01:25:25.520 | and you're like, okay, yeah, let's just, let's go south,

01:25:29.400 | but you put in exactly the right,

01:25:30.760 | sort of the right direction.

01:25:32.120 | Just optimize that, south is possible.

01:25:34.120 | You get closer and closer to Austin,

01:25:35.920 | but there's always some little error.

01:25:41.360 | So you're not going exactly towards Austin,

01:25:44.240 | but you get pretty close.

01:25:45.440 | But eventually you start going away again,

01:25:47.240 | and eventually you're gonna be leaving the solar system.

01:25:50.160 | - Yeah.

01:25:51.000 | - And they proved, it's a beautiful mathematical proof.

01:25:53.440 | This happens generally, and this is very important for AI,

01:25:57.800 | because even though Stuart Russell has written a book

01:26:02.240 | and given a lot of talks on why it's a bad idea

01:26:06.000 | to have AI just blindly optimize something,

01:26:08.440 | that's what pretty much all our systems do.

01:26:10.720 | We have something called the loss function

01:26:12.280 | that we're just minimizing, or reward function,

01:26:14.160 | we're just maximizing.

01:26:15.680 | And capitalism is exactly like that too.

01:26:21.920 | We wanted to get stuff done more efficiently

01:26:26.240 | than people wanted.

01:26:27.560 | So we introduced the free market.

01:26:30.440 | Things got done much more efficiently than they did

01:26:34.360 | in say, communism, right?

01:26:38.760 | And it got better.

01:26:39.760 | But then it just kept optimizing.

01:26:43.840 | And kept optimizing.

01:26:45.320 | And you got ever bigger companies

01:26:46.480 | and ever more efficient information processing,

01:26:48.640 | and now also very much powered by IT.

01:26:51.080 | And eventually a lot of people are beginning to feel,

01:26:55.360 | wait, we're kind of optimizing a bit too much.

01:26:57.320 | Like, why did we just chop down half the rainforest?

01:26:59.920 | And why did suddenly these regulators

01:27:03.480 | get captured by lobbyists and so on?

01:27:07.360 | It's just the same optimization

01:27:08.600 | that's been running for too long.

01:27:11.040 | If you have an AI that actually has power over the world

01:27:15.000 | and you just give it one goal

01:27:16.200 | and just keep optimizing that,

01:27:18.240 | most likely everybody's gonna be like,

01:27:20.040 | yay, this is great in the beginning,

01:27:21.320 | things are getting better.

01:27:23.480 | But it's almost impossible to give it exactly

01:27:27.760 | the right direction to optimize in.

01:27:29.920 | And then eventually all hay breaks loose, right?

01:27:34.680 | Nick Bostrom and others have given examples

01:27:37.440 | that sound quite silly.

01:27:38.840 | What if you just wanna tell it to cure cancer or something,

01:27:43.800 | and that's all you tell it?

01:27:45.120 | Maybe it's gonna decide to take over entire continents

01:27:50.120 | just so it can get more supercomputer facilities in there

01:27:53.600 | and figure out how to cure cancer backwards.

01:27:55.960 | And then you're like, wait, that's not what I wanted, right?

01:27:58.440 | And the issue with capitalism

01:28:02.960 | and the issue with the front-end AI

01:28:04.200 | have kind of merged now

01:28:05.600 | because the Moloch I talked about

01:28:08.600 | is exactly the capitalist Moloch

01:28:10.400 | that we have built an economy

01:28:12.360 | that is optimizing for only one thing, profit.

01:28:16.680 | And that worked great back when things were very inefficient

01:28:20.960 | and then now it's getting done better.

01:28:22.720 | And it worked great as long as the companies

01:28:24.760 | were small enough that they couldn't capture the regulators.

01:28:28.080 | But that's not true anymore, but they keep optimizing.

01:28:32.520 | And now they realize that these companies

01:28:37.000 | can make even more profit by building ever more powerful AI

01:28:39.520 | even if it's reckless,

01:28:40.680 | but optimize more and more and more and more and more.

01:28:46.040 | So this is Moloch again, showing up.

01:28:50.280 | And I just wanna, anyone here who has any concerns

01:28:54.200 | about late-stage capitalism having gone a little too far,

01:28:59.200 | you should worry about superintelligence

01:29:02.400 | 'cause it's the same villain in both cases, it's Moloch.

01:29:06.120 | - And optimizing one objective function

01:29:10.040 | aggressively, blindly is going to take us there.

01:29:13.560 | - Yeah, we have this pause from time to time

01:29:16.080 | and look into our hearts and ask, why are we doing this?

01:29:20.560 | Is this, am I still going towards Austin

01:29:23.360 | or have I gone too far?

01:29:25.320 | Maybe we should change direction.

01:29:27.400 | - And that is the idea behind a halt for six months.

01:29:30.920 | Why six months? - Yeah.

01:29:32.300 | - That seems like a very short period.

01:29:34.200 | Can we just linger and explore different ideas here?

01:29:37.680 | Because this feels like a really important moment

01:29:40.160 | in human history where pausing would actually

01:29:42.960 | have a significant positive effect.

01:29:46.160 | - We said six months because we figured

01:29:50.480 | the number one pushback we were gonna get in the West

01:29:54.360 | was like, but China.

01:29:57.960 | And everybody knows there's no way that China

01:30:01.040 | is gonna catch up with the West on this in six months.

01:30:03.520 | So that argument goes off the table

01:30:05.800 | and you can forget about geopolitical competition

01:30:08.200 | and just focus on the real issue.

01:30:11.360 | That's why we put this.

01:30:12.600 | - That's really interesting.

01:30:13.800 | But you've already made the case that even for China,

01:30:18.000 | if you actually wanna take on that argument,

01:30:20.640 | China too would not be bothered by a longer halt

01:30:25.400 | because they don't wanna lose control,

01:30:26.960 | even more than the West doesn't.

01:30:28.560 | - That's what I think.

01:30:30.400 | - That's a really interesting argument.

01:30:32.280 | I have to actually really think about that,

01:30:33.960 | which the kind of thing people assume

01:30:36.920 | is if you develop an AGI, that open AI,

01:30:40.040 | if they're the ones that do it, for example,

01:30:42.200 | they're going to win.

01:30:44.000 | But you're saying, no, everybody loses.

01:30:47.400 | - Yeah, it's gonna get better and better and better

01:30:49.840 | and then kaboom, we all lose.

01:30:52.080 | That's what's gonna happen.

01:30:53.120 | - When lose and win are defined in a metric

01:30:55.000 | of basically quality of life for human civilization

01:31:00.000 | and for Sam Altman.

01:31:01.360 | - To be blunt, my personal guess,

01:31:05.280 | and people can quibble with this,

01:31:06.320 | is that we're just gonna, there won't be any humans.

01:31:08.680 | That's it, that's what I mean by lose.

01:31:10.560 | We can see in history, once you have some species

01:31:15.000 | or some group of people who aren't needed anymore,

01:31:18.180 | doesn't usually work out so well for them, right?

01:31:22.640 | - Yeah.

01:31:24.320 | - There were a lot of horses that were used

01:31:26.440 | for traffic in Boston and then the car got invented

01:31:29.080 | and most of them got, yeah, well, we don't need to go there.

01:31:33.040 | And if you look at humans, right now,

01:31:36.040 | why did the labor movement succeed

01:31:45.360 | after the Industrial Revolution?

01:31:46.720 | Because it was needed.

01:31:47.900 | Even though we had a lot of Molochs

01:31:52.920 | and there was child labor and so on,

01:31:54.840 | the company still needed to have workers

01:31:58.680 | and that's why strikes had power and so on.

01:32:02.640 | If we get to the point where most humans

01:32:05.120 | aren't needed anymore, I think it's quite naive

01:32:07.760 | to think that they're gonna still be treated well.

01:32:10.600 | We say that, yeah, yeah, everybody's equal

01:32:13.200 | and the government will always, we'll always protect them.

01:32:15.540 | But if you look in practice,

01:32:17.460 | groups that are very disenfranchised

01:32:19.480 | and don't have any actual power,

01:32:22.320 | usually get screwed.

01:32:24.200 | And now, in the beginning, so Industrial Revolution,

01:32:29.200 | we automated away muscle work.

01:32:30.920 | But that worked out pretty well eventually

01:32:35.560 | because we educated ourselves

01:32:36.880 | and started working with our brains instead

01:32:38.680 | and got usually more interesting, better paid jobs.

01:32:42.520 | But now we're beginning to replace brain work.

01:32:44.280 | So we replaced a lot of boring stuff,

01:32:46.320 | like we got the pocket calculator

01:32:48.800 | so you don't have people adding stuff

01:32:50.920 | and adding, multiplying numbers anymore at work.

01:32:53.720 | Fine, there were better jobs they could get.

01:32:56.280 | But now, GPT-4 and the stable diffusion

01:33:01.280 | and techniques like this,

01:33:02.800 | they're really beginning to blow away

01:33:06.000 | some jobs that people really love having.

01:33:08.720 | There was a heartbreaking article just,

01:33:10.600 | post just yesterday on social media I saw

01:33:13.120 | about this guy who was doing 3D modeling for gaming

01:33:17.320 | and all of a sudden now they got this new software,

01:33:20.960 | he just says prompts and he feels his whole job

01:33:24.760 | that he loved just lost its meaning.

01:33:27.320 | And I asked GPT-4 to rewrite "Twinkle, Twinkle Little Star"

01:33:32.320 | in the style of Shakespeare.

01:33:34.600 | I couldn't have done such a good job.

01:33:37.720 | It was really impressive.

01:33:39.920 | You've seen a lot of the art coming out here.

01:33:42.160 | So I'm all for automating away the dangerous jobs

01:33:47.160 | and boring jobs.

01:33:48.520 | But I think you hear a lot,

01:33:51.840 | some arguments which are too glib.

01:33:53.200 | Sometimes people say,

01:33:54.040 | "Well, that's all that's gonna happen.

01:33:55.120 | "We're getting rid of the boring,

01:33:57.600 | "tedious, dangerous jobs."

01:33:59.160 | It's just not true.

01:34:00.160 | There are a lot of really interesting jobs

01:34:01.880 | that are being taken away now.

01:34:03.240 | Journalism is gonna get crushed.

01:34:05.960 | Coding is gonna get crushed.

01:34:08.760 | I predict the job market for programmers,

01:34:12.120 | salaries are gonna start dropping.

01:34:15.360 | You said you can code five times faster

01:34:18.080 | than you need five times fewer programmers.

01:34:20.160 | Maybe there'll be more output also,

01:34:22.880 | but you'll still end up needing fewer programmers than today.

01:34:27.080 | And I love coding.

01:34:28.320 | I think it's super cool.

01:34:29.960 | So we need to stop and ask ourselves why again

01:34:35.000 | are we doing this as humans?

01:34:36.720 | I feel that AI should be built by humanity for humanity.

01:34:44.520 | And let's not forget that.

01:34:45.960 | It shouldn't be by Moloch for Moloch.

01:34:48.800 | Or what it really is now is kind of by humanity for Moloch,

01:34:53.240 | which doesn't make any sense.

01:34:54.880 | It's for us that we're doing it.

01:34:57.680 | And it would make a lot more sense

01:35:00.280 | if we develop, figure out gradually and safely

01:35:04.000 | how to make all this tech.

01:35:04.920 | And then we think about what are the kind of jobs

01:35:06.760 | that people really don't wanna have?

01:35:08.880 | Automate them all away.

01:35:10.640 | And then we ask,

01:35:11.480 | what are the jobs that people really find meaning in?

01:35:15.240 | Like maybe taking care of children in the daycare center,

01:35:20.240 | maybe doing art, et cetera, et cetera.

01:35:23.320 | And even if it were possible to automate that way,

01:35:26.760 | we don't need to do that, right?

01:35:28.600 | We built these machines.

01:35:30.320 | - Well, it's possible that we redefine

01:35:33.760 | or rediscover what are the jobs that give us meaning.

01:35:36.680 | So for me, the thing, it is really sad.

01:35:40.200 | Like I, half the time I'm excited,

01:35:43.920 | half the time I'm crying as I'm generating code

01:35:48.680 | because I kind of love programming.

01:35:52.640 | It's an act of creation.

01:35:55.240 | You have an idea, you design it,

01:35:58.240 | and then you bring it to life and it does something,

01:36:00.080 | especially if there's some intelligence that it does something.

01:36:02.400 | It doesn't even have to have intelligence.

01:36:04.120 | Printing "Hello World" on screen,

01:36:06.240 | you made a little machine and it comes to life.

01:36:10.240 | - Yeah.

01:36:11.080 | - And there's a bunch of tricks you learn along the way

01:36:13.840 | 'cause you've been doing it for many, many years.

01:36:17.440 | And then to see AI be able to generate

01:36:19.920 | all the tricks you thought were special.

01:36:21.920 | - Yeah.

01:36:22.760 | - I don't know, it's very, it's scary.

01:36:29.040 | It's almost painful, like a loss of innocence maybe.

01:36:34.080 | Like maybe when I was younger,

01:36:36.520 | I remember before I learned that sugar's bad for you,

01:36:39.960 | you should be on a diet.

01:36:41.720 | I remember I enjoyed candy deeply,

01:36:44.320 | in a way I just can't anymore,

01:36:47.120 | that I know is bad for me.

01:36:48.760 | I enjoyed it unapologetically, fully, just intensely.

01:36:53.760 | And I just, I lost that.

01:36:55.840 | Now I feel like a little bit of that is lost for me

01:36:59.400 | with programming, or being lost with programming,

01:37:01.520 | similar as it is for the 3D modeler,

01:37:06.440 | no longer being able to really enjoy the art of modeling

01:37:10.000 | 3D things for gaming.

01:37:11.840 | I don't know, I don't know what to make sense of that.

01:37:13.400 | Maybe I would rediscover that the true magic

01:37:15.960 | of what it means to be human

01:37:16.960 | is connecting with other humans,

01:37:18.160 | to have conversations like this.

01:37:19.800 | I don't know, to have sex, to eat food,

01:37:24.040 | to really intensify the value from conscious experiences

01:37:28.240 | versus like creating other stuff.

01:37:30.320 | - You're pitching the rebranding again

01:37:32.360 | from Homo sapiens to Homo sentiens.

01:37:34.000 | - Homo sentiens, yeah.

01:37:34.960 | - The meaningful experiences.

01:37:36.480 | And just to inject some optimism in this here,

01:37:38.400 | so we don't sound like a bunch of gloomers,

01:37:40.640 | we can totally have our cake and eat it.

01:37:43.080 | You hear a lot of totally bullshit claims

01:37:45.080 | that we can't afford having more teachers,

01:37:47.800 | have to cut the number of nurses.

01:37:49.440 | That's just nonsense, obviously.

01:37:51.680 | With anything, even quite far short of AGI,

01:37:57.800 | we can dramatically improve, grow the GDP

01:38:01.720 | and produce a wealth of goods and services.

01:38:05.600 | It's very easy to create a world

01:38:07.160 | where everybody's better off than today,

01:38:09.160 | including the richest people can be better off as well.

01:38:13.560 | It's not a zero-sum game in technology.

01:38:17.000 | Again, you can have two countries like Sweden and Denmark

01:38:20.440 | have all these ridiculous wars century after century.

01:38:25.360 | And sometimes Sweden got a little better off

01:38:28.800 | 'cause it got a little bit bigger.

01:38:29.720 | And then Denmark got a little bit better off

01:38:31.280 | 'cause Sweden got a little bit smaller.

01:38:33.320 | But then technology came along

01:38:35.000 | and we both got just dramatically wealthier

01:38:37.240 | without taking it away from anyone else.

01:38:38.640 | It was just a total win for everyone.

01:38:40.960 | And AI can do that on steroids.

01:38:44.480 | If you can build safe AGI,

01:38:47.960 | if you can build super intelligence,

01:38:49.800 | basically all the limitations that cause harm today

01:38:55.000 | can be completely eliminated.

01:38:57.840 | It's a wonderful possibility.

01:39:00.720 | And this is not sci-fi.

01:39:01.920 | This is something which is clearly possible

01:39:03.760 | according to the laws of physics.

01:39:05.640 | And we can talk about ways of making it safe also.

01:39:09.440 | But unfortunately, that'll only happen

01:39:13.480 | if we steer in that direction.

01:39:14.720 | That's absolutely not the default outcome.

01:39:17.120 | That's why income inequality keeps going up.

01:39:22.000 | That's why the life expectancy in the US

01:39:23.960 | has been going down now.

01:39:25.000 | I think it's four years in a row.

01:39:27.240 | I just read a heartbreaking study from the CDC

01:39:30.760 | about how something like one third

01:39:33.480 | of all teenage girls in the US

01:39:36.400 | been thinking about suicide.

01:39:37.840 | Those are steps in totally the wrong direction.

01:39:42.600 | And it's important to keep our eyes on the prize here

01:39:45.880 | that we have the power now for the first time

01:39:50.880 | in the history of our species.

01:39:53.960 | To harness artificial intelligence,

01:39:55.680 | to help us really flourish

01:39:58.120 | and help bring out the best in our humanity

01:40:03.360 | rather than the worst of it.

01:40:05.840 | To help us have really fulfilling experiences

01:40:10.240 | that feel truly meaningful.

01:40:11.480 | And you and I shouldn't sit here

01:40:13.680 | and dictate to future generations what they will be.

01:40:15.520 | Let them figure it out,

01:40:16.360 | but let's give them a chance to live

01:40:18.680 | and not foreclose all these possibilities for them

01:40:21.320 | by just messing things up, right?

01:40:23.040 | - And for that, we'll have to solve the AI safety problem.

01:40:25.800 | It would be nice if we can linger

01:40:27.080 | on exploring that a little bit.

01:40:29.520 | So one interesting way to enter that discussion

01:40:33.760 | is you tweeted and Elon replied.

01:40:37.920 | You tweeted, "Let's not just focus on whether GPT-4

01:40:40.400 | "will do more harm or good on the job market,

01:40:42.580 | "but also whether its coding skills

01:40:44.580 | "will hasten the arrival of superintelligence."

01:40:47.480 | That's something we've been talking about, right?

01:40:49.440 | So Elon proposed one thing in the reply,

01:40:51.560 | saying, "Maximum truth-seeking

01:40:53.320 | "is my best guess for AI safety."

01:40:55.960 | Can you maybe steelman the case

01:40:59.400 | for this objective function of truth

01:41:04.400 | and maybe make an argument against it?

01:41:06.760 | And in general, what are your different ideas

01:41:09.960 | to start approaching the solution to AI safety?

01:41:12.720 | - I didn't see that reply, actually.

01:41:14.400 | - Oh, interesting.

01:41:16.120 | - But I really resonate with it because

01:41:19.240 | AI is not evil.

01:41:20.600 | It caused people around the world

01:41:23.000 | to hate each other much more,

01:41:24.520 | but that's because we made it in a certain way.

01:41:27.960 | It's a tool.

01:41:28.800 | We can use it for great things and bad things.

01:41:30.480 | And we could just as well have AI systems.

01:41:33.240 | And this is part of my vision for success here,

01:41:36.840 | truth-seeking AI that really brings us together again.

01:41:41.840 | Why do people hate each other so much

01:41:43.800 | between countries and within countries?

01:41:46.080 | It's because they each have totally different versions

01:41:49.640 | of the truth, right?

01:41:50.840 | If they all had the same truth

01:41:54.560 | that they trusted for good reason,

01:41:56.240 | 'cause they could check it and verify it

01:41:57.800 | and not have to believe

01:41:58.640 | in some self-proclaimed authority, right?

01:42:00.840 | There wouldn't be nearly as much hate.

01:42:03.960 | There'd be a lot more understanding instead.

01:42:06.040 | And this is, I think,

01:42:09.160 | something AI can help enormously with.

01:42:11.200 | For example, if you're a journalist,

01:42:14.960 | for example, a little baby step in this direction

01:42:18.640 | is this website called Metaculous,

01:42:21.040 | where people bet and make predictions,

01:42:25.320 | not for money, but just for their own reputation.

01:42:29.000 | And it's kind of funny, actually.

01:42:30.480 | You treat the humans like you treat AI,

01:42:32.400 | as you have a loss function where they get penalized

01:42:35.560 | if they're super confident on something

01:42:37.880 | and then the opposite happens.

01:42:39.440 | Whereas if you're kind of humble and then you're like,

01:42:43.120 | I think it's 51% chance this is gonna happen,

01:42:45.360 | and then the other happens, you don't get penalized much.

01:42:48.600 | And what you can see is that some people

01:42:50.000 | are much better at predicting than others.

01:42:52.400 | They've earned your trust, right?

01:42:54.040 | One project that I'm working on right now

01:42:57.680 | is an outgrowth of Improve the News Foundation

01:42:59.320 | together with the Metaculous folks

01:43:00.520 | is seeing if we can really scale this up a lot

01:43:03.040 | with more powerful AI.

01:43:04.560 | 'Cause I would love it.

01:43:06.320 | I would love for there to be

01:43:07.400 | a really powerful truth-seeking system

01:43:10.560 | where that is trustworthy

01:43:14.320 | because it keeps being right about stuff.

01:43:17.120 | And people who come to it

01:43:19.240 | and maybe look at its latest trust ranking

01:43:24.160 | of different pundits and newspapers, et cetera,

01:43:27.480 | if they wanna know why someone got a low score,

01:43:29.840 | they can click on it and see all the predictions

01:43:32.440 | that they actually made and how they turned out.

01:43:35.040 | This is how we do it in science.

01:43:38.160 | You trust scientists like Einstein who said something

01:43:40.560 | everybody thought was bullshit and turned out to be right.

01:43:44.200 | You get a lot of trust points

01:43:45.560 | and he did it multiple times even.

01:43:47.440 | I think AI has the power to really heal

01:43:53.800 | a lot of the rifts we're seeing by creating trust systems.

01:43:58.400 | It has to get away from this idea today

01:44:02.520 | with some fact-checking science

01:44:03.880 | which might themselves have an agenda

01:44:05.760 | and you just trust it because of its reputation.

01:44:08.160 | You wanna have it so these sort of systems,

01:44:13.080 | they earn their trust and they're completely transparent.

01:44:16.320 | This I think would actually help a lot.

01:44:18.480 | That can, I think, help heal

01:44:21.400 | the very dysfunctional conversation that humanity has

01:44:24.920 | about how it's gonna deal with all its biggest challenges

01:44:28.880 | in the world today.

01:44:31.520 | - And then on the technical side,

01:44:35.920 | another common sort of gloom comment I get

01:44:39.400 | from people who are saying, "We're just screwed.

01:44:40.920 | "There's no hope," is, well, things like GPT-4

01:44:44.120 | are way too complicated for a human to ever understand

01:44:47.160 | and prove that they can be trustworthy.

01:44:49.240 | They're forgetting that AI can help us

01:44:51.600 | prove that things work.

01:44:53.480 | There's this very fundamental fact that in math,

01:44:58.240 | it's much harder to come up with a proof

01:45:01.760 | than it is to verify that the proof is correct.

01:45:04.920 | You can actually write a little proof-checking code

01:45:07.040 | which is quite short, but you can, as a human, understand it.

01:45:10.640 | And then it can check the most monstrously long proof

01:45:12.960 | ever generated even by a computer and say,

01:45:14.920 | "Yeah, this is valid."

01:45:16.280 | So right now, we have this approach

01:45:26.880 | with virus-checking software that it looks to see

01:45:29.680 | if there's something, if you should not trust it.

01:45:31.800 | And if it can prove to itself

01:45:33.160 | that you should not trust that code, it warns you.

01:45:35.600 | What if you flip this around?

01:45:40.000 | And this is an idea I give credit to Steve on Mahindra for.

01:45:44.240 | So that it will only run the code if it can prove,

01:45:47.360 | instead of not running it, if it can prove

01:45:49.000 | that it's not trustworthy, if it will only run it

01:45:51.520 | if it can prove that it's trustworthy.

01:45:52.920 | So it asks the code, "Prove to me that you're gonna do

01:45:55.120 | "what you say you're gonna do."

01:45:57.400 | And it gives you this proof.

01:46:00.640 | And you, a little proof-taker, can check it.

01:46:03.880 | Now you can actually trust an AI

01:46:06.480 | that's much more intelligent than you are, right?

01:46:08.880 | Because it's a problem to come up with this proof

01:46:13.440 | that you could never have found, but you should trust it.

01:46:16.160 | - So this is the interesting point.

01:46:17.760 | I agree with you, but this is where Eliezer Yakovsky

01:46:21.760 | might disagree with you.

01:46:23.200 | His claim, not with you, but with this idea.

01:46:26.760 | His claim is a super-intelligent AI

01:46:30.680 | would be able to know how to lie to you with such a proof.

01:46:34.800 | - How to lie to you and give me a proof

01:46:37.840 | that I'm gonna think is correct?

01:46:39.640 | - Yeah.

01:46:40.480 | - But it's not me that's lying to you.

01:46:41.920 | That's the trick, my proof-checker.

01:46:44.240 | It's just a piece of code.

01:46:45.160 | - So his general idea is a super-intelligent system

01:46:50.120 | can lie to a dumber proof-checker.

01:46:54.000 | So you're going to have, as a system

01:46:56.600 | becomes more and more intelligent,

01:46:57.760 | there's going to be a threshold

01:46:59.840 | where a super-intelligent system

01:47:01.560 | would be able to effectively lie

01:47:03.120 | to a slightly dumber AGI system.

01:47:05.400 | He really focuses on this weak AGI to strong AGI jump

01:47:11.680 | where the strong AGI can make all the weak AGIs

01:47:15.760 | think that it's just one of them, but it's no longer that.

01:47:19.960 | And that leap is when it runs away from you.

01:47:23.760 | - I don't buy that argument.

01:47:25.720 | I think no matter how super-intelligent an AI is,

01:47:29.320 | it's never gonna be able to prove to me

01:47:30.880 | that there are only finitely many primes, for example.

01:47:33.560 | It just can't.

01:47:36.800 | And it can try to snow me by making up

01:47:40.000 | all sorts of new weird rules of deduction

01:47:42.840 | that say, trust me, the way your proof-checker works

01:47:47.840 | is too limited, and we have this new hyper-math,

01:47:51.760 | and it's true.

01:47:53.040 | But then I would just take the attitude,

01:47:55.560 | okay, I'm gonna forfeit some of these

01:47:58.080 | supposedly super-cool technologies.

01:48:00.000 | I'm only gonna go with the ones that I can prove

01:48:01.840 | with my own trusted proof-checker.

01:48:03.880 | Then I think it's fine.

01:48:05.320 | There's still, of course, this is not something

01:48:08.520 | anyone has successfully implemented at this point,

01:48:10.360 | but I think I just give it as an example of hope.

01:48:14.680 | We don't have to do all the work ourselves.

01:48:17.160 | This is exactly the sort of very boring and tedious task

01:48:19.880 | that's perfect to outsource to an AI.

01:48:22.720 | And this is a way in which less powerful

01:48:24.680 | and less intelligent agents like us

01:48:26.960 | can actually continue to control

01:48:29.720 | and trust more powerful ones.

01:48:31.760 | - So build AGI systems that help us defend

01:48:33.800 | against other AGI systems.

01:48:35.840 | - Well, for starters, begin with a simple problem

01:48:39.160 | of just making sure that the system that you own

01:48:41.120 | or that's supposed to be loyal to you

01:48:44.320 | has to prove to itself that it's always gonna do

01:48:46.440 | the things that you actually want it to do.

01:48:48.240 | And if it can't prove it, maybe it's still gonna do it,

01:48:51.040 | but you won't run it.

01:48:52.480 | So you just forfeit some aspects

01:48:54.400 | of all the cool things AI can do.

01:48:56.520 | I bet you dollars and donuts it can still do

01:48:58.480 | some incredibly cool stuff for you.

01:49:00.520 | - Yeah.

01:49:01.440 | - There are other things too

01:49:02.600 | that we shouldn't sweep under the rug,

01:49:03.880 | like not every human agrees on exactly

01:49:06.880 | what direction we should go with humanity, right?

01:49:09.440 | - Yes.

01:49:10.840 | - And you've talked a lot about geopolitical things

01:49:13.760 | on your podcast to this effect,

01:49:16.280 | but I think that shouldn't distract us

01:49:19.120 | from the fact that there are actually a lot of things

01:49:21.680 | that everybody in the world virtually agrees on

01:49:25.920 | that, hey, you know, like having no humans on the planet

01:49:29.000 | in a near future, let's not do that, right?

01:49:35.120 | You look at something like

01:49:36.000 | the United Nations Sustainable Development Goals,

01:49:39.360 | some of them are quite ambitious,

01:49:42.280 | and basically all the countries agree,

01:49:44.960 | US, China, Russia, Ukraine, they all agree.

01:49:47.960 | So instead of quibbling about the little things

01:49:50.960 | we don't agree on, let's start with the things

01:49:53.120 | we do agree on and get them done.

01:49:56.720 | Instead of being so distracted by all these things

01:49:59.200 | we disagree on that Moloch wins,

01:50:02.840 | because frankly, Moloch going wild now,

01:50:07.840 | it feels like a war on life playing out in front of our eyes.

01:50:12.000 | If you just look at it from space, you know,

01:50:15.960 | we're on this planet, beautiful, vibrant ecosystem.

01:50:20.720 | Now we start chopping down big parts of it,

01:50:24.680 | even though most people thought that was a bad idea.

01:50:27.760 | Always start doing ocean acidification,

01:50:30.480 | wiping out all sorts of species.

01:50:33.000 | Oh, now we have all these close calls,

01:50:34.800 | we almost had a nuclear war.

01:50:36.720 | And we're replacing more and more of the biosphere

01:50:39.880 | with non-living things.

01:50:42.880 | We're also replacing in our social lives

01:50:45.600 | a lot of the things which were so valuable to humanity.

01:50:49.120 | A lot of social interactions now are replaced

01:50:51.360 | by people staring into their rectangles, right?

01:50:54.320 | And I'm not a psychologist, I'm out of my depth here,

01:50:58.640 | but I suspect that part of the reason why teen suicide

01:51:02.640 | and suicide in general in the US,

01:51:04.760 | the record breaking levels is actually caused by,

01:51:08.080 | again, AI technologies and social media,

01:51:11.600 | making people spend less time

01:51:13.080 | and actually just human interaction.

01:51:16.320 | We've all seen a bunch of good looking people

01:51:19.840 | in restaurants staring into the rectangles

01:51:22.240 | instead of looking into each other's eyes, right?

01:51:24.680 | So that's also a part of the war on life,

01:51:28.160 | that we're replacing so many

01:51:31.760 | really life affirming things by technology.

01:51:38.200 | We're putting technology between us.

01:51:41.640 | The technology that was supposed to connect us

01:51:43.800 | is actually distancing us, ourselves from each other.

01:51:47.000 | And then we're giving ever more power

01:51:50.680 | to things which are not alive.

01:51:52.680 | These large corporations are not living things, right?

01:51:55.600 | They're just maximizing profit.

01:51:57.480 | I wanna win the war on life.

01:52:01.960 | I think we humans, together with all our fellow living things

01:52:06.280 | on this planet, will be better off if we can

01:52:08.880 | remain in control over the non-living things

01:52:12.920 | and make sure that they work for us.

01:52:15.120 | I really think it can be done.

01:52:17.560 | - Can you just linger on this,

01:52:19.840 | maybe high level philosophical disagreement

01:52:23.160 | with Eliezer Yudkowsky

01:52:24.920 | in the hope you're stating?

01:52:30.080 | So he is very sure,

01:52:31.960 | he puts a very high probability,

01:52:35.560 | very close to one, depending on the day he puts it at one,

01:52:39.400 | that AI is going to kill humans.

01:52:42.800 | That there's just, he does not see a trajectory

01:52:47.320 | which it doesn't end up with that conclusion.

01:52:50.920 | What trajectory do you see that doesn't end up there?

01:52:54.360 | And maybe can you see the point he's making?

01:52:57.680 | And can you also see a way out?

01:53:01.300 | - First of all, I tremendously respect Eliezer Yudkowsky

01:53:07.480 | and his thinking.

01:53:10.040 | Second, I do share his view

01:53:13.280 | that there's a pretty large chance

01:53:14.840 | that we're not gonna make it as humans,

01:53:16.840 | that there won't be any humans on the planet

01:53:19.800 | in the not-too-distant future.

01:53:20.840 | And that makes me very sad.

01:53:22.280 | We just had a little baby, and I keep asking myself,

01:53:24.840 | how old is he even gonna get?

01:53:34.080 | And I ask myself,

01:53:37.360 | it feels, I said to my wife recently,

01:53:39.200 | it feels a little bit like I was just diagnosed

01:53:40.960 | with some sort of cancer,

01:53:43.560 | which has some risk of dying from

01:53:48.000 | and some risk of surviving,

01:53:49.360 | except this is the kind of cancer

01:53:53.560 | which will kill all of humanity.

01:53:54.720 | So I completely take seriously his concerns.

01:53:59.220 | I think, but I absolutely don't think it's hopeless.

01:54:05.360 | I think there is, first of all, a lot of momentum now,

01:54:10.360 | for the first time, actually,

01:54:15.200 | since the many, many years that have passed,

01:54:16.960 | since I and many others started warning about this,

01:54:20.000 | I feel most people are getting it now.

01:54:23.040 | Just talking to this guy in the gas station

01:54:30.400 | near our house the other day,

01:54:34.440 | and he's like, "I think we're getting replaced."

01:54:37.880 | So that's positive, that we're finally seeing this reaction,

01:54:44.360 | which is the first step towards solving the problem.

01:54:47.640 | Second, I really think that this vision

01:54:50.480 | of only running AIs,

01:54:52.280 | if the stakes are really high,

01:54:55.720 | that can prove to us that they're safe,

01:54:57.880 | it's really just virus checking in reverse again.

01:55:00.280 | I think it's scientifically doable.

01:55:03.680 | I don't think it's hopeless.

01:55:05.120 | We might have to forfeit some of the technology

01:55:08.960 | that we could get if we were putting blind faith in our AIs,

01:55:12.320 | but we're still gonna get amazing stuff.

01:55:14.360 | - Do you envision a process with a proof checker,

01:55:16.080 | like something like GPT-4 or GPT-5,

01:55:18.840 | would go through a process of rigorous interrogation?

01:55:21.840 | - No, I think it's hopeless.

01:55:23.200 | That's like trying to prove Vero about five spaghetti.

01:55:25.840 | (laughing)

01:55:27.720 | What I think, well, the vision I have for success

01:55:31.440 | is instead that just like we human beings

01:55:34.600 | were able to look at our brains

01:55:36.720 | and distill out the key knowledge.

01:55:38.160 | Galileo, when his dad threw him an apple when he was a kid,

01:55:42.520 | he was able to catch it

01:55:43.360 | 'cause his brain could in this funny spaghetti kind of way,

01:55:45.840 | predict how parabolas are gonna move.

01:55:48.040 | His Kahneman system won, right?

01:55:49.640 | Then he got older and it's like, wait, this is a parabola.

01:55:53.200 | It's Y equals X squared.

01:55:55.640 | I can distill this knowledge out,

01:55:56.960 | and today you can easily program it into a computer

01:55:59.680 | and it can simulate not just that,

01:56:01.640 | but how to get to Mars and so on, right?

01:56:04.240 | I envision a similar process

01:56:05.640 | where we use the amazing learning power of neural networks

01:56:09.120 | to discover the knowledge in the first place,

01:56:12.320 | but we don't stop with a black box and use that.

01:56:16.880 | We then do a second round of AI

01:56:19.160 | where we use automated systems to extract out the knowledge

01:56:21.600 | and see what are the insights it's had.

01:56:24.120 | And then we put that knowledge

01:56:28.280 | into a completely different kind of architecture

01:56:31.800 | or programming language or whatever

01:56:33.640 | that's made in a way that it can be both really efficient

01:56:37.040 | and also is more amenable to very formal verification.

01:56:41.480 | That's my vision.

01:56:44.280 | I'm not sitting here saying I'm confident,

01:56:46.880 | 100% sure that it's gonna work,

01:56:48.600 | but I don't think, the chance is certainly not zero either,

01:56:51.960 | and it will certainly be possible to do

01:56:53.560 | for a lot of really cool AI applications

01:56:57.280 | that we're not using now.

01:56:58.920 | So we can have a lot of the fun that we're excited about

01:57:02.240 | if we do this.

01:57:03.840 | We're gonna need a little bit of time.

01:57:05.680 | That's why it's good to pause

01:57:08.560 | and put in place requirements.

01:57:12.440 | One more thing also, I think,

01:57:15.640 | someone might think,

01:57:17.960 | well, 0% chance we're gonna survive.

01:57:20.680 | Let's just give up, right?

01:57:22.680 | That's very dangerous

01:57:26.240 | because there's no more guaranteed way to fail

01:57:29.840 | than to convince yourself that it's impossible and not try.

01:57:33.280 | When you study history and military history,

01:57:39.280 | the first thing you learn is

01:57:40.920 | that that's how you do psychological warfare.

01:57:44.680 | You persuade the other side that it's hopeless

01:57:47.120 | so they don't even fight.

01:57:48.360 | And then, of course, you win, right?

01:57:51.520 | Let's not do this psychological warfare on ourselves

01:57:55.120 | and say there's 100% probability

01:57:56.760 | we're all screwed anyway.

01:57:58.400 | It's sadly, I do get that a little bit sometimes

01:58:03.480 | from some young people who are so convinced

01:58:06.680 | that we're all screwed that they're like,

01:58:08.160 | I'm just gonna play computer games and do drugs

01:58:12.000 | 'cause we're screwed anyway, right?

01:58:14.000 | It's important to keep the hope alive

01:58:17.560 | because it actually has a causal impact

01:58:20.120 | and makes it more likely that we're gonna succeed.

01:58:22.720 | - It seems like the people that actually build solutions

01:58:25.680 | to a problem seemingly impossible to solve problems

01:58:28.760 | are the ones that believe.

01:58:31.200 | They're the ones who are the optimists.

01:58:33.040 | And it seems like there's some fundamental law

01:58:36.560 | to the universe where fake it 'til you make it

01:58:38.720 | kind of works.

01:58:40.000 | Like, believe it's possible and it becomes possible.

01:58:43.480 | - Yeah, was it Henry Ford who said that

01:58:46.000 | if you tell yourself that it's impossible, it is?

01:58:52.480 | Let's not make that mistake.

01:58:54.040 | And this is a big mistake society is making,

01:58:56.560 | I think all in all.

01:58:57.400 | Everybody's so gloomy and the media are also very biased

01:58:59.920 | towards if it bleeds, it leads and gloom and doom.

01:59:02.400 | So most visions of the future we have

01:59:07.400 | are dystopian, which really demotivates people.

01:59:12.600 | We wanna really, really, really focus on the upside also

01:59:16.000 | to give people the willingness to fight for it.

01:59:18.800 | And for AI, you and I mostly talked about gloom here again,

01:59:23.800 | but let's not forget that we have probably both lost

01:59:30.000 | someone we really cared about to some disease

01:59:33.920 | that we were told was incurable.

01:59:35.680 | Well, it's not.

01:59:37.160 | There's no law of physics saying

01:59:38.480 | you have to die of that cancer or whatever.

01:59:40.400 | Of course you can cure it.

01:59:42.280 | And there are so many other things

01:59:44.160 | that we with our human intelligence

01:59:45.920 | have also failed to solve on this planet,

01:59:49.360 | which AI could also very much help us with.

01:59:52.280 | So if we can get this right, just be a little more chill

01:59:56.760 | and slow down a little bit till we get it right.

01:59:59.160 | It's mind blowing how awesome our future can be.

02:00:04.480 | We talked a lot about stuff on earth, it can be great.

02:00:08.000 | But even if you really get ambitious

02:00:09.920 | and look up into the skies,

02:00:11.280 | there's no reason we have to be stuck on this planet

02:00:13.680 | for the rest of the remaining,

02:00:16.800 | for billions of years to come.

02:00:19.160 | We totally understand now as laws of physics

02:00:22.480 | let life spread out into space to other solar systems,

02:00:26.480 | to other galaxies and flourish

02:00:27.840 | for billions and billions of years.

02:00:29.960 | And this to me is a very, very hopeful vision

02:00:32.800 | that really motivates me to fight.

02:00:37.880 | And coming back to the end,

02:00:39.040 | something you talked about again,

02:00:40.680 | the struggle, how the human struggle

02:00:42.920 | is one of the things that also really gives meaning

02:00:45.240 | to our lives.

02:00:46.440 | If there's ever been an epic struggle, this is it.

02:00:50.080 | And isn't it even more epic if you're the underdog?

02:00:53.320 | If most people are telling you this is gonna fail,

02:00:55.760 | it's impossible, right?

02:00:57.480 | And you persist and you succeed.

02:01:01.040 | That's what we can do together as a species on this one.

02:01:05.280 | A lot of pundits are ready to count this out.

02:01:08.800 | - Both in the battle to keep AI safe

02:01:11.560 | and becoming a multi-planetary species.

02:01:13.680 | - Yeah, and they're the same challenge.

02:01:16.480 | If we can keep AI safe,

02:01:17.640 | that's how we're gonna get multi-planetary very efficiently.

02:01:21.600 | - I have some sort of technical questions

02:01:23.600 | about how to get it right.

02:01:24.720 | So one idea that I'm not even sure

02:01:28.800 | what the right answer is to is,

02:01:31.200 | should systems like GPT-4 be open sourced

02:01:35.000 | in whole or in part?

02:01:36.600 | Can you see the case for either?

02:01:40.680 | - I think the answer right now is no.

02:01:42.600 | I think the answer early on was yes.

02:01:45.880 | So we could bring in all the wonderful,

02:01:50.200 | great thought process of everybody on this.

02:01:53.160 | But asking, should we open source GPT-4 now

02:01:56.600 | is just the same as if you say,

02:01:57.880 | well, is it good?

02:01:58.720 | Should we open source how to build

02:02:02.320 | really small nuclear weapons?

02:02:04.280 | Should we open source how to make bioweapons?

02:02:09.680 | Should we open source how to make a new virus

02:02:13.600 | that kills 90% of everybody who gets it?

02:02:15.440 | Of course we shouldn't.

02:02:17.400 | - So it's already that powerful.

02:02:19.600 | It's already that powerful that we have to respect

02:02:22.720 | the power of the systems we've built.

02:02:26.680 | - The knowledge that you get

02:02:29.000 | from open sourcing everything we do now

02:02:32.520 | might very well be powerful enough

02:02:35.120 | that people looking at that

02:02:38.040 | can use it to build the things

02:02:39.960 | that are really threatening.

02:02:41.600 | Remember, open AI is,

02:02:43.280 | GPT-4 is a baby AI.

02:02:46.440 | Baby, sort of baby proto, almost little bit AGI

02:02:50.720 | according to what Microsoft's recent paper said.

02:02:53.920 | It's not that that we're scared of.

02:02:55.640 | What we're scared about is people taking that

02:02:57.880 | who might be a lot less responsible

02:03:01.280 | than the company that made it

02:03:02.680 | and just going to town with it.

02:03:06.000 | That's why we want to,

02:03:10.480 | it's an information hazard.

02:03:12.200 | There are many things which are not open sourced

02:03:15.120 | right now in society for a very good reason.

02:03:17.760 | Like how do you make

02:03:18.960 | certain kind of very powerful toxins

02:03:23.600 | out of stuff you can buy at Home Depot,

02:03:27.000 | you don't open source those things for a reason.

02:03:29.560 | And this is really no different.

02:03:32.480 | - So open-- - And I'm saying that,

02:03:34.920 | I have to say it feels in a way a bit weird to say it

02:03:38.120 | because MIT is like the cradle of the open source movement.

02:03:42.400 | And I love open source in general,

02:03:44.320 | power to the people, let's say.

02:03:46.080 | But there's always gonna be some stuff

02:03:50.920 | that you don't open source.

02:03:52.320 | And it's just like you don't open source.

02:03:55.480 | So we have a three month old baby, right?

02:03:56.880 | When he gets a little bit older,

02:03:58.480 | we're not gonna open source to him

02:03:59.640 | all the most dangerous things he can do in the house.

02:04:02.080 | - Yeah. - Right?

02:04:04.400 | But it does, it's a weird feeling

02:04:07.040 | because this is one of the first moments in history

02:04:10.600 | where there's a strong case to be made

02:04:13.240 | not to open source software.

02:04:15.740 | This is when the software has become too dangerous.

02:04:19.720 | - Yeah, but it's not the first time

02:04:21.160 | that we didn't wanna open source a technology.

02:04:23.080 | - Technology, yeah.

02:04:24.040 | Is there something to be said

02:04:28.400 | about how to get the release of such systems right?

02:04:30.980 | Like GPT-4 and GPT-5.

02:04:33.980 | So OpenAI went through a pretty rigorous effort

02:04:37.820 | for several months.

02:04:39.140 | You could say it could be longer,

02:04:40.300 | but nevertheless it's longer than you would have expected

02:04:42.900 | of trying to test the system

02:04:44.540 | to see like what are the ways it goes wrong

02:04:46.660 | to make it very difficult for people,

02:04:49.300 | somewhat difficult for people to ask things,

02:04:51.860 | how do I make a bomb for $1?

02:04:54.300 | Or how do I say I hate a certain group on Twitter

02:05:00.180 | in a way that doesn't get me blocked from Twitter,

02:05:02.260 | banned from Twitter, those kinds of questions.

02:05:05.380 | So you basically use the system to do harm.

02:05:08.940 | Is there something you could say about ideas

02:05:13.340 | you have just on looking,

02:05:15.460 | having thought about this problem of AI safety,

02:05:17.740 | how to release such system,

02:05:18.940 | how to test such systems when you have them

02:05:21.000 | inside the company?

02:05:22.280 | - Yeah, so a lot of people say

02:05:29.900 | that the two biggest risks from large language models are

02:05:33.220 | it's spreading disinformation,

02:05:40.020 | harmful information of various types.

02:05:42.380 | And second, being used for offensive cyber weapon.

02:05:48.620 | So I think those are not the two greatest threats.

02:05:53.300 | They're very serious threats and it's wonderful

02:05:55.020 | that people are trying to mitigate them.

02:05:58.500 | A much bigger elephant in the room

02:06:00.220 | is how is this just gonna disrupt our economy

02:06:02.620 | in a huge way, obviously,

02:06:03.620 | and maybe take away a lot of the most meaningful jobs.

02:06:06.300 | And an even bigger one is the one we spent

02:06:08.820 | so much time talking about here,

02:06:10.180 | that this becomes the bootloader

02:06:15.180 | for the more powerful AI.

02:06:17.860 | - Write code, connect it to the internet, manipulate humans.

02:06:21.120 | - Yeah, and before we know it, we have something else,

02:06:23.900 | which is not at all a large language model.

02:06:25.700 | It looks nothing like it,

02:06:26.860 | but which is way more intelligent and capable and has goals.

02:06:29.860 | And that's the elephant in the room.

02:06:33.720 | And obviously, no matter how hard

02:06:36.220 | any of these companies have tried,

02:06:37.920 | that's not something that's easy for them to verify

02:06:41.220 | with large language models.

02:06:42.460 | And the only way to really lower that risk a lot

02:06:45.660 | would be to not let, for example,

02:06:48.940 | never let it read any code, not train on that,

02:06:52.020 | and not put it into an API,

02:06:54.400 | and not give it access to so much information

02:06:59.400 | about how to manipulate humans.

02:07:01.960 | But that doesn't mean you still can't make

02:07:05.840 | a ton of money on them.

02:07:08.280 | We're gonna just watch now this coming year,

02:07:13.720 | Microsoft is rolling out the new Office suite

02:07:17.680 | where you go into Microsoft Word and give it a prompt,

02:07:21.480 | and it writes the whole text for you,

02:07:23.160 | and then you edit it.

02:07:24.600 | And then you're like,

02:07:25.440 | "Oh, give me a PowerPoint version of this,"

02:07:26.680 | and it makes it.

02:07:27.520 | And now take the spreadsheet and blah, blah.

02:07:31.440 | All of those things, I think,

02:07:32.920 | you can debate the economic impact of it

02:07:35.920 | and whether society is prepared to deal with this disruption,

02:07:39.280 | but those are not the things which,

02:07:41.040 | that's not the elephant of the room

02:07:43.560 | that keeps me awake at night for wiping out humanity.

02:07:46.200 | And I think that's the biggest misunderstanding we have.

02:07:51.200 | A lot of people think that we're scared

02:07:52.640 | of automatic spreadsheets.

02:07:55.680 | That's not the case.

02:07:56.560 | That's not what Eliezer was freaked out about either.

02:07:59.720 | - Is there, in terms of the actual mechanism

02:08:03.600 | of how AI might kill all humans,

02:08:06.720 | so something you've been outspoken about,

02:08:09.600 | you've talked about a lot, is autonomous weapon systems.

02:08:13.720 | So the use of AI in war.

02:08:17.200 | Is that one of the things that still you carry concern for

02:08:21.200 | as these systems become more and more powerful?

02:08:23.120 | - I carry concern for it,

02:08:24.120 | not that all humans are going to get killed by slaughterbots,

02:08:26.480 | but rather just as express route into Orwellian dystopia

02:08:31.480 | where it becomes much easier for very few to kill very many

02:08:35.080 | and therefore it becomes very easy

02:08:36.400 | for very few to dominate very many.

02:08:38.200 | If you want to know how AI could kill all people,

02:08:43.920 | just ask yourself,

02:08:45.400 | humans have driven a lot of species extinct.

02:08:47.720 | How do we do it?

02:08:49.960 | We were smarter than them.

02:08:51.320 | Usually we didn't do it even systematically

02:08:55.520 | by going around one after the other

02:08:58.000 | and stepping on them or shooting them or anything like that.

02:09:00.080 | We just like chopped down their habitat

02:09:02.160 | 'cause we needed it for something else.

02:09:04.240 | In some cases, we did it by putting more carbon dioxide

02:09:08.080 | in the atmosphere because of some reason

02:09:10.840 | that those animals didn't even understand

02:09:13.680 | and now they're gone, right?

02:09:15.800 | So if you're an AI and you just want to figure something out

02:09:20.800 | then you decide, we just really need the space here

02:09:26.480 | to build more compute facilities.

02:09:28.520 | If that's the only goal it has,

02:09:34.240 | we are just the sort of accidental roadkill along the way.

02:09:37.920 | And you could totally imagine,

02:09:38.960 | yeah, maybe this oxygen is kind of annoying

02:09:40.840 | 'cause it caused more corrosion,

02:09:42.160 | so let's get rid of the oxygen

02:09:44.360 | and good luck surviving after that.

02:09:46.480 | I'm not particularly concerned

02:09:48.200 | that they would want to kill us

02:09:49.920 | just because that would be a goal in itself.

02:09:54.920 | We've driven a number of the elephant species extinct.

02:10:02.320 | It wasn't 'cause we didn't like elephants.

02:10:04.420 | The basic problem is you just don't want to give,

02:10:11.040 | you don't want to cede control over your planet

02:10:13.960 | to some other more intelligent entity

02:10:17.120 | that doesn't share your goals.

02:10:18.440 | It's that simple.

02:10:19.280 | So which brings us to another key challenge

02:10:23.720 | which AI safety researchers have been grappling with

02:10:25.880 | for a long time.

02:10:27.440 | How do you make AI first of all understand our goals

02:10:31.720 | and then adopt our goals

02:10:32.760 | and then retain them as they get smarter, right?

02:10:41.520 | All three of those are really hard, right?

02:10:44.080 | Like a human child, first they're just not smart enough

02:10:49.080 | to understand our goals.

02:10:50.640 | They can't even talk.

02:10:53.020 | And then eventually they're teenagers

02:10:56.240 | and understand our goals just fine,

02:10:57.640 | but they don't share.

02:10:59.080 | But there is fortunately a magic phase in the middle

02:11:03.760 | where they're smart enough to understand our goals

02:11:05.440 | and malleable enough that we can hopefully

02:11:06.840 | with good parenting teach them right from wrong

02:11:09.580 | and instill good goals in them, right?

02:11:12.280 | So those are all tough challenges with computers.

02:11:17.960 | And then even if you teach your kids good goals

02:11:20.560 | when they're little, they might outgrow them too.

02:11:22.300 | And that's a challenge for machines to keep improving.

02:11:25.720 | So these are a lot of hard challenges we're up for,

02:11:30.380 | but I don't think any of them are insurmountable.

02:11:33.240 | The fundamental reason why Eliezer looked so depressed

02:11:37.980 | when he last saw him was because he felt

02:11:39.800 | there just wasn't enough time.

02:11:42.060 | - Oh, not that it was unsolvable.

02:11:44.600 | - Correct. - There's just not enough time.

02:11:46.000 | - He was hoping that humanity was gonna take this threat

02:11:48.240 | more seriously so we would have more time.

02:11:50.800 | - Yeah.

02:11:51.640 | - And now we don't have more time.

02:11:53.360 | That's why the open letter is calling for more time.

02:11:56.360 | - But even with time, the AI alignment problem

02:12:02.880 | seems to be really difficult.

02:12:06.360 | - Oh yeah.

02:12:08.200 | - But it's also the most worthy problem,

02:12:11.660 | the most important problem for humanity to ever solve.

02:12:14.220 | Because if we solve that one, Lex,

02:12:15.940 | that aligned AI can help us solve all the other problems.

02:12:20.740 | - 'Cause it seems like it has to have constant humility

02:12:23.940 | about its goal, constantly questioning the goal.

02:12:26.440 | Because as you optimize towards a particular goal

02:12:31.220 | and you start to achieve it,

02:12:32.580 | that's when you have the unintended consequences,

02:12:34.320 | all the things you mentioned about.

02:12:35.940 | So how do you enforce and code a constant humility

02:12:40.000 | as your ability become better and better and better and better

02:12:42.920 | - Stewart, Professor Stewart Russell at Berkeley,

02:12:44.760 | who's also one of the driving forces behind this letter,

02:12:49.400 | he has a whole research program about this.

02:12:54.320 | I think of it as AI humility, exactly.

02:12:59.080 | Although he calls it inverse reinforcement learning

02:13:01.320 | and other nerdy terms.

02:13:02.840 | But it's about exactly that.

02:13:04.140 | Instead of telling the AI, here's this goal,

02:13:06.100 | go optimize the bejesus out of it.

02:13:08.920 | You tell it, okay, do what I want you to do,

02:13:15.220 | but I'm not gonna tell you right now what it is

02:13:16.900 | I want you to do, you need to figure it out.

02:13:19.260 | So then you give the incentive to be very humble

02:13:21.700 | and keep asking you questions along the way.

02:13:23.360 | Is this what you really meant?

02:13:24.580 | Is this what you wanted?

02:13:25.660 | And oh, this other thing I tried didn't work,

02:13:28.140 | seemed like it didn't work out right,

02:13:29.340 | should I try it differently?

02:13:33.240 | What's nice about this is it's not just philosophical

02:13:35.600 | mumbo jumbo, it's theorems and technical work

02:13:38.320 | that with more time I think you can make a lot of progress.

02:13:40.860 | And there are a lot of brilliant people now

02:13:43.320 | working on AI safety.

02:13:44.560 | We just need to give them a bit more time.

02:13:47.840 | - But also not that many relative to the scale of the problem.

02:13:50.800 | - No, exactly.

02:13:51.960 | There should be, at least just like every university

02:13:56.400 | worth its name has some cancer research going on

02:13:59.120 | in its biology department, right?

02:14:01.520 | Every university that does computer science

02:14:03.800 | should have a real effort in this area

02:14:07.060 | and it's nowhere near that.

02:14:09.300 | This is something I hope is changing now

02:14:11.980 | thanks to the GPT-4, right?

02:14:13.780 | So I think if there's a silver lining

02:14:17.180 | to what's happening here, even though I think many people

02:14:20.820 | would wish it would have been rolled out more carefully,

02:14:24.420 | is that this might be the wake up call

02:14:26.900 | that humanity needed to really

02:14:31.540 | stop fantasising about this being 100 years off

02:14:35.080 | and stop fantasising about this being completely

02:14:37.600 | controllable and predictable because it's so obvious

02:14:41.960 | it's not predictable, you know?

02:14:45.240 | Why is it that, I think it was ChatGPT

02:14:50.240 | tried to persuade a journalist,

02:14:54.880 | or was it ChatGPT-4, to divorce his wife?

02:15:00.080 | It was not 'cause the engineers that built it

02:15:02.780 | was like, "Heh heh heh heh heh, let's put this in here

02:15:06.940 | "and screw a little bit with people."

02:15:09.720 | They hadn't predicted it at all.

02:15:11.800 | They built the giant black box,

02:15:13.540 | trained to predict the next word,

02:15:16.560 | got all these emergent properties,

02:15:18.120 | and oops, it did this, you know?

02:15:20.540 | I think this is a very powerful wake up call

02:15:26.440 | and anyone watching this who's not scared,

02:15:29.840 | I would encourage them to just play a bit more

02:15:31.960 | with these tools that are out there now, like GPT-4.

02:15:36.020 | It's a wake up call is first step.

02:15:42.240 | Once you've woken up, then gotta slow down a little bit

02:15:45.600 | the risky stuff to give a chance to all,

02:15:48.800 | everyone who's woken up, to catch up

02:15:51.440 | with us on the safety front.

02:15:52.600 | - You know, what's interesting is, you know, MIT,

02:15:55.480 | that's computer science in general,

02:15:58.680 | but let's just even say computer science curriculum.

02:16:01.920 | How does the computer science curriculum change now?

02:16:04.400 | You mentioned programming.

02:16:06.300 | When I was coming up, programming is a prestigious position.

02:16:13.600 | Like, why would you be dedicating crazy amounts of time

02:16:17.520 | to become an excellent programmer?

02:16:19.240 | Like, the nature of programming is fundamentally changing.

02:16:21.800 | - The nature of our entire education system

02:16:24.760 | is completely turned on its head.

02:16:28.480 | Has anyone been able to like load that in

02:16:30.840 | and like think about, 'cause it's really turning.

02:16:33.960 | - I mean, some English professors, some English teachers

02:16:36.160 | are beginning to really freak out now.

02:16:38.360 | Right, like they give an essay assignment

02:16:40.560 | and they get back all this fantastic prose,

02:16:42.640 | like this is the style of Hemingway.

02:16:44.880 | And then they realize they have to completely rethink.

02:16:48.080 | And even, you know, just like we stopped teaching,

02:16:52.920 | writing a script, is that what you say in English?

02:16:57.880 | - Yeah, handwritten, yeah.

02:16:59.160 | - Yeah, when everybody started typing, you know,

02:17:01.200 | like so much of what we teach our kids today.

02:17:04.080 | - Yeah, I mean, that's,

02:17:09.960 | everything is changing and it's changing very,

02:17:15.440 | it's changing very quickly.

02:17:17.920 | And so much of us understanding how to deal

02:17:20.680 | with the big problems of the world

02:17:21.840 | is through the education system.

02:17:23.960 | And if the education system is being turned on its head,

02:17:26.620 | then what's next?

02:17:27.960 | It feels like having these kinds of conversations

02:17:30.600 | is essential to try to figure it out.

02:17:32.920 | And everything's happening so rapidly.

02:17:35.480 | I don't think there's even, speaking of safety,

02:17:38.280 | what the broad AI safety defined,

02:17:40.880 | I don't think most universities have courses on AI safety.

02:17:44.400 | It's like a philosophy seminar.

02:17:46.280 | - And like, I'm an educator myself,

02:17:48.680 | so it pains me to see this, say this,

02:17:50.560 | but I feel our education right now

02:17:52.280 | is completely obsoleted by what's happening.

02:17:56.380 | You know, you put a kid into first grade

02:17:58.620 | and then you're envisioning,

02:18:01.580 | and then they're gonna come out of high school

02:18:03.020 | 12 years later, and you've already pre-planned now

02:18:06.380 | what they're gonna learn when you're not even sure

02:18:08.700 | if there's gonna be any world left to come out to.

02:18:11.340 | Clearly, you need to have a much more

02:18:16.500 | opportunistic education system

02:18:17.980 | that keeps adapting itself very rapidly

02:18:20.220 | as society readapts.

02:18:22.720 | The skills that were really useful

02:18:25.200 | when the curriculum was written,

02:18:26.520 | I mean, how many of those skills

02:18:28.520 | are gonna get you a job in 12 years?

02:18:31.240 | I mean, seriously.

02:18:32.560 | - If we just linger on the GPT-4 system a little bit,

02:18:36.160 | you kind of hinted at it, especially talking about

02:18:41.920 | the importance of consciousness in the human mind

02:18:46.480 | with homo sentience.

02:18:48.380 | Do you think GPT-4 is conscious?

02:18:51.520 | - Love this question.

02:18:53.960 | So, let's define consciousness first,

02:18:57.560 | because in my experience, like 90% of all arguments

02:19:00.880 | about consciousness boil down to the two people

02:19:03.320 | arguing, having totally different definitions

02:19:05.060 | of what it is, and they're just shouting past each other.

02:19:08.240 | I define consciousness as subjective experience.

02:19:13.720 | Right now, I'm experiencing colors and sounds

02:19:17.740 | and emotions, but does a self-driving car

02:19:21.560 | experience anything?

02:19:22.680 | That's the question about whether it's conscious or not.

02:19:26.400 | Other people think you should define consciousness

02:19:30.280 | differently, fine by me, but then maybe

02:19:33.520 | use a different word for it.

02:19:34.960 | I'm gonna use consciousness for this, at least.

02:19:38.700 | But if people hate the, yeah.

02:19:43.800 | So, is GPT-4 conscious?

02:19:46.680 | Does GPT-4 have subjective experience?

02:19:50.060 | Short answer, I don't know, because we still don't know

02:19:53.240 | what it is that gives us wonderful subjective experience

02:19:56.680 | that is kind of the meaning of our life, right?

02:19:59.240 | 'Cause meaning itself, feeling a meaning

02:20:01.040 | is a subjective experience.

02:20:02.620 | Joy is a subjective experience.

02:20:04.120 | Love is a subjective experience.

02:20:05.720 | We don't know what it is.

02:20:08.660 | I've written some papers about this.

02:20:11.560 | A lot of people have.

02:20:13.760 | Giulio Tononi, a professor, has stuck his neck

02:20:18.480 | out the farthest and written down, actually,

02:20:20.040 | a very bold mathematical conjecture

02:20:23.080 | for what's the essence of conscious information processing.

02:20:26.960 | He might be wrong, he might be right,

02:20:29.080 | but we should test it.

02:20:30.180 | He postulates that consciousness has to do

02:20:34.440 | with loops in the information processing.

02:20:37.360 | So, our brain has loops.

02:20:38.640 | Information can go round and round.

02:20:41.400 | In computer science nerd speak,

02:20:43.600 | you call it a recurrent neural network

02:20:45.460 | where some of the output gets fed back in again.

02:20:48.440 | And with his mathematical formalism,

02:20:53.440 | if it's a feed-forward neural network

02:20:56.360 | where information only goes in one direction,

02:20:58.680 | like from your eye, retina, into the back of your brain,

02:21:01.480 | for example, that's not conscious.

02:21:03.040 | So, he would predict that your retina itself

02:21:04.600 | isn't conscious of anything, or a video camera.

02:21:09.460 | Now, the interesting thing about GPT-4

02:21:11.360 | is it's also just one-way flow of information.

02:21:14.800 | So, if Tononi is right, GPT-4 is a very intelligent zombie

02:21:19.800 | that can do all this smart stuff

02:21:22.440 | but isn't experiencing anything.

02:21:24.040 | And this is both a relief if it's true,

02:21:30.120 | in that you don't have to feel guilty

02:21:32.280 | about turning off GPT-4 and wiping its memory

02:21:35.600 | whenever a new user comes along.

02:21:37.840 | I wouldn't like if someone did that to me,

02:21:40.320 | neuralized me like in Men in Black.

02:21:42.260 | But it's also creepy that you can have

02:21:48.240 | very high intelligence, perhaps, that's not conscious.

02:21:51.120 | Because if we get replaced by machines,

02:21:53.400 | and it's sad enough that humanity isn't here anymore,

02:21:58.960 | 'cause I kind of like humanity.

02:22:00.520 | But at least if the machines were conscious,

02:22:04.280 | I could be like, well, but they're our descendants

02:22:06.200 | and maybe they have our values, they're our children.

02:22:09.280 | But if Tononi is right, and these are all transformers

02:22:13.000 | that are not in the sense of Hollywood,

02:22:19.260 | but in the sense of these one-way direction

02:22:21.920 | neural networks, so they're all the zombies.

02:22:24.880 | That's the ultimate zombie apocalypse now.

02:22:26.680 | We have this universe that goes on

02:22:28.320 | with great construction projects and stuff,

02:22:30.440 | but there's no one experiencing anything.

02:22:32.580 | That would be like the ultimate depressing future.

02:22:37.240 | So I actually think as we move forward

02:22:40.920 | to building more advanced AI,

02:22:42.920 | we should do more research on figuring out

02:22:44.880 | what kind of information processing actually has experience,

02:22:47.220 | because I think that's what it's all about.

02:22:49.620 | And I completely don't buy the dismissal that some people,

02:22:54.280 | some people will say, well, this is all bullshit

02:22:56.480 | because consciousness equals intelligence.

02:22:58.600 | That's obviously not true.

02:23:01.280 | You can have a lot of conscious experience

02:23:03.160 | when you're not really accomplishing any goals at all,

02:23:06.160 | you're just reflecting on something.

02:23:08.800 | And you can sometimes have things,

02:23:12.840 | doing things that require intelligence

02:23:14.160 | probably without being conscious.

02:23:16.160 | - But I also worry that we humans won't,

02:23:18.900 | will discriminate against AI systems

02:23:22.560 | that clearly exhibit consciousness,

02:23:24.800 | that we will not allow AI systems to have consciousness.

02:23:29.120 | We'll come up with theories about measuring consciousness

02:23:32.600 | that will say this is a lesser being.

02:23:35.200 | And this is why I worry about that,

02:23:37.120 | because maybe we humans will create something

02:23:40.640 | that is better than us humans

02:23:43.800 | in the way that we find beautiful,

02:23:47.080 | which is they have a deeper subjective experience

02:23:51.320 | of reality.

02:23:52.220 | Not only are they smarter, but they feel deeper.

02:23:55.600 | And we humans will hate them for it.

02:23:58.580 | As human history has shown,

02:24:02.100 | they'll be the other, we'll try to suppress it,

02:24:04.800 | they'll create conflict, they'll create war, all of this.

02:24:07.680 | I worry about this too.

02:24:09.320 | - Are you saying that we humans sometimes

02:24:11.400 | come up with self-serving arguments?

02:24:13.520 | No, we would never do that, would we?

02:24:15.640 | - Well, that's the danger here,

02:24:16.960 | is even in this early stages,

02:24:19.480 | we might create something beautiful

02:24:21.520 | and we'll erase its memory.

02:24:24.960 | - I was horrified as a kid

02:24:28.440 | when someone started boiling lobsters.

02:24:33.280 | I'm like, "Oh my God, that's so cruel."

02:24:36.000 | And some grownup there back in Sweden said,

02:24:38.520 | "Oh, it doesn't feel pain."

02:24:40.000 | I'm like, "How do you know that?"

02:24:41.400 | "Oh, scientists have shown that."

02:24:43.160 | And then there was a recent study

02:24:46.200 | where they show that lobsters actually do feel pain

02:24:48.480 | when you boil them.

02:24:49.580 | So they banned lobster boiling in Switzerland now,

02:24:51.880 | to kill them in a different way first.

02:24:54.480 | Presumably, that scientific research

02:24:56.600 | boiled down to someone asked the lobster,

02:24:58.520 | "Does this hurt?"

02:24:59.520 | (both laugh)

02:25:00.680 | - Survey, self-report.

02:25:01.800 | - We do the same thing with cruelty to farm animals,

02:25:03.880 | also all these self-serving arguments for why they're fine.

02:25:07.640 | Yeah, so we should certainly be watchful.

02:25:10.160 | I think step one is just be humble

02:25:12.000 | and acknowledge that consciousness

02:25:13.800 | is not the same thing as intelligence.

02:25:16.040 | And I believe that consciousness still is

02:25:18.600 | a form of information processing

02:25:20.160 | where it's really information being aware of itself

02:25:22.320 | in a certain way.

02:25:23.160 | And let's study it and give ourselves a little bit of time.

02:25:26.040 | And I think we will be able to figure out

02:25:28.200 | actually what it is that causes consciousness.

02:25:31.240 | Then we can make probably unconscious robots

02:25:34.560 | that do the boring jobs that we would feel immoral

02:25:37.600 | to give to machines.

02:25:38.440 | But if you have a companion robot taking care of your mom

02:25:42.080 | or something like that,

02:25:44.160 | she would probably want it to be conscious, right?

02:25:45.760 | So that the emotions it seems to display aren't fake.

02:25:49.640 | All these things can be done in a good way

02:25:53.720 | if we give ourselves a little bit of time

02:25:55.720 | and don't run and take on this challenge.

02:25:59.400 | - Is there something you could say to the timeline

02:26:02.000 | that you think about, about the development of AGI?

02:26:05.920 | Depending on the day, I'm sure that changes for you.

02:26:09.160 | But when do you think there'll be a really big leap

02:26:13.560 | in intelligence where you would definitively say

02:26:16.400 | we have built AGI?

02:26:17.920 | Do you think it's one year from now,

02:26:19.400 | five years from now, 10, 20, 50?

02:26:23.160 | What's your gut say?

02:26:27.680 | - Honestly, for the past decade,

02:26:32.520 | I've deliberately given very long timelines

02:26:34.720 | just because I didn't want to fuel

02:26:35.760 | some kind of stupid Moloch race.

02:26:37.760 | But I think that cat has really left the bag now.

02:26:42.000 | I think we might be very, very close.

02:26:46.600 | I don't think the Microsoft paper is totally off

02:26:50.640 | when they say that there are some glimmers of AGI.

02:26:54.960 | It's not AGI yet.

02:26:56.800 | It's not an agent.

02:26:57.720 | There's a lot of things it can't do.

02:26:59.760 | But I wouldn't bet very strongly

02:27:03.800 | against it happening very soon.

02:27:07.160 | That's why we decided to do this open letter

02:27:09.720 | because if there's ever been a time to pause, it's today.

02:27:14.280 | - There's a feeling like this GPT-4 is a big transition

02:27:19.360 | into waking everybody up to the effectiveness

02:27:23.560 | of these systems.

02:27:24.400 | And so the next version will be big.

02:27:28.520 | - Yeah, and if that next one isn't AGI,

02:27:31.440 | maybe the next next one will.

02:27:33.040 | And there are many companies trying to do these things.

02:27:35.440 | And the basic architecture of them

02:27:37.840 | is not some sort of super well-kept secret.

02:27:39.840 | So this is a time to,

02:27:43.000 | a lot of people have said for many years

02:27:45.960 | that there will come a time

02:27:46.920 | when we want to pause a little bit.

02:27:48.680 | That time is now.

02:27:54.280 | - You have spoken about and thought about nuclear war a lot

02:27:58.920 | over the past year.

02:28:01.480 | We've seemingly have come closest

02:28:06.480 | to the precipice of nuclear war,

02:28:09.520 | then at least in my lifetime.

02:28:11.540 | - Yeah.

02:28:13.700 | - What do you learn about human nature from that?

02:28:15.880 | - It's our old friend Moloch again.

02:28:19.240 | It's really scary to see it where

02:28:23.480 | America doesn't want there to be a nuclear war.

02:28:26.800 | Russia doesn't want there to be a global nuclear war either.

02:28:30.000 | We both know that it's just being others.

02:28:32.040 | If we just try to do it,

02:28:33.480 | both sides try to launch first,

02:28:35.760 | it's just another suicide race, right?

02:28:37.640 | So why is it the way you said

02:28:40.240 | that this is the closest we've come since 1962?

02:28:43.240 | In fact, I think we've come closer now

02:28:44.720 | than even the Cuban Missile Crisis.

02:28:47.000 | It's 'cause of Moloch.

02:28:48.080 | You have these other forces.

02:28:51.640 | On one hand, you have the West

02:28:54.920 | saying that we have to drive Russia out of Ukraine.

02:28:59.920 | It's a matter of pride.

02:29:01.080 | We've staked so much on it

02:29:04.640 | that it would be seen as a huge loss

02:29:08.360 | of the credibility of the West

02:29:10.080 | if we don't drive Russia out entirely of the Ukraine.

02:29:12.880 | And on the other hand, you have Russia

02:29:20.440 | and you have the Russian leadership

02:29:22.400 | who knows that if they get completely driven

02:29:24.840 | out of Ukraine, it's not just gonna be very humiliating

02:29:29.840 | for them, but they might,

02:29:32.440 | it often happens when countries lose wars

02:29:36.400 | that things don't go so well for their leadership either.

02:29:39.640 | You remember when Argentina invaded the Falkland Islands?

02:29:42.480 | The military junta ordered that, right?

02:29:48.240 | People were cheering on the streets at first

02:29:50.400 | when they took it.

02:29:51.840 | And then when they got their butt kicked by the British,

02:29:56.680 | you know what happened to those guys?

02:29:58.520 | They were out.

02:30:01.320 | And I believe those who are still alive are in jail now.

02:30:04.600 | So the Russian leadership is entirely cornered

02:30:09.320 | where they know that just getting driven out of Ukraine

02:30:14.320 | is not an option.

02:30:17.160 | And so this to me is a typical example of Moloch.

02:30:22.160 | You have these incentives of the two parties

02:30:27.040 | where both of them are just driven

02:30:29.600 | to escalate more and more, right?

02:30:30.920 | If Russia starts losing in the conventional warfare,

02:30:33.960 | the only thing they can do

02:30:36.480 | since they're back against the war is to keep escalating.

02:30:39.280 | And the West has put itself in the situation now

02:30:43.040 | where we've sort of already committed to drive Russia out.

02:30:45.520 | So the only option the West has is to call Russia's bluff

02:30:48.560 | and keep sending in more weapons.

02:30:50.200 | This really bothers me

02:30:52.160 | because Moloch can sometimes drive competing parties

02:30:55.480 | to do something which is ultimately just really bad

02:30:57.840 | for both of them.

02:30:58.880 | And what makes me even more worried

02:31:02.720 | is not just that it's difficult to see an ending,

02:31:07.720 | a quick, peaceful ending to this tragedy

02:31:12.320 | that doesn't involve some horrible escalation,

02:31:15.480 | but also that we understand more clearly now

02:31:19.000 | just how horrible it would be.

02:31:21.640 | There was an amazing paper that was published

02:31:23.960 | in Nature Food this August

02:31:27.080 | by some of the top researchers

02:31:30.480 | who've been studying nuclear winter for a long time.

02:31:31.960 | And what they basically did was they combined climate models

02:31:38.120 | with food agricultural models.

02:31:42.800 | So instead of just saying, yeah, it gets really cold,

02:31:45.280 | blah, blah, blah, they figured out actually

02:31:46.700 | how many people would die in different countries.

02:31:49.160 | And it's pretty mind-blowing.

02:31:52.280 | So basically what happens is the thing that kills

02:31:54.480 | the most people is not the explosions,

02:31:56.000 | it's not the radioactivity, it's not the EMP mayhem,

02:31:59.800 | it's not the rampaging mobs foraging for food.

02:32:04.120 | No, it's the fact that you get so much smoke

02:32:06.680 | coming up from the burning cities and stratosphere

02:32:09.720 | that spreads around the earth from the jet streams.

02:32:14.720 | So in typical models, you get like 10 years or so

02:32:19.000 | where it's just crazy cold.

02:32:20.440 | During the first year after the war,

02:32:25.960 | and their models, the temperature drops in Nebraska

02:32:30.960 | and in the Ukraine breadbaskets by like 20,

02:32:36.560 | Celsius or so if I remember.

02:32:38.920 | No, yeah, 20, 30 Celsius, depending on where you are,

02:32:42.600 | 40 Celsius in some places, which is 40 Fahrenheit

02:32:46.160 | to 80 Fahrenheit colder than what it would normally be.

02:32:48.840 | So I'm not good at farming, but if it's snowing,

02:32:53.840 | if it drops below freezing pretty much most days in July,

02:32:58.080 | and then that's not good.

02:32:59.160 | So they worked out, they put this into their farming models.

02:33:02.080 | And what they found was really interesting.

02:33:04.120 | The countries that get the most hard hit

02:33:06.080 | are the ones in the Northern hemisphere.

02:33:08.080 | So in the US and one model,

02:33:12.680 | they had about 99% of all Americans starving to death.

02:33:16.360 | In Russia and China and Europe,

02:33:18.160 | also about 99%, 98% starving to death.

02:33:21.120 | So you might be like, oh, it's kind of poetic justice

02:33:24.760 | that both the Russians and the Americans,

02:33:28.040 | 99% of them have to pay for it

02:33:29.720 | 'cause it was their bombs that did it.

02:33:31.360 | But that doesn't particularly cheer people up in Sweden

02:33:35.280 | or other random countries

02:33:37.280 | that have nothing to do with it.

02:33:38.920 | I think it hasn't entered the mainstream,

02:33:45.880 | not understanding very much just like how bad this is.

02:33:53.200 | Most people, especially a lot of people

02:33:55.720 | in decision-making positions still think of nuclear weapons

02:33:58.040 | as something that makes you powerful.

02:33:59.880 | Scary, powerful, they don't think of it as something

02:34:05.000 | where, yeah, just to within a percent or two,

02:34:09.760 | we're all just gonna starve to death.

02:34:11.880 | - And starving to death is the worst way to die,

02:34:17.040 | as a lot of more, as all the famines in history show,

02:34:25.120 | the torture involved in that.

02:34:27.160 | - Probably brings out the worst in people also,

02:34:29.520 | when people are desperate like this.

02:34:34.240 | I've heard some people say that

02:34:37.120 | if that's what's gonna happen,

02:34:39.240 | they'd rather be at ground zero and just get vaporized.

02:34:42.000 | But I think people underestimate the risk of this

02:34:49.760 | because they aren't afraid of Moloch.

02:34:53.400 | They think, oh, it's just gonna be,

02:34:54.800 | 'cause humans don't want this, so it's not gonna happen.

02:34:56.600 | That's the whole point of Moloch,

02:34:58.080 | that things happen that nobody wanted.

02:35:00.360 | - And that applies to nuclear weapons,

02:35:02.440 | and that applies to AGI.

02:35:04.360 | - Exactly, and it applies to some of the things

02:35:09.000 | that people have gotten most upset with capitalism for also,

02:35:12.440 | where everybody was just kind of trapped.

02:35:14.920 | It's not to see if some company does something

02:35:18.640 | that causes a lot of harm.

02:35:23.400 | Not that the CEO is a bad person,

02:35:25.560 | but she or he knew that all the other companies

02:35:29.160 | were doing this too.

02:35:30.000 | So Moloch is a formidable foe.

02:35:32.480 | I hope someone makes a good movie

02:35:40.320 | so we can see who the real enemy is.

02:35:42.160 | We're not fighting against each other.

02:35:45.680 | Moloch makes us fight against each other.

02:35:48.400 | That's what Moloch's superpower is.

02:35:50.720 | The hope here is any kind of technology

02:35:55.440 | or other mechanism that lets us instead realize

02:35:59.560 | that we're fighting the wrong enemy.

02:36:01.400 | - It's such a fascinating battle.

02:36:04.120 | - It's not us versus them, it's us versus it.

02:36:06.480 | - We are fighting, Moloch, for human survival.

02:36:11.840 | We as a civilization.

02:36:13.000 | - Have you seen the movie "Needful Things"?

02:36:16.320 | It's a Stephen King novel.

02:36:17.960 | I love Stephen King and Max von Sydow,

02:36:21.040 | Swedish actor, is playing the guy.

02:36:23.840 | It's brilliant.

02:36:25.000 | I just hadn't thought about that until now,

02:36:27.540 | but that's the closest I've seen to a movie about Moloch.

02:36:31.840 | I don't want to spoil the film for anyone

02:36:33.280 | who wants to watch it, but basically,

02:36:36.120 | it's about this guy who turns out to,

02:36:39.560 | you can interpret him as the devil or whatever,

02:36:41.600 | but he doesn't actually ever go around and kill people

02:36:44.200 | or torture people with burning coal or anything.

02:36:47.120 | He makes everybody fight each other,

02:36:49.200 | makes everybody fear each other, hate each other,

02:36:51.160 | and then kill each other.

02:36:53.120 | So that's the movie about Moloch.

02:36:56.400 | - Love is the answer.

02:36:57.460 | That seems to be one of the ways to fight Moloch

02:37:02.460 | is by compassion, by seeing the common humanity.

02:37:08.140 | - Yes, yes.

02:37:09.980 | And to not sound, so we don't sound like,

02:37:12.300 | like what's it, Kumbaya tree huggers here, right?

02:37:15.380 | (Lex laughing)

02:37:16.780 | We're not just saying love and peace, man.

02:37:19.540 | We're trying to actually help people

02:37:21.860 | understand the true facts about the other side.

02:37:26.360 | And feel the compassion because the truth

02:37:31.360 | makes you more compassionate, right?

02:37:35.940 | So that's why I really like using AI

02:37:42.080 | for truth-seeking technologies that can,

02:37:46.300 | as a result, get us more love than hate.

02:37:53.400 | And even if you can't get love,

02:37:56.680 | settle for some understanding,

02:37:59.800 | which already gives compassion.

02:38:01.240 | If someone is like, "I really disagree with you, Lex,

02:38:06.120 | "but I can see where you're coming from.

02:38:07.860 | "You're not a bad person who needs to be destroyed,

02:38:12.700 | "but I disagree with you,

02:38:13.560 | "and I'm happy to have an argument about it."

02:38:15.920 | That's a lot of progress compared to where we are

02:38:18.760 | at 2023 in the public space, wouldn't you say?

02:38:22.120 | - If we solve the AI safety problem, as we've talked about,

02:38:26.560 | and then you, Max Tegmark, who has been talking about this

02:38:31.040 | for many years, get to sit down with the AGI,

02:38:35.120 | with the early AGI system, on a beach with a drink,

02:38:38.220 | what kind of, what would you ask her?

02:38:41.760 | What kind of question would you ask?

02:38:42.760 | What would you talk about?

02:38:44.060 | Something so much smarter than you.

02:38:47.600 | Would you be afraid--

02:38:49.560 | - I knew you were gonna get me

02:38:50.720 | with a really zinger of a question.

02:38:53.720 | That's a good one.

02:38:54.560 | - Would you be afraid to ask some questions?

02:38:58.360 | - No.

02:38:59.680 | I'm not afraid of the truth.

02:39:01.040 | (laughing)

02:39:01.880 | I'm very humble.

02:39:02.760 | I know I'm just a meat bag with all these flaws,

02:39:05.440 | but I have, we talked a lot about homo sentiens.

02:39:09.920 | I've already tried that for a long time with myself.

02:39:12.920 | So that is what's really valuable about being alive for me,

02:39:16.400 | is that I have these meaningful experiences.

02:39:19.760 | It's not that I'm good at this or good at that or whatever.

02:39:24.400 | There's so much I suck at.

02:39:25.720 | - So you're not afraid for the system

02:39:28.480 | to show you just how dumb you are?

02:39:29.920 | - No, no.

02:39:30.760 | In fact, my son reminds me of that pretty frequently.

02:39:34.160 | - You could find out how dumb you are in terms of physics,

02:39:36.440 | how little we humans understand.

02:39:38.720 | - I'm cool with that.

02:39:40.200 | I think, so I can't waffle my way out of this question.

02:39:45.200 | It's a fair one.

02:39:49.280 | I think, given that I'm a really, really curious person,

02:39:52.480 | that's really the defining part of who I am.

02:39:57.240 | I'm so curious.

02:39:58.440 | I have some physics questions.

02:40:05.520 | (laughing)

02:40:06.720 | I love to understand.

02:40:09.800 | I have some questions about consciousness,

02:40:12.240 | about the nature of reality.

02:40:13.360 | I would just really, really love to understand also.

02:40:15.960 | I could tell you one, for example,

02:40:18.720 | that I've been obsessing about a lot recently.

02:40:21.000 | So I believe that, so suppose Tononi is right.

02:40:27.720 | And suppose there are some information processing systems

02:40:30.720 | that are conscious and some that are not.

02:40:32.560 | Suppose you can even make reasonably smart things

02:40:34.480 | like GPT-4 that are not conscious,

02:40:36.400 | but you can also make them conscious.

02:40:38.840 | Here's the question that keeps me awake at night.

02:40:41.280 | Is it the case that the unconscious zombie systems

02:40:47.800 | that are really intelligent are also really efficient?

02:40:50.040 | So they're really inefficient?

02:40:51.600 | So that when you try to make things more efficient,

02:40:54.960 | which there'll naturally be a pressure to do,

02:40:57.200 | they become conscious.

02:40:59.120 | I'm kind of hoping that that's correct.

02:41:02.480 | And do you want me to give you a hand wavey argument for it?

02:41:05.480 | In my lab, again, every time we look at

02:41:11.160 | how these large language models do something,

02:41:13.680 | we see that they do it in really dumb ways

02:41:15.200 | and you could make it better.

02:41:17.640 | If you, we have loops in our computer language for a reason.

02:41:22.640 | The code would get way, way longer

02:41:25.640 | if you weren't allowed to use them.

02:41:27.400 | It's more efficient to have the loops.

02:41:29.840 | And in order to have self-reflection,

02:41:34.240 | whether it's conscious or not,

02:41:37.000 | even an operating system knows things about itself.

02:41:39.560 | You need to have loops already.

02:41:44.080 | So I think, I'm waving my hands a lot,

02:41:48.360 | but I suspect that the most efficient way

02:41:53.240 | of implementing a given level of intelligence

02:41:55.840 | has loops in it, self-reflection, and will be conscious.

02:42:01.840 | - Isn't that great news?

02:42:04.240 | - Yes, if it's true, it's wonderful.

02:42:06.080 | 'Cause then we don't have to fear

02:42:07.920 | the ultimate zombie apocalypse.

02:42:09.560 | And I think if you look at our brains, actually,

02:42:12.160 | our brains are part zombie and part conscious.

02:42:16.920 | When I open my eyes, I immediately take all these pixels

02:42:24.960 | that hit my retina, right?

02:42:27.800 | And I'm like, "Oh, that's Lex."

02:42:29.880 | But I have no freaking clue of how I did that computation.

02:42:32.800 | It's actually quite complicated, right?

02:42:34.280 | It was only relatively recently

02:42:36.640 | we could even do it well with machines, right?

02:42:39.720 | You get a bunch of information processing happening

02:42:42.160 | in my retina, and then it goes to the lateral geniculate

02:42:44.480 | nucleus, my thalamus, and the area V1, V2, V4,

02:42:48.520 | and the fusiform face area here that Nancy Kenwisher

02:42:51.040 | at MIT invented, and blah, blah, blah, blah, blah.

02:42:53.400 | And I have no freaking clue how that worked, right?

02:42:56.320 | It feels to me subjectively like my conscious module

02:42:59.520 | just got a little email saying,

02:43:05.480 | "Facial processing task complete. It's Lex."

02:43:10.480 | - Yeah.

02:43:13.120 | - I'm gonna just go with that, right?

02:43:15.080 | So this fits perfectly with Tononi's model

02:43:18.440 | because this was all one-way information processing mainly.

02:43:23.060 | And it turned out for that particular task,

02:43:28.160 | that's all you needed, and it probably was

02:43:30.200 | kind of the most efficient way to do it.

02:43:32.560 | But there were a lot of other things

02:43:34.080 | that we associated with higher intelligence

02:43:36.120 | and planning and so on and so forth,

02:43:38.000 | where you kind of want to have loops

02:43:40.160 | and be able to ruminate and self-reflect

02:43:42.240 | and introspect and so on, where my hunch is

02:43:46.840 | that if you want to fake that with a zombie system

02:43:49.120 | that just all goes one way, you have to unroll those loops

02:43:52.240 | and it gets really, really long,

02:43:53.360 | and it's much more inefficient.

02:43:55.840 | So I'm actually hopeful that AI, if in the future

02:43:59.320 | we have all these very sublime and interesting machines

02:44:01.800 | that do cool things and are aligned with us,

02:44:04.600 | that they will also have consciousness

02:44:08.680 | for the kind of these things that we do.

02:44:11.680 | - That great intelligence is also correlated

02:44:14.520 | to great consciousness or a deep kind of consciousness.

02:44:18.480 | - Yes.

02:44:20.000 | So that's a happy thought for me

02:44:21.760 | 'cause the zombie apocalypse really is my worst nightmare

02:44:24.760 | of all, to be like adding insult to injury.

02:44:27.160 | Not only did we get replaced,

02:44:29.040 | but we frigging replaced ourselves by zombies.

02:44:32.240 | How dumb can we be?

02:44:34.120 | - That's such a beautiful vision,

02:44:35.480 | and that's actually a provable one.

02:44:37.080 | That's one that we humans can intuit and prove

02:44:40.200 | that those two things are correlated

02:44:42.520 | as we start to understand what it means to be intelligent

02:44:45.240 | and what it means to be conscious,

02:44:46.880 | which these systems, early AGI-like systems

02:44:51.120 | will help us understand.

02:44:52.400 | - And I just want to say one more thing,

02:44:53.760 | which is super important.

02:44:55.200 | Most of my colleagues, when I started going on

02:44:57.440 | about consciousness, tell me that it's all bullshit

02:44:59.360 | and I should stop talking about it.

02:45:01.120 | I hear a little inner voice from my father

02:45:04.240 | and from my mom saying, "Keep talking about it,"

02:45:06.960 | 'cause I think they're wrong.

02:45:08.040 | And the main way to convince people like that,

02:45:13.040 | that they're wrong, if they say that consciousness

02:45:17.120 | is just equal to intelligence, is to ask them,

02:45:19.560 | "What's wrong with torture?"

02:45:21.280 | Or, "Why are you against torture?"

02:45:23.960 | If it's just about these particles moving this way

02:45:28.920 | rather than that way, and there is no such thing

02:45:31.680 | as subjective experience, what's wrong with torture?

02:45:34.320 | Do you have a good comeback to that?

02:45:36.520 | - No, it seems like suffering imposed onto other humans

02:45:40.480 | is somehow deeply wrong in a way

02:45:44.200 | that intelligence doesn't quite explain.

02:45:46.120 | - And if someone tells me, "Well, it's just an illusion,

02:45:50.720 | "consciousness, whatever," I like to invite them

02:45:55.720 | the next time they're having surgery

02:45:58.920 | to do it without anesthesia.

02:46:00.360 | What is anesthesia really doing?

02:46:03.760 | If you have it, you can have it at local anesthesia

02:46:05.800 | when you're awake.

02:46:06.620 | I had that when they fixed my shoulder.

02:46:07.800 | It was super entertaining.

02:46:09.060 | What was it that it did?

02:46:12.560 | It just removed my subjective experience of pain.

02:46:15.560 | It didn't change anything about what was actually happening

02:46:17.760 | in my shoulder, right?

02:46:20.120 | So if someone says that's all bullshit,

02:46:22.120 | skip the anesthesia is my advice.

02:46:24.960 | This is incredibly central.

02:46:26.680 | - It could be fundamental to whatever this thing

02:46:30.080 | we have going on here.

02:46:31.320 | - It is fundamental because what we feel is so fundamental

02:46:36.080 | is suffering and joy and pleasure and meaning.

02:46:41.080 | That's all, those are all subjective experiences there.

02:46:47.880 | And let's not, those are the elephant in the room.

02:46:50.160 | That's what makes life worth living

02:46:51.840 | and that's what can make it horrible

02:46:53.040 | if it's just the way you're suffering.

02:46:54.420 | So let's not make the mistake of saying

02:46:56.640 | that that's all bullshit.

02:46:58.400 | - And let's not make the mistake of not instilling

02:47:02.640 | the AI systems with that same thing that makes us special.

02:47:07.640 | - Yeah.

02:47:09.600 | - Max, it's a huge honor that you will sit down to me

02:47:12.800 | the first time on the first episode of this podcast.

02:47:16.240 | It's a huge honor you sit down with me again

02:47:18.280 | and talk about this, what I think is the most important

02:47:21.400 | topic, the most important problem that we humans

02:47:25.080 | have to face and hopefully solve.

02:47:28.740 | - Yeah, well the honor is all mine and I'm so grateful

02:47:31.600 | to you for making more people aware of the fact

02:47:34.960 | that humanity has reached the most important fork

02:47:37.320 | in the road ever in its history

02:47:38.960 | and let's turn in the correct direction.

02:47:41.700 | - Thanks for listening to this conversation

02:47:44.240 | with Max Tegmark.

02:47:45.440 | To support this podcast, please check out our sponsors

02:47:47.840 | in the description.

02:47:49.440 | And now let me leave you with some words

02:47:51.440 | from Frank Herbert.

02:47:52.880 | History is a constant race

02:47:56.440 | between invention and catastrophe.

02:47:59.120 | Thank you for listening and hope to see you next time.

02:48:03.480 | (upbeat music)

02:48:06.060 | (upbeat music)

02:48:08.640 | [BLANK_AUDIO]

Max Tegmark: The Case for Halting AI Development | Lex Fridman Podcast #371

Chapters