back to index

Stuart Russell: Long-Term Future of Artificial Intelligence | Lex Fridman Podcast #9


Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Stuart Russell. He's a professor of computer science at
00:00:04.720 | UC Berkeley and a co-author of a book that introduced me and millions of other people
00:00:10.240 | to the amazing world of AI called Artificial Intelligence, The Modern Approach. So it was an
00:00:16.720 | honor for me to have this conversation as part of MIT course in Artificial General Intelligence
00:00:23.120 | and the Artificial Intelligence podcast. If you enjoy it, please subscribe on YouTube,
00:00:28.560 | iTunes, or your podcast provider of choice, or simply connect with me on Twitter @LexFriedman,
00:00:34.320 | spelled F-R-I-D. And now, here's my conversation with Stuart Russell.
00:00:40.080 | So you've mentioned in 1975 in high school you've created one of your first AI programs that played
00:00:48.080 | chess. Were you ever able to build a program that beat you at chess or another board game?
00:00:58.240 | >>Stuart Russell So my program never beat me at chess. I actually wrote the program
00:01:04.640 | at Imperial College. So I used to take the bus every Wednesday with a box of cards this big
00:01:12.160 | and shove them into the card reader. And they gave us eight seconds of CPU time.
00:01:17.200 | It took about five seconds to read the cards in and compile the code. So we had three seconds
00:01:24.560 | of CPU time, which was enough to make one move with a not very deep search. And then we would
00:01:31.040 | print that move out, and then we'd have to go to the back of the queue and wait to feed the cards
00:01:35.200 | in again. >>Lex: How deep was the search? Are we talking about one move, two moves, three moves?
00:01:39.280 | >>Stuart Russell So no, I think we got an eight move,
00:01:42.560 | depth eight with alpha beta. And we had some tricks of our own about move ordering and some
00:01:51.120 | pruning of the tree. >>Lex: But you were still able to beat that program?
00:01:54.880 | >>Stuart Russell Yeah, yeah. I was a reasonable chess player in my youth. I did an Othello program
00:02:00.880 | and a backgammon program. So when I got to Berkeley, I worked a lot on what we call meta
00:02:08.160 | reasoning, which really means reasoning about reasoning. And in the case of a game playing
00:02:13.840 | program, you need to reason about what parts of the search tree you're actually going to explore,
00:02:19.200 | because the search tree is enormous, bigger than the number of atoms in the universe.
00:02:24.320 | And the way programs succeed and the way humans succeed is by only looking at a small fraction
00:02:32.240 | of the search tree. And if you look at the right fraction, you play really well. If you look at the
00:02:37.360 | wrong fraction, if you waste your time thinking about things that are never going to happen,
00:02:42.320 | moves that no one's ever going to make, then you're going to lose, because you won't be able
00:02:47.440 | to figure out the right decision. So that question of how machines can manage their own computation,
00:02:54.960 | how they decide what to think about, is the meta reasoning question. We developed some methods
00:03:00.960 | for doing that. And very simply, a machine should think about whatever thoughts are going to improve
00:03:08.880 | its decision quality. We were able to show that both for Othello, which is a standard two-player
00:03:15.440 | game, and for Backgammon, which includes dice rolls, so it's a two-player game with uncertainty.
00:03:21.840 | For both of those cases, we could come up with algorithms that were actually much more efficient
00:03:27.680 | than the standard alpha-beta search, which chess programs at the time were using. And that,
00:03:34.400 | those programs could beat me. And I think you can see the same basic ideas in AlphaGo and AlphaZero
00:03:44.320 | today. The way they explore the tree is using a form of meta reasoning to select what to think
00:03:53.200 | about based on how useful it is to think about it. Is there any insights you can describe with
00:03:59.520 | our Greek symbols of how do we select which paths to go down? There's really two kinds of learning
00:04:06.800 | going on. So as you say, AlphaGo learns to evaluate board positions. So it can look at a go board
00:04:14.080 | and it actually has probably a superhuman ability to instantly tell how promising that situation is.
00:04:22.320 | To me, the amazing thing about AlphaGo is not that
00:04:26.960 | it can beat the world champion with its hands tied behind his back, but the fact that
00:04:36.240 | if you stop it from searching altogether, so you say, okay, you're not allowed to do
00:04:41.280 | any thinking ahead, right? You can just consider each of your legal moves and then look at the
00:04:46.480 | resulting situation and evaluate it. So what we call a depth one search. So just the immediate
00:04:53.280 | outcome of your moves and decide if that's good or bad. That version of AlphaGo
00:04:57.920 | can still play at a professional level, right? And human professionals are sitting there for
00:05:05.200 | five, 10 minutes deciding what to do. And AlphaGo in less than a second can instantly intuit what
00:05:13.440 | is the right move to make based on its ability to evaluate positions. And that is remarkable
00:05:18.800 | because we don't have that level of intuition about Go. We actually have to think about
00:05:25.360 | the situation. So anyway, that capability that AlphaGo has is one big part of why it
00:05:34.560 | beats humans. The other big part is that it's able to look ahead 40, 50, 60 moves into the future.
00:05:43.200 | And if it was considering all possibilities, 40 or 50 or 60 moves into the future, that would be
00:05:52.400 | 10 to the 200 possibilities. So way, way more than atoms in the universe and so on. So it's very,
00:06:03.440 | very selective about what it looks at. So let me try to give you an intuition about
00:06:09.680 | how you decide what to think about. It's a combination of two things. One is how promising
00:06:16.960 | it is. So if you're already convinced that a move is terrible, there's no point spending a lot more
00:06:24.640 | time convincing yourself that it's terrible because it's probably not going to change your
00:06:30.320 | mind. So the real reason you think is because there's some possibility of changing your mind
00:06:35.680 | about what to do. And it's that changing of mind that would result then in a better
00:06:42.240 | final action in the real world. So that's the purpose of thinking, is to improve the final
00:06:48.480 | action in the real world. And so if you think about a move that is guaranteed to be terrible,
00:06:55.440 | you can convince yourself it's terrible, you're still not going to change your mind.
00:06:58.880 | Right? But on the other hand, you suppose you had a choice between two moves. One of them you've
00:07:04.320 | already figured out is guaranteed to be a draw, let's say. And then the other one looks a little
00:07:10.320 | bit worse. Like it looks fairly likely that if you make that move, you're going to lose.
00:07:14.000 | But there's still some uncertainty about the value of that move. There's still some possibility that
00:07:20.880 | it will turn out to be a win. Right? Then it's worth thinking about that. So even though it's
00:07:25.840 | less promising on average than the other move, which is guaranteed to be a draw, there's still
00:07:31.280 | some purpose in thinking about it, because there's a chance that you will change your mind and
00:07:35.680 | discover that in fact it's a better move. So it's a combination of how good the move appears to be
00:07:42.080 | and how much uncertainty there is about its value. The more uncertainty, the more it's worth thinking
00:07:48.000 | about. Because there's a higher upside if you want to think of it that way.
00:07:52.080 | And of course in the beginning, especially in the AlphaGo Zero formulation, everything is shrouded
00:07:59.760 | in uncertainty. So you're really swimming in a sea of uncertainty. So it benefits you to,
00:08:06.240 | I mean, actually following the same process as you described, but because you're so uncertain
00:08:11.120 | about everything, you basically have to try a lot of different directions.
00:08:15.280 | Yeah. So the early parts of the search tree are fairly bushy, that it will look at a lot
00:08:22.400 | of different possibilities, but fairly quickly, the degree of certainty about some of the moves,
00:08:27.840 | I mean, if a move is really terrible, you'll pretty quickly find out, right? You'll lose half
00:08:31.920 | your pieces or half your territory. And then you'll say, "Okay, this is not worth thinking
00:08:37.200 | about anymore." And then, so further down, the tree becomes very long and narrow. And you're
00:08:44.400 | following various lines of play, 10, 20, 30, 40, 50 moves into the future. And that again is
00:08:54.560 | something that human beings have a very hard time doing, mainly because they just lack the short
00:09:01.280 | term memory. You just can't remember a sequence of moves that's 50 moves long. And you can't
00:09:07.920 | imagine the board correctly for that many moves into the future. Of course, the top players,
00:09:15.520 | I'm much more familiar with chess, but the top players probably have echoes of the same kind of
00:09:22.160 | intuition, instinct that in a moment's time, AlphaGo applies when they see a board. I mean,
00:09:28.640 | they've seen those patterns. Human beings have seen those patterns before at the top,
00:09:33.600 | at the grandmaster level. It seems that there is some similarities, or maybe it's our imagination
00:09:43.600 | creates a vision of those similarities, but it feels like this kind of pattern recognition that
00:09:49.760 | the AlphaGo approaches are using is similar to what human beings at the top level are using.
00:09:55.760 | I think there's some truth to that.
00:10:00.480 | But not entirely.
00:10:02.400 | Yeah, I mean, I think the extent to which a human grandmaster can reliably,
00:10:08.960 | instantly recognize the right move, instantly recognize the value of a position,
00:10:13.680 | I think that's a little bit overrated.
00:10:17.280 | But if you sacrifice a queen, for example, I mean, there's these beautiful games of chess
00:10:22.160 | with Bobby Fischer, somebody where it's seeming to make a bad move. And I'm not sure there's a
00:10:31.040 | perfect degree of calculation involved where they've calculated all the possible things that
00:10:36.080 | happen, but there's an instinct there, right, that somehow adds up to go down that path.
00:10:42.240 | Yeah, so I think what happens is you get a sense that there's some possibility in the position,
00:10:47.920 | even if you make a weird looking move, that it opens up some lines of
00:10:57.520 | calculation that otherwise would be definitely bad. And it's that intuition that there's something
00:11:06.960 | here in this position that might yield a win.
00:11:12.080 | Down the set of possibilities.
00:11:13.760 | And then you follow that, right. And in some sense, when a chess player is following a line
00:11:19.760 | in his or her mind, they're mentally simulating what the other person is going to do, what the
00:11:26.640 | opponent is going to do. And they can do that as long as the moves are kind of forced, right.
00:11:33.120 | As long as there's a forcing variation where the opponent doesn't really have much choice how to
00:11:39.440 | respond. And then you see if you can force them into a situation where you win. You know, we see
00:11:44.640 | plenty of mistakes, even in Grandmaster games, where they just miss some simple three, four,
00:11:52.960 | five move combination that wasn't particularly apparent in the position, but was still there.
00:12:00.320 | That's the thing that makes us human.
00:12:02.400 | So when you mentioned that in Othello, those games were, after some meta reasoning improvements and
00:12:10.400 | research, was able to beat you. How did that make you feel?
00:12:14.240 | Part of the meta reasoning capability that it had was based on learning. And
00:12:21.360 | you could sit down the next day and you could just feel that it had got a lot smarter.
00:12:28.080 | You know, and all of a sudden you really felt like you're sort of pressed against
00:12:33.200 | the wall because it was much more aggressive and was totally unforgiving of any minor mistake that
00:12:41.680 | you might make. And actually, it seemed understood the game better than I did.
00:12:47.760 | And Garry Kasparov has this quote where during his match against Deep Blue, he said he suddenly
00:12:55.520 | felt that there was a new kind of intelligence across the board.
00:12:58.160 | Do you think that's a scary or an exciting possibility for Garry Kasparov and for yourself
00:13:06.240 | in the context of chess, purely sort of in this, like, that feeling, whatever that is?
00:13:14.320 | I think it's definitely an exciting feeling. You know, this is what made me
00:13:18.880 | work on AI in the first place, was as soon as I really understood what a computer was,
00:13:24.560 | I wanted to make it smart. You know, I started out with the first program I wrote was for the
00:13:30.800 | Sinclair programmable calculator. And I think you could write a 21-step algorithm. That was
00:13:38.960 | the biggest program you could write, something like that. And do little arithmetic calculations.
00:13:44.800 | So I think I implemented Newton's method for square roots and a few other things like that.
00:13:49.840 | But then, you know, I thought, "Okay, if I just had more space, I could make this thing intelligent."
00:13:57.120 | And so I started thinking about AI. And I think the thing that's
00:14:07.520 | scary is not the chess program, because, you know, chess programs,
00:14:15.360 | they're not in the taking over the world business. But if you extrapolate,
00:14:24.240 | you know, there are things about chess that don't resemble the real world, right? We know the rules
00:14:32.080 | of chess. The chessboard is completely visible to the program, where of course the real world is
00:14:40.320 | not. Most of the real world is not visible from wherever you're sitting, so to speak.
00:14:45.840 | And to overcome those kinds of problems, you need qualitatively different algorithms.
00:14:56.160 | Another thing about the real world is that, you know, we regularly
00:15:00.160 | plan ahead on the timescales involving billions or trillions of steps.
00:15:07.360 | Now, we don't plan those in detail, but, you know, when you choose to do a PhD at Berkeley,
00:15:14.960 | that's a five-year commitment. And that amounts to about a trillion motor control
00:15:20.880 | steps that you will eventually be committed to.
00:15:23.920 | - Including going up the stairs, opening doors, drinking water.
00:15:28.640 | - Yeah, I mean, every finger movement while you're typing, every character of every paper
00:15:33.680 | and the thesis and everything. So you're not committing in advance to the specific motor
00:15:37.600 | control steps, but you're still reasoning on a timescale that will eventually reduce to
00:15:43.760 | trillions of motor control actions. And so for all these reasons,
00:15:52.080 | you know, AlphaGo and Deep Blue and so on don't represent any kind of threat to humanity,
00:15:58.000 | but they are a step towards it, right? And progress in AI occurs by essentially removing
00:16:07.040 | one by one these assumptions that make problems easy, like the assumption of complete observability
00:16:14.560 | of the situation, right? If we remove that assumption, you need a much more complicated
00:16:20.080 | kind of computing design, and you need something that actually keeps track of all the things you
00:16:25.440 | can't see and tries to estimate what's going on, and there's inevitable uncertainty in that. So
00:16:31.440 | it becomes a much more complicated problem. But, you know, we are removing those assumptions. We
00:16:37.520 | are starting to have algorithms that can cope with much longer timescales, that can cope with
00:16:43.120 | uncertainty, that can cope with partial observability. And so each of those steps sort of
00:16:50.400 | magnifies by a thousand the range of things that we can do with AI systems.
00:16:55.760 | So the way I started in AI, I wanted to be a psychiatrist for a long time. I wanted to
00:16:59.040 | understand the mind in high school, and of course program and so on. And I showed up
00:17:04.240 | University of Illinois to an AI lab and they said, "Okay, I don't have time for you, but here's a
00:17:10.880 | book, AI, a modern approach." I think it was the first edition at the time. "Here, go learn this."
00:17:18.480 | And I remember the lay of the land was, "Well, it's incredible that we solve chess,
00:17:22.800 | but we'll never solve Go." I mean, it was pretty certain that Go, in the way we thought about
00:17:29.440 | systems that reason, was impossible to solve, and now we've solved it. So it's a very...
00:17:34.240 | Well, I think I would have said that it's unlikely we could take the kind of algorithm
00:17:41.040 | that was used for chess and just get it to scale up and work well for Go.
00:17:46.480 | And at the time, what we thought was that in order to solve Go, we would have to do something
00:17:57.520 | similar to the way humans manage the complexity of Go, which is to break it down
00:18:03.600 | into kind of sub-games. So when a human thinks about a Go board, they think about different
00:18:08.080 | parts of the board as sort of weakly connected to each other. And they think about, "Okay,
00:18:13.360 | within this part of the board, here's how things could go. In that part of the board,
00:18:17.600 | here's how things could go." And then you try to sort of couple those two analyses together
00:18:21.840 | and deal with the interactions and maybe revise your views of how things are going to go in each
00:18:27.040 | part. And then you've got maybe five, six, seven, ten parts of the board. And that actually resembles
00:18:35.120 | the real world much more than chess does. Because in the real world, we have work, we have home
00:18:43.600 | life, we have sport, whatever, different kinds of activities, shopping. These all are connected to
00:18:52.800 | each other, but they're weakly connected. So when I'm typing a paper, I don't simultaneously have to
00:19:00.320 | decide which order I'm going to get the milk and the butter. That doesn't affect the typing.
00:19:06.960 | But I do need to realize, "Okay, I better finish this before the shops close because I don't have
00:19:12.080 | anything. I don't have any food at home." So there's some weak connection, but not in the way
00:19:17.760 | that chess works, where everything is tied into a single stream of thought. So the thought was that
00:19:24.720 | to solve Go, we'd have to make progress on stuff that would be useful for the real world. And in
00:19:29.600 | a way, AlphaGo is a little bit disappointing. Because the program design for AlphaGo is
00:19:35.600 | actually not that different from Deep Blue or even from Arthur Samuel's checker playing program from
00:19:44.960 | the 1950s. And in fact, the two things that make AlphaGo work is one is its amazing ability to
00:19:54.080 | evaluate the positions, and the other is the meta-reasoning capability, which allows it to
00:19:59.200 | explore some paths in the tree very deeply and to abandon other paths very quickly.
00:20:06.960 | - So this word meta-reasoning, while technically correct, inspires perhaps the wrong degree of
00:20:16.640 | power that AlphaGo has, for example. The word reasoning is a powerful word. Let me ask you,
00:20:21.760 | you were part of the symbolic AI world for a while, like where AI was, there's a lot of excellent,
00:20:30.880 | interesting ideas there that unfortunately met a winter. So do you think it re-emerges?
00:20:40.000 | - Oh, so I would say, yeah, it's not quite as simple as that. So the AI winter,
00:20:46.800 | the first winter that was actually named as such was the one in the late 80s.
00:20:56.400 | And that came about because in the mid-80s there was a,
00:21:01.120 | really a concerted attempt to push AI out into the real world, using what was called expert
00:21:10.880 | system technology. And for the most part, that technology was just not ready for prime time.
00:21:18.560 | They were trying in many cases to do a form of uncertain reasoning, judgment, combinations of
00:21:27.200 | evidence, diagnosis, those kinds of things, which was simply invalid. And when you try to apply
00:21:34.640 | invalid reasoning methods to real problems, you can fudge it for small versions of the problem,
00:21:40.960 | but when it starts to get larger, the thing just falls apart. So many companies found that
00:21:48.080 | the stuff just didn't work and they were spending tons of money on consultants to
00:21:52.880 | try to make it work. And there were other practical reasons, like they were asking
00:21:59.600 | the companies to buy incredibly expensive Lisp machine workstations, which were literally between
00:22:08.400 | 50 and a hundred thousand dollars in, you know, in 1980s money, which was, would be like between
00:22:14.720 | 150 and $300,000 per workstation in current prices. So...
00:22:20.800 | And the bottom line, they weren't seeing a profit from it.
00:22:24.080 | Yeah. In many cases, I think there were some successes, there's no doubt about that, but
00:22:29.840 | people, I would say, over-invested. Every major company was starting an AI department, just like
00:22:37.760 | now. And I worry a bit that we might see similar disappointments, not because the current technology
00:22:46.800 | is invalid, but it's limited in its scope. And it's almost the dual of the, you know,
00:22:57.440 | the scope problems that expert systems had. So...
00:23:00.000 | What have you learned from that hype cycle? And what can we do to prevent another winter,
00:23:05.680 | for example?
00:23:07.120 | Yeah. So when I'm giving talks these days, that's one of the warnings that I give. So this is a
00:23:14.320 | two-part warning slide. One is that, you know, rather than data being the new oil, data is the
00:23:20.240 | new snake oil.
00:23:21.040 | That's a good line.
00:23:22.800 | And then the other is that we might see a kind of very visible failure in some of the major
00:23:34.080 | application areas. And I think self-driving cars would be the flagship. And I think when you look
00:23:43.680 | at the history... So the first self-driving car was on the freeway, driving itself, changing lanes,
00:23:52.320 | overtaking in 1987. And so it's more than 30 years. And that kind of looks like where we are today.
00:24:04.560 | You know, prototypes on the freeway, changing lanes and overtaking. Now,
00:24:08.880 | I think significant progress has been made, particularly on the perception side. So
00:24:14.960 | we worked a lot on autonomous vehicles in the early-mid-90s at Berkeley.
00:24:21.040 | And we had our own big demonstrations. You know, we put congressmen into self-driving cars and
00:24:29.040 | had them zooming along the freeway. And the problem was clearly perception.
00:24:36.000 | At the time, the problem was perception.
00:24:39.120 | Yeah. So in simulation, with perfect perception, you could actually show that you can
00:24:44.720 | drive safely for a long time, even if the other car is misbehaving and so on. But
00:24:49.040 | simultaneously, we worked on machine vision for detecting cars and tracking pedestrians and so on.
00:24:57.680 | And we couldn't get the reliability of detection and tracking up to a high enough level,
00:25:06.080 | particularly in bad weather conditions, nighttime rain, fog.
00:25:11.520 | Good enough for demos, but perhaps not good enough to cover the general operation.
00:25:15.920 | Yeah. See, the thing about driving is, you know, so suppose you're a taxi driver,
00:25:19.440 | you know, and you drive every day, eight hours a day for 10 years, right? That's
00:25:23.440 | a hundred million seconds of driving, you know. And any one of those seconds,
00:25:27.760 | you can make a fatal mistake. So you're talking about eight nines of reliability, right?
00:25:34.960 | Now, if your vision system only detects 98.3% of the vehicles, right? Then that's sort of,
00:25:43.280 | you know, one and a bit nines of reliability. So you have another seven orders of magnitude to go.
00:25:52.160 | And this is what people don't understand. They think, "Oh, because I had a successful demo,
00:25:56.480 | I'm pretty much done." But you're not even within seven orders of magnitude of being done.
00:26:03.920 | And that's the difficulty. And it's not the, "Can I follow a white line?" That's not the problem,
00:26:12.320 | right? We follow a white line all the way across the country. But it's the weird stuff that happens.
00:26:20.800 | It's all the edge cases. Yeah.
00:26:22.160 | The edge case, other drivers doing weird things. So if you talk to Google, right? So
00:26:28.480 | they had actually a very classical architecture where, you know, you had machine vision,
00:26:35.360 | which would detect all the other cars and pedestrians and the white lines and the road
00:26:40.240 | signs. And then basically that was fed into a logical database. And then you had a classical
00:26:47.040 | 1970s rule-based expert system telling you, "Okay, if you're in the middle lane and there's a
00:26:54.480 | bicyclist in the right lane, who is signaling this, then you do that." Right? And what they
00:27:00.880 | found was that every day they'd go out and there'd be another situation that the rules didn't cover.
00:27:05.680 | You know, so they'd come to a traffic circle and there's a little girl riding her bicycle
00:27:10.160 | the wrong way around the traffic circle. Okay, what do you do? "We don't have a rule." "Oh my
00:27:13.440 | God. Okay, stop." And then, you know, they'd come back and add more rules and they just found that
00:27:19.280 | this was not really converging. And if you think about it, right, how do you deal with
00:27:26.080 | an unexpected situation? Meaning one that you've never previously encountered and the sort of
00:27:32.320 | the reasoning required to figure out the solution for that situation has never been done. It doesn't
00:27:39.280 | match any previous situation in terms of the kind of reasoning you have to do. Well, you know, in
00:27:45.840 | chess programs this happens all the time. You're constantly coming up with situations you haven't
00:27:51.200 | seen before and you have to reason about them and you have to think about, "Okay, here are the
00:27:56.480 | possible things I could do. Here are the outcomes. Here's how desirable the outcomes are." And then
00:28:01.600 | pick the right one. You know, in the 90s we were saying, "Okay, this is how you're going to have to
00:28:05.440 | do automated vehicles. They're going to have to have look-ahead capability." But the look-ahead
00:28:10.880 | for driving is more difficult than it is for chess. Because of humans. Right, there's humans and
00:28:16.960 | they're less predictable than chess pieces. Well, then you have an opponent in chess who's also
00:28:22.960 | somewhat unpredictable. But, for example, in chess you always know the opponent's intention. They're
00:28:29.920 | trying to beat you. Right? Whereas in driving you don't know, "Is this guy trying to turn left or has
00:28:36.320 | he just forgotten to turn off his turn signal or is he drunk or is he, you know, changing the channel
00:28:42.320 | on his radio or whatever it might be?" You've got to try and figure out the mental state, the intent
00:28:48.160 | of the other drivers to forecast the possible evolutions of their trajectories. And then you've
00:28:55.440 | got to figure out, "Okay, which is the trajectory for me that's going to be safest?" And those all
00:29:01.200 | interact with each other because the other driver is going to react to your trajectory and so on.
00:29:07.040 | So, you know, you've got the classic merging onto the freeway problem where you're kind of
00:29:11.760 | racing a vehicle that's already on the freeway and you either pull ahead of them or you're going
00:29:16.080 | to let them go first and pull in behind. And you get this sort of uncertainty about who's going
00:29:20.800 | first. So all those kinds of things mean that you need a decision-making architecture that's
00:29:31.120 | very different from either a rule-based system or, it seems to me, a kind of an end-to-end neural
00:29:38.000 | network system. You know, so just as AlphaGo is pretty good when it doesn't do any look ahead,
00:29:44.320 | but it's way, way, way, way better when it does, I think the same is going to be true for driving.
00:29:50.640 | You can have a driving system that's pretty good when it doesn't do any look ahead, but that's not
00:29:56.400 | good enough. You know, and we've already seen multiple deaths caused by poorly designed machine
00:30:05.040 | learning algorithms that don't really understand what they're doing. Yeah, on several levels. I
00:30:11.280 | think on the perception side, there's mistakes being made by those algorithms where the perception
00:30:17.760 | is very shallow. On the planning side, the look ahead, like you said. And the thing that we come
00:30:24.640 | up against that's really interesting when you try to deploy systems in the real world is you can't
00:30:33.600 | think of an artificial intelligence system as a thing that responds to the world always. You have
00:30:38.640 | to realize that it's an agent that others will respond to as well. So in order to drive successfully,
00:30:44.400 | you can't just try to do obstacle avoidance. Right, you can't pretend that you're invisible.
00:30:49.360 | Right? You're the invisible car. Right. It doesn't work that way. I mean, but you have to assert,
00:30:55.200 | yet others have to be scared of you. There's this tension. There's this game. So we study a lot of
00:31:02.640 | work with pedestrians. If you approach pedestrians as purely an obstacle avoidance, so you're doing
00:31:09.040 | look ahead as in modeling the intent, they're not going to take advantage of you. They're not going
00:31:15.520 | to respect you at all. There has to be a tension, a fear, some amount of uncertainty. That's how
00:31:21.520 | we have created... Or at least just a kind of a resoluteness. Right, yes. You have to display
00:31:29.040 | a certain amount of resoluteness. You can't be too tentative. And yeah, so the solutions then
00:31:38.800 | become pretty complicated. Right? You get into game theoretic analyses. And so we're, you know,
00:31:45.360 | at Berkeley now we're working a lot on this kind of interaction between machines and humans.
00:31:51.440 | And that's exciting. Yep.
00:31:53.440 | And so my colleague, Ankur Dragan, actually, you know, if you formulate the problem game
00:32:03.840 | theoretically, and you just let the system figure out the solution, you know, it does
00:32:08.320 | interesting, unexpected things. Like sometimes at a stop sign, if no one is going first,
00:32:14.400 | right, the car will actually back up a little, right? Just to indicate to the other cars that
00:32:21.200 | they should go. And that's something it invented entirely by itself.
00:32:25.120 | That's interesting.
00:32:25.840 | Right? There was, you know, we didn't say this is the language of communication at stop signs,
00:32:29.920 | it figured it out.
00:32:32.240 | That's really interesting. So let me one just step back for a second. Just this beautiful
00:32:38.960 | philosophical notion. So Pamela McCordick in 1979 wrote, "AI began with the ancient wish
00:32:46.880 | to forge the gods." So when you think about the history of our civilization,
00:32:51.840 | do you think that there is an inherent desire to create, let's not say gods,
00:33:00.320 | but to create super intelligence? Is it inherent to us? Is it in our genes, that the natural arc of
00:33:07.280 | human civilization is to create things that are of greater and greater power, and perhaps
00:33:14.960 | echoes of ourselves? So to create the gods, as Pamela said?
00:33:25.200 | It may be. I mean, you know, we're all individuals. But certainly we see over and
00:33:32.720 | over again in history, individuals who thought about this possibility.
00:33:38.720 | Hopefully, I'm not being too philosophical here. But if you look at the arc of this,
00:33:45.600 | you know, where this is going, and we'll talk about AI safety, we'll talk about greater and
00:33:49.520 | greater intelligence. Do you see that there, when you created the Othello program, and you felt this
00:33:57.280 | excitement, what was that excitement? Was it excitement of a tinkerer who created something
00:34:02.640 | cool, like a clock? Or was there a magic, or was it more like a child being born?
00:34:10.480 | Yeah, so I mean, I certainly understand that viewpoint. And if you look at
00:34:17.120 | the Lighthill report, which was... So in the 70s, there was a lot of controversy in the UK about
00:34:24.400 | AI, and you know, whether it was for real, and how much money the government should invest. And
00:34:30.480 | so it was a long story. But the government commissioned a report by
00:34:36.320 | Lighthill, who was a physicist, and he wrote a very damning report about AI, which I think was
00:34:46.160 | the point. And he said that, that these are, you know, frustrated men who, unable to have children
00:34:55.760 | would like to create, and, you know, create life, you know, as a kind of replacement,
00:35:02.960 | which I think is really pretty unfair.
00:35:13.120 | But there is, I mean, there is a kind of magic, I would say, when you build something,
00:35:21.600 | and what you're building in is really just, you're building in some understanding of the principles
00:35:30.400 | of learning and decision making. And to see those principles actually then turn into
00:35:38.800 | intelligent behavior in specific situations, it's an incredible thing. And, you know, that
00:35:50.240 | is naturally going to make you think, okay, where does this end?
00:35:58.480 | And so there's magical, optimistic views of where it ends. Whatever your view of optimism is,
00:36:08.320 | whatever your view of utopia is, is probably different for everybody. But you've often talked
00:36:13.360 | about concerns you have of how things may go wrong. So I've talked to Max Tegmark,
00:36:24.160 | there's a lot of interesting ways to think about AI safety. You're one of the
00:36:30.640 | seminal people thinking about this problem amongst sort of being in the weeds of actually
00:36:36.880 | solving specific AI problems. You're also thinking about the big picture of where we're going.
00:36:42.160 | So can you talk about several elements of it? Let's just talk about maybe the control problem.
00:36:47.840 | So this idea of losing ability to control the behavior in our AI system. So how do you see that?
00:36:58.240 | How do you see that coming about? What do you think we can do to manage it?
00:37:04.720 | [00:40:00]
00:37:05.520 | Well, so it doesn't take a genius to realize that if you make something that's smarter than you,
00:37:11.520 | you might have a problem. You know, Alan Turing wrote about this and gave lectures about this,
00:37:20.480 | you know, in I think 1951. He did a lecture on the radio. And he basically says, you know,
00:37:30.160 | once the machine thinking method starts, you know, very quickly they'll outstrip humanity.
00:37:37.280 | And, you know, if we're lucky, we might be able to, I think he says,
00:37:42.400 | if we may be able to turn off the power at strategic moments, but even so our species
00:37:49.040 | would be humbled. Yeah, you actually, he was wrong about that, right? Because you know,
00:37:54.560 | if it's a sufficiently intelligent machine, it's not going to let you switch it off.
00:37:58.560 | It's actually in competition with you.
00:38:00.080 | [00:40:50]
00:38:00.240 | So what do you think is meant just for a quick tangent? If we shut off this
00:38:05.840 | super intelligent machine that our species would be humbled.
00:38:08.880 | [00:41:00]
00:38:10.880 | I think he means that we would realize that we are inferior, right? That we only survive by the skin
00:38:20.560 | of our teeth because we happen to get to the off switch, you know, just in time, you know, and if
00:38:26.800 | we hadn't, then we would have lost control over the earth.
00:38:30.240 | [00:41:21]
00:38:30.240 | So do you, are you more worried when you think about this stuff about super intelligent AI,
00:38:36.800 | or are you more worried about super powerful AI that's not aligned with our values? So the paper
00:38:43.600 | clip scenarios kind of.
00:38:45.520 | [00:41:39]
00:38:47.120 | I think, so the main problem I'm working on is, is the control problem, the problem of
00:38:55.280 | machines pursuing objectives that are, as you say, not aligned with human objectives. And,
00:39:01.520 | and this has been, this has been the way we've thought about AI since the beginning.
00:39:07.520 | You, you build a machine for optimizing and then you put in some objective and it optimizes,
00:39:16.080 | right. And, and, you know, we, we can think of this as the, the King Midas problem, right?
00:39:23.920 | [00:42:14]
00:39:23.920 | Because if, you know, so King Midas put in this objective, right, everything I touch should turn
00:39:29.440 | to gold and the gods, you know, that's like the machine, they said, okay, done. You know,
00:39:34.720 | you now have this power and of course his food and his drink and his family all turned to gold.
00:39:40.080 | [00:42:30]
00:39:40.080 | And then he dies, misery and starvation. And this is, you know, it's, it's a warning. It's,
00:39:48.240 | it's a failure mode that pretty much every culture in history has had some story along
00:39:54.560 | the same lines. You know, there's the, the genie that gives you three wishes. And, you know,
00:39:58.640 | third wish is always, you know, please undo the first two wishes because I messed up.
00:40:02.800 | And, you know, and when Arthur Samuel wrote his chess, his checker playing program, which learned
00:40:11.520 | to play checkers considerably better than Arthur Samuel could play and actually reached a pretty
00:40:16.400 | decent standard. Norbert Wiener, who was one of the major mathematicians of the 20th century,
00:40:24.720 | sort of the father of modern automation control systems. You know, he saw this and he basically
00:40:31.680 | extrapolated, you know, as Turing did and said, okay, this is how we could lose control.
00:40:39.840 | And specifically that we have to be certain that the purpose we put into the machine is
00:40:49.600 | the purpose which we really desire. And the problem is we can't do that.
00:40:55.520 | [00:43:40]
00:40:56.320 | Right. You mean we're not, it's a very difficult to encode, to put our values on
00:41:02.160 | paper is really difficult or you're just saying it's impossible.
00:41:05.440 | [00:43:56]
00:41:05.920 | Uh, theoretically, theoretically it's possible, but, uh, in practice, it's extremely unlikely
00:41:17.120 | that we could specify correctly in advance the full range of concerns of humanity.
00:41:24.240 | [00:44:12]
00:41:24.400 | Yeah. You talked about cultural transmission of values. I think it's how humans to human
00:41:29.360 | transmission of values happens. Right.
00:41:31.200 | [00:44:20]
00:41:31.840 | Uh, well we learn, yeah. I mean, as we grow up, we learn about the values that matter,
00:41:38.400 | how things, how things should go, what is reasonable to pursue and what isn't reasonable
00:41:43.040 | to pursue.
00:41:43.600 | [00:44:30]
00:41:44.080 | I think machines can learn in the same kind of way.
00:41:46.480 | [00:44:33]
00:41:46.640 | Yeah. So I think that, um, what we need to do is to get away from this idea that you build
00:41:53.040 | an optimizing machine and then you put the objective into it. Because if it's possible
00:42:00.800 | that you might put in a wrong objective, and we already know this is possible because it's
00:42:04.480 | happened lots of times, right? That means that the machine should never take an objective
00:42:12.000 | that's given as gospel truth. Because once it takes the mission, the objective is gospel truth,
00:42:18.960 | right? Then it's believes that whatever actions it's taking in pursuit of that objective are
00:42:27.040 | the correct things to do. So you could be jumping up and down and saying, you know,
00:42:30.720 | no, no, no, you're going to destroy the world. But the machine knows what the true objective
00:42:35.600 | is and is pursuing it and tough luck to you. And this is not restricted to AI, right? This is,
00:42:42.080 | you know, I think many of the 20th century technologies, right? So in statistics,
00:42:46.800 | you, you minimize a loss function. The loss function is exogenously specified in control
00:42:51.840 | theory. You minimize a cost function in operations research, you maximize a reward function,
00:42:57.920 | and so on. So in all these disciplines, this is how we conceive of the problem. And it's the wrong
00:43:04.400 | problem. Because we cannot specify with certainty the correct objective, right? We need
00:43:12.560 | uncertainty, we need the machine to be uncertain about what it is that it's supposed to be
00:43:18.720 | maximizing. - Favorite idea of yours, I've heard you say somewhere, well, I shouldn't pick favorites,
00:43:25.200 | but it just sounds beautiful of we need to teach machines humility.
00:43:29.040 | - Yeah, I mean, - It's a beautiful way to put it. I love it.
00:43:33.680 | - That they're humble, in that they know that they don't know what it is they're supposed to be
00:43:40.560 | doing. And that those objectives, I mean, they exist, they're within us, but we may not be able
00:43:48.560 | to explicate them. We may not even know how we want our future to go. - Exactly.
00:43:58.160 | - And the machine, a machine that's uncertain is going to be deferential to us. So if we say,
00:44:06.800 | don't do that, well, now the machines learn something a bit more about our true objectives,
00:44:11.840 | because something that it thought was reasonable in pursuit of our objective, it turns out not to
00:44:17.200 | be, so now it's learned something. So it's going to defer because it wants to be doing what we
00:44:22.480 | really want. And that point I think is absolutely central to solving the control problem. And it's
00:44:32.800 | a different kind of AI when you take away this idea that the objective is known, then, in fact,
00:44:42.160 | a lot of the theoretical frameworks that we're so familiar with, you know, Markov decision processes,
00:44:49.840 | goal-based planning, you know, standard games research, all of these techniques actually become
00:44:58.560 | inapplicable. And you get a more complicated problem because now the interaction with the
00:45:09.920 | human becomes part of the problem. Because the human, by making choices, is giving you more
00:45:19.520 | information about the true objective, and that information helps you achieve the objective
00:45:25.040 | better. And so that really means that you're mostly dealing with game theoretic problems,
00:45:31.520 | where you've got the machine and the human and they're coupled together,
00:45:35.840 | rather than a machine going off by itself with a fixed objective.
00:45:39.040 | Which is fascinating on the machine and the human level that we, when you don't have an objective,
00:45:47.360 | means you're together coming up with an objective. I mean, there's a lot of philosophy that, you
00:45:53.440 | know, you could argue that life doesn't really have meaning. We together agree on what gives
00:45:59.120 | it meaning, and we kind of culturally create things that give why the heck we are on this
00:46:05.600 | earth anyway. We together as a society create that meaning, and you have to learn that objective.
00:46:10.800 | And one of the biggest, I thought that's where you were going to go for a second,
00:46:14.880 | one of the biggest troubles we run into outside of statistics and machine learning and AI,
00:46:21.200 | in just human civilization, is when you look at, I came from, I was born in the Soviet Union,
00:46:28.080 | and the history of the 20th century, we ran into the most trouble, us humans, when there was a
00:46:34.240 | certainty about the objective. And you do whatever it takes to achieve that objective,
00:46:40.800 | whether you're talking about Germany or communist Russia. You get into trouble with humans.
00:46:46.720 | I would say with, you know, corporations, in fact, some people argue that, you know,
00:46:52.400 | we don't have to look forward to a time when AI systems take over the world. They already have,
00:46:57.200 | and they call it corporations, right? That corporations happen to be using people as
00:47:03.760 | components right now, but they are effectively algorithmic machines, and they're optimizing
00:47:10.160 | an objective, which is quarterly profit that isn't aligned with overall well-being of the human race,
00:47:17.520 | and they are destroying the world. They are primarily responsible for our inability to tackle
00:47:23.440 | climate change. So I think that's one way of thinking about what's going on with corporations.
00:47:30.240 | But I think the point you're making is valid, that there are many systems in the real world
00:47:39.360 | where we've sort of prematurely fixed on the objective and then decoupled the machine from
00:47:47.920 | those that it's supposed to be serving. And I think you see this with government, right? Government is
00:47:54.400 | supposed to be a machine that serves people, but instead it tends to be taken over by people who
00:48:02.320 | have their own objective and use government to optimize that objective regardless of what people
00:48:07.840 | want. - Do you find appealing the idea of almost arguing machines, where you have multiple AI
00:48:15.440 | systems with a clear fixed objective? We have in government the red team and the blue team,
00:48:20.960 | they're very fixed on their objectives, and they argue and they kind of,
00:48:24.560 | I may disagree, but it kind of seems to make it work somewhat, that the duality of it.
00:48:34.720 | Okay, let's go 100 years back when there was still was going on, or at the founding of this country,
00:48:41.680 | there was disagreements, and that disagreement is where, so there's a balance between certainty
00:48:48.640 | and forced humility because the power was distributed. - Yeah, I think that the
00:48:55.360 | nature of debate and disagreement argument takes as a premise the idea that you could be wrong,
00:49:07.200 | right? Which means that you're not necessarily absolutely convinced that your objective is
00:49:13.920 | the correct one, right? If you were absolutely convinced, there'd be no point in having any
00:49:20.000 | discussion or argument because you would never change your mind, and there wouldn't be any
00:49:24.480 | sort of synthesis or anything like that. So I think you can think of argumentation as an
00:49:32.000 | implementation of a form of uncertain reasoning. And I've been reading recently about
00:49:42.800 | utilitarianism and the history of efforts to define, in a sort of clear mathematical way,
00:49:50.640 | I feel like a formula for moral or political decision making. And it's really interesting
00:50:00.240 | that the parallels between the philosophical discussions going back 200 years and what you
00:50:06.640 | see now in discussions about existential risk, because it's almost exactly the same. So someone
00:50:14.320 | would say, "Okay, well here's a formula for how we should make decisions." So utilitarianism is
00:50:19.040 | roughly, each person has a utility function and then we make decisions to maximize the sum of
00:50:26.560 | everybody's utility. And then people point out, "Well, in that case, the best policy is one that
00:50:35.360 | leads to the enormously vast population, all of whom are living a life that's barely worth living."
00:50:42.480 | And this is called the repugnant conclusion. And another version is that we should maximize
00:50:51.200 | pleasure. And that's what we mean by utility. And then you'll get people effectively saying,
00:50:57.680 | "Well, in that case, we might as well just have everyone hooked up to a heroin drip."
00:51:01.600 | And they didn't use those words, but that debate was happening in the 19th century,
00:51:07.840 | as it is now about AI. That if we get the formula wrong, we're going to have AI systems
00:51:16.320 | working towards an outcome that in retrospect would be exactly wrong.
00:51:21.920 | - Do you think there's, as beautifully put, so the echoes are there, but do you think,
00:51:26.960 | I mean, if you look at Sam Harris, our imagination worries about the AI version of that, because of
00:51:35.040 | the speed at which the things going wrong in the utilitarian context could happen.
00:51:44.640 | - Yeah.
00:51:45.360 | - Is that a worry for you?
00:51:47.280 | - Yeah, I think that in most cases, not in all, but if we have a wrong political idea,
00:51:55.360 | we see it starting to go wrong and we're not completely stupid. And so we sort of, "Okay,
00:52:01.200 | maybe that was a mistake. Let's try something different."
00:52:05.840 | And also we're very slow and inefficient about implementing these things and so on. So you have
00:52:12.640 | to worry when you have corporations or political systems that are extremely efficient. But when we
00:52:19.680 | look at AI systems, or even just computers in general, they have this different characteristic
00:52:28.320 | from ordinary human activity in the past. So let's say you were a surgeon, you had some idea about
00:52:35.200 | how to do some operation. And let's say you were wrong, that that way of doing the operation
00:52:41.600 | would mostly kill the patient. Well, you'd find out pretty quickly, like after three,
00:52:48.000 | maybe three or four tries, right? But that isn't true for pharmaceutical companies,
00:52:56.400 | because they don't do three or four operations. They manufacture three or four billion pills,
00:53:03.840 | and they sell them. And then they find out maybe six months or a year later that,
00:53:08.080 | "Oh, people are dying of heart attacks or getting cancer from this drug."
00:53:12.080 | And so that's why we have the FDA, right? Because of the scalability of pharmaceutical production.
00:53:19.280 | And there have been some unbelievably bad episodes in the history of pharmaceuticals and
00:53:30.400 | adulteration of products and so on that have killed tens of thousands or paralyzed hundreds
00:53:36.640 | of thousands of people. Now, with computers, we have that same scalability problem, that you can
00:53:43.760 | sit there and type "for i equals one to five billion do," right? And all of a sudden you're
00:53:49.760 | having an impact on a global scale. And yet we have no FDA, right? There's absolutely no controls
00:53:56.160 | at all over what a bunch of undergraduates with too much caffeine can do to the world.
00:54:03.440 | And we look at what happened with Facebook, well, social media in general, and click-through
00:54:10.160 | optimization. So you have a simple feedback algorithm that's trying to just optimize
00:54:18.480 | click-through, right? That sounds reasonable, right? Because you don't want to be feeding
00:54:23.840 | people ads that they don't care about or not interested in. And you might even think of
00:54:31.920 | that process as simply adjusting the feeding of ads or news articles or whatever it might be
00:54:40.560 | to match people's preferences, right? Which sounds like a good idea.
00:54:45.440 | But in fact, that isn't how the algorithm works, right? You make more money, the algorithm makes
00:54:54.880 | more money if it can better predict what people are going to click on, because then it can feed
00:55:02.080 | them exactly that, right? So the way to maximize click-through is actually to modify the people,
00:55:09.760 | to make them more predictable. And one way to do that is to feed them information which will change
00:55:18.560 | their behavior and preferences towards extremes that make them predictable. Whatever is the
00:55:25.680 | nearest extreme or the nearest predictable point, that's where you're going to end up.
00:55:30.640 | And the machines will force you there. Now, and I think there's a reasonable argument to say
00:55:38.080 | that this, among other things, is contributing to the destruction of democracy in the world.
00:55:47.280 | And where was the oversight of this process? Where were the people saying, "Okay, you would
00:55:53.680 | like to apply this algorithm to five billion people on the face of the earth. Can you show
00:55:59.760 | me that it's safe? Can you show me that it won't have various kinds of negative effects?" No,
00:56:05.200 | there was no one asking that question. There was no one placed between, you know, the undergrads
00:56:12.400 | with too much caffeine and the human race. They just did it. - And some, way outside the scope
00:56:19.920 | of my knowledge, so economists would argue that the, what is it, the invisible hand,
00:56:24.400 | so the capitalist system, it was the oversight. So if you're going to corrupt society with whatever
00:56:31.040 | decision you make as a company, then that's going to be reflected in people not using your product.
00:56:37.040 | That's one model of oversight. - We shall see, but in the meantime, you know, but you might even
00:56:43.760 | have broken the political system that enables capitalism to function. - Well, you've changed it.
00:56:51.440 | - We shall see. - Change is often painful. So my question is absolutely, it's fascinating.
00:57:00.000 | You're absolutely right that there was zero oversight on algorithms that can have a profound
00:57:06.320 | civilization-changing effect. So do you think it's possible, I mean, I haven't,
00:57:13.040 | have you seen government? So do you think it's possible to create regulatory bodies,
00:57:19.840 | oversight over AI algorithms, which are inherently such cutting-edge set of ideas and technologies?
00:57:30.960 | - Yeah, but I think it takes time to figure out what kind of oversight, what kinds of controls.
00:57:37.520 | I mean, it took time to design the FDA regime, you know, and some people still don't like it,
00:57:42.560 | and they want to fix it. And I think there are clear ways that it could be improved.
00:57:47.920 | But the whole notion that you have stage one, stage two, stage three, and here are the criteria for
00:57:54.480 | what you have to do to pass a stage one trial, right? - Yes. - We haven't even thought about
00:58:00.320 | what those would be for algorithms. So, I mean, I think there are things we could do right now
00:58:06.880 | with regard to bias, for example. We have a pretty good technical handle on
00:58:13.760 | how to detect algorithms that are propagating bias that exists in datasets, how to de-bias those
00:58:22.800 | algorithms, and even what it's going to cost you to do that. So I think we could start having some
00:58:30.640 | standards on that. I think there are things to do with impersonation and falsification
00:58:38.080 | that we could work on. - Like fakes, yeah. - A very simple point. So impersonation is a machine
00:58:48.800 | acting as if it was a person. I can't see a real justification
00:58:53.840 | for why we shouldn't insist that machines self-identify as machines.
00:58:59.280 | You know, where is the social benefit in fooling people into thinking that this is really a person
00:59:08.480 | when it isn't? You know, I don't mind if it uses a human-like voice that's easy to understand,
00:59:14.560 | that's fine, but it should just say, "I'm a machine" in some form. - And not many people
00:59:21.120 | are speaking to that, I would think, relatively obvious fact. So I think most people... - Yeah,
00:59:26.160 | I mean, there is actually a law in California that bans impersonation, but only in certain
00:59:32.320 | restricted circumstances. So for the purpose of engaging in a fraudulent transaction,
00:59:40.800 | and for the purpose of modifying someone's voting behavior. So those are the circumstances where
00:59:48.160 | machines have to self-identify. But I think, you know, arguably it should be in all circumstances.
00:59:55.440 | And then when you talk about deepfakes, you know, we're just at the beginning,
01:00:01.520 | but already it's possible to make a movie of anybody saying anything in ways that are pretty
01:00:09.680 | hard to detect. - Including yourself, because you're on camera now, and your voice is coming
01:00:14.480 | through with high resolution. - Yeah, so you could take what I'm saying and replace it with
01:00:18.720 | pretty much anything else you wanted me to be saying. And even it would change my lips and
01:00:23.120 | facial expressions to fit. And there's actually not much in the way of real legal protection
01:00:34.640 | against that. I think in the commercial area you could say, "Yeah, you're using my brand,"
01:00:41.120 | and so on. There are rules about that. But in the political sphere, I think it's...
01:00:46.160 | At the moment, it's, you know, anything goes. So that could be really, really damaging.
01:00:52.480 | - And let me just try to make, not an argument, but try to look back at history
01:01:00.720 | and say something dark, in essence, is while regulation seems to be, oversight seems to be
01:01:09.040 | exactly the right thing to do here, it seems that human beings, what they naturally do is they wait
01:01:14.480 | for something to go wrong. If you're talking about nuclear weapons, you can't talk about
01:01:20.080 | nuclear weapons being dangerous until somebody actually, like the United States, drops the bomb.
01:01:25.920 | Or Chernobyl melting. Do you think we will have to wait for things going wrong in a way that's
01:01:34.960 | obviously damaging to society? Not an existential risk, but obviously damaging.
01:01:39.840 | Or do you have faith that... - I hope not. But I mean, I think we do have to look at history.
01:01:48.000 | And, you know, so the two examples you gave, nuclear weapons and nuclear power,
01:01:55.840 | are very, very interesting because, you know, in nuclear weapons, we knew in the early years of the
01:02:04.800 | 20th century that atoms contained a huge amount of energy, right? We had E equals mc squared,
01:02:09.920 | we knew the mass differences between the different atoms and their components, and we knew that
01:02:15.360 | you might be able to make an incredibly powerful explosive. So H.G. Wells wrote a science fiction
01:02:23.440 | book, I think in 1912. Frederick Soddy, who was the guy who discovered isotopes,
01:02:30.320 | is a Nobel Prize winner. He gave a speech in 1915 saying that, you know, one pound of this
01:02:38.720 | new explosive would be the equivalent of 150 tons of dynamite, which turns out to be about right.
01:02:44.160 | And, you know, this was in World War I, right? So he was imagining how much worse the world
01:02:52.480 | war would be if we were using that kind of explosive. But the physics establishment simply
01:02:58.000 | refused to believe that these things could be made. - Including the people who were making it.
01:03:05.760 | - Well, so they were doing the nuclear physics... - I mean, eventually were the ones who made it.
01:03:11.200 | You're talking about Fermi or whoever. - Well, so up to... The development
01:03:19.760 | was mostly theoretical. So it was people using sort of primitive kinds of particle acceleration
01:03:24.240 | and doing experiments at the level of single particles or collections of particles. They
01:03:31.040 | weren't yet thinking about how to actually make a bomb or anything like that. But they knew the
01:03:38.640 | energy was there, and they figured if they understood it better, it might be possible.
01:03:42.960 | But the physics establishment, their view, and I think because they did not want it to be true,
01:03:49.520 | their view was that it could not be true. That this could not provide a way to make a super weapon.
01:03:57.040 | And, you know, there was this famous speech given by Rutherford, who was the sort of leader of
01:04:04.880 | nuclear physics. And it was on September 11th, 1933. And he said, you know, anyone who talks
01:04:12.800 | about the possibility of obtaining energy from transformation of atoms is talking complete
01:04:18.720 | moonshine. And the next morning, Leo Szilard read about that speech and then invented the
01:04:27.360 | nuclear chain reaction. And so as soon as he invented, as soon as he had that idea that you
01:04:34.320 | could make a chain reaction with neutrons, because neutrons were not repelled by the nucleus, so they
01:04:39.280 | could enter the nucleus and then continue the reaction. As soon as he has that idea, he instantly
01:04:46.400 | realized that the world was in deep doo-doo. Because this is 1933, right? Hitler had recently
01:04:56.400 | come to power in Germany. Szilard was in London, and eventually became a refugee,
01:05:02.720 | and came to the US. And in the process of having the idea about the chain reaction,
01:05:12.480 | he figured out basically how to make a bomb and also how to make a reactor.
01:05:16.720 | And he patented the reactor in 1934. But because of the situation, the great power conflict
01:05:26.560 | situation that he could see happening, he kept that a secret. And so between then
01:05:35.680 | and the beginning of World War II, people were working, including the Germans, on
01:05:42.320 | how to actually create neutron sources, right? What specific fission reactions would produce
01:05:51.840 | neutrons of the right energy to continue the reaction. And that was demonstrated in Germany,
01:05:59.200 | I think, in 1938, if I remember correctly. The first nuclear weapon patent was 1939,
01:06:06.640 | by the French. So this was actually going on well before World War II really got going.
01:06:18.720 | And then the British probably had the most advanced capability in this area, but for safety
01:06:24.880 | reasons, among others, and just sort of just resources, they moved the program from Britain
01:06:30.800 | to the US, and then that became Manhattan Project. So the reason why we couldn't
01:06:37.920 | have any kind of oversight of nuclear weapons and nuclear technology
01:06:44.960 | was because we were basically already in an arms race and a war.
01:06:52.320 | - But you mentioned then in the 20s and 30s, so what are the echoes?
01:06:57.520 | The way you've described this story, I mean, there's clearly echoes. Why do you think most
01:07:03.920 | AI researchers, folks who are really close to the metal, they really are not concerned about it,
01:07:11.440 | they don't think about it, whether it's they don't want to think about it. But what are the,
01:07:16.880 | yeah, why do you think that is? What are the echoes of the nuclear situation to the current
01:07:23.440 | AI situation? And what can we do about it? - I think there is a kind of motivated cognition,
01:07:32.720 | which is a term in psychology means that you believe what you would like to be true,
01:07:39.200 | rather than what is true. And it's unsettling to think that what you're working on might
01:07:49.120 | be the end of the human race, obviously. So you would rather instantly deny it,
01:07:55.520 | come up with some reason why it couldn't be true. And I have collected a long list of reasons that
01:08:04.320 | extremely intelligent, competent AI scientists have come up with for why we shouldn't worry
01:08:10.640 | about this. For example, calculators are superhuman at arithmetic and they haven't
01:08:17.600 | taken over the world, so there's nothing to worry about. Well, okay, my five-year-old could have
01:08:23.440 | figured out why that was an unreasonable and really quite weak argument. Another one was
01:08:33.280 | that while it's theoretically possible that you could have superhuman AI destroy the world,
01:08:42.720 | it's also theoretically possible that a black hole could materialize right next to the Earth and
01:08:48.080 | destroy humanity. I mean, yes, it's theoretically possible, quantum theoretically extremely unlikely
01:08:54.720 | that it would just materialize right there. But that's a completely bogus analogy, because
01:09:01.680 | if the whole physics community on Earth was working to materialize a black hole in near-Earth orbit,
01:09:07.360 | wouldn't you ask them, "Is that a good idea? Is that going to be safe? What if you succeed?"
01:09:13.760 | And that's the thing. The AI community has sort of refused to ask itself, "What if you succeed?"
01:09:22.080 | And initially I think that was because it was too hard, but Alan Turing asked himself that,
01:09:30.560 | and he said, "We'd be toast." Right? If we were lucky, we might be able to switch off the power,
01:09:36.800 | but probably we'd be toast. But there's also an aspect that because we're not exactly
01:09:44.240 | sure what the future holds, it's not clear exactly, so technically, what to worry about,
01:09:50.640 | sort of how things go wrong. And so there is something, it feels like, maybe you can correct
01:09:59.040 | me if I'm wrong, but there's something paralyzing about worrying about something that logically is
01:10:06.000 | inevitable, but you don't really know what that will look like. - Yeah, I think that's a reasonable
01:10:13.360 | point. And you know, it's certainly in terms of existential risks, it's different from, you know,
01:10:21.120 | the asteroid collides with the Earth, right? Which again is quite possible. You know, it's happened
01:10:27.120 | in the past, it'll probably happen again, we don't know right now, but if we did detect an asteroid
01:10:33.760 | that was going to hit the Earth in 75 years time, we'd certainly be doing something about it.
01:10:39.600 | - Well, it's clear there's a big rock, and we'll probably have a meeting and see what do we do
01:10:44.000 | about the big rock with AI. - Right, with AI, I mean, there are very few people who think it's
01:10:49.120 | not gonna happen within the next 75 years. I know Rod Brooks doesn't think it's gonna happen,
01:10:55.200 | maybe Andrew Ng doesn't think it's happened, but you know, a lot of the people who work day-to-day,
01:11:00.880 | you know, as you say, at the rock face, they think it's gonna happen. I think the median
01:11:07.280 | estimate from AI researchers is somewhere in 40 to 50 years from now, or maybe even a little,
01:11:14.080 | you know, I think in Asia they think it's gonna be even faster than that.
01:11:17.200 | I am a little bit more conservative, I think it'd probably take longer than that, but I think it's
01:11:25.120 | you know, as happened with nuclear weapons, - It can happen overnight.
01:11:30.080 | - It can happen overnight that you have these breakthroughs, and we need more than one
01:11:33.360 | breakthrough, but you know, it's on the order of half a dozen, I mean, this is a very rough scale,
01:11:40.000 | but sort of half a dozen breakthroughs of that nature would have to happen for us to reach
01:11:48.640 | superhuman AI. But the AI research community is vast now, the massive investments from governments,
01:11:57.280 | from corporations, tons of really, really smart people, you know, you just have to look at the
01:12:03.360 | rate of progress in different areas of AI to see that things are moving pretty fast. So to say,
01:12:09.200 | "Oh, it's just gonna be thousands of years," I don't see any basis for that. You know, I see,
01:12:15.920 | you know, for example, the Stanford 100-Year AI Project, right, which is supposed to be sort of,
01:12:26.400 | you know, the serious establishment view, their most recent report actually said it's probably
01:12:32.480 | not even possible. - Oh, wow.
01:12:34.720 | - Right, which if you want a perfect example of people in denial, that's it. Because, you know,
01:12:42.960 | for the whole history of AI, we've been saying to philosophers who said it wasn't possible, "Well,
01:12:49.760 | you have no idea what you're talking about. Of course it's possible, right? Give me an argument
01:12:54.000 | for why it couldn't happen." And there isn't one, right? And now, because people are worried that
01:13:00.480 | maybe AI might get a bad name, or I just don't want to think about this, they're saying, "Okay,
01:13:06.080 | well, of course it's not really possible." You know, and we imagine, right? Imagine if, you know,
01:13:10.160 | the leaders of the cancer biology community got up and said, "Well, you know, of course,
01:13:16.560 | curing cancer, it's not really possible." There'd be complete outrage and dismay, and,
01:13:23.600 | you know, I find this really a strange phenomenon. So, okay, so if you accept that it's possible,
01:13:35.680 | and if you accept that it's probably going to happen, the point that you're making that,
01:13:42.400 | you know, how does it go wrong, a valid question without that, without an answer to that question,
01:13:50.240 | then you're stuck with what I call the guerrilla problem, which is, you know, the problem that the
01:13:54.480 | guerrillas face, right? They made something more intelligent than them, namely us, a few million
01:14:00.480 | years ago, and now they're in deep doo-doo. So there's really nothing they can do. They've lost
01:14:07.680 | the control. They failed to solve the control problem of controlling humans, and so they've
01:14:13.760 | lost. So we don't want to be in that situation, and if the guerrilla problem is the only formulation
01:14:20.240 | you have, there's not a lot you can do, right? Other than to say, "Okay, we should try to stop.
01:14:26.560 | You know, we should just not make the humans, or in this case, not make the AI." And I think
01:14:31.760 | that's really hard to do. I'm not actually proposing that that's a feasible course of action.
01:14:40.640 | And I also think that, you know, if properly controlled, AI could be incredibly beneficial.
01:14:46.080 | But it seems to me that there's a consensus that one of the major failure modes
01:14:55.920 | is this loss of control. That we create AI systems that are pursuing incorrect objectives,
01:15:02.880 | and because the AI system believes it knows what the objective is, it has no incentive
01:15:11.040 | to listen to us anymore, so to speak, right? It's just carrying out the strategy that it
01:15:19.360 | has computed as being the optimal solution. And, you know, it may be that in the process,
01:15:26.880 | it needs to acquire more resources to increase the possibility of success, or prevent various
01:15:34.640 | failure modes by defending itself against interference. And so that collection of
01:15:40.880 | problems, I think, is something we can address. The other problems are, roughly speaking,
01:15:51.120 | misuse. So even if we solve the control problem, we make perfectly safe, controllable AI systems,
01:15:58.720 | well why? Why is Dr. Evil going to use those? He wants to just take over the world, and he'll
01:16:04.320 | make unsafe AI systems that then get out of control. So that's one problem which is sort of a
01:16:10.960 | you know, partly a policing problem, partly a sort of a cultural problem for the profession
01:16:19.600 | of how we teach people what kinds of AI systems are safe.
01:16:23.440 | - You talk about autonomous weapon system and how pretty much everybody agrees that there's
01:16:28.240 | too many ways that that can go horribly wrong. You have this great Slaughterbots movie that kind of
01:16:33.920 | illustrates that beautifully. - Well, I want to talk about that. That's another
01:16:38.000 | there's another topic I'm happy to talk about. I just want to mention that what I see is the
01:16:42.400 | third major failure mode, which is overuse, not so much misuse, but overuse of AI.
01:16:48.080 | That we become overly dependent. So I call this the WALL-E problem. So if you've seen WALL-E,
01:16:55.280 | the movie, all right, all the humans are on the spaceship, and the machines look after everything
01:17:00.800 | for them, and they just watch TV and drink big gulps. And they're all sort of obese and stupid,
01:17:07.680 | and they sort of totally lost any notion of human autonomy. And, you know, so in effect, right,
01:17:17.040 | this would happen like the slow boiling frog, right? We would gradually turn over more and more
01:17:24.960 | of the management of our civilization to machines, as we are already doing. And this, you know,
01:17:30.000 | if this process continues, you know, we sort of gradually switch from sort of being the masters
01:17:37.920 | of technology to just being the guests, right? So we become guests on a cruise ship, you know,
01:17:44.240 | which is fine for a week, but not for the rest of eternity. You know, and it's almost irreversible,
01:17:52.160 | right? Once you lose the incentive to, for example, you know, learn to be an engineer or a
01:17:59.520 | doctor or a sanitation operative or any other of the infinitely many ways that we maintain and
01:18:08.720 | propagate our civilization, you know, if you don't have the incentive to do any of that, you won't.
01:18:15.040 | And then it's really hard to recover.
01:18:18.400 | - And of course, AI is just one of the technologies that could, that third failure
01:18:22.240 | mode result in that. There's probably other technology in general detaches us from...
01:18:27.360 | - It does a bit, but the difference is that in terms of the knowledge to run our civilization,
01:18:35.520 | you know, up to now we've had no alternative but to put it into people's heads.
01:18:40.000 | - Right.
01:18:40.800 | - Right. And if you...
01:18:41.520 | - Or software with Google, I mean, so software in general, so AI broadly defined.
01:18:45.840 | - Computers in general, but the, you know, the knowledge of how, you know,
01:18:51.440 | how a sanitation system works, you know, that's an, the AI has to understand that.
01:18:55.760 | It's no good putting it into Google. So, I mean, we've always put knowledge in on paper,
01:19:00.800 | but paper doesn't run our civilization. It only runs when it goes from the paper
01:19:05.600 | into people's heads again. Right. So we've always propagated civilization through human minds.
01:19:12.000 | And we've spent about a trillion person years doing that. Literally, right? You can work it out.
01:19:18.480 | It's about, right. There's about just over a hundred billion people who've ever lived.
01:19:22.880 | And each of them has spent about 10 years learning stuff to keep their civilization going. And so
01:19:29.200 | that's a trillion person years we put into this effort.
01:19:31.520 | - Beautiful way to describe all of civilization.
01:19:33.920 | - And now we're, you know, we're in danger of throwing that away. So this is a problem that
01:19:38.480 | AI can't solve. It's not a technical problem. It's, you know, and if we do our job right,
01:19:43.920 | the AI systems will say, you know, the human race doesn't in the long run want to be passengers in
01:19:50.880 | a cruise ship. The human race wants autonomy. This is part of human preferences. So we, the AI
01:19:58.160 | systems are not going to do this stuff for you. You've got to do it for yourself. Right? I'm not
01:20:03.680 | going to carry you to the top of Everest in an autonomous helicopter. You have to climb it if
01:20:08.640 | you want to get the benefit and so on. So, but I'm afraid that because we are short-sighted and lazy,
01:20:18.160 | we're going to override the AI systems. And there's an amazing short story that I recommend
01:20:25.280 | to everyone that I talk to about this called "The Machine Stops" written in 1909 by E.M. Forster,
01:20:33.360 | who, you know, wrote novels about the British empire and sort of things that became costume
01:20:39.200 | dramas on the BBC. But he wrote this one science fiction story, which is an amazing
01:20:45.840 | vision of the future. It has, it has basically iPads. It has video conferencing. It has MOOCs.
01:20:52.640 | It has computer-induced obesity. I mean, literally, it's what people spend their time doing
01:21:01.280 | is giving online courses or listening to online courses and talking about ideas, but they never
01:21:06.960 | get out there in the real world. They don't really have a lot of face-to-face contact.
01:21:12.320 | Everything is done online, you know. So, all the things we're worrying about now
01:21:16.480 | were described in this story. And then the human race becomes more and more dependent on the
01:21:22.080 | machine, loses knowledge of how things really run, and then becomes vulnerable to collapse.
01:21:29.840 | And so, it's a pretty unbelievably amazing story for someone writing in 1909 to imagine all this.
01:21:38.800 | Plus. So, there's very few people that represent artificial intelligence
01:21:43.920 | more than you, Stuart Russell.
01:21:46.080 | If you say so, okay, that's very kind. So, it's all my fault.
01:21:50.800 | It's all your fault. No, right. You're often brought up as the person, well, Stuart Russell,
01:21:59.680 | like the AI person is worried about this. That's why you should be worried about it.
01:22:06.080 | Do you feel the burden of that? I don't know if you feel that at all, but when I talk to people,
01:22:11.280 | you talk about people outside of computer science when they think about this. Stuart Russell
01:22:17.600 | is worried about AI safety. You should be worried too. Do you feel the burden of that?
01:22:22.880 | I mean, in a practical sense, yeah. Because I get a dozen, sometimes 25, invitations a day
01:22:34.400 | to talk about it, to give interviews, to write press articles, and so on. So,
01:22:39.840 | in that very practical sense, I'm seeing that people are concerned and really interested about
01:22:47.120 | this. But you're worried that you could be wrong, as all good scientists are.
01:22:52.240 | Of course. I worry about that all the time. I mean, that's always been the way that I've worked,
01:22:58.800 | you know, is like I have an argument in my head with myself, right? So, I have some idea,
01:23:04.720 | and then I think, "Okay, how could that be wrong? Or did someone else already have that idea?" So,
01:23:11.120 | I'll go and, you know, search in as much literature as I can to see whether someone
01:23:16.720 | else already thought of that or even refuted it. So, you know, right now I'm reading a lot
01:23:23.840 | of philosophy because, you know, in the form of the debates over utilitarianism and other kinds of
01:23:34.480 | moral formulas, shall we say, people have already thought through some of these issues. But,
01:23:44.000 | you know, one of the things I'm not seeing in a lot of these debates is this specific idea about
01:23:52.640 | the importance of uncertainty in the objective. That this is the way we should think about
01:23:59.040 | machines that are beneficial to humans. So, this idea of provably beneficial machines based on
01:24:06.240 | explicit uncertainty in the objective, you know, it seems to be, you know, my gut feeling is this
01:24:15.200 | is the core of it. It's going to have to be elaborated in a lot of different directions.
01:24:20.560 | And there are a lot of... - Probably beneficial.
01:24:22.800 | - Yeah, but there are, I mean, it has to be, right? We can't afford, you know, hand-wavy
01:24:28.800 | beneficial because there are, you know, whenever we do hand-wavy stuff, there are loopholes. And
01:24:34.160 | the thing about super-intelligent machines is they find the loopholes. You know, just like,
01:24:39.040 | you know, tax evaders. If you don't write your tax law properly, people will find the loopholes
01:24:45.440 | and end up paying no tax. And so, you should think of it this way. And getting those definitions
01:24:54.160 | right, you know, it is really a long process, you know. So, you can define mathematical frameworks
01:25:04.400 | and within that framework, you can prove mathematical theorems that, yes, this theoretical
01:25:10.480 | entity will be provably beneficial to that theoretical entity. But that framework may
01:25:15.920 | not match the real world in some crucial way. - It's a long process of thinking through it,
01:25:20.960 | iterating and so on. Last question. - Yep.
01:25:23.120 | - You have 10 seconds to answer it. What is your favorite sci-fi movie about AI?
01:25:30.080 | - I would say Interstellar has my favorite robots. - Oh, Beats, Space Odyssey.
01:25:36.640 | - Yeah, yeah, yeah. So, TARS, the robots, one of the robots in Interstellar is the way robots
01:25:42.960 | should behave. And I would say Ex Machina is in some ways the one that makes you think
01:25:54.000 | in a nervous kind of way about where we're going.
01:25:57.920 | - Well, Stuart, thank you so much for talking today.
01:25:59.840 | - Pleasure.
01:26:00.640 | - Thank you.
01:26:02.100 | - Thank you.
01:26:03.000 | - Thank you.
01:26:03.580 | - Thank you.
01:26:04.080 | - Thank you.
01:26:04.580 | - Thank you.
01:26:05.080 | - Thank you.
01:26:05.580 | - Thank you.
01:26:06.080 | - Thank you.
01:26:06.580 | - Thank you.
01:26:07.080 | - Thank you.
01:26:07.580 | - Thank you.
01:26:08.080 | - Thank you.
01:26:08.580 | - Thank you.
01:26:09.080 | - Thank you.
01:26:09.580 | - Thank you.
01:26:10.080 | - Thank you.
01:26:10.580 | - Thank you.
01:26:11.080 | - Thank you.
01:26:11.580 | - Thank you.
01:26:12.080 | - Thank you.
01:26:12.580 | - Thank you.
01:26:13.080 | - Thank you.
01:26:13.580 | - Thank you.
01:26:14.080 | - Thank you.
01:26:14.580 | - Thank you.
01:26:15.080 | - Thank you.
01:26:15.580 | - Thank you.
01:26:16.080 | [BLANK_AUDIO]