back to indexStuart Russell: Long-Term Future of Artificial Intelligence | Lex Fridman Podcast #9
00:00:00.000 |
The following is a conversation with Stuart Russell. He's a professor of computer science at 00:00:04.720 |
UC Berkeley and a co-author of a book that introduced me and millions of other people 00:00:10.240 |
to the amazing world of AI called Artificial Intelligence, The Modern Approach. So it was an 00:00:16.720 |
honor for me to have this conversation as part of MIT course in Artificial General Intelligence 00:00:23.120 |
and the Artificial Intelligence podcast. If you enjoy it, please subscribe on YouTube, 00:00:28.560 |
iTunes, or your podcast provider of choice, or simply connect with me on Twitter @LexFriedman, 00:00:34.320 |
spelled F-R-I-D. And now, here's my conversation with Stuart Russell. 00:00:40.080 |
So you've mentioned in 1975 in high school you've created one of your first AI programs that played 00:00:48.080 |
chess. Were you ever able to build a program that beat you at chess or another board game? 00:00:58.240 |
>>Stuart Russell So my program never beat me at chess. I actually wrote the program 00:01:04.640 |
at Imperial College. So I used to take the bus every Wednesday with a box of cards this big 00:01:12.160 |
and shove them into the card reader. And they gave us eight seconds of CPU time. 00:01:17.200 |
It took about five seconds to read the cards in and compile the code. So we had three seconds 00:01:24.560 |
of CPU time, which was enough to make one move with a not very deep search. And then we would 00:01:31.040 |
print that move out, and then we'd have to go to the back of the queue and wait to feed the cards 00:01:35.200 |
in again. >>Lex: How deep was the search? Are we talking about one move, two moves, three moves? 00:01:39.280 |
>>Stuart Russell So no, I think we got an eight move, 00:01:42.560 |
depth eight with alpha beta. And we had some tricks of our own about move ordering and some 00:01:51.120 |
pruning of the tree. >>Lex: But you were still able to beat that program? 00:01:54.880 |
>>Stuart Russell Yeah, yeah. I was a reasonable chess player in my youth. I did an Othello program 00:02:00.880 |
and a backgammon program. So when I got to Berkeley, I worked a lot on what we call meta 00:02:08.160 |
reasoning, which really means reasoning about reasoning. And in the case of a game playing 00:02:13.840 |
program, you need to reason about what parts of the search tree you're actually going to explore, 00:02:19.200 |
because the search tree is enormous, bigger than the number of atoms in the universe. 00:02:24.320 |
And the way programs succeed and the way humans succeed is by only looking at a small fraction 00:02:32.240 |
of the search tree. And if you look at the right fraction, you play really well. If you look at the 00:02:37.360 |
wrong fraction, if you waste your time thinking about things that are never going to happen, 00:02:42.320 |
moves that no one's ever going to make, then you're going to lose, because you won't be able 00:02:47.440 |
to figure out the right decision. So that question of how machines can manage their own computation, 00:02:54.960 |
how they decide what to think about, is the meta reasoning question. We developed some methods 00:03:00.960 |
for doing that. And very simply, a machine should think about whatever thoughts are going to improve 00:03:08.880 |
its decision quality. We were able to show that both for Othello, which is a standard two-player 00:03:15.440 |
game, and for Backgammon, which includes dice rolls, so it's a two-player game with uncertainty. 00:03:21.840 |
For both of those cases, we could come up with algorithms that were actually much more efficient 00:03:27.680 |
than the standard alpha-beta search, which chess programs at the time were using. And that, 00:03:34.400 |
those programs could beat me. And I think you can see the same basic ideas in AlphaGo and AlphaZero 00:03:44.320 |
today. The way they explore the tree is using a form of meta reasoning to select what to think 00:03:53.200 |
about based on how useful it is to think about it. Is there any insights you can describe with 00:03:59.520 |
our Greek symbols of how do we select which paths to go down? There's really two kinds of learning 00:04:06.800 |
going on. So as you say, AlphaGo learns to evaluate board positions. So it can look at a go board 00:04:14.080 |
and it actually has probably a superhuman ability to instantly tell how promising that situation is. 00:04:22.320 |
To me, the amazing thing about AlphaGo is not that 00:04:26.960 |
it can beat the world champion with its hands tied behind his back, but the fact that 00:04:36.240 |
if you stop it from searching altogether, so you say, okay, you're not allowed to do 00:04:41.280 |
any thinking ahead, right? You can just consider each of your legal moves and then look at the 00:04:46.480 |
resulting situation and evaluate it. So what we call a depth one search. So just the immediate 00:04:53.280 |
outcome of your moves and decide if that's good or bad. That version of AlphaGo 00:04:57.920 |
can still play at a professional level, right? And human professionals are sitting there for 00:05:05.200 |
five, 10 minutes deciding what to do. And AlphaGo in less than a second can instantly intuit what 00:05:13.440 |
is the right move to make based on its ability to evaluate positions. And that is remarkable 00:05:18.800 |
because we don't have that level of intuition about Go. We actually have to think about 00:05:25.360 |
the situation. So anyway, that capability that AlphaGo has is one big part of why it 00:05:34.560 |
beats humans. The other big part is that it's able to look ahead 40, 50, 60 moves into the future. 00:05:43.200 |
And if it was considering all possibilities, 40 or 50 or 60 moves into the future, that would be 00:05:52.400 |
10 to the 200 possibilities. So way, way more than atoms in the universe and so on. So it's very, 00:06:03.440 |
very selective about what it looks at. So let me try to give you an intuition about 00:06:09.680 |
how you decide what to think about. It's a combination of two things. One is how promising 00:06:16.960 |
it is. So if you're already convinced that a move is terrible, there's no point spending a lot more 00:06:24.640 |
time convincing yourself that it's terrible because it's probably not going to change your 00:06:30.320 |
mind. So the real reason you think is because there's some possibility of changing your mind 00:06:35.680 |
about what to do. And it's that changing of mind that would result then in a better 00:06:42.240 |
final action in the real world. So that's the purpose of thinking, is to improve the final 00:06:48.480 |
action in the real world. And so if you think about a move that is guaranteed to be terrible, 00:06:55.440 |
you can convince yourself it's terrible, you're still not going to change your mind. 00:06:58.880 |
Right? But on the other hand, you suppose you had a choice between two moves. One of them you've 00:07:04.320 |
already figured out is guaranteed to be a draw, let's say. And then the other one looks a little 00:07:10.320 |
bit worse. Like it looks fairly likely that if you make that move, you're going to lose. 00:07:14.000 |
But there's still some uncertainty about the value of that move. There's still some possibility that 00:07:20.880 |
it will turn out to be a win. Right? Then it's worth thinking about that. So even though it's 00:07:25.840 |
less promising on average than the other move, which is guaranteed to be a draw, there's still 00:07:31.280 |
some purpose in thinking about it, because there's a chance that you will change your mind and 00:07:35.680 |
discover that in fact it's a better move. So it's a combination of how good the move appears to be 00:07:42.080 |
and how much uncertainty there is about its value. The more uncertainty, the more it's worth thinking 00:07:48.000 |
about. Because there's a higher upside if you want to think of it that way. 00:07:52.080 |
And of course in the beginning, especially in the AlphaGo Zero formulation, everything is shrouded 00:07:59.760 |
in uncertainty. So you're really swimming in a sea of uncertainty. So it benefits you to, 00:08:06.240 |
I mean, actually following the same process as you described, but because you're so uncertain 00:08:11.120 |
about everything, you basically have to try a lot of different directions. 00:08:15.280 |
Yeah. So the early parts of the search tree are fairly bushy, that it will look at a lot 00:08:22.400 |
of different possibilities, but fairly quickly, the degree of certainty about some of the moves, 00:08:27.840 |
I mean, if a move is really terrible, you'll pretty quickly find out, right? You'll lose half 00:08:31.920 |
your pieces or half your territory. And then you'll say, "Okay, this is not worth thinking 00:08:37.200 |
about anymore." And then, so further down, the tree becomes very long and narrow. And you're 00:08:44.400 |
following various lines of play, 10, 20, 30, 40, 50 moves into the future. And that again is 00:08:54.560 |
something that human beings have a very hard time doing, mainly because they just lack the short 00:09:01.280 |
term memory. You just can't remember a sequence of moves that's 50 moves long. And you can't 00:09:07.920 |
imagine the board correctly for that many moves into the future. Of course, the top players, 00:09:15.520 |
I'm much more familiar with chess, but the top players probably have echoes of the same kind of 00:09:22.160 |
intuition, instinct that in a moment's time, AlphaGo applies when they see a board. I mean, 00:09:28.640 |
they've seen those patterns. Human beings have seen those patterns before at the top, 00:09:33.600 |
at the grandmaster level. It seems that there is some similarities, or maybe it's our imagination 00:09:43.600 |
creates a vision of those similarities, but it feels like this kind of pattern recognition that 00:09:49.760 |
the AlphaGo approaches are using is similar to what human beings at the top level are using. 00:10:02.400 |
Yeah, I mean, I think the extent to which a human grandmaster can reliably, 00:10:08.960 |
instantly recognize the right move, instantly recognize the value of a position, 00:10:17.280 |
But if you sacrifice a queen, for example, I mean, there's these beautiful games of chess 00:10:22.160 |
with Bobby Fischer, somebody where it's seeming to make a bad move. And I'm not sure there's a 00:10:31.040 |
perfect degree of calculation involved where they've calculated all the possible things that 00:10:36.080 |
happen, but there's an instinct there, right, that somehow adds up to go down that path. 00:10:42.240 |
Yeah, so I think what happens is you get a sense that there's some possibility in the position, 00:10:47.920 |
even if you make a weird looking move, that it opens up some lines of 00:10:57.520 |
calculation that otherwise would be definitely bad. And it's that intuition that there's something 00:11:06.960 |
here in this position that might yield a win. 00:11:13.760 |
And then you follow that, right. And in some sense, when a chess player is following a line 00:11:19.760 |
in his or her mind, they're mentally simulating what the other person is going to do, what the 00:11:26.640 |
opponent is going to do. And they can do that as long as the moves are kind of forced, right. 00:11:33.120 |
As long as there's a forcing variation where the opponent doesn't really have much choice how to 00:11:39.440 |
respond. And then you see if you can force them into a situation where you win. You know, we see 00:11:44.640 |
plenty of mistakes, even in Grandmaster games, where they just miss some simple three, four, 00:11:52.960 |
five move combination that wasn't particularly apparent in the position, but was still there. 00:12:02.400 |
So when you mentioned that in Othello, those games were, after some meta reasoning improvements and 00:12:10.400 |
research, was able to beat you. How did that make you feel? 00:12:14.240 |
Part of the meta reasoning capability that it had was based on learning. And 00:12:21.360 |
you could sit down the next day and you could just feel that it had got a lot smarter. 00:12:28.080 |
You know, and all of a sudden you really felt like you're sort of pressed against 00:12:33.200 |
the wall because it was much more aggressive and was totally unforgiving of any minor mistake that 00:12:41.680 |
you might make. And actually, it seemed understood the game better than I did. 00:12:47.760 |
And Garry Kasparov has this quote where during his match against Deep Blue, he said he suddenly 00:12:55.520 |
felt that there was a new kind of intelligence across the board. 00:12:58.160 |
Do you think that's a scary or an exciting possibility for Garry Kasparov and for yourself 00:13:06.240 |
in the context of chess, purely sort of in this, like, that feeling, whatever that is? 00:13:14.320 |
I think it's definitely an exciting feeling. You know, this is what made me 00:13:18.880 |
work on AI in the first place, was as soon as I really understood what a computer was, 00:13:24.560 |
I wanted to make it smart. You know, I started out with the first program I wrote was for the 00:13:30.800 |
Sinclair programmable calculator. And I think you could write a 21-step algorithm. That was 00:13:38.960 |
the biggest program you could write, something like that. And do little arithmetic calculations. 00:13:44.800 |
So I think I implemented Newton's method for square roots and a few other things like that. 00:13:49.840 |
But then, you know, I thought, "Okay, if I just had more space, I could make this thing intelligent." 00:13:57.120 |
And so I started thinking about AI. And I think the thing that's 00:14:07.520 |
scary is not the chess program, because, you know, chess programs, 00:14:15.360 |
they're not in the taking over the world business. But if you extrapolate, 00:14:24.240 |
you know, there are things about chess that don't resemble the real world, right? We know the rules 00:14:32.080 |
of chess. The chessboard is completely visible to the program, where of course the real world is 00:14:40.320 |
not. Most of the real world is not visible from wherever you're sitting, so to speak. 00:14:45.840 |
And to overcome those kinds of problems, you need qualitatively different algorithms. 00:14:56.160 |
Another thing about the real world is that, you know, we regularly 00:15:00.160 |
plan ahead on the timescales involving billions or trillions of steps. 00:15:07.360 |
Now, we don't plan those in detail, but, you know, when you choose to do a PhD at Berkeley, 00:15:14.960 |
that's a five-year commitment. And that amounts to about a trillion motor control 00:15:20.880 |
steps that you will eventually be committed to. 00:15:23.920 |
- Including going up the stairs, opening doors, drinking water. 00:15:28.640 |
- Yeah, I mean, every finger movement while you're typing, every character of every paper 00:15:33.680 |
and the thesis and everything. So you're not committing in advance to the specific motor 00:15:37.600 |
control steps, but you're still reasoning on a timescale that will eventually reduce to 00:15:43.760 |
trillions of motor control actions. And so for all these reasons, 00:15:52.080 |
you know, AlphaGo and Deep Blue and so on don't represent any kind of threat to humanity, 00:15:58.000 |
but they are a step towards it, right? And progress in AI occurs by essentially removing 00:16:07.040 |
one by one these assumptions that make problems easy, like the assumption of complete observability 00:16:14.560 |
of the situation, right? If we remove that assumption, you need a much more complicated 00:16:20.080 |
kind of computing design, and you need something that actually keeps track of all the things you 00:16:25.440 |
can't see and tries to estimate what's going on, and there's inevitable uncertainty in that. So 00:16:31.440 |
it becomes a much more complicated problem. But, you know, we are removing those assumptions. We 00:16:37.520 |
are starting to have algorithms that can cope with much longer timescales, that can cope with 00:16:43.120 |
uncertainty, that can cope with partial observability. And so each of those steps sort of 00:16:50.400 |
magnifies by a thousand the range of things that we can do with AI systems. 00:16:55.760 |
So the way I started in AI, I wanted to be a psychiatrist for a long time. I wanted to 00:16:59.040 |
understand the mind in high school, and of course program and so on. And I showed up 00:17:04.240 |
University of Illinois to an AI lab and they said, "Okay, I don't have time for you, but here's a 00:17:10.880 |
book, AI, a modern approach." I think it was the first edition at the time. "Here, go learn this." 00:17:18.480 |
And I remember the lay of the land was, "Well, it's incredible that we solve chess, 00:17:22.800 |
but we'll never solve Go." I mean, it was pretty certain that Go, in the way we thought about 00:17:29.440 |
systems that reason, was impossible to solve, and now we've solved it. So it's a very... 00:17:34.240 |
Well, I think I would have said that it's unlikely we could take the kind of algorithm 00:17:41.040 |
that was used for chess and just get it to scale up and work well for Go. 00:17:46.480 |
And at the time, what we thought was that in order to solve Go, we would have to do something 00:17:57.520 |
similar to the way humans manage the complexity of Go, which is to break it down 00:18:03.600 |
into kind of sub-games. So when a human thinks about a Go board, they think about different 00:18:08.080 |
parts of the board as sort of weakly connected to each other. And they think about, "Okay, 00:18:13.360 |
within this part of the board, here's how things could go. In that part of the board, 00:18:17.600 |
here's how things could go." And then you try to sort of couple those two analyses together 00:18:21.840 |
and deal with the interactions and maybe revise your views of how things are going to go in each 00:18:27.040 |
part. And then you've got maybe five, six, seven, ten parts of the board. And that actually resembles 00:18:35.120 |
the real world much more than chess does. Because in the real world, we have work, we have home 00:18:43.600 |
life, we have sport, whatever, different kinds of activities, shopping. These all are connected to 00:18:52.800 |
each other, but they're weakly connected. So when I'm typing a paper, I don't simultaneously have to 00:19:00.320 |
decide which order I'm going to get the milk and the butter. That doesn't affect the typing. 00:19:06.960 |
But I do need to realize, "Okay, I better finish this before the shops close because I don't have 00:19:12.080 |
anything. I don't have any food at home." So there's some weak connection, but not in the way 00:19:17.760 |
that chess works, where everything is tied into a single stream of thought. So the thought was that 00:19:24.720 |
to solve Go, we'd have to make progress on stuff that would be useful for the real world. And in 00:19:29.600 |
a way, AlphaGo is a little bit disappointing. Because the program design for AlphaGo is 00:19:35.600 |
actually not that different from Deep Blue or even from Arthur Samuel's checker playing program from 00:19:44.960 |
the 1950s. And in fact, the two things that make AlphaGo work is one is its amazing ability to 00:19:54.080 |
evaluate the positions, and the other is the meta-reasoning capability, which allows it to 00:19:59.200 |
explore some paths in the tree very deeply and to abandon other paths very quickly. 00:20:06.960 |
- So this word meta-reasoning, while technically correct, inspires perhaps the wrong degree of 00:20:16.640 |
power that AlphaGo has, for example. The word reasoning is a powerful word. Let me ask you, 00:20:21.760 |
you were part of the symbolic AI world for a while, like where AI was, there's a lot of excellent, 00:20:30.880 |
interesting ideas there that unfortunately met a winter. So do you think it re-emerges? 00:20:40.000 |
- Oh, so I would say, yeah, it's not quite as simple as that. So the AI winter, 00:20:46.800 |
the first winter that was actually named as such was the one in the late 80s. 00:20:56.400 |
And that came about because in the mid-80s there was a, 00:21:01.120 |
really a concerted attempt to push AI out into the real world, using what was called expert 00:21:10.880 |
system technology. And for the most part, that technology was just not ready for prime time. 00:21:18.560 |
They were trying in many cases to do a form of uncertain reasoning, judgment, combinations of 00:21:27.200 |
evidence, diagnosis, those kinds of things, which was simply invalid. And when you try to apply 00:21:34.640 |
invalid reasoning methods to real problems, you can fudge it for small versions of the problem, 00:21:40.960 |
but when it starts to get larger, the thing just falls apart. So many companies found that 00:21:48.080 |
the stuff just didn't work and they were spending tons of money on consultants to 00:21:52.880 |
try to make it work. And there were other practical reasons, like they were asking 00:21:59.600 |
the companies to buy incredibly expensive Lisp machine workstations, which were literally between 00:22:08.400 |
50 and a hundred thousand dollars in, you know, in 1980s money, which was, would be like between 00:22:14.720 |
150 and $300,000 per workstation in current prices. So... 00:22:20.800 |
And the bottom line, they weren't seeing a profit from it. 00:22:24.080 |
Yeah. In many cases, I think there were some successes, there's no doubt about that, but 00:22:29.840 |
people, I would say, over-invested. Every major company was starting an AI department, just like 00:22:37.760 |
now. And I worry a bit that we might see similar disappointments, not because the current technology 00:22:46.800 |
is invalid, but it's limited in its scope. And it's almost the dual of the, you know, 00:22:57.440 |
the scope problems that expert systems had. So... 00:23:00.000 |
What have you learned from that hype cycle? And what can we do to prevent another winter, 00:23:07.120 |
Yeah. So when I'm giving talks these days, that's one of the warnings that I give. So this is a 00:23:14.320 |
two-part warning slide. One is that, you know, rather than data being the new oil, data is the 00:23:22.800 |
And then the other is that we might see a kind of very visible failure in some of the major 00:23:34.080 |
application areas. And I think self-driving cars would be the flagship. And I think when you look 00:23:43.680 |
at the history... So the first self-driving car was on the freeway, driving itself, changing lanes, 00:23:52.320 |
overtaking in 1987. And so it's more than 30 years. And that kind of looks like where we are today. 00:24:04.560 |
You know, prototypes on the freeway, changing lanes and overtaking. Now, 00:24:08.880 |
I think significant progress has been made, particularly on the perception side. So 00:24:14.960 |
we worked a lot on autonomous vehicles in the early-mid-90s at Berkeley. 00:24:21.040 |
And we had our own big demonstrations. You know, we put congressmen into self-driving cars and 00:24:29.040 |
had them zooming along the freeway. And the problem was clearly perception. 00:24:39.120 |
Yeah. So in simulation, with perfect perception, you could actually show that you can 00:24:44.720 |
drive safely for a long time, even if the other car is misbehaving and so on. But 00:24:49.040 |
simultaneously, we worked on machine vision for detecting cars and tracking pedestrians and so on. 00:24:57.680 |
And we couldn't get the reliability of detection and tracking up to a high enough level, 00:25:06.080 |
particularly in bad weather conditions, nighttime rain, fog. 00:25:11.520 |
Good enough for demos, but perhaps not good enough to cover the general operation. 00:25:15.920 |
Yeah. See, the thing about driving is, you know, so suppose you're a taxi driver, 00:25:19.440 |
you know, and you drive every day, eight hours a day for 10 years, right? That's 00:25:23.440 |
a hundred million seconds of driving, you know. And any one of those seconds, 00:25:27.760 |
you can make a fatal mistake. So you're talking about eight nines of reliability, right? 00:25:34.960 |
Now, if your vision system only detects 98.3% of the vehicles, right? Then that's sort of, 00:25:43.280 |
you know, one and a bit nines of reliability. So you have another seven orders of magnitude to go. 00:25:52.160 |
And this is what people don't understand. They think, "Oh, because I had a successful demo, 00:25:56.480 |
I'm pretty much done." But you're not even within seven orders of magnitude of being done. 00:26:03.920 |
And that's the difficulty. And it's not the, "Can I follow a white line?" That's not the problem, 00:26:12.320 |
right? We follow a white line all the way across the country. But it's the weird stuff that happens. 00:26:22.160 |
The edge case, other drivers doing weird things. So if you talk to Google, right? So 00:26:28.480 |
they had actually a very classical architecture where, you know, you had machine vision, 00:26:35.360 |
which would detect all the other cars and pedestrians and the white lines and the road 00:26:40.240 |
signs. And then basically that was fed into a logical database. And then you had a classical 00:26:47.040 |
1970s rule-based expert system telling you, "Okay, if you're in the middle lane and there's a 00:26:54.480 |
bicyclist in the right lane, who is signaling this, then you do that." Right? And what they 00:27:00.880 |
found was that every day they'd go out and there'd be another situation that the rules didn't cover. 00:27:05.680 |
You know, so they'd come to a traffic circle and there's a little girl riding her bicycle 00:27:10.160 |
the wrong way around the traffic circle. Okay, what do you do? "We don't have a rule." "Oh my 00:27:13.440 |
God. Okay, stop." And then, you know, they'd come back and add more rules and they just found that 00:27:19.280 |
this was not really converging. And if you think about it, right, how do you deal with 00:27:26.080 |
an unexpected situation? Meaning one that you've never previously encountered and the sort of 00:27:32.320 |
the reasoning required to figure out the solution for that situation has never been done. It doesn't 00:27:39.280 |
match any previous situation in terms of the kind of reasoning you have to do. Well, you know, in 00:27:45.840 |
chess programs this happens all the time. You're constantly coming up with situations you haven't 00:27:51.200 |
seen before and you have to reason about them and you have to think about, "Okay, here are the 00:27:56.480 |
possible things I could do. Here are the outcomes. Here's how desirable the outcomes are." And then 00:28:01.600 |
pick the right one. You know, in the 90s we were saying, "Okay, this is how you're going to have to 00:28:05.440 |
do automated vehicles. They're going to have to have look-ahead capability." But the look-ahead 00:28:10.880 |
for driving is more difficult than it is for chess. Because of humans. Right, there's humans and 00:28:16.960 |
they're less predictable than chess pieces. Well, then you have an opponent in chess who's also 00:28:22.960 |
somewhat unpredictable. But, for example, in chess you always know the opponent's intention. They're 00:28:29.920 |
trying to beat you. Right? Whereas in driving you don't know, "Is this guy trying to turn left or has 00:28:36.320 |
he just forgotten to turn off his turn signal or is he drunk or is he, you know, changing the channel 00:28:42.320 |
on his radio or whatever it might be?" You've got to try and figure out the mental state, the intent 00:28:48.160 |
of the other drivers to forecast the possible evolutions of their trajectories. And then you've 00:28:55.440 |
got to figure out, "Okay, which is the trajectory for me that's going to be safest?" And those all 00:29:01.200 |
interact with each other because the other driver is going to react to your trajectory and so on. 00:29:07.040 |
So, you know, you've got the classic merging onto the freeway problem where you're kind of 00:29:11.760 |
racing a vehicle that's already on the freeway and you either pull ahead of them or you're going 00:29:16.080 |
to let them go first and pull in behind. And you get this sort of uncertainty about who's going 00:29:20.800 |
first. So all those kinds of things mean that you need a decision-making architecture that's 00:29:31.120 |
very different from either a rule-based system or, it seems to me, a kind of an end-to-end neural 00:29:38.000 |
network system. You know, so just as AlphaGo is pretty good when it doesn't do any look ahead, 00:29:44.320 |
but it's way, way, way, way better when it does, I think the same is going to be true for driving. 00:29:50.640 |
You can have a driving system that's pretty good when it doesn't do any look ahead, but that's not 00:29:56.400 |
good enough. You know, and we've already seen multiple deaths caused by poorly designed machine 00:30:05.040 |
learning algorithms that don't really understand what they're doing. Yeah, on several levels. I 00:30:11.280 |
think on the perception side, there's mistakes being made by those algorithms where the perception 00:30:17.760 |
is very shallow. On the planning side, the look ahead, like you said. And the thing that we come 00:30:24.640 |
up against that's really interesting when you try to deploy systems in the real world is you can't 00:30:33.600 |
think of an artificial intelligence system as a thing that responds to the world always. You have 00:30:38.640 |
to realize that it's an agent that others will respond to as well. So in order to drive successfully, 00:30:44.400 |
you can't just try to do obstacle avoidance. Right, you can't pretend that you're invisible. 00:30:49.360 |
Right? You're the invisible car. Right. It doesn't work that way. I mean, but you have to assert, 00:30:55.200 |
yet others have to be scared of you. There's this tension. There's this game. So we study a lot of 00:31:02.640 |
work with pedestrians. If you approach pedestrians as purely an obstacle avoidance, so you're doing 00:31:09.040 |
look ahead as in modeling the intent, they're not going to take advantage of you. They're not going 00:31:15.520 |
to respect you at all. There has to be a tension, a fear, some amount of uncertainty. That's how 00:31:21.520 |
we have created... Or at least just a kind of a resoluteness. Right, yes. You have to display 00:31:29.040 |
a certain amount of resoluteness. You can't be too tentative. And yeah, so the solutions then 00:31:38.800 |
become pretty complicated. Right? You get into game theoretic analyses. And so we're, you know, 00:31:45.360 |
at Berkeley now we're working a lot on this kind of interaction between machines and humans. 00:31:53.440 |
And so my colleague, Ankur Dragan, actually, you know, if you formulate the problem game 00:32:03.840 |
theoretically, and you just let the system figure out the solution, you know, it does 00:32:08.320 |
interesting, unexpected things. Like sometimes at a stop sign, if no one is going first, 00:32:14.400 |
right, the car will actually back up a little, right? Just to indicate to the other cars that 00:32:21.200 |
they should go. And that's something it invented entirely by itself. 00:32:25.840 |
Right? There was, you know, we didn't say this is the language of communication at stop signs, 00:32:32.240 |
That's really interesting. So let me one just step back for a second. Just this beautiful 00:32:38.960 |
philosophical notion. So Pamela McCordick in 1979 wrote, "AI began with the ancient wish 00:32:46.880 |
to forge the gods." So when you think about the history of our civilization, 00:32:51.840 |
do you think that there is an inherent desire to create, let's not say gods, 00:33:00.320 |
but to create super intelligence? Is it inherent to us? Is it in our genes, that the natural arc of 00:33:07.280 |
human civilization is to create things that are of greater and greater power, and perhaps 00:33:14.960 |
echoes of ourselves? So to create the gods, as Pamela said? 00:33:25.200 |
It may be. I mean, you know, we're all individuals. But certainly we see over and 00:33:32.720 |
over again in history, individuals who thought about this possibility. 00:33:38.720 |
Hopefully, I'm not being too philosophical here. But if you look at the arc of this, 00:33:45.600 |
you know, where this is going, and we'll talk about AI safety, we'll talk about greater and 00:33:49.520 |
greater intelligence. Do you see that there, when you created the Othello program, and you felt this 00:33:57.280 |
excitement, what was that excitement? Was it excitement of a tinkerer who created something 00:34:02.640 |
cool, like a clock? Or was there a magic, or was it more like a child being born? 00:34:10.480 |
Yeah, so I mean, I certainly understand that viewpoint. And if you look at 00:34:17.120 |
the Lighthill report, which was... So in the 70s, there was a lot of controversy in the UK about 00:34:24.400 |
AI, and you know, whether it was for real, and how much money the government should invest. And 00:34:30.480 |
so it was a long story. But the government commissioned a report by 00:34:36.320 |
Lighthill, who was a physicist, and he wrote a very damning report about AI, which I think was 00:34:46.160 |
the point. And he said that, that these are, you know, frustrated men who, unable to have children 00:34:55.760 |
would like to create, and, you know, create life, you know, as a kind of replacement, 00:35:13.120 |
But there is, I mean, there is a kind of magic, I would say, when you build something, 00:35:21.600 |
and what you're building in is really just, you're building in some understanding of the principles 00:35:30.400 |
of learning and decision making. And to see those principles actually then turn into 00:35:38.800 |
intelligent behavior in specific situations, it's an incredible thing. And, you know, that 00:35:50.240 |
is naturally going to make you think, okay, where does this end? 00:35:58.480 |
And so there's magical, optimistic views of where it ends. Whatever your view of optimism is, 00:36:08.320 |
whatever your view of utopia is, is probably different for everybody. But you've often talked 00:36:13.360 |
about concerns you have of how things may go wrong. So I've talked to Max Tegmark, 00:36:24.160 |
there's a lot of interesting ways to think about AI safety. You're one of the 00:36:30.640 |
seminal people thinking about this problem amongst sort of being in the weeds of actually 00:36:36.880 |
solving specific AI problems. You're also thinking about the big picture of where we're going. 00:36:42.160 |
So can you talk about several elements of it? Let's just talk about maybe the control problem. 00:36:47.840 |
So this idea of losing ability to control the behavior in our AI system. So how do you see that? 00:36:58.240 |
How do you see that coming about? What do you think we can do to manage it? 00:37:05.520 |
Well, so it doesn't take a genius to realize that if you make something that's smarter than you, 00:37:11.520 |
you might have a problem. You know, Alan Turing wrote about this and gave lectures about this, 00:37:20.480 |
you know, in I think 1951. He did a lecture on the radio. And he basically says, you know, 00:37:30.160 |
once the machine thinking method starts, you know, very quickly they'll outstrip humanity. 00:37:37.280 |
And, you know, if we're lucky, we might be able to, I think he says, 00:37:42.400 |
if we may be able to turn off the power at strategic moments, but even so our species 00:37:49.040 |
would be humbled. Yeah, you actually, he was wrong about that, right? Because you know, 00:37:54.560 |
if it's a sufficiently intelligent machine, it's not going to let you switch it off. 00:38:00.240 |
So what do you think is meant just for a quick tangent? If we shut off this 00:38:05.840 |
super intelligent machine that our species would be humbled. 00:38:10.880 |
I think he means that we would realize that we are inferior, right? That we only survive by the skin 00:38:20.560 |
of our teeth because we happen to get to the off switch, you know, just in time, you know, and if 00:38:26.800 |
we hadn't, then we would have lost control over the earth. 00:38:30.240 |
So do you, are you more worried when you think about this stuff about super intelligent AI, 00:38:36.800 |
or are you more worried about super powerful AI that's not aligned with our values? So the paper 00:38:47.120 |
I think, so the main problem I'm working on is, is the control problem, the problem of 00:38:55.280 |
machines pursuing objectives that are, as you say, not aligned with human objectives. And, 00:39:01.520 |
and this has been, this has been the way we've thought about AI since the beginning. 00:39:07.520 |
You, you build a machine for optimizing and then you put in some objective and it optimizes, 00:39:16.080 |
right. And, and, you know, we, we can think of this as the, the King Midas problem, right? 00:39:23.920 |
Because if, you know, so King Midas put in this objective, right, everything I touch should turn 00:39:29.440 |
to gold and the gods, you know, that's like the machine, they said, okay, done. You know, 00:39:34.720 |
you now have this power and of course his food and his drink and his family all turned to gold. 00:39:40.080 |
And then he dies, misery and starvation. And this is, you know, it's, it's a warning. It's, 00:39:48.240 |
it's a failure mode that pretty much every culture in history has had some story along 00:39:54.560 |
the same lines. You know, there's the, the genie that gives you three wishes. And, you know, 00:39:58.640 |
third wish is always, you know, please undo the first two wishes because I messed up. 00:40:02.800 |
And, you know, and when Arthur Samuel wrote his chess, his checker playing program, which learned 00:40:11.520 |
to play checkers considerably better than Arthur Samuel could play and actually reached a pretty 00:40:16.400 |
decent standard. Norbert Wiener, who was one of the major mathematicians of the 20th century, 00:40:24.720 |
sort of the father of modern automation control systems. You know, he saw this and he basically 00:40:31.680 |
extrapolated, you know, as Turing did and said, okay, this is how we could lose control. 00:40:39.840 |
And specifically that we have to be certain that the purpose we put into the machine is 00:40:49.600 |
the purpose which we really desire. And the problem is we can't do that. 00:40:56.320 |
Right. You mean we're not, it's a very difficult to encode, to put our values on 00:41:02.160 |
paper is really difficult or you're just saying it's impossible. 00:41:05.920 |
Uh, theoretically, theoretically it's possible, but, uh, in practice, it's extremely unlikely 00:41:17.120 |
that we could specify correctly in advance the full range of concerns of humanity. 00:41:24.400 |
Yeah. You talked about cultural transmission of values. I think it's how humans to human 00:41:31.840 |
Uh, well we learn, yeah. I mean, as we grow up, we learn about the values that matter, 00:41:38.400 |
how things, how things should go, what is reasonable to pursue and what isn't reasonable 00:41:44.080 |
I think machines can learn in the same kind of way. 00:41:46.640 |
Yeah. So I think that, um, what we need to do is to get away from this idea that you build 00:41:53.040 |
an optimizing machine and then you put the objective into it. Because if it's possible 00:42:00.800 |
that you might put in a wrong objective, and we already know this is possible because it's 00:42:04.480 |
happened lots of times, right? That means that the machine should never take an objective 00:42:12.000 |
that's given as gospel truth. Because once it takes the mission, the objective is gospel truth, 00:42:18.960 |
right? Then it's believes that whatever actions it's taking in pursuit of that objective are 00:42:27.040 |
the correct things to do. So you could be jumping up and down and saying, you know, 00:42:30.720 |
no, no, no, you're going to destroy the world. But the machine knows what the true objective 00:42:35.600 |
is and is pursuing it and tough luck to you. And this is not restricted to AI, right? This is, 00:42:42.080 |
you know, I think many of the 20th century technologies, right? So in statistics, 00:42:46.800 |
you, you minimize a loss function. The loss function is exogenously specified in control 00:42:51.840 |
theory. You minimize a cost function in operations research, you maximize a reward function, 00:42:57.920 |
and so on. So in all these disciplines, this is how we conceive of the problem. And it's the wrong 00:43:04.400 |
problem. Because we cannot specify with certainty the correct objective, right? We need 00:43:12.560 |
uncertainty, we need the machine to be uncertain about what it is that it's supposed to be 00:43:18.720 |
maximizing. - Favorite idea of yours, I've heard you say somewhere, well, I shouldn't pick favorites, 00:43:25.200 |
but it just sounds beautiful of we need to teach machines humility. 00:43:29.040 |
- Yeah, I mean, - It's a beautiful way to put it. I love it. 00:43:33.680 |
- That they're humble, in that they know that they don't know what it is they're supposed to be 00:43:40.560 |
doing. And that those objectives, I mean, they exist, they're within us, but we may not be able 00:43:48.560 |
to explicate them. We may not even know how we want our future to go. - Exactly. 00:43:58.160 |
- And the machine, a machine that's uncertain is going to be deferential to us. So if we say, 00:44:06.800 |
don't do that, well, now the machines learn something a bit more about our true objectives, 00:44:11.840 |
because something that it thought was reasonable in pursuit of our objective, it turns out not to 00:44:17.200 |
be, so now it's learned something. So it's going to defer because it wants to be doing what we 00:44:22.480 |
really want. And that point I think is absolutely central to solving the control problem. And it's 00:44:32.800 |
a different kind of AI when you take away this idea that the objective is known, then, in fact, 00:44:42.160 |
a lot of the theoretical frameworks that we're so familiar with, you know, Markov decision processes, 00:44:49.840 |
goal-based planning, you know, standard games research, all of these techniques actually become 00:44:58.560 |
inapplicable. And you get a more complicated problem because now the interaction with the 00:45:09.920 |
human becomes part of the problem. Because the human, by making choices, is giving you more 00:45:19.520 |
information about the true objective, and that information helps you achieve the objective 00:45:25.040 |
better. And so that really means that you're mostly dealing with game theoretic problems, 00:45:31.520 |
where you've got the machine and the human and they're coupled together, 00:45:35.840 |
rather than a machine going off by itself with a fixed objective. 00:45:39.040 |
Which is fascinating on the machine and the human level that we, when you don't have an objective, 00:45:47.360 |
means you're together coming up with an objective. I mean, there's a lot of philosophy that, you 00:45:53.440 |
know, you could argue that life doesn't really have meaning. We together agree on what gives 00:45:59.120 |
it meaning, and we kind of culturally create things that give why the heck we are on this 00:46:05.600 |
earth anyway. We together as a society create that meaning, and you have to learn that objective. 00:46:10.800 |
And one of the biggest, I thought that's where you were going to go for a second, 00:46:14.880 |
one of the biggest troubles we run into outside of statistics and machine learning and AI, 00:46:21.200 |
in just human civilization, is when you look at, I came from, I was born in the Soviet Union, 00:46:28.080 |
and the history of the 20th century, we ran into the most trouble, us humans, when there was a 00:46:34.240 |
certainty about the objective. And you do whatever it takes to achieve that objective, 00:46:40.800 |
whether you're talking about Germany or communist Russia. You get into trouble with humans. 00:46:46.720 |
I would say with, you know, corporations, in fact, some people argue that, you know, 00:46:52.400 |
we don't have to look forward to a time when AI systems take over the world. They already have, 00:46:57.200 |
and they call it corporations, right? That corporations happen to be using people as 00:47:03.760 |
components right now, but they are effectively algorithmic machines, and they're optimizing 00:47:10.160 |
an objective, which is quarterly profit that isn't aligned with overall well-being of the human race, 00:47:17.520 |
and they are destroying the world. They are primarily responsible for our inability to tackle 00:47:23.440 |
climate change. So I think that's one way of thinking about what's going on with corporations. 00:47:30.240 |
But I think the point you're making is valid, that there are many systems in the real world 00:47:39.360 |
where we've sort of prematurely fixed on the objective and then decoupled the machine from 00:47:47.920 |
those that it's supposed to be serving. And I think you see this with government, right? Government is 00:47:54.400 |
supposed to be a machine that serves people, but instead it tends to be taken over by people who 00:48:02.320 |
have their own objective and use government to optimize that objective regardless of what people 00:48:07.840 |
want. - Do you find appealing the idea of almost arguing machines, where you have multiple AI 00:48:15.440 |
systems with a clear fixed objective? We have in government the red team and the blue team, 00:48:20.960 |
they're very fixed on their objectives, and they argue and they kind of, 00:48:24.560 |
I may disagree, but it kind of seems to make it work somewhat, that the duality of it. 00:48:34.720 |
Okay, let's go 100 years back when there was still was going on, or at the founding of this country, 00:48:41.680 |
there was disagreements, and that disagreement is where, so there's a balance between certainty 00:48:48.640 |
and forced humility because the power was distributed. - Yeah, I think that the 00:48:55.360 |
nature of debate and disagreement argument takes as a premise the idea that you could be wrong, 00:49:07.200 |
right? Which means that you're not necessarily absolutely convinced that your objective is 00:49:13.920 |
the correct one, right? If you were absolutely convinced, there'd be no point in having any 00:49:20.000 |
discussion or argument because you would never change your mind, and there wouldn't be any 00:49:24.480 |
sort of synthesis or anything like that. So I think you can think of argumentation as an 00:49:32.000 |
implementation of a form of uncertain reasoning. And I've been reading recently about 00:49:42.800 |
utilitarianism and the history of efforts to define, in a sort of clear mathematical way, 00:49:50.640 |
I feel like a formula for moral or political decision making. And it's really interesting 00:50:00.240 |
that the parallels between the philosophical discussions going back 200 years and what you 00:50:06.640 |
see now in discussions about existential risk, because it's almost exactly the same. So someone 00:50:14.320 |
would say, "Okay, well here's a formula for how we should make decisions." So utilitarianism is 00:50:19.040 |
roughly, each person has a utility function and then we make decisions to maximize the sum of 00:50:26.560 |
everybody's utility. And then people point out, "Well, in that case, the best policy is one that 00:50:35.360 |
leads to the enormously vast population, all of whom are living a life that's barely worth living." 00:50:42.480 |
And this is called the repugnant conclusion. And another version is that we should maximize 00:50:51.200 |
pleasure. And that's what we mean by utility. And then you'll get people effectively saying, 00:50:57.680 |
"Well, in that case, we might as well just have everyone hooked up to a heroin drip." 00:51:01.600 |
And they didn't use those words, but that debate was happening in the 19th century, 00:51:07.840 |
as it is now about AI. That if we get the formula wrong, we're going to have AI systems 00:51:16.320 |
working towards an outcome that in retrospect would be exactly wrong. 00:51:21.920 |
- Do you think there's, as beautifully put, so the echoes are there, but do you think, 00:51:26.960 |
I mean, if you look at Sam Harris, our imagination worries about the AI version of that, because of 00:51:35.040 |
the speed at which the things going wrong in the utilitarian context could happen. 00:51:47.280 |
- Yeah, I think that in most cases, not in all, but if we have a wrong political idea, 00:51:55.360 |
we see it starting to go wrong and we're not completely stupid. And so we sort of, "Okay, 00:52:01.200 |
maybe that was a mistake. Let's try something different." 00:52:05.840 |
And also we're very slow and inefficient about implementing these things and so on. So you have 00:52:12.640 |
to worry when you have corporations or political systems that are extremely efficient. But when we 00:52:19.680 |
look at AI systems, or even just computers in general, they have this different characteristic 00:52:28.320 |
from ordinary human activity in the past. So let's say you were a surgeon, you had some idea about 00:52:35.200 |
how to do some operation. And let's say you were wrong, that that way of doing the operation 00:52:41.600 |
would mostly kill the patient. Well, you'd find out pretty quickly, like after three, 00:52:48.000 |
maybe three or four tries, right? But that isn't true for pharmaceutical companies, 00:52:56.400 |
because they don't do three or four operations. They manufacture three or four billion pills, 00:53:03.840 |
and they sell them. And then they find out maybe six months or a year later that, 00:53:08.080 |
"Oh, people are dying of heart attacks or getting cancer from this drug." 00:53:12.080 |
And so that's why we have the FDA, right? Because of the scalability of pharmaceutical production. 00:53:19.280 |
And there have been some unbelievably bad episodes in the history of pharmaceuticals and 00:53:30.400 |
adulteration of products and so on that have killed tens of thousands or paralyzed hundreds 00:53:36.640 |
of thousands of people. Now, with computers, we have that same scalability problem, that you can 00:53:43.760 |
sit there and type "for i equals one to five billion do," right? And all of a sudden you're 00:53:49.760 |
having an impact on a global scale. And yet we have no FDA, right? There's absolutely no controls 00:53:56.160 |
at all over what a bunch of undergraduates with too much caffeine can do to the world. 00:54:03.440 |
And we look at what happened with Facebook, well, social media in general, and click-through 00:54:10.160 |
optimization. So you have a simple feedback algorithm that's trying to just optimize 00:54:18.480 |
click-through, right? That sounds reasonable, right? Because you don't want to be feeding 00:54:23.840 |
people ads that they don't care about or not interested in. And you might even think of 00:54:31.920 |
that process as simply adjusting the feeding of ads or news articles or whatever it might be 00:54:40.560 |
to match people's preferences, right? Which sounds like a good idea. 00:54:45.440 |
But in fact, that isn't how the algorithm works, right? You make more money, the algorithm makes 00:54:54.880 |
more money if it can better predict what people are going to click on, because then it can feed 00:55:02.080 |
them exactly that, right? So the way to maximize click-through is actually to modify the people, 00:55:09.760 |
to make them more predictable. And one way to do that is to feed them information which will change 00:55:18.560 |
their behavior and preferences towards extremes that make them predictable. Whatever is the 00:55:25.680 |
nearest extreme or the nearest predictable point, that's where you're going to end up. 00:55:30.640 |
And the machines will force you there. Now, and I think there's a reasonable argument to say 00:55:38.080 |
that this, among other things, is contributing to the destruction of democracy in the world. 00:55:47.280 |
And where was the oversight of this process? Where were the people saying, "Okay, you would 00:55:53.680 |
like to apply this algorithm to five billion people on the face of the earth. Can you show 00:55:59.760 |
me that it's safe? Can you show me that it won't have various kinds of negative effects?" No, 00:56:05.200 |
there was no one asking that question. There was no one placed between, you know, the undergrads 00:56:12.400 |
with too much caffeine and the human race. They just did it. - And some, way outside the scope 00:56:19.920 |
of my knowledge, so economists would argue that the, what is it, the invisible hand, 00:56:24.400 |
so the capitalist system, it was the oversight. So if you're going to corrupt society with whatever 00:56:31.040 |
decision you make as a company, then that's going to be reflected in people not using your product. 00:56:37.040 |
That's one model of oversight. - We shall see, but in the meantime, you know, but you might even 00:56:43.760 |
have broken the political system that enables capitalism to function. - Well, you've changed it. 00:56:51.440 |
- We shall see. - Change is often painful. So my question is absolutely, it's fascinating. 00:57:00.000 |
You're absolutely right that there was zero oversight on algorithms that can have a profound 00:57:06.320 |
civilization-changing effect. So do you think it's possible, I mean, I haven't, 00:57:13.040 |
have you seen government? So do you think it's possible to create regulatory bodies, 00:57:19.840 |
oversight over AI algorithms, which are inherently such cutting-edge set of ideas and technologies? 00:57:30.960 |
- Yeah, but I think it takes time to figure out what kind of oversight, what kinds of controls. 00:57:37.520 |
I mean, it took time to design the FDA regime, you know, and some people still don't like it, 00:57:42.560 |
and they want to fix it. And I think there are clear ways that it could be improved. 00:57:47.920 |
But the whole notion that you have stage one, stage two, stage three, and here are the criteria for 00:57:54.480 |
what you have to do to pass a stage one trial, right? - Yes. - We haven't even thought about 00:58:00.320 |
what those would be for algorithms. So, I mean, I think there are things we could do right now 00:58:06.880 |
with regard to bias, for example. We have a pretty good technical handle on 00:58:13.760 |
how to detect algorithms that are propagating bias that exists in datasets, how to de-bias those 00:58:22.800 |
algorithms, and even what it's going to cost you to do that. So I think we could start having some 00:58:30.640 |
standards on that. I think there are things to do with impersonation and falsification 00:58:38.080 |
that we could work on. - Like fakes, yeah. - A very simple point. So impersonation is a machine 00:58:48.800 |
acting as if it was a person. I can't see a real justification 00:58:53.840 |
for why we shouldn't insist that machines self-identify as machines. 00:58:59.280 |
You know, where is the social benefit in fooling people into thinking that this is really a person 00:59:08.480 |
when it isn't? You know, I don't mind if it uses a human-like voice that's easy to understand, 00:59:14.560 |
that's fine, but it should just say, "I'm a machine" in some form. - And not many people 00:59:21.120 |
are speaking to that, I would think, relatively obvious fact. So I think most people... - Yeah, 00:59:26.160 |
I mean, there is actually a law in California that bans impersonation, but only in certain 00:59:32.320 |
restricted circumstances. So for the purpose of engaging in a fraudulent transaction, 00:59:40.800 |
and for the purpose of modifying someone's voting behavior. So those are the circumstances where 00:59:48.160 |
machines have to self-identify. But I think, you know, arguably it should be in all circumstances. 00:59:55.440 |
And then when you talk about deepfakes, you know, we're just at the beginning, 01:00:01.520 |
but already it's possible to make a movie of anybody saying anything in ways that are pretty 01:00:09.680 |
hard to detect. - Including yourself, because you're on camera now, and your voice is coming 01:00:14.480 |
through with high resolution. - Yeah, so you could take what I'm saying and replace it with 01:00:18.720 |
pretty much anything else you wanted me to be saying. And even it would change my lips and 01:00:23.120 |
facial expressions to fit. And there's actually not much in the way of real legal protection 01:00:34.640 |
against that. I think in the commercial area you could say, "Yeah, you're using my brand," 01:00:41.120 |
and so on. There are rules about that. But in the political sphere, I think it's... 01:00:46.160 |
At the moment, it's, you know, anything goes. So that could be really, really damaging. 01:00:52.480 |
- And let me just try to make, not an argument, but try to look back at history 01:01:00.720 |
and say something dark, in essence, is while regulation seems to be, oversight seems to be 01:01:09.040 |
exactly the right thing to do here, it seems that human beings, what they naturally do is they wait 01:01:14.480 |
for something to go wrong. If you're talking about nuclear weapons, you can't talk about 01:01:20.080 |
nuclear weapons being dangerous until somebody actually, like the United States, drops the bomb. 01:01:25.920 |
Or Chernobyl melting. Do you think we will have to wait for things going wrong in a way that's 01:01:34.960 |
obviously damaging to society? Not an existential risk, but obviously damaging. 01:01:39.840 |
Or do you have faith that... - I hope not. But I mean, I think we do have to look at history. 01:01:48.000 |
And, you know, so the two examples you gave, nuclear weapons and nuclear power, 01:01:55.840 |
are very, very interesting because, you know, in nuclear weapons, we knew in the early years of the 01:02:04.800 |
20th century that atoms contained a huge amount of energy, right? We had E equals mc squared, 01:02:09.920 |
we knew the mass differences between the different atoms and their components, and we knew that 01:02:15.360 |
you might be able to make an incredibly powerful explosive. So H.G. Wells wrote a science fiction 01:02:23.440 |
book, I think in 1912. Frederick Soddy, who was the guy who discovered isotopes, 01:02:30.320 |
is a Nobel Prize winner. He gave a speech in 1915 saying that, you know, one pound of this 01:02:38.720 |
new explosive would be the equivalent of 150 tons of dynamite, which turns out to be about right. 01:02:44.160 |
And, you know, this was in World War I, right? So he was imagining how much worse the world 01:02:52.480 |
war would be if we were using that kind of explosive. But the physics establishment simply 01:02:58.000 |
refused to believe that these things could be made. - Including the people who were making it. 01:03:05.760 |
- Well, so they were doing the nuclear physics... - I mean, eventually were the ones who made it. 01:03:11.200 |
You're talking about Fermi or whoever. - Well, so up to... The development 01:03:19.760 |
was mostly theoretical. So it was people using sort of primitive kinds of particle acceleration 01:03:24.240 |
and doing experiments at the level of single particles or collections of particles. They 01:03:31.040 |
weren't yet thinking about how to actually make a bomb or anything like that. But they knew the 01:03:38.640 |
energy was there, and they figured if they understood it better, it might be possible. 01:03:42.960 |
But the physics establishment, their view, and I think because they did not want it to be true, 01:03:49.520 |
their view was that it could not be true. That this could not provide a way to make a super weapon. 01:03:57.040 |
And, you know, there was this famous speech given by Rutherford, who was the sort of leader of 01:04:04.880 |
nuclear physics. And it was on September 11th, 1933. And he said, you know, anyone who talks 01:04:12.800 |
about the possibility of obtaining energy from transformation of atoms is talking complete 01:04:18.720 |
moonshine. And the next morning, Leo Szilard read about that speech and then invented the 01:04:27.360 |
nuclear chain reaction. And so as soon as he invented, as soon as he had that idea that you 01:04:34.320 |
could make a chain reaction with neutrons, because neutrons were not repelled by the nucleus, so they 01:04:39.280 |
could enter the nucleus and then continue the reaction. As soon as he has that idea, he instantly 01:04:46.400 |
realized that the world was in deep doo-doo. Because this is 1933, right? Hitler had recently 01:04:56.400 |
come to power in Germany. Szilard was in London, and eventually became a refugee, 01:05:02.720 |
and came to the US. And in the process of having the idea about the chain reaction, 01:05:12.480 |
he figured out basically how to make a bomb and also how to make a reactor. 01:05:16.720 |
And he patented the reactor in 1934. But because of the situation, the great power conflict 01:05:26.560 |
situation that he could see happening, he kept that a secret. And so between then 01:05:35.680 |
and the beginning of World War II, people were working, including the Germans, on 01:05:42.320 |
how to actually create neutron sources, right? What specific fission reactions would produce 01:05:51.840 |
neutrons of the right energy to continue the reaction. And that was demonstrated in Germany, 01:05:59.200 |
I think, in 1938, if I remember correctly. The first nuclear weapon patent was 1939, 01:06:06.640 |
by the French. So this was actually going on well before World War II really got going. 01:06:18.720 |
And then the British probably had the most advanced capability in this area, but for safety 01:06:24.880 |
reasons, among others, and just sort of just resources, they moved the program from Britain 01:06:30.800 |
to the US, and then that became Manhattan Project. So the reason why we couldn't 01:06:37.920 |
have any kind of oversight of nuclear weapons and nuclear technology 01:06:44.960 |
was because we were basically already in an arms race and a war. 01:06:52.320 |
- But you mentioned then in the 20s and 30s, so what are the echoes? 01:06:57.520 |
The way you've described this story, I mean, there's clearly echoes. Why do you think most 01:07:03.920 |
AI researchers, folks who are really close to the metal, they really are not concerned about it, 01:07:11.440 |
they don't think about it, whether it's they don't want to think about it. But what are the, 01:07:16.880 |
yeah, why do you think that is? What are the echoes of the nuclear situation to the current 01:07:23.440 |
AI situation? And what can we do about it? - I think there is a kind of motivated cognition, 01:07:32.720 |
which is a term in psychology means that you believe what you would like to be true, 01:07:39.200 |
rather than what is true. And it's unsettling to think that what you're working on might 01:07:49.120 |
be the end of the human race, obviously. So you would rather instantly deny it, 01:07:55.520 |
come up with some reason why it couldn't be true. And I have collected a long list of reasons that 01:08:04.320 |
extremely intelligent, competent AI scientists have come up with for why we shouldn't worry 01:08:10.640 |
about this. For example, calculators are superhuman at arithmetic and they haven't 01:08:17.600 |
taken over the world, so there's nothing to worry about. Well, okay, my five-year-old could have 01:08:23.440 |
figured out why that was an unreasonable and really quite weak argument. Another one was 01:08:33.280 |
that while it's theoretically possible that you could have superhuman AI destroy the world, 01:08:42.720 |
it's also theoretically possible that a black hole could materialize right next to the Earth and 01:08:48.080 |
destroy humanity. I mean, yes, it's theoretically possible, quantum theoretically extremely unlikely 01:08:54.720 |
that it would just materialize right there. But that's a completely bogus analogy, because 01:09:01.680 |
if the whole physics community on Earth was working to materialize a black hole in near-Earth orbit, 01:09:07.360 |
wouldn't you ask them, "Is that a good idea? Is that going to be safe? What if you succeed?" 01:09:13.760 |
And that's the thing. The AI community has sort of refused to ask itself, "What if you succeed?" 01:09:22.080 |
And initially I think that was because it was too hard, but Alan Turing asked himself that, 01:09:30.560 |
and he said, "We'd be toast." Right? If we were lucky, we might be able to switch off the power, 01:09:36.800 |
but probably we'd be toast. But there's also an aspect that because we're not exactly 01:09:44.240 |
sure what the future holds, it's not clear exactly, so technically, what to worry about, 01:09:50.640 |
sort of how things go wrong. And so there is something, it feels like, maybe you can correct 01:09:59.040 |
me if I'm wrong, but there's something paralyzing about worrying about something that logically is 01:10:06.000 |
inevitable, but you don't really know what that will look like. - Yeah, I think that's a reasonable 01:10:13.360 |
point. And you know, it's certainly in terms of existential risks, it's different from, you know, 01:10:21.120 |
the asteroid collides with the Earth, right? Which again is quite possible. You know, it's happened 01:10:27.120 |
in the past, it'll probably happen again, we don't know right now, but if we did detect an asteroid 01:10:33.760 |
that was going to hit the Earth in 75 years time, we'd certainly be doing something about it. 01:10:39.600 |
- Well, it's clear there's a big rock, and we'll probably have a meeting and see what do we do 01:10:44.000 |
about the big rock with AI. - Right, with AI, I mean, there are very few people who think it's 01:10:49.120 |
not gonna happen within the next 75 years. I know Rod Brooks doesn't think it's gonna happen, 01:10:55.200 |
maybe Andrew Ng doesn't think it's happened, but you know, a lot of the people who work day-to-day, 01:11:00.880 |
you know, as you say, at the rock face, they think it's gonna happen. I think the median 01:11:07.280 |
estimate from AI researchers is somewhere in 40 to 50 years from now, or maybe even a little, 01:11:14.080 |
you know, I think in Asia they think it's gonna be even faster than that. 01:11:17.200 |
I am a little bit more conservative, I think it'd probably take longer than that, but I think it's 01:11:25.120 |
you know, as happened with nuclear weapons, - It can happen overnight. 01:11:30.080 |
- It can happen overnight that you have these breakthroughs, and we need more than one 01:11:33.360 |
breakthrough, but you know, it's on the order of half a dozen, I mean, this is a very rough scale, 01:11:40.000 |
but sort of half a dozen breakthroughs of that nature would have to happen for us to reach 01:11:48.640 |
superhuman AI. But the AI research community is vast now, the massive investments from governments, 01:11:57.280 |
from corporations, tons of really, really smart people, you know, you just have to look at the 01:12:03.360 |
rate of progress in different areas of AI to see that things are moving pretty fast. So to say, 01:12:09.200 |
"Oh, it's just gonna be thousands of years," I don't see any basis for that. You know, I see, 01:12:15.920 |
you know, for example, the Stanford 100-Year AI Project, right, which is supposed to be sort of, 01:12:26.400 |
you know, the serious establishment view, their most recent report actually said it's probably 01:12:34.720 |
- Right, which if you want a perfect example of people in denial, that's it. Because, you know, 01:12:42.960 |
for the whole history of AI, we've been saying to philosophers who said it wasn't possible, "Well, 01:12:49.760 |
you have no idea what you're talking about. Of course it's possible, right? Give me an argument 01:12:54.000 |
for why it couldn't happen." And there isn't one, right? And now, because people are worried that 01:13:00.480 |
maybe AI might get a bad name, or I just don't want to think about this, they're saying, "Okay, 01:13:06.080 |
well, of course it's not really possible." You know, and we imagine, right? Imagine if, you know, 01:13:10.160 |
the leaders of the cancer biology community got up and said, "Well, you know, of course, 01:13:16.560 |
curing cancer, it's not really possible." There'd be complete outrage and dismay, and, 01:13:23.600 |
you know, I find this really a strange phenomenon. So, okay, so if you accept that it's possible, 01:13:35.680 |
and if you accept that it's probably going to happen, the point that you're making that, 01:13:42.400 |
you know, how does it go wrong, a valid question without that, without an answer to that question, 01:13:50.240 |
then you're stuck with what I call the guerrilla problem, which is, you know, the problem that the 01:13:54.480 |
guerrillas face, right? They made something more intelligent than them, namely us, a few million 01:14:00.480 |
years ago, and now they're in deep doo-doo. So there's really nothing they can do. They've lost 01:14:07.680 |
the control. They failed to solve the control problem of controlling humans, and so they've 01:14:13.760 |
lost. So we don't want to be in that situation, and if the guerrilla problem is the only formulation 01:14:20.240 |
you have, there's not a lot you can do, right? Other than to say, "Okay, we should try to stop. 01:14:26.560 |
You know, we should just not make the humans, or in this case, not make the AI." And I think 01:14:31.760 |
that's really hard to do. I'm not actually proposing that that's a feasible course of action. 01:14:40.640 |
And I also think that, you know, if properly controlled, AI could be incredibly beneficial. 01:14:46.080 |
But it seems to me that there's a consensus that one of the major failure modes 01:14:55.920 |
is this loss of control. That we create AI systems that are pursuing incorrect objectives, 01:15:02.880 |
and because the AI system believes it knows what the objective is, it has no incentive 01:15:11.040 |
to listen to us anymore, so to speak, right? It's just carrying out the strategy that it 01:15:19.360 |
has computed as being the optimal solution. And, you know, it may be that in the process, 01:15:26.880 |
it needs to acquire more resources to increase the possibility of success, or prevent various 01:15:34.640 |
failure modes by defending itself against interference. And so that collection of 01:15:40.880 |
problems, I think, is something we can address. The other problems are, roughly speaking, 01:15:51.120 |
misuse. So even if we solve the control problem, we make perfectly safe, controllable AI systems, 01:15:58.720 |
well why? Why is Dr. Evil going to use those? He wants to just take over the world, and he'll 01:16:04.320 |
make unsafe AI systems that then get out of control. So that's one problem which is sort of a 01:16:10.960 |
you know, partly a policing problem, partly a sort of a cultural problem for the profession 01:16:19.600 |
of how we teach people what kinds of AI systems are safe. 01:16:23.440 |
- You talk about autonomous weapon system and how pretty much everybody agrees that there's 01:16:28.240 |
too many ways that that can go horribly wrong. You have this great Slaughterbots movie that kind of 01:16:33.920 |
illustrates that beautifully. - Well, I want to talk about that. That's another 01:16:38.000 |
there's another topic I'm happy to talk about. I just want to mention that what I see is the 01:16:42.400 |
third major failure mode, which is overuse, not so much misuse, but overuse of AI. 01:16:48.080 |
That we become overly dependent. So I call this the WALL-E problem. So if you've seen WALL-E, 01:16:55.280 |
the movie, all right, all the humans are on the spaceship, and the machines look after everything 01:17:00.800 |
for them, and they just watch TV and drink big gulps. And they're all sort of obese and stupid, 01:17:07.680 |
and they sort of totally lost any notion of human autonomy. And, you know, so in effect, right, 01:17:17.040 |
this would happen like the slow boiling frog, right? We would gradually turn over more and more 01:17:24.960 |
of the management of our civilization to machines, as we are already doing. And this, you know, 01:17:30.000 |
if this process continues, you know, we sort of gradually switch from sort of being the masters 01:17:37.920 |
of technology to just being the guests, right? So we become guests on a cruise ship, you know, 01:17:44.240 |
which is fine for a week, but not for the rest of eternity. You know, and it's almost irreversible, 01:17:52.160 |
right? Once you lose the incentive to, for example, you know, learn to be an engineer or a 01:17:59.520 |
doctor or a sanitation operative or any other of the infinitely many ways that we maintain and 01:18:08.720 |
propagate our civilization, you know, if you don't have the incentive to do any of that, you won't. 01:18:18.400 |
- And of course, AI is just one of the technologies that could, that third failure 01:18:22.240 |
mode result in that. There's probably other technology in general detaches us from... 01:18:27.360 |
- It does a bit, but the difference is that in terms of the knowledge to run our civilization, 01:18:35.520 |
you know, up to now we've had no alternative but to put it into people's heads. 01:18:41.520 |
- Or software with Google, I mean, so software in general, so AI broadly defined. 01:18:45.840 |
- Computers in general, but the, you know, the knowledge of how, you know, 01:18:51.440 |
how a sanitation system works, you know, that's an, the AI has to understand that. 01:18:55.760 |
It's no good putting it into Google. So, I mean, we've always put knowledge in on paper, 01:19:00.800 |
but paper doesn't run our civilization. It only runs when it goes from the paper 01:19:05.600 |
into people's heads again. Right. So we've always propagated civilization through human minds. 01:19:12.000 |
And we've spent about a trillion person years doing that. Literally, right? You can work it out. 01:19:18.480 |
It's about, right. There's about just over a hundred billion people who've ever lived. 01:19:22.880 |
And each of them has spent about 10 years learning stuff to keep their civilization going. And so 01:19:29.200 |
that's a trillion person years we put into this effort. 01:19:31.520 |
- Beautiful way to describe all of civilization. 01:19:33.920 |
- And now we're, you know, we're in danger of throwing that away. So this is a problem that 01:19:38.480 |
AI can't solve. It's not a technical problem. It's, you know, and if we do our job right, 01:19:43.920 |
the AI systems will say, you know, the human race doesn't in the long run want to be passengers in 01:19:50.880 |
a cruise ship. The human race wants autonomy. This is part of human preferences. So we, the AI 01:19:58.160 |
systems are not going to do this stuff for you. You've got to do it for yourself. Right? I'm not 01:20:03.680 |
going to carry you to the top of Everest in an autonomous helicopter. You have to climb it if 01:20:08.640 |
you want to get the benefit and so on. So, but I'm afraid that because we are short-sighted and lazy, 01:20:18.160 |
we're going to override the AI systems. And there's an amazing short story that I recommend 01:20:25.280 |
to everyone that I talk to about this called "The Machine Stops" written in 1909 by E.M. Forster, 01:20:33.360 |
who, you know, wrote novels about the British empire and sort of things that became costume 01:20:39.200 |
dramas on the BBC. But he wrote this one science fiction story, which is an amazing 01:20:45.840 |
vision of the future. It has, it has basically iPads. It has video conferencing. It has MOOCs. 01:20:52.640 |
It has computer-induced obesity. I mean, literally, it's what people spend their time doing 01:21:01.280 |
is giving online courses or listening to online courses and talking about ideas, but they never 01:21:06.960 |
get out there in the real world. They don't really have a lot of face-to-face contact. 01:21:12.320 |
Everything is done online, you know. So, all the things we're worrying about now 01:21:16.480 |
were described in this story. And then the human race becomes more and more dependent on the 01:21:22.080 |
machine, loses knowledge of how things really run, and then becomes vulnerable to collapse. 01:21:29.840 |
And so, it's a pretty unbelievably amazing story for someone writing in 1909 to imagine all this. 01:21:38.800 |
Plus. So, there's very few people that represent artificial intelligence 01:21:46.080 |
If you say so, okay, that's very kind. So, it's all my fault. 01:21:50.800 |
It's all your fault. No, right. You're often brought up as the person, well, Stuart Russell, 01:21:59.680 |
like the AI person is worried about this. That's why you should be worried about it. 01:22:06.080 |
Do you feel the burden of that? I don't know if you feel that at all, but when I talk to people, 01:22:11.280 |
you talk about people outside of computer science when they think about this. Stuart Russell 01:22:17.600 |
is worried about AI safety. You should be worried too. Do you feel the burden of that? 01:22:22.880 |
I mean, in a practical sense, yeah. Because I get a dozen, sometimes 25, invitations a day 01:22:34.400 |
to talk about it, to give interviews, to write press articles, and so on. So, 01:22:39.840 |
in that very practical sense, I'm seeing that people are concerned and really interested about 01:22:47.120 |
this. But you're worried that you could be wrong, as all good scientists are. 01:22:52.240 |
Of course. I worry about that all the time. I mean, that's always been the way that I've worked, 01:22:58.800 |
you know, is like I have an argument in my head with myself, right? So, I have some idea, 01:23:04.720 |
and then I think, "Okay, how could that be wrong? Or did someone else already have that idea?" So, 01:23:11.120 |
I'll go and, you know, search in as much literature as I can to see whether someone 01:23:16.720 |
else already thought of that or even refuted it. So, you know, right now I'm reading a lot 01:23:23.840 |
of philosophy because, you know, in the form of the debates over utilitarianism and other kinds of 01:23:34.480 |
moral formulas, shall we say, people have already thought through some of these issues. But, 01:23:44.000 |
you know, one of the things I'm not seeing in a lot of these debates is this specific idea about 01:23:52.640 |
the importance of uncertainty in the objective. That this is the way we should think about 01:23:59.040 |
machines that are beneficial to humans. So, this idea of provably beneficial machines based on 01:24:06.240 |
explicit uncertainty in the objective, you know, it seems to be, you know, my gut feeling is this 01:24:15.200 |
is the core of it. It's going to have to be elaborated in a lot of different directions. 01:24:20.560 |
And there are a lot of... - Probably beneficial. 01:24:22.800 |
- Yeah, but there are, I mean, it has to be, right? We can't afford, you know, hand-wavy 01:24:28.800 |
beneficial because there are, you know, whenever we do hand-wavy stuff, there are loopholes. And 01:24:34.160 |
the thing about super-intelligent machines is they find the loopholes. You know, just like, 01:24:39.040 |
you know, tax evaders. If you don't write your tax law properly, people will find the loopholes 01:24:45.440 |
and end up paying no tax. And so, you should think of it this way. And getting those definitions 01:24:54.160 |
right, you know, it is really a long process, you know. So, you can define mathematical frameworks 01:25:04.400 |
and within that framework, you can prove mathematical theorems that, yes, this theoretical 01:25:10.480 |
entity will be provably beneficial to that theoretical entity. But that framework may 01:25:15.920 |
not match the real world in some crucial way. - It's a long process of thinking through it, 01:25:23.120 |
- You have 10 seconds to answer it. What is your favorite sci-fi movie about AI? 01:25:30.080 |
- I would say Interstellar has my favorite robots. - Oh, Beats, Space Odyssey. 01:25:36.640 |
- Yeah, yeah, yeah. So, TARS, the robots, one of the robots in Interstellar is the way robots 01:25:42.960 |
should behave. And I would say Ex Machina is in some ways the one that makes you think 01:25:54.000 |
in a nervous kind of way about where we're going. 01:25:57.920 |
- Well, Stuart, thank you so much for talking today.