back to indexJim Keller: Moore's Law, Microprocessors, and First Principles | Lex Fridman Podcast #70
Chapters
0:0 Introduction
2:12 Difference between a computer and a human brain
3:43 Computer abstraction layers and parallelism
17:53 If you run a program multiple times, do you always get the same answer?
20:43 Building computers and teams of people
22:41 Start from scratch every 5 years
30:5 Moore's law is not dead
55:47 Is superintelligence the next layer of abstraction?
60:2 Is the universe a computer?
63:0 Ray Kurzweil and exponential improvement in technology
64:33 Elon Musk and Tesla Autopilot
80:51 Lessons from working with Elon Musk
88:33 Existential threats from AI
92:38 Happiness and the meaning of life
00:00:00.000 |
The following is a conversation with Jim Keller, 00:00:05.560 |
who has worked at AMD, Apple, Tesla, and now Intel. 00:00:13.520 |
and Zen microarchitectures, Apple A4 and A5 processors, 00:00:30.040 |
and just an interesting and fun human being to talk to. 00:00:52.600 |
I'll do one or two minutes after introducing the episode 00:01:08.640 |
I personally use Cash App to send money to friends, 00:01:18.480 |
You can buy fractions of a stock, say $1 worth, 00:01:23.540 |
Brokers services are provided by Cash App Investing, 00:01:32.040 |
to support one of my favorite organizations called FIRST, 00:01:35.440 |
best known for their FIRST Robotics and Lego competitions. 00:01:38.980 |
They educate and inspire hundreds of thousands of students 00:01:44.100 |
and have a perfect rating at Charity Navigator, 00:01:50.760 |
When you get Cash App from the App Store or Google Play 00:01:56.280 |
you'll get $10 and Cash App will also donate $10 to FIRST, 00:02:02.140 |
that I've personally seen inspire girls and boys 00:02:08.060 |
And now here's my conversation with Jim Keller. 00:02:19.260 |
Let's start with the philosophical question perhaps. 00:02:22.280 |
- Well, since people don't actually understand 00:02:32.600 |
Computers are, you know, there's really two things. 00:02:37.280 |
There's memory and there's computation, right? 00:02:40.480 |
And to date, almost all computer architectures 00:02:49.400 |
and you do relatively simple operations on it 00:02:59.840 |
everything's a mesh, a mess that's combined together. 00:03:09.120 |
And information is stored in some distributed fashion. 00:03:13.720 |
And people build things called neural networks in computers 00:03:25.520 |
I don't know that the understanding of that is super deep. 00:03:37.880 |
So to date, it's hard to compare them, I would say. 00:03:42.880 |
- So let's get into the basics before we zoom back out. 00:03:56.640 |
Maybe even as far back as what is a transistor? 00:03:59.460 |
- So the special charm of computer engineering 00:04:23.680 |
And then functional units, like an adder, a subtractor, 00:04:28.760 |
And then we assemble those into processing elements. 00:04:32.300 |
Modern computers are built out of probably 10 to 20 00:04:47.920 |
And then software, there's an instruction set. 00:04:50.840 |
You run, and then there's assembly language, C, 00:04:58.680 |
essentially from the atom to the data center, right? 00:05:13.800 |
And then in an organization of a thousand people 00:05:27.080 |
- So there's a bunch of levels of abstraction. 00:05:29.380 |
In an organization like Intel, and in your own vision, 00:05:39.680 |
Some of it is science, some of it is engineering, 00:05:43.320 |
What's the most, if you could pick favorites, 00:05:46.340 |
what's the most important, your favorite layer 00:05:57.080 |
That's the fun, you know, I'm somewhat agnostic to that. 00:06:00.720 |
So I would say, for relatively long periods of time, 00:06:08.020 |
So the x86 instruction set, the ARM instruction set. 00:06:13.320 |
- So it says, how do you encode the basic operations? 00:06:16.080 |
Load, store, multiply, add, subtract, conditional branch. 00:06:19.620 |
There aren't that many interesting instructions. 00:06:26.440 |
90% of the execution is on 25 opcodes, 25 instructions. 00:06:35.480 |
- Intel architecture has been around for 25 years. 00:06:39.800 |
And that's because the basics are defined a long time ago. 00:06:48.720 |
is you fetched instructions and you executed them in order. 00:06:58.880 |
is you fetch large numbers of instructions, say 500. 00:07:15.260 |
So a modern computer, like people like to say, 00:07:22.340 |
complete, clean, slow computers is zero, right? 00:07:29.500 |
Now you can, there's how you build it can be clean, 00:07:45.540 |
and then executes it in a way that gets the right answers. 00:07:53.460 |
And then there's semantics around how memory ordering works 00:07:58.340 |
So the computer sort of has a bunch of bookkeeping tables. 00:08:01.900 |
It says, what order should these operations finish in 00:08:07.740 |
But to go fast, you have to fetch a lot of instructions 00:08:26.020 |
the dependency graph and you issue instructions out of order. 00:08:29.340 |
That's because you have one serial narrative to execute, 00:08:42.980 |
There's a sentence after sentence after sentence 00:08:49.380 |
Imagine you diagrammed it properly and you said, 00:08:59.060 |
- That's a fascinating question to ask of a book, yeah. 00:09:08.420 |
You could say, he is tall and smart and X, right? 00:09:13.420 |
And it doesn't matter the order of tall and smart. 00:09:18.220 |
But if you say the tall man is wearing a red shirt, 00:09:22.940 |
what colors, you know, like you can create dependencies. 00:09:36.860 |
And the first order, the screen you're looking at 00:09:44.460 |
Simple narratives around the large numbers of things 00:09:52.300 |
- So found parallelism where the narrative is sequential 00:09:57.300 |
but you discover like little pockets of parallelism versus-- 00:10:11.140 |
here's how you fetch 10 instructions at a time. 00:10:13.460 |
Here's how you calculated the dependencies between them. 00:10:51.500 |
- You would get what's called cycles per instruction 00:10:53.900 |
and it would be about, you know, three instructions, 00:10:59.980 |
because of the latency of the operations and stuff. 00:11:04.460 |
excuse it, but like 0.2, 0.25 cycles per instruction. 00:11:12.960 |
One is the found parallelism in the narrative, right? 00:11:17.300 |
And the other is to predictability of the narrative, right? 00:11:21.320 |
So certain operations, they do a bunch of calculations 00:11:25.480 |
and if greater than one, do this, else do that. 00:11:29.740 |
That decision is predicted in modern computers 00:11:45.460 |
figure out the graph and execute them all in parallel. 00:11:51.580 |
if you fix 600 instructions and it's every six, 00:11:56.060 |
you have to predict 99 out of 100 branches correctly 00:12:11.460 |
- So imagine you do a computation over and over, 00:12:19.420 |
And you go through that loop a million times. 00:12:22.640 |
you say, it's probably still greater than one. 00:12:25.740 |
- And you're saying you could do that accurately. 00:12:48.100 |
So then somebody said, hey, let's keep a couple of bits 00:13:00.760 |
And if it's, you can use the top bit as the sign bit. 00:13:05.020 |
So if it's greater than one, you predict taken 00:13:07.460 |
and less than one, you predict not taken, right? 00:13:52.220 |
and then you do basically deep pattern recognition 00:14:03.660 |
And you have something that chooses what the best result is. 00:14:06.660 |
There's a little supercomputer inside the computer. 00:14:14.260 |
So the effective window that is worth finding grass 00:14:42.700 |
to get from a window of say 50 instructions to 500, 00:14:52.420 |
- Now, if you get the prediction of a branch wrong, 00:14:57.380 |
- You flush the pipe, so it's just the performance cost. 00:15:01.420 |
So we're starting to look at stuff that says, 00:15:14.660 |
So you took the wrong path, you executed a bunch of stuff. 00:15:20.580 |
Then you had the mispredicting, you backed it up, 00:15:22.420 |
but you remembered all the results you already calculated. 00:15:27.660 |
Like if you read a book and you misunderstand a paragraph, 00:15:32.500 |
sometimes it's invariant to their understanding. 00:15:37.580 |
- And you can kind of anticipate that invariance. 00:15:47.380 |
And so when you come back to a piece of code, 00:15:49.220 |
should you calculate it again or do the same thing? 00:16:05.140 |
And you have a bunch of knowledge about which way to go, 00:16:17.820 |
So imagine you're doing something complicated 00:16:27.720 |
And the ways you pick interact in a complicated way. 00:16:35.660 |
- Right, so that's-- - Or that's art or science, 00:16:57.940 |
It seems like there's combinations of things. 00:17:02.180 |
but they're really good at evaluating the alternatives. 00:17:05.580 |
Right, and everybody has a different way to do it. 00:17:14.300 |
So when you see computers are designed by teams of people 00:17:19.300 |
and a good team has lots of different kinds of people. 00:17:24.300 |
I suspect you would describe some of them as artistic. 00:18:14.460 |
- But that's a language definitional statement. 00:18:19.780 |
when we first did 3D acceleration of graphics, 00:18:29.740 |
- Right, and then some people thought that was okay, 00:18:34.540 |
And then when the HPC world used GPUs for calculations, 00:18:48.060 |
where the precision of the data is low enough 00:18:53.620 |
And the observation is the input data is unbelievably noisy. 00:19:02.140 |
that say can get faster answers by being noisy. 00:19:09.540 |
it starts out really wide and then it gets narrower. 00:19:12.140 |
And you can say, is that last little bit that important? 00:19:17.700 |
before we whittle it all the way down to the answer? 00:19:20.780 |
Right, so you can create algorithms that are noisy. 00:19:25.460 |
and every time you run it, you get a different answer, 00:19:33.940 |
every time you run the program, you get the same answer. 00:19:38.340 |
that's the formal definition of a programming language. 00:19:44.500 |
that don't get the same answer, but people who use those. 00:19:47.400 |
You always want something 'cause you get a bad answer 00:19:53.260 |
of something in the algorithm or because of this? 00:19:56.780 |
that says no matter what, do it deterministically. 00:20:00.260 |
And it's really weird 'cause almost everything 00:20:09.620 |
- I design computers for people who run programs. 00:20:12.500 |
So if somebody says, I want a deterministic answer, 00:20:24.380 |
What people don't realize is you get a deterministic answer 00:20:27.260 |
even though the execution flow is very undeterministic. 00:20:36.100 |
And the answer, it arrives at the same answer. 00:20:42.020 |
Okay, you've achieved in the eyes of many people, 00:20:56.420 |
Perhaps because it was challenging, because of its impact, 00:21:10.100 |
And I have two small children, and I promise you, 00:21:18.340 |
- I'm really interested in building computers. 00:21:22.420 |
And I've worked with really, really smart people. 00:21:32.100 |
both as a thing to do, and as an endeavor that people do. 00:21:40.020 |
- Yeah, like how people think and build a computer. 00:21:43.000 |
And I find sometimes that the best computer architects 00:21:53.260 |
- So the whole stack of human beings is fascinating. 00:22:05.180 |
logic gates, functional units, computational elements, 00:22:12.620 |
And then you could think of organizational design 00:22:20.680 |
just like the computational elements are all different. 00:22:41.620 |
So what have you learned about the human abstractions, 00:22:51.900 |
What does it take to create something special? 00:22:55.020 |
- Well, most people don't think simple enough. 00:23:04.140 |
There's probably a philosophical description of this. 00:23:09.180 |
So imagine you're gonna make a loaf of bread. 00:23:11.500 |
The recipe says, get some flour, add some water, 00:23:20.280 |
Understanding bread, you can understand biology, 00:23:34.380 |
There's so many levels of understanding there. 00:23:37.220 |
And then when people build and design things, 00:23:40.220 |
they frequently are executing some stack of recipes. 00:23:53.700 |
But if you have a deep understanding of cooking, 00:24:03.100 |
there's a different way of viewing everything. 00:24:07.740 |
And most people, when you get to be an expert at something, 00:24:12.260 |
you're hoping to achieve deeper understanding, 00:24:16.420 |
not just a large set of recipes to go execute. 00:24:19.960 |
And it's interesting to walk groups of people 00:24:22.820 |
because executing recipes is unbelievably efficient 00:24:29.180 |
If it's not what you want to do, you're really stuck. 00:24:40.940 |
And some people are really good at recognizing 00:24:43.740 |
when the problem is to understand something deeply. 00:24:55.540 |
- Well, this goes back to the art versus science question. 00:25:01.220 |
for deeper understanding, you never get anything done. 00:25:04.220 |
And if you don't unpack understanding when you need to, 00:25:08.460 |
And then at every juncture, like human beings 00:25:12.020 |
are these really weird things because everything you tell 00:25:17.100 |
And then they all interact in a hilarious way. 00:25:20.640 |
And then having some intuition about what do you tell them, 00:25:24.260 |
what do you do, when do you intervene, when do you not, 00:25:29.780 |
- It's essentially computationally unsolvable. 00:25:37.980 |
But with deep understanding, do you mean also 00:25:53.700 |
Like the why question is why are we even building this? 00:26:04.300 |
sort of really getting into the core of the science? 00:26:14.620 |
And then when somebody says, I want to make it 10% faster, 00:26:22.980 |
Or I have this thing that's three instructions wide, 00:26:45.360 |
And then somebody else will look at it and say, 00:26:46.900 |
well, actually the way you divided the problem up 00:26:49.380 |
and the way that different features are interacting 00:26:51.940 |
is limiting you and it has to be rethought, rewritten. 00:27:09.700 |
maybe more generally to just throw the whole thing out? 00:27:25.180 |
- If you want to really make a lot of progress 00:27:38.740 |
- I wrote the, I was the co-author of that spec in '98. 00:27:45.820 |
- The instruction set itself has been extended 00:27:57.500 |
Intel's designed a few, AMD's designed a few, 00:28:02.460 |
And I don't want to go into too much of the detail 00:28:15.100 |
- So you're saying you're an outlier in that sense? 00:28:25.180 |
- To everybody involved, because like you said, 00:28:33.820 |
well, no, individual engineers want to succeed, 00:28:52.380 |
And to get to the next level, you have to do a new one, 00:28:57.660 |
than the old optimization point, but it'll get higher. 00:29:10.460 |
- Right, like, you know, people with a quarter-by-quarter 00:29:13.820 |
business objective are terrified about changing everything. 00:29:21.080 |
or build a computer for a long-term objective 00:29:35.200 |
every time they saw that they had to redo something, 00:29:43.080 |
Like, you optimize the old one while you build a new one. 00:29:53.920 |
well, the new computer will be faster on the average. 00:29:56.740 |
But there's a distribution of results and performance, 00:29:59.480 |
and you'll have some outliers that are slower. 00:30:01.920 |
And that's very hard, 'cause they have one customer 00:30:05.320 |
- So speaking of the long-term, for over 50 years now, 00:30:09.000 |
Moore's Law has served, for me and millions of others, 00:30:12.960 |
as an inspiring beacon of what kind of amazing future 00:30:18.160 |
I'm just making your kids laugh all of today. 00:30:23.520 |
- So first, in your eyes, what is Moore's Law, 00:30:27.560 |
if you could define for people who don't know? 00:30:29.860 |
- Well, the simple statement was, from Gordon Moore, 00:30:34.300 |
was double the number of transistors every two years. 00:30:43.280 |
we increase the performance of computers by 2x 00:30:48.560 |
And it's wiggled around substantially over time. 00:30:51.480 |
And also, in how we deliver performance has changed. 00:30:59.000 |
- The foundational idea was 2x the transistors 00:31:29.160 |
What's the broader, what do you think should be 00:31:33.920 |
When you mentioned how you think of performance, 00:31:37.920 |
just broadly, what's a good way to think about Moore's Law? 00:31:41.480 |
- Well, first of all, I've been aware of Moore's Law 00:31:49.080 |
- Well, I've been designing computers for 40. 00:31:52.920 |
- You're just watching it before your eyes, kind of thing. 00:31:55.440 |
- Well, and somewhere where I became aware of it, 00:31:58.160 |
I was also informed that Moore's Law was gonna die 00:32:07.480 |
And then at one point, it was gonna die in five years, 00:32:22.840 |
And I thought, that's sad, 'cause it's the Moore's Law 00:32:24.600 |
company, and it's not dead, and it's always been gonna die. 00:32:29.200 |
And humans, like these apocryphal kind of statements, 00:32:33.360 |
like, we'll run out of food, or we'll run out of air, 00:32:39.960 |
- Right, but it's still incredible that it's lived 00:32:55.400 |
But why do you think, if you can try to understand it, 00:33:09.160 |
But actually, under the sheets, there's literally 00:33:10.760 |
thousands of innovations, and almost all those innovations 00:33:38.480 |
you will probably tell people, "Well, this is done." 00:33:55.760 |
a thousand by a thousand by a thousand atoms, right? 00:33:59.920 |
And you get quantum effects down around two to 10 atoms. 00:34:14.480 |
are working away at how to use quantum effects. 00:34:28.840 |
if you look at the fan, it's like 120 atoms wide, 00:34:42.000 |
could count both atoms in every single direction. 00:34:47.980 |
Like there's techniques now to already put down atoms 00:34:55.820 |
It's just, you know, from a manufacturing process, 00:35:01.300 |
and you need to put 10 to the 23rd atoms together 00:35:05.620 |
to make a computer, it would take a long time. 00:35:27.580 |
there's material science, there's metallurgy. 00:35:32.660 |
different materials together, how do they interact? 00:35:45.020 |
- But just for the shrinking, you don't think 00:35:46.980 |
we're quite yet close to the fundamental limits of physics. 00:35:52.540 |
for a roadmap to a path of 100 and after two weeks, 00:36:05.120 |
Well, here's the thing about Moore's Law, right? 00:36:16.360 |
Now, as a computer designer, you have two stances. 00:36:20.940 |
You think it's going to shrink, in which case 00:36:23.040 |
you're designing and thinking about architecture 00:36:29.020 |
Or conversely, not be swamped by the complexity 00:36:39.300 |
- So you're open to the possibility and waiting 00:36:53.580 |
about design, how you think about architecture 00:36:57.200 |
Like imagine you build buildings out of bricks 00:37:05.860 |
Well, if you kept building bricks the same way, 00:37:24.540 |
of the smaller bricks, more strength, thinner walls, 00:37:27.500 |
you know, less material, efficiency out of that. 00:37:30.320 |
So once you have a roadmap with what's going to happen, 00:37:33.220 |
transistors, we're going to get more of them, 00:37:36.500 |
then you design all this collateral around it 00:37:38.740 |
to take advantage of it and also to cope with it. 00:37:42.420 |
Like that's the thing people don't understand, 00:37:50.500 |
- So what's the hardest part of this in flood 00:38:03.700 |
what fundamentally changes when you add more transistors 00:38:17.300 |
that we do get smarter because of nutrition, whatever. 00:38:24.560 |
Nobody understands it, nobody knows if it's still going on. 00:38:40.900 |
- Right, so human beings, we're really good in teams of 10, 00:38:45.060 |
up to teams of 100, they can know each other. 00:38:48.140 |
Beyond that, you have to have organizational boundaries. 00:38:59.700 |
The power of abstraction layers is really high. 00:39:03.180 |
We used to build computers out of transistors. 00:39:06.100 |
Now we have a team that turns transistors into logic cells 00:39:08.860 |
and another team that turns them into functional units 00:39:10.660 |
and another one that turns them into computers, right? 00:39:16.040 |
And you have to think about when do you shift gears on that? 00:39:21.040 |
We also use faster computers to build faster computers. 00:39:24.280 |
So some algorithms run twice as fast on new computers, 00:39:30.420 |
So, you know, a computer with twice as many transistors 00:39:51.560 |
is shrinking the thing we've just been talking about, 00:40:03.900 |
Like in the direction of sort of enforcing given parallelism 00:40:15.020 |
you know, stacking CPUs on top of each other, 00:40:17.660 |
that kind of parallelism or any kind of parallelism? 00:40:30.580 |
And then we made faster computers with vector units 00:40:33.460 |
and you can do proper equations and matrices, right? 00:40:43.380 |
where you convolve one large data set against another. 00:40:47.060 |
And so there's sort of this hierarchy of mathematics, 00:40:51.100 |
you know, from simple equation to linear equations 00:40:54.020 |
to matrix equations to deeper kind of computation. 00:41:00.580 |
that people are thinking of data as a topology problem. 00:41:04.340 |
You know, data is organized in some immense shape. 00:41:09.340 |
which sort of wants to be get data from immense shape 00:41:21.380 |
So that paper you referenced, the Sutton paper, 00:41:26.620 |
they talked about, you know, like when AI started, 00:41:31.860 |
That's a very simple computational situation. 00:41:39.860 |
So have a huge database of moves and results, deep search, 00:41:56.260 |
It's a completely different kind of phenomena. 00:42:03.780 |
they're going up this mathematical graph, right? 00:42:07.540 |
And then computations, both computation and data sets 00:42:15.460 |
I mean, I would argue that all of it is still a search, 00:42:19.980 |
Just like you said, a topology problem of data sets, 00:42:22.780 |
you're searching the data sets for valuable data. 00:42:27.020 |
And also the actual optimization of neural networks 00:42:33.060 |
- I don't know, if you had looked at the inner layers 00:42:45.660 |
And then you can have a shadow of that on the something 00:42:55.540 |
And, but the computation to tease out the attributes 00:43:08.300 |
- And then in deep networks, they look at layers 00:43:13.140 |
And yet if you take the layers out, it doesn't work. 00:43:29.020 |
- I would say it's absolutely not semantics, but-- 00:43:49.020 |
and the space, the incredibly multi-dimensional, 00:43:54.020 |
100,000 dimensional space that neural networks 00:44:11.220 |
the funny thing is, is the difference between 00:44:17.340 |
- Yeah, maybe that's a different way to describe it. 00:44:21.700 |
in terms of the basic mathematical operations 00:44:38.540 |
Well, the operations continue to be add, subtract, 00:44:48.860 |
of computers or transistors, under that, atoms. 00:44:52.780 |
So you got atoms, transistors, logic gates, computers, 00:44:58.420 |
The building blocks of mathematics at some level 00:45:01.060 |
are things like adds and subtracts and multiplies, 00:45:31.660 |
So the data types in TensorFlow imply an optimization set, 00:45:36.660 |
but when you go right down and look at the computers, 00:45:40.460 |
it's and and or gates doing adds and multiplies. 00:45:49.980 |
and then there's people who think about analog computing, 00:46:10.980 |
and ability to hit mathematical abstractions. 00:46:30.700 |
And as we get the next two orders of magnitude, 00:46:36.620 |
every order of magnitude changes the computation. 00:46:40.140 |
- Fundamentally changes what the computation is doing. 00:46:45.660 |
the difference in quantity is the difference in kind. 00:46:48.300 |
You know, the difference between ant and anthill, right? 00:46:58.880 |
where the quantity changed the quality, right? 00:47:02.500 |
And we've seen that happen in mathematics multiple times, 00:47:09.980 |
if you focus head down and shrinking the transistor. 00:47:38.100 |
you know, you find adders and subtractors and multipliers. 00:48:00.040 |
- Well, software guys have a thing that they call it 00:48:12.380 |
the odds of that being the performance limiter is low. 00:48:16.900 |
can you make it 2X faster by optimizing the right things? 00:48:30.300 |
- But the whole time as you're doing the writing, 00:48:34.860 |
The hardware underneath gets faster and faster. 00:48:39.980 |
then your AI research should expect that to show up, 00:48:44.980 |
and then you make a slightly different set of choices, 00:48:47.900 |
then we've hit the wall, nothing's gonna happen, 00:48:51.380 |
and from here, it's just us rewriting algorithms. 00:48:56.500 |
for the last 30 years of Moore's Law's death. 00:49:07.300 |
So, why do you think Moore's Law is not going to die? 00:49:12.300 |
Which is the most promising, exciting possibility 00:49:15.740 |
of why it won't die in the next five, 10 years? 00:49:18.060 |
So, is it the continued shrinking of the transistor, 00:49:30.240 |
- Right, so there's stacks of S-curves in there. 00:49:47.460 |
so they took a fin which had a gate around it 00:49:55.380 |
and then from there, there's some obvious steps 00:49:59.380 |
So, the metallurgy around wire stacks and stuff 00:50:07.140 |
and there's a whole combination of things there to do. 00:50:45.380 |
And then, when we were thinking about Moore's Law, 00:50:49.540 |
Rajagirdari said, "Every 10x generates a new computation." 00:50:53.300 |
So, scalar, vector, matrix, topological computation. 00:51:03.860 |
there was mainframes and minicomputers and PCs, 00:51:14.780 |
And people are starting to think about the smart world 00:51:21.220 |
The transformations are gonna be unpredictable. 00:51:29.900 |
of the key architects of this kind of future? 00:51:37.180 |
of the high-level people who build the Angry Bird apps. 00:51:45.580 |
Maybe that's the whole point of the universe. 00:51:48.820 |
and the attention-distracting nature of mobile phones. 00:52:17.220 |
for talking to their friends all day on text. 00:52:38.100 |
so there's billions of people on this planet. 00:52:49.860 |
and getting paid for it, and there's an interest in it. 00:52:52.820 |
But there's so many things going on in parallel. 00:53:04.860 |
You know, there's a, I'm sure some philosopher 00:53:14.020 |
- So you can't deny the fact that these tools, 00:53:19.140 |
whether, that these tools are changing our world. 00:53:25.260 |
- So do you think it's changing for the better? 00:53:31.740 |
the two disciplines with the highest GRE scores 00:53:38.420 |
And they're both sort of trying to answer the question, 00:53:42.900 |
And the philosophers are on the kind of theological side, 00:53:47.740 |
and the physicists are obviously on the material side. 00:53:52.660 |
And there's 100 billion galaxies with 100 billion stars. 00:54:00.140 |
So, you know, there's, on our way to 10 billion people. 00:54:11.260 |
- Things do tend to significantly increase in complexity. 00:54:30.100 |
you get a surface that falls, you know, grows by R squared. 00:54:41.260 |
And computation has been, let's say, relatively pedestrian. 00:54:54.460 |
through the other realms of possibility, right? 00:55:01.820 |
mathematical computations that are sophisticated enough 00:55:06.540 |
that nobody understands how the answers came out, right? 00:55:18.900 |
if it's predictive of new functions, new data sets. 00:55:34.260 |
And it can arrive at results that I don't know 00:55:37.580 |
if they're completely mathematically describable. 00:55:39.980 |
So computation has kind of done something interesting 00:56:01.080 |
Do you think we're creating sort of the next step 00:56:03.460 |
in our evolution in creating artificial intelligence systems 00:56:09.260 |
I mean, there's so much in the universe already, 00:56:14.060 |
- Are human beings working on additional abstraction layers 00:56:20.300 |
Does that mean that human beings don't need dogs? 00:56:25.940 |
that are all simultaneously interesting and useful. 00:56:32.460 |
you've seen greater and greater level abstractions 00:56:41.260 |
do you think that the look of all life on Earth 00:56:46.860 |
this machine with greater and greater levels of abstraction, 00:56:58.380 |
Or do you think we're just somewhere in the middle? 00:57:00.500 |
Are we the basic functional operations of a CPU? 00:57:10.460 |
Like somebody's, you know, people have calculated 00:57:14.900 |
And something, you know, I've seen the number 10 00:57:17.020 |
to the 18th a bunch of times, arrived different ways. 00:57:32.980 |
You know, my personal experience is interesting 00:57:35.260 |
'cause, you know, you think you know how you think 00:57:44.100 |
like what you can be aware of is interesting. 00:57:48.660 |
So I don't know if brains are magical or not. 00:57:54.780 |
Lots of people's personal experience says yes. 00:57:57.820 |
So what would be funny is if brains are magical 00:58:01.300 |
and yet we can make brains with more computation. 00:58:04.620 |
You know, I don't know what to say about that, but. 00:58:07.060 |
- Well, do you think magic is an emergent phenomena? 00:58:20.620 |
- Yeah, like what, you know, consciousness, love, 00:58:29.560 |
Is that something that we'll be able to make, 00:58:41.020 |
- Can you summarize it in a couple of sentences? 00:58:44.020 |
- Many people have observed that organisms run 00:58:52.860 |
you'd have one sensory neuron and one motor neuron, right? 00:58:56.900 |
So we move towards things and away from things 00:58:58.820 |
and we have physical integrity and safety or not, right? 00:59:05.660 |
you can see brains that are a little more complicated 00:59:17.220 |
And then our brains have massive numbers of structures, 00:59:21.660 |
you know, like planning and movement and thinking 00:59:27.940 |
And we seem to have multiple layers of thinking systems. 00:59:37.500 |
And you can think in a way that those systems 00:59:46.540 |
you know, the different parts of yourself can observe them. 01:00:15.340 |
how much calculation it takes to describe quantum effects 01:00:29.580 |
But then the simulation guys have pointed out 01:00:32.700 |
Like when you look really close, it's uncertain. 01:00:35.100 |
And the speed of light says you can only look so far 01:00:45.100 |
And somebody said physics is like having 50 equations 01:00:59.020 |
It seems odd when you get to the corners of everything. 01:01:07.180 |
- It's almost like the designers of the simulation 01:01:09.380 |
are trying to prevent us from understanding it perfectly. 01:01:12.820 |
- But also the things that require calculations 01:01:17.740 |
that our idea of the universe of a computer is absurd 01:01:23.100 |
takes all the computation in the universe to figure out. 01:01:28.100 |
You know, you say the simulation is running in the computer 01:01:30.900 |
which has by definition infinite computation. 01:01:37.700 |
- Yeah, well, every little piece of our universe 01:01:40.700 |
seems to take infinite computation to figure out. 01:01:46.060 |
Compute this little teeny spot takes all the mass 01:01:50.340 |
in the local one light year by one light year space. 01:01:54.940 |
- Oh, it's a heck of a computer if it is one. 01:02:00.020 |
'cause the simulation description seems to break 01:02:04.940 |
But the rules of the universe seem to imply something's up. 01:02:10.900 |
- The universe, the whole thing, the laws of physics, 01:02:14.980 |
it just seems like how did it come out to be the way it is? 01:02:22.660 |
Like I said, the two smartest groups of humans 01:02:27.060 |
- Different aspects and they're both complete failures. 01:02:34.260 |
- Well, after 2,000 years, the trend isn't good. 01:02:43.380 |
- But the next 1,000 years doesn't look good either. 01:02:48.940 |
But with Moore's Law, as you've just described, 01:02:51.420 |
not being dead, the exponential growth of technology, 01:02:57.740 |
- Well, it'll be interesting, that's for sure. 01:03:00.460 |
So what are your thoughts on Ray Kurzweil's sense 01:03:16.900 |
has a way of stacking S-curves on top of each other 01:03:24.540 |
- What does an exponential of a million mean? 01:03:29.440 |
And that's just for a local little piece of silicon. 01:03:35.780 |
1,000 tons of silicon to collaborate in one computer 01:03:49.820 |
than our current already unbelievably fast computers. 01:04:31.140 |
and a little bit of ocean water into computers. 01:04:33.340 |
So all the cost is in the equipment to do it. 01:04:36.700 |
And the trend on equipment is once you figure out 01:04:39.420 |
how to build the equipment, the trend of cost is zero. 01:04:41.820 |
Elon said first you figure out what configuration 01:04:45.900 |
you want the atoms in and then how to put them there. 01:04:51.480 |
- But here's the, you know, his great insight is 01:04:58.700 |
And then little tweaks to that will generate something 01:05:21.640 |
Elon Musk believes that autopilot and vehicle autonomy, 01:05:26.700 |
can follow this kind of exponential improvement. 01:05:29.500 |
In terms of the how question that we're talking about, 01:05:34.700 |
What are your thoughts on this particular space 01:05:45.260 |
- Well, the computer you need to build is straightforward. 01:05:48.780 |
And you could argue, well, does it need to be 01:05:53.600 |
But that's just a matter of time or price in the short run. 01:06:00.240 |
You don't have to be especially smart to drive a car. 01:06:05.740 |
I mean, the big problem with safety is attention, 01:06:07.940 |
which computers are really good at, not skills. 01:06:30.620 |
and you can train a neural network to extract 01:06:33.060 |
the distance of any object and the shape of any surface 01:06:46.340 |
It's because it's not just detecting objects, 01:06:50.460 |
it's understanding the scene and it's being able to do it 01:06:56.580 |
So the beautiful thing about the human vision system 01:07:05.540 |
It's not just about perfectly detecting cars, 01:07:09.960 |
It's trying to, it's understanding the physics-- 01:07:20.740 |
- Well, there is a, you know, when you're driving a car 01:07:22.660 |
and somebody cuts you off, your brain has theories 01:07:26.140 |
You know, they're a bad person, they're distracted, 01:07:28.660 |
they're dumb, you know, you can listen to yourself. 01:07:32.820 |
- So, you know, if you think that narrative is important 01:07:44.360 |
and probabilistic changes of speed and direction, 01:07:56.340 |
You can place every object really thoroughly, right? 01:08:01.340 |
You can calculate trajectories of things really thoroughly. 01:08:06.900 |
- But everything you said about really thoroughly 01:08:15.100 |
computer autonomous systems will be way better 01:08:22.480 |
they'll always remember there was a pothole in the road 01:08:57.120 |
And autonomous systems are happily maximizing the givens. 01:09:04.120 |
you remember it 'cause you're processing it the whole time. 01:09:08.880 |
you get to work, you don't know how you got there. 01:09:17.720 |
But the cars have no theories about why they got cut off 01:09:25.520 |
So I tend to believe you do have to have theories, 01:09:32.800 |
So everything you said is actually essential to driving. 01:09:37.800 |
Driving is a lot more complicated than people realize, 01:09:50.120 |
You'd be surprised how simple a calculation for that is. 01:09:53.800 |
- I may be on that particular point, but there's-- 01:10:04.280 |
but I think you might be surprised how complicated it is. 01:10:09.280 |
- I tell people, it's like progress disappoints 01:10:11.960 |
in the short run and surprises in the long run. 01:10:15.760 |
- I suspect in 10 years, it'll be just taken for granted. 01:10:22.360 |
- It's gonna be a $50 solution that nobody cares about. 01:10:34.320 |
But I do think that systems that involve human behavior 01:10:38.880 |
are more complicated than we give them credit for. 01:10:40.800 |
So we can do incredible things with technology 01:10:45.600 |
- I think humans are less complicated than people 01:10:51.400 |
- We tend to operate out of large numbers of patterns 01:10:55.800 |
- But I can't trust you because you're a human. 01:11:13.440 |
Like you said, attention and things like that. 01:11:17.660 |
that the overall picture of safety and autonomy 01:11:26.360 |
I mean, there are already the current safety systems 01:11:29.600 |
like cruise control that doesn't let you run into people 01:11:33.320 |
There are so many features that you just look at the Pareto 01:11:36.280 |
of accidents and knocking off like 80% of them 01:11:45.820 |
it seems to be that there's a very intense scrutiny 01:11:51.680 |
by the media and the public in terms of safety, 01:11:54.280 |
the pressure, the bar put before autonomous vehicles. 01:12:01.720 |
working on the hardware and trying to build a system 01:12:21.200 |
would write into the rules technology solutions 01:12:25.080 |
like modern brake systems imply hydraulic brakes. 01:12:44.320 |
don't hit pedestrians, don't run into people, 01:12:47.040 |
don't leave the road, don't run a red light or a stoplight. 01:12:53.120 |
And they had all the data about which scenarios 01:12:59.280 |
And for the most part, those conversations were like, 01:13:04.000 |
what's the right thing to do to take the next step? 01:13:08.760 |
Now Elon's very interested also in the benefits 01:13:11.960 |
of autonomous driving or freeing people's time 01:13:16.480 |
And I think that's also an interesting thing, 01:13:20.320 |
but building autonomous systems so they're safe 01:13:27.360 |
since the goal is to be 10x safer than people, 01:13:32.160 |
and scrutinizing accidents seems philosophically correct. 01:13:40.760 |
- It's different than the things you worked at, 01:13:47.360 |
the Intel, AMD, Apple, with autopilot chip design 01:13:55.300 |
of building this specialized kind of computing system 01:14:02.740 |
One is the software team, the machine learning team 01:14:07.280 |
is developing algorithms that are changing fast. 01:14:24.560 |
which is if you build a really good general purpose computer 01:14:29.800 |
and then GPU guys will deliver about 5x to performance 01:14:39.200 |
And then special accelerators get another two to 5x 01:14:55.160 |
So, AI accelerators have a claim performance benefit 01:15:13.240 |
So there's a little creative tension there of, 01:15:17.240 |
I want the acceleration afforded by specialization 01:15:22.100 |
so that the new algorithm is so much more effective 01:15:29.920 |
To build a good computer for an application like automotive, 01:15:34.360 |
there's all kinds of sensor inputs and safety processors 01:15:39.060 |
So one of Elon's goals to make it super affordable. 01:15:55.160 |
And Elon's constraint was, I'm gonna put one in every car, 01:15:58.720 |
whether people buy autonomous driving or not. 01:16:01.640 |
So the cost constraint he had in mind was great. 01:16:05.200 |
And to hit that, you had to think about the system design. 01:16:14.240 |
You can say Stradivarius is this incredible thing, 01:16:20.440 |
picked wood and sanded it, and then he cut it, 01:16:47.880 |
I used to, I dug ditches when I was in college. 01:16:56.920 |
So there's an expression called complex mastery behavior. 01:17:04.060 |
When you do something and it's rote and simple, 01:17:06.680 |
But if the steps that you have to do are complicated 01:17:10.360 |
and you're good at 'em, it's satisfying to do them. 01:17:16.840 |
as you're doing them, you sometimes learn new things 01:17:23.720 |
And engineers, like engineering is complicated enough 01:17:28.760 |
and then a lot of what you do is then craftsman's work, 01:17:41.080 |
that essentially boils down to craftsman's work. 01:17:45.880 |
- Yeah, you know, there's thoughtful decisions 01:17:47.660 |
and problems to solve and trade-offs to make. 01:17:52.480 |
You know, you're building for the current car 01:18:01.420 |
It's not like I'm building a new type of neural network 01:18:04.740 |
which has a new mathematics and a new computer to work. 01:18:08.020 |
You know, that's, like there's more invention than that. 01:18:14.100 |
once you pick the architecture, you look inside 01:18:17.060 |
Adders and multipliers and memories and, you know, 01:18:21.180 |
So computers is always this weird set of abstraction layers 01:18:25.580 |
of ideas and thinking that reduction to practice 01:18:29.300 |
is transistors and wires and, you know, pretty basic stuff. 01:18:44.140 |
Like the people who work there really like it. 01:18:50.860 |
And the car is moving and the parts are moving 01:18:54.940 |
and you have to coordinate putting all the stuff together 01:19:03.940 |
and some of the guys sitting around were really bummed 01:19:17.780 |
But what they did was complicated and you couldn't do it. 01:19:34.620 |
in a minute and a half is unbelievably complicated. 01:19:38.160 |
And human beings can do it, it's really good. 01:19:42.500 |
I think that's harder than driving a car, by the way. 01:19:56.980 |
- No, not for us humans driving a car is easy. 01:20:07.460 |
because we've been evolving for billions of years. 01:20:15.640 |
- Oh, now you join the rest of the internet in mocking me. 01:20:50.900 |
what have you learned, have taken away from your time 01:21:00.860 |
innovation, craftsmanship, and all those things. 01:21:17.420 |
that no matter what you do, it's a local maximum. 01:21:24.260 |
and it was a lot better than what we were using. 01:21:33.300 |
And I said, "You know, when the super intelligent aliens 01:21:43.220 |
But doing interesting work that's both innovative, 01:21:49.440 |
and let's say craftsman's work on the current thing, 01:21:55.140 |
And then Elon was good at taking everything apart. 01:22:03.980 |
You know, that ability to look at it without assumptions, 01:22:21.860 |
Like, when they first landed two SpaceX rockets to Tesla, 01:22:44.580 |
You think that's not gonna be unbelievably painful? 01:22:57.440 |
"to go take apart that many layers of assumptions?" 01:23:05.360 |
So it could be emotionally and intellectually painful, 01:23:07.900 |
that whole process of just stripping away assumptions? 01:23:23.620 |
when you get back into that one bit that's useful, 01:23:44.200 |
Now for a long time I've suspected you could get better. 01:23:47.040 |
Like you can think better, you can think more clearly, 01:23:52.040 |
And there's lots of examples of that, people who do that. 01:24:14.600 |
Well, no, I've read a couple of books a week for 55 years. 01:24:19.600 |
Well, maybe 50, 'cause I didn't learn to read 01:24:39.800 |
who wrote the best books and who like, you know, 01:24:58.720 |
and basically compared to all the VPs running around, 01:25:01.400 |
I'd read 19 more management books than anybody else. 01:25:12.660 |
- But at the core of that is questioning the assumptions, 01:25:16.960 |
or sort of entering, thinking first principles thinking, 01:25:21.760 |
sort of looking at the reality of the situation, 01:25:24.880 |
and using that knowledge, applying that knowledge. 01:25:28.200 |
- Yeah, so I would say my brain has this idea 01:25:38.280 |
and you have to kind of circle back that observation. 01:25:45.120 |
- Well, it's hard to just keep it front and center, 01:25:47.280 |
'cause you operate on so many levels all the time, 01:26:08.260 |
But you do for a while, and that's kind of cool. 01:26:16.200 |
from the big picture, from the first principles, 01:26:19.480 |
do you think, you kind of answered it already, 01:26:24.320 |
is something we can solve on a timeline of years? 01:26:42.600 |
the fundamentals of building the hardware and the software? 01:27:00.240 |
people are doing frequency and domain analysis, 01:27:38.560 |
that when you add human beings into the picture, 01:27:50.360 |
- Cars are highly damped in terms of rate of change. 01:27:57.640 |
The acceleration, the acceleration's really slow. 01:28:02.840 |
On a ballistics time scale, but human behavior, 01:28:29.680 |
there's gonna be pleasant surprises all over the place. 01:28:49.800 |
you know, beyond the point, there's no looking back. 01:28:53.320 |
Do you share this worry of existential threats 01:28:57.360 |
from computers becoming superhuman level intelligent? 01:29:03.400 |
You know, like we already have a very stratified society. 01:29:07.520 |
And then if you look at the whole animal kingdom 01:29:12.560 |
and, you know, smart people have their niche, 01:29:15.280 |
and, you know, normal people have their niche, 01:29:26.040 |
for things that, you know, astronomically different, 01:29:29.480 |
like the whole something got 10 times smarter than us 01:29:32.320 |
and wanted to track us all down because what? 01:29:42.560 |
where there's something way smarter than you, 01:29:48.920 |
Well, there's what, 0.1% of the population who thinks that? 01:29:52.560 |
'Cause the rest of the population's been dealing with it 01:30:03.680 |
And, you know, superintelligence seems likely, 01:30:09.840 |
although we still don't know if we're magical, 01:30:16.320 |
and it seems likely that it'll create possibilities 01:30:20.920 |
and its interests will be interesting for whatever it is. 01:30:28.920 |
would somehow wanna fight over some square foot of dirt 01:30:32.400 |
or, you know, whatever the usual fears are about. 01:30:41.320 |
- Depends on how you think reality's constructed. 01:30:45.240 |
So for whatever reason, human beings are in, let's say, 01:30:55.400 |
Like, there's lots of philosophical understanding of that. 01:31:03.200 |
- So you think the evil is necessary for the good? 01:31:11.640 |
where your good is somebody else's, you know, evil. 01:31:43.160 |
will leave humans behind in a way that's painful? 01:31:51.280 |
- Isn't it already painful for a large percentage 01:31:54.900 |
I mean, society does have a lot of stress in it, 01:31:57.880 |
about the 1% and about to this and about to that, 01:32:00.680 |
but you know, everybody has a lot of stress in their life 01:32:05.360 |
and you know, know yourself seems to be the proper dictum 01:32:09.760 |
and pursue something that makes your life meaningful 01:32:45.800 |
because there were the happiest times of your life 01:32:58.040 |
I like that situation where you have some amount of optimism 01:33:04.840 |
- So you love the unknown, the mystery of it. 01:33:12.940 |
- What do you think is the meaning of this whole thing? 01:33:29.280 |
makes atoms which makes us which we do stuff. 01:33:32.820 |
And we figure out things and we explore things. 01:33:43.520 |
Jim, I don't think there's a better place to end it 01:33:56.200 |
and thank you to our presenting sponsor, Cash App. 01:33:59.360 |
Download it, use code LexPodcast, you'll get $10 01:34:03.080 |
and $10 will go to FIRST, a STEM education nonprofit 01:34:06.440 |
that inspires hundreds of thousands of young minds 01:34:12.200 |
If you enjoy this podcast, subscribe on YouTube, 01:34:15.000 |
get five stars on Apple Podcast, follow on Spotify, 01:34:18.280 |
support on Patreon or simply connect with me on Twitter. 01:34:22.320 |
And now let me leave you with some words of wisdom 01:34:26.880 |
If everything you try works, you aren't trying hard enough. 01:34:30.920 |
Thank you for listening and hope to see you next time.