back to index

David Patterson: Computer Architecture and Data Storage | Lex Fridman Podcast #104


Chapters

0:0 Introduction
3:28 How have computers changed?
4:22 What's inside a computer?
10:2 Layers of abstraction
13:5 RISC vs CISC computer architectures
28:18 Designing a good instruction set is an art
31:46 Measures of performance
36:2 RISC instruction set
39:39 RISC-V open standard instruction set architecture
51:12 Why do ARM implementations vary?
52:57 Simple is beautiful in instruction set design
58:9 How machine learning changed computers
68:18 Machine learning benchmarks
76:30 Quantum computing
79:41 Moore's law
88:22 RAID data storage
96:53 Teaching
100:59 Wrestling
105:26 Meaning of life

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with David Patterson,
00:00:03.440 | Turing Award winner
00:00:04.800 | and professor of computer science at Berkeley.
00:00:07.520 | He's known for pioneering contributions
00:00:09.760 | to risk processor architecture
00:00:11.680 | used by 99% of new chips today
00:00:14.760 | and for co-creating RAID storage.
00:00:18.040 | The impact that these two lines of research
00:00:20.080 | and development have had in our world is immeasurable.
00:00:23.720 | He's also one of the great educators
00:00:26.240 | of computer science in the world.
00:00:28.240 | His book with John Hennessy is how I first learned about
00:00:31.520 | and was humbled by the inner workings of machines
00:00:34.320 | at the lowest level.
00:00:35.960 | Quick summary of the ads.
00:00:37.520 | Two sponsors, the Jordan Harbinger Show and Cash App.
00:00:42.000 | Please consider supporting the podcast
00:00:43.640 | by going to jordanharbinger.com/lex
00:00:46.840 | and downloading Cash App and using code LEXPODCAST.
00:00:51.000 | Click on the links, buy the stuff.
00:00:53.520 | It's the best way to support this podcast
00:00:55.680 | and in general, the journey I'm on
00:00:57.800 | in my research and startup.
00:00:59.400 | This is the Artificial Intelligence Podcast.
00:01:01.960 | If you enjoy it, subscribe on YouTube,
00:01:04.000 | review it, the five stars on Apple Podcast,
00:01:06.240 | support it on Patreon or connect with me on Twitter
00:01:09.200 | at Lex Friedman, spelled without the E, just F-R-I-D-M-A-N.
00:01:14.200 | As usual, I'll do a few minutes of ads now
00:01:17.440 | and never any ads in the middle
00:01:18.840 | that can break the flow of the conversation.
00:01:21.320 | This episode is supported by the Jordan Harbinger Show.
00:01:25.160 | Go to jordanharbinger.com/lex.
00:01:27.960 | It's how he knows I sent you.
00:01:29.760 | On that page, there's links to subscribe to it
00:01:31.720 | on Apple Podcast, Spotify and everywhere else.
00:01:34.840 | I've been binging on this podcast.
00:01:36.640 | It's amazing.
00:01:37.480 | Jordan is a great human being.
00:01:39.360 | He gets the best out of his guests,
00:01:40.920 | dives deep, calls them out when it's needed
00:01:43.400 | and makes the whole thing fun to listen to.
00:01:45.440 | He's interviewed Kobe Bryant, Mark Cuban,
00:01:48.320 | Neil deGrasse Tyson, Garry Kasparov and many more.
00:01:51.800 | I recently listened to his conversation
00:01:53.680 | with Frank Abagnale, author of "Catch Me If You Can"
00:01:57.480 | and one of the world's most famous con men.
00:02:00.440 | Perfect podcast length and topic
00:02:02.160 | for a recent long distance run that I did.
00:02:06.000 | Again, go to jordanharbinger.com/lex.
00:02:09.160 | To give him my love and to support this podcast,
00:02:13.600 | subscribe also on Apple Podcast, Spotify and everywhere else.
00:02:17.380 | This show is presented by Cash App,
00:02:21.080 | the greatest sponsor of this podcast ever
00:02:22.920 | and the number one finance app in the App Store.
00:02:26.400 | When you get it, use code LEXPODCAST.
00:02:29.120 | Cash App lets you send money to friends,
00:02:31.160 | buy Bitcoin and invest in the stock market
00:02:33.040 | with as little as $1.
00:02:34.280 | Since Cash App allows you to buy Bitcoin,
00:02:37.320 | let me mention that cryptocurrency
00:02:39.120 | in the context of the history of money is fascinating.
00:02:41.960 | I recommend "A Scent of Money"
00:02:43.440 | as a great book on this history.
00:02:45.280 | Also, the audio book is amazing.
00:02:47.720 | Debits and credits on Ledger
00:02:49.000 | started around 30,000 years ago.
00:02:51.520 | The US dollar created over 200 years ago
00:02:54.200 | and the first decentralized cryptocurrency
00:02:56.200 | released just over 10 years ago.
00:02:58.180 | So given that history,
00:02:59.460 | cryptocurrency is still very much
00:03:01.280 | in its early days of development,
00:03:03.120 | but it's still aiming to
00:03:04.480 | and just might redefine the nature of money.
00:03:08.000 | So again, if you get Cash App
00:03:09.640 | from the App Store or Google Play
00:03:11.240 | and use the code LEXPODCAST, you get $10
00:03:15.160 | and Cash App will also donate $10 to FIRST,
00:03:18.080 | an organization that is helping to advance robotics
00:03:20.480 | and STEM education for young people around the world.
00:03:23.400 | And now, here's my conversation with David Patterson.
00:03:27.980 | Let's start with the big historical question.
00:03:31.720 | How have computers changed in the past 50 years
00:03:34.120 | at both the fundamental architectural level
00:03:36.600 | and in general, in your eyes?
00:03:39.240 | - Well, the biggest thing that happened
00:03:40.440 | was the invention of the microprocessor.
00:03:43.040 | So computers that used to fill up several rooms
00:03:46.360 | could fit inside your cell phone.
00:03:49.200 | And not only did they get smaller,
00:03:53.400 | they got a lot faster.
00:03:54.740 | So they're a million times faster
00:03:57.240 | than they were 50 years ago,
00:03:59.640 | and they're much cheaper and they're ubiquitous.
00:04:02.560 | There's 7.8 billion people on this planet.
00:04:07.600 | Probably half of them have cell phones right now,
00:04:09.880 | which is remarkable.
00:04:11.340 | - There's probably more microprocessors
00:04:14.080 | than there are people.
00:04:15.280 | - Sure, I don't know what the ratio is,
00:04:16.980 | but I'm sure it's above one.
00:04:19.420 | Maybe it's 10 to one or some number like that.
00:04:22.100 | - What is a microprocessor?
00:04:24.580 | - So a way to say what a microprocessor is
00:04:27.740 | is to tell you what's inside a computer.
00:04:29.540 | So a computer forever has classically had five pieces.
00:04:34.040 | There's input and output,
00:04:35.380 | which kind of naturally, as you'd expect,
00:04:37.380 | is input is like speech or typing,
00:04:40.540 | and output is displays.
00:04:42.140 | There's a memory, and like the name sounds,
00:04:48.420 | it remembers things.
00:04:50.220 | So it's integrated circuits whose job is
00:04:53.260 | you put information in,
00:04:54.460 | then when you ask for it, it comes back out.
00:04:56.000 | That's memory.
00:04:57.380 | And then the third part is the processor,
00:04:59.580 | where the term microprocessor comes from.
00:05:01.900 | And that has two pieces as well,
00:05:04.580 | and that is the control,
00:05:06.140 | which is kind of the brain of the processor,
00:05:10.100 | and what's called the arithmetic unit.
00:05:13.700 | It's kind of the brawn of the computer.
00:05:15.460 | So if you think of the, as a human body,
00:05:17.720 | the arithmetic unit,
00:05:19.220 | the thing that does the number crunching
00:05:20.660 | is the body, and the control is the brain.
00:05:23.440 | So those five pieces, input, output, memory,
00:05:26.040 | arithmetic unit, and control
00:05:30.700 | have been in computers since the very dawn,
00:05:33.460 | and the last two are considered the processor.
00:05:36.540 | So a microprocessor simply means
00:05:38.820 | a processor that fits on a microchip,
00:05:40.980 | and that was invented about 40 years ago,
00:05:44.660 | was the first microprocessor.
00:05:46.460 | - It's interesting that you refer to the arithmetic unit
00:05:48.660 | as the, like you connect it to the body,
00:05:52.340 | and the control is the brain.
00:05:54.240 | So I guess, I never thought of it that way.
00:05:56.420 | It's a nice way to think of it,
00:05:57.800 | because most of the actions the microprocessor does
00:06:01.900 | in terms of literally sort of computation,
00:06:05.820 | microprocessor does computation.
00:06:07.500 | It processes information.
00:06:09.060 | And most of the thing it does
00:06:10.860 | is basic arithmetic operations.
00:06:14.480 | What are the operations, by the way?
00:06:16.360 | - It's a lot like a calculator.
00:06:17.900 | So there are add instructions,
00:06:21.820 | subtract instructions, multiply and divide.
00:06:24.340 | And kind of the brilliance of the invention
00:06:28.660 | of the computer or the processor
00:06:33.040 | is that it performs very trivial operations,
00:06:36.080 | but it just performs billions of them per second.
00:06:39.220 | And what we're capable of doing is writing software
00:06:42.740 | that can take these very trivial instructions
00:06:45.540 | and have them create tasks that can do things better
00:06:48.280 | than human beings can do today.
00:06:50.460 | - Just looking back through your career,
00:06:52.560 | did you anticipate the kind of how good
00:06:54.860 | we would be able to get
00:06:56.200 | at doing these small basic operations?
00:06:59.400 | How many surprises along the way
00:07:03.020 | where you just kind of sat back and said,
00:07:05.900 | wow, I didn't expect it to go this fast, this good?
00:07:09.840 | - Well, the fundamental driving force
00:07:12.740 | is what's called Moore's Law,
00:07:15.380 | which was named after Gordon Moore,
00:07:17.700 | who's a Berkeley alumnus.
00:07:20.020 | And he made this observation very early
00:07:22.820 | in what are called semiconductors.
00:07:24.420 | And semiconductors are these ideas,
00:07:26.300 | you can build these very simple switches
00:07:29.300 | and you can put them on these microchips.
00:07:31.260 | And he made this observation over 50 years ago.
00:07:34.420 | He looked at a few years and said,
00:07:36.060 | I think what's going to happen
00:07:37.460 | is the number of these little switches called transistors
00:07:40.660 | is going to double every year for the next decade.
00:07:44.300 | And he said this in 1965.
00:07:46.260 | And in 1975, he said,
00:07:47.660 | well, maybe it's going to double every two years.
00:07:50.440 | And that what other people since named that Moore's Law
00:07:55.440 | guided the industry.
00:07:57.640 | And when Gordon Moore made that prediction,
00:07:59.440 | he wrote a paper back in, I think in the 70s
00:08:04.440 | and said, not only did this going to happen,
00:08:08.900 | he wrote, what would be the implications of that?
00:08:10.860 | And in this article from 1965,
00:08:13.300 | he shows ideas like computers being in cars
00:08:17.700 | and computers being in something that you would buy
00:08:21.700 | in the grocery store and stuff like that.
00:08:23.220 | So he kind of not only called his shot,
00:08:26.460 | he called the implications of it.
00:08:28.340 | So if you were in the computing field
00:08:30.920 | and if you believed Moore's prediction,
00:08:33.260 | he kind of said what would be happening in the future.
00:08:36.580 | So it's not kind of, it's at one sense,
00:08:41.540 | this is what was predicted.
00:08:43.140 | And you could imagine,
00:08:44.860 | it was easy to believe that Moore's Law was going to continue
00:08:47.100 | and so this would be the implications.
00:08:49.540 | On the other side,
00:08:50.540 | there are these kind of shocking events in your life.
00:08:53.420 | Like I remember driving in a Marine
00:08:57.580 | across the bay in San Francisco
00:08:59.640 | and seeing a bulletin board at a local civic center
00:09:03.700 | and it had a URL on it.
00:09:05.220 | And it was like, for the people at the time,
00:09:09.580 | these first URLs and that's the www select stuff
00:09:13.620 | with the HGP, people thought it looked like alien writing.
00:09:18.620 | They'd see these advertisements and commercials
00:09:23.620 | or bulletin boards that had this alien writing on it.
00:09:25.540 | So for the lay people, it was like,
00:09:26.700 | what the hell is going on here?
00:09:28.380 | And for those people in industry, it was, oh my God,
00:09:31.240 | this stuff is getting so popular,
00:09:33.580 | it's actually leaking out of our nerdy world
00:09:36.560 | into the real world.
00:09:37.940 | So that, I mean, there was events like that.
00:09:39.620 | I think another one was,
00:09:41.060 | I remember in the early days of the personal computer,
00:09:44.080 | when we started seeing advertisements in magazines
00:09:46.940 | for personal computers,
00:09:48.180 | like it's so popular that it's made the newspapers.
00:09:51.140 | So at one hand, Gordon Moore predicted it
00:09:54.380 | and you kind of expected it to happen,
00:09:55.860 | but when it really hit and you saw it affecting society,
00:09:58.620 | it was shocking.
00:10:02.580 | - So maybe taking a step back and looking
00:10:04.940 | at both the engineering and philosophical perspective,
00:10:08.140 | what do you see as the layers of abstraction in a computer?
00:10:12.500 | Do you see a computer as a set of layers of abstractions?
00:10:16.460 | - Yeah, I think that's one of the things
00:10:18.440 | that computer science fundamentals,
00:10:21.820 | is these things are really complicated
00:10:24.380 | in the way we cope with complicated software
00:10:27.460 | and complicated hardware, is these layers of abstraction.
00:10:30.060 | And that simply means that we,
00:10:34.460 | suspend disbelief and pretend
00:10:37.100 | that the only thing you know is that layer
00:10:39.660 | and you don't know anything about the layer below it.
00:10:42.060 | And that's the way we can make very complicated things.
00:10:45.020 | And probably it started with hardware,
00:10:48.500 | that that's the way it was done,
00:10:49.900 | but it's been proven extremely useful.
00:10:52.540 | And I would think in a modern computer today,
00:10:55.620 | there might be 10 or 20 layers of abstraction.
00:10:59.140 | And they're all trying to kind of enforce this contract
00:11:01.740 | is all you know is this interface.
00:11:05.140 | There's a set of commands that you can,
00:11:08.980 | are allowed to use and you stick to those commands
00:11:11.220 | and we will faithfully execute that.
00:11:12.980 | And it's like peeling the layers of a onion,
00:11:16.340 | you get down, there's a new set of layers and so forth.
00:11:19.260 | So for people who wanna study computer science,
00:11:23.120 | the exciting part about it is you can
00:11:27.180 | keep peeling those layers.
00:11:28.420 | You take your first course and you might learn to program
00:11:31.420 | in Python and then you can take a follow on course
00:11:34.700 | and you can get it down to a lower level language like C
00:11:37.860 | and you can go and then you can,
00:11:40.100 | if you want to, you can start getting
00:11:41.380 | into the hardware layers and you keep getting down
00:11:44.380 | all the way to that transistor that I talked about
00:11:47.160 | that Gordon Moore predicted.
00:11:49.140 | And you can understand all those layers all the way up
00:11:52.380 | to the highest level application software.
00:11:54.820 | So it's a very kind of magnetic field.
00:12:00.620 | If you're interested, you can go into any depth
00:12:03.940 | and keep going.
00:12:05.180 | In particular, what's happening right now
00:12:07.220 | or it's happened in software last 20 years
00:12:09.700 | and recently in hardware,
00:12:11.120 | there's getting to be open source versions
00:12:13.100 | of all of these things.
00:12:14.280 | So what open source means is what the engineer,
00:12:18.260 | the programmer designs, it's not secret
00:12:22.380 | the belonging to a company,
00:12:24.140 | it's out there on the worldwide web so you can see it.
00:12:27.340 | So you can look at for lots of pieces of software
00:12:32.180 | that you use, you can see exactly what the programmer does
00:12:35.340 | if you want to get involved.
00:12:37.840 | That used to stop at the hardware.
00:12:39.980 | Recently, there's been an efforts to make
00:12:43.540 | open source hardware and those interfaces open
00:12:46.420 | so you can see that.
00:12:47.260 | So instead of before you had to stop at the hardware,
00:12:49.400 | you can now start going layer by layer below that
00:12:52.580 | and see what's inside there.
00:12:53.960 | So it's a remarkable time that for the interested
00:12:57.520 | individual can really see in great depth
00:13:00.160 | what's really going on in the computers
00:13:02.120 | that power everything that we see around us.
00:13:05.440 | - Are you thinking also when you say open source
00:13:07.720 | at the hardware level, is this going to the
00:13:10.800 | design architecture instruction set level
00:13:13.960 | or is it going to literally the
00:13:18.880 | manufacturer of the actual hardware,
00:13:23.200 | of the actual chips, whether that's ASICs,
00:13:25.120 | specialized to a particular domain or the general?
00:13:27.600 | - Yeah, so let's talk about that a little bit.
00:13:30.120 | So when you get down to the bottom layer of software,
00:13:35.120 | the way software talks to hardware is in a vocabulary.
00:13:40.480 | And what we call that vocabulary, we call that,
00:13:43.760 | the words of that vocabulary are called instructions.
00:13:47.640 | And the technical term for the vocabulary
00:13:50.120 | is instruction set.
00:13:51.960 | So those instructions are like what we talked about earlier,
00:13:54.400 | they can be instructions like add, subtract,
00:13:56.600 | and multiply, divide.
00:13:58.000 | There's instructions to put data into memory,
00:14:01.840 | which is called a store instruction,
00:14:03.400 | and to get data back, which is called a load instructions.
00:14:05.640 | And those simple instructions go back
00:14:08.800 | to the very dawn of computing.
00:14:10.240 | In 1950, the commercial computer had these instructions.
00:14:14.760 | So that's the instruction set that we're talking about.
00:14:17.720 | So up until I'd say 10 years ago,
00:14:20.640 | these instruction sets were all proprietary.
00:14:22.860 | So a very popular one is owned by Intel,
00:14:27.120 | the one that's in the cloud
00:14:28.760 | and in all the PCs in the world.
00:14:30.760 | Intel owns that instruction set.
00:14:32.480 | It's referred to as the x86.
00:14:35.560 | There've been a sequence of ones
00:14:37.160 | that the first number was called 8086.
00:14:39.800 | And since then, there's been a lot of numbers,
00:14:42.080 | but they all end in 86.
00:14:43.240 | So there's been that kind of family of instruction sets.
00:14:48.040 | - And that's proprietary.
00:14:49.520 | - And that's proprietary.
00:14:50.780 | The other one that's very popular is from ARM.
00:14:54.500 | That kind of powers all the cell phones in the world,
00:14:57.740 | all the iPads in the world,
00:14:59.500 | and a lot of things that are so-called
00:15:02.180 | internet of things devices.
00:15:04.700 | ARM and that one is also proprietary.
00:15:08.140 | ARM will license it to people for a fee, but they own that.
00:15:12.380 | So the new idea that got started at Berkeley
00:15:15.260 | kind of unintentionally 10 years ago
00:15:17.780 | is early in my career,
00:15:21.180 | we pioneered a way to do these vocabularies instruction sets
00:15:25.860 | that was very controversial at the time.
00:15:28.220 | At the time in the 1980s,
00:15:30.340 | conventional wisdom was these
00:15:32.460 | vocabularies instruction sets
00:15:33.900 | should have powerful instructions.
00:15:36.500 | So polysyllabic kind of words, you can think of that.
00:15:40.360 | And so that instead of just add, subtract and multiply,
00:15:43.660 | they would have polynomial divide or sort a list.
00:15:47.980 | And the hope was of those powerful vocabularies
00:15:51.340 | that make it easier for software.
00:15:52.980 | So we thought that didn't make sense for microprocessors.
00:15:57.740 | There was people at Berkeley and Stanford and IBM
00:16:00.620 | who argued the opposite.
00:16:02.060 | And we called that was a reduced instruction set computer.
00:16:06.340 | And the abbreviation was RISC
00:16:09.420 | and typical for computer people,
00:16:10.940 | we use the abbreviations that are pronouncing it.
00:16:13.300 | So RISC was the thing.
00:16:14.860 | So we said for microprocessors,
00:16:17.180 | which with Gordon's more is changing really fast.
00:16:20.340 | We think it's better to have a pretty simple
00:16:22.620 | set of instructions, reduced set of instructions.
00:16:26.400 | That that would be a better way to build microprocessors
00:16:29.700 | since they're gonna be changing so fast due to Moore's law.
00:16:32.620 | And then we'll just use standard software
00:16:36.740 | to generate more of those simple instructions.
00:16:40.980 | And one of the pieces of software
00:16:43.540 | that's in that software stack
00:16:45.280 | going between these layers of abstractions
00:16:47.100 | is called a compiler.
00:16:48.180 | And it's basically translates.
00:16:50.100 | It's a translator between levels.
00:16:51.420 | We said the translator will handle that.
00:16:53.380 | So the technical question was,
00:16:55.340 | well, since there are these reduced instructions,
00:16:59.320 | you have to execute more of them.
00:17:01.060 | Yeah, that's right.
00:17:02.340 | But maybe you execute them faster.
00:17:04.100 | Yeah, that's right.
00:17:04.940 | They're simpler so they could go faster,
00:17:06.500 | but you have to do more of them.
00:17:07.380 | So what's that trade off look like?
00:17:10.580 | And it ended up that we ended up executing
00:17:13.420 | maybe 50% more instructions,
00:17:16.300 | maybe a third more instructions,
00:17:17.940 | but they ran four times faster.
00:17:19.620 | So this risk, controversial risk ideas
00:17:23.900 | proved to be maybe factors of three or four better.
00:17:26.760 | - I love that this idea was controversial
00:17:29.980 | and almost kind of like rebellious.
00:17:32.640 | So that's in the context of what was more conventional
00:17:36.660 | is the complex instructional set computing.
00:17:39.180 | So how'd you pronounce that?
00:17:41.260 | - CISC.
00:17:42.100 | - CISC, which is risk.
00:17:43.060 | - Risk versus CISC.
00:17:44.060 | And believe it or not, this sounds very,
00:17:48.060 | who cares about this, right?
00:17:50.420 | It was violently debated at several conferences.
00:17:54.780 | It's like, what's the right way to go?
00:17:57.060 | And people thought risk was a de-evolution.
00:18:01.140 | We're gonna make software worse
00:18:02.500 | by making those instructions simpler.
00:18:04.580 | And there are fierce debates
00:18:06.820 | at several conferences in the 1980s.
00:18:09.340 | And then later in the '80s,
00:18:11.140 | it kind of settled to these benefits.
00:18:14.620 | - It's not completely intuitive to me
00:18:16.120 | why risk has, for the most part, won.
00:18:18.780 | - Yeah, so why did that happen?
00:18:21.780 | - Yeah, yeah, and maybe I can sort of say
00:18:23.380 | a bunch of dumb things that could lay the land
00:18:25.460 | for further commentary.
00:18:27.060 | So to me, this is kind of interesting thing.
00:18:30.780 | If you look at C++ versus C,
00:18:33.340 | with modern compilers,
00:18:34.940 | you really could write faster code with C++.
00:18:38.660 | So relying on the compiler to reduce your complicated code
00:18:42.940 | into something simple and fast.
00:18:44.900 | So to me, comparing risk,
00:18:48.580 | maybe this is a dumb question,
00:18:50.020 | but why is it that focusing the definition,
00:18:54.060 | the design of the instruction set
00:18:55.820 | on very few simple instructions
00:18:58.260 | in the long run provide faster execution
00:19:02.980 | versus coming up with, like you said,
00:19:06.260 | a ton of complicated instructions
00:19:10.060 | that over time, years, maybe decades,
00:19:13.940 | you come up with compilers
00:19:15.260 | that can reduce those into simple instructions for you?
00:19:19.180 | - Yeah, so let's try and split that into two pieces.
00:19:22.660 | So if the compiler can do that for you,
00:19:26.260 | if the compiler can take a complicated program
00:19:29.980 | and produce simpler instructions,
00:19:33.140 | then the programmer doesn't care, right?
00:19:35.780 | Programmer, I don't care just how fast is the computer
00:19:39.900 | I'm using, how much does it cost?
00:19:41.860 | And so what happened kind of in the software industry
00:19:46.380 | is right around before the 1980s,
00:19:48.700 | critical pieces of software were still written
00:19:51.660 | not in languages like C or C++.
00:19:55.540 | They were written in what's called assembly language,
00:19:58.180 | where there's this kind of humans writing exactly
00:20:01.660 | at the instructions at the level
00:20:03.900 | that a computer can understand.
00:20:05.900 | So they were writing add, subtract, multiply instructions.
00:20:10.460 | It's very tedious, but the belief was to write
00:20:14.060 | this lowest level of software that people use,
00:20:17.580 | which are called operating systems,
00:20:18.860 | they had to be written in assembly language
00:20:21.020 | because these high-level languages were just too inefficient.
00:20:24.340 | They were too slow or the programs would be too big.
00:20:29.620 | So that changed with a famous operating system called Unix,
00:20:34.020 | which is kind of the grandfather
00:20:36.340 | of all the operating systems today.
00:20:38.740 | So Unix demonstrated that you could write
00:20:41.940 | something as complicated as an operating system
00:20:44.060 | in a language like C.
00:20:46.040 | So once that was true,
00:20:48.580 | then that meant we could hide the instruction set
00:20:51.980 | from the programmer.
00:20:53.640 | And so that meant then it didn't really matter.
00:20:57.140 | The programmer didn't have to write
00:20:59.260 | lots of these simple instructions.
00:21:00.940 | That was up to the compiler.
00:21:02.260 | So that was part of our arguments for risk,
00:21:04.180 | is if you were still writing in assembly language,
00:21:06.580 | there's maybe a better case for CISC instructions.
00:21:09.540 | But if the compiler can do that,
00:21:11.260 | it's gonna be, that's done once.
00:21:14.140 | The computer translates it once,
00:21:15.940 | and then every time you run the program,
00:21:17.780 | it runs at this potentially simpler instructions.
00:21:21.060 | And so that was the debate, right?
00:21:25.540 | And people would acknowledge that the simpler instructions
00:21:29.340 | could lead to a faster computer.
00:21:30.980 | You can think of monosyllabic instructions.
00:21:33.580 | You could say them, if you think of reading,
00:21:35.660 | you could probably read them faster
00:21:36.900 | or say them faster than long instructions.
00:21:39.060 | The same thing, that analogy works pretty well for hardware.
00:21:42.700 | And as long as you didn't have to read
00:21:44.680 | a lot more of those instructions, you could win.
00:21:47.000 | So that's the basic idea for risk.
00:21:50.340 | - But it's interesting that in that discussion of Unix and C
00:21:54.340 | that there's only one step of levels of abstraction
00:21:59.020 | from the code that's really the closest to the machine
00:22:03.140 | to the code that's written by human.
00:22:05.500 | It's, at least to me again, perhaps a dumb intuition,
00:22:09.960 | but it feels like there might have been more layers,
00:22:13.440 | sort of different kinds of humans
00:22:15.340 | stacked on top of each other.
00:22:17.380 | - So what's true and not true about what you said
00:22:21.140 | is several of the layers of software,
00:22:26.140 | like, so if you, two layers would be,
00:22:31.380 | suppose we just talk about two layers.
00:22:32.700 | That would be the operating system,
00:22:34.060 | like you get from Microsoft or from Apple,
00:22:37.460 | like iOS or the Windows operating system.
00:22:41.260 | And let's say applications that run on top of it,
00:22:43.580 | like Word or Excel.
00:22:46.020 | So both the operating system could be written in C,
00:22:51.160 | and the application could be written in C.
00:22:53.440 | But you could construct those two layers
00:22:56.520 | and the applications absolutely do call up
00:22:58.840 | on the operating system.
00:23:00.400 | And the change was that both of them
00:23:03.000 | could be written in higher level languages.
00:23:04.940 | So it's one step of a translation,
00:23:07.080 | but you can still build many layers of abstraction
00:23:10.360 | of software on top of that.
00:23:11.760 | And that's how things are done today.
00:23:13.640 | So still today, many of the layers that you'll deal with,
00:23:19.480 | you may deal with debuggers, you may deal with linkers.
00:23:24.040 | There's libraries.
00:23:27.000 | Many of those today will be written in C++,
00:23:31.200 | say, even though that language is pretty ancient.
00:23:35.000 | And even the Python interpreter is probably written
00:23:38.880 | in C or C++.
00:23:40.280 | So lots of layers there are probably written in these,
00:23:44.360 | some old fashioned efficient languages
00:23:47.260 | that still take one step to produce these instructions,
00:23:52.260 | produce RISC instructions,
00:23:54.820 | but they're composed, each layer of software invokes
00:23:58.760 | one another through these interfaces,
00:24:01.040 | and you can get 10 layers of software that way.
00:24:04.360 | - So in general, the RISC was developed here at Berkeley?
00:24:07.520 | - It was kind of the three places that were these radicals
00:24:11.700 | that advocated for this against the rest of the community
00:24:14.600 | were IBM, Berkeley, and Stanford.
00:24:16.900 | - You're one of these radicals,
00:24:20.540 | and how radical did you feel?
00:24:24.460 | How confident did you feel?
00:24:26.460 | How doubtful were you that RISC might be the right approach?
00:24:31.460 | 'Cause it may, you can also, into it,
00:24:33.820 | that is kind of taking a step back into simplicity,
00:24:36.700 | not forward into simplicity.
00:24:38.780 | - Yeah, no, it was easy to make, yeah.
00:24:42.620 | It was easy to make the argument against it.
00:24:44.020 | Well, this was my colleague, John Hennessy at Stanford,
00:24:48.140 | and we were both assistant professors,
00:24:49.900 | and for me, I just believed in the power of our ideas.
00:24:54.900 | I thought what we were saying made sense.
00:24:57.140 | Moore's law is gonna move fast.
00:24:58.900 | The other thing that I didn't mention
00:25:01.700 | is one of the surprises of these complex instruction sets.
00:25:05.820 | You could certainly write these complex instructions
00:25:08.420 | if the programmer is writing them themselves.
00:25:11.020 | It turned out to be kind of difficult
00:25:13.260 | for the compiler to generate those complex instructions.
00:25:15.780 | Kind of ironically, you'd have to find
00:25:17.980 | the right circumstances that just exactly
00:25:20.740 | fit this complex instruction.
00:25:21.940 | It was actually easier for the compiler
00:25:23.740 | to generate these simple instructions.
00:25:25.300 | So not only did these complex instructions
00:25:28.620 | make the hardware more difficult to build,
00:25:31.680 | often the compiler wouldn't even use them.
00:25:33.900 | And so it's harder to build.
00:25:36.620 | The compiler doesn't use them that much.
00:25:39.180 | The simple instructions go better with Moore's law.
00:25:41.740 | The number of transistors is doubling every two years,
00:25:45.460 | so we're gonna have, you know,
00:25:47.780 | you wanna reduce the time to design the microprocessor,
00:25:50.280 | that may be more important than the number of instructions.
00:25:52.660 | So I think we believed in the,
00:25:55.980 | that we were right, that this was the best idea.
00:25:59.500 | Then the question became in these debates,
00:26:01.700 | well, yeah, that's a good technical idea,
00:26:04.020 | but in the business world, this doesn't matter.
00:26:06.260 | There's other things that matter.
00:26:07.740 | It's like arguing that if there's a standard
00:26:11.860 | with the railroad tracks,
00:26:13.540 | and you've come up with a better width,
00:26:15.300 | but the whole world is covered in railroad tracks,
00:26:17.420 | so your ideas have no chance of success, commercial success.
00:26:22.260 | It was technically right,
00:26:23.140 | but commercially, it'll be insignificant.
00:26:25.640 | - Yeah, it's kind of sad that this world,
00:26:28.660 | the history of human civilization is full of good ideas
00:26:32.540 | that lost because somebody else came along first
00:26:36.020 | with a worse idea.
00:26:37.700 | And it's good that in the computing world,
00:26:39.820 | at least some of these have, well, you could,
00:26:42.140 | I mean, there's probably still CISC people that say--
00:26:45.340 | - Yeah, there still are.
00:26:46.660 | (laughing)
00:26:47.860 | And what happened was, what was interesting,
00:26:50.180 | Intel, a bunch of the CISC,
00:26:51.740 | companies with CISC instruction sets of vocabulary,
00:26:56.440 | they gave up, but not Intel.
00:26:58.640 | What Intel did, to its credit,
00:27:01.100 | because Intel's vocabulary was in the personal computer,
00:27:07.260 | and so that was a very valuable vocabulary
00:27:09.500 | because the way we distribute software
00:27:12.380 | is in those actual instructions.
00:27:14.280 | It's in the instructions of that instruction set.
00:27:16.260 | So you don't get that source code,
00:27:19.260 | what the programmers wrote,
00:27:20.900 | you get, after it's been translated into the lowest level,
00:27:24.340 | that's, if you were to get a floppy disk
00:27:26.300 | or download software,
00:27:27.220 | it's in the instructions of that instruction set.
00:27:29.400 | So the x86 instruction set was very valuable.
00:27:33.420 | So what Intel did cleverly and amazingly
00:27:36.860 | is they had their chips in hardware do a translation step.
00:27:41.860 | They would take these complex instructions
00:27:43.940 | and translate them into essentially in RISC instructions
00:27:46.400 | in hardware on the fly, at gigahertz clock speeds,
00:27:51.300 | and then any good idea that RISC people had,
00:27:53.860 | they could use, and they could still be compatible
00:27:56.620 | with this really valuable PC software base,
00:28:01.620 | which also had very high volumes,
00:28:04.820 | 100 million personal computers per year.
00:28:07.140 | So the CISC architecture in the business world
00:28:11.460 | was actually one in this PC era.
00:28:15.300 | - So just going back to the time of designing RISC,
00:28:22.180 | when you design an instruction set architecture,
00:28:27.340 | do you think like a programmer?
00:28:29.120 | Do you think like a microprocessor engineer?
00:28:32.380 | Do you think like a artist, a philosopher?
00:28:36.700 | Do you think in software and hardware?
00:28:38.860 | I mean, is it art, is it science?
00:28:40.920 | - Yeah, I'd say, I think designing a good instruction set
00:28:44.340 | is an art, and I think you're trying to balance
00:28:48.640 | the simplicity and speed of execution
00:28:54.260 | with how well easy it will be for compilers to use it.
00:28:58.620 | You're trying to create an instruction set
00:29:00.920 | that everything in there can be used by compilers.
00:29:04.620 | There's not things that are missing
00:29:07.340 | that'll make it difficult for the program to run,
00:29:09.780 | they run efficiently,
00:29:11.740 | but you want it to be easy to build as well.
00:29:13.620 | So it's that kind of, so you're thinking,
00:29:15.540 | I'd say you're thinking hardware,
00:29:16.920 | trying to find a hardware software compromise
00:29:19.300 | that'll work well.
00:29:20.620 | And it's a matter of taste, right?
00:29:25.620 | It's kind of fun to build instruction sets.
00:29:29.180 | It's not that hard to build an instruction set,
00:29:31.280 | but to build one that catches on and people use,
00:29:35.480 | you have to be fortunate to be the right place
00:29:38.680 | at the right time,
00:29:39.520 | or have a design that people really like.
00:29:41.940 | - Are you using metrics?
00:29:43.320 | So is it quantifiable?
00:29:46.320 | Because you kind of have to anticipate
00:29:48.120 | the kind of programs that people write ahead of time.
00:29:50.880 | So is that, can you use numbers, can you use metrics,
00:29:54.400 | can you quantify something ahead of time,
00:29:56.720 | or is this, again, that's the art part
00:29:58.400 | where you're kind of anticipating?
00:29:59.240 | - No, it's a big change, kind of what happened,
00:30:03.400 | I think from Hennessy's and my perspective in the 1980s,
00:30:07.120 | what happened was going from kind of really,
00:30:10.920 | taste and hunches to quantifiable.
00:30:16.640 | And in fact, he and I wrote a textbook
00:30:19.720 | at the end of the 1980s called
00:30:21.280 | "Computer Architecture, A Quantitative Approach."
00:30:23.600 | - I heard of that.
00:30:24.440 | - And it's the thing,
00:30:27.080 | it had a pretty big impact in the field
00:30:30.160 | 'cause we went from textbooks that kind of listed,
00:30:33.920 | so here's what this computer does,
00:30:35.760 | and here's the pros and cons,
00:30:36.960 | and here's what this computer does and pros and cons,
00:30:38.640 | to something where there were formulas and equations
00:30:41.520 | where you could measure things.
00:30:42.440 | So specifically for instruction sets,
00:30:44.700 | what we do and some other fields do
00:30:49.640 | is we agree upon a set of programs,
00:30:51.960 | which we call benchmarks,
00:30:53.760 | and a suite of programs,
00:30:56.100 | and then you develop both the hardware and the compiler,
00:31:00.160 | and you get numbers on how well your computer does,
00:31:05.160 | given its instruction set,
00:31:07.720 | and how well you implemented it in your microprocessor,
00:31:10.560 | and how good your compilers are.
00:31:12.760 | And in computer architecture,
00:31:14.680 | using professors' terms,
00:31:16.720 | we grade on a curve rather than grade on an absolute scale.
00:31:19.240 | So when you say,
00:31:20.140 | these programs run this fast,
00:31:22.820 | well, that's kind of interesting,
00:31:23.940 | but how do you know it's better?
00:31:25.240 | Well, you compare it to other computers of the same time.
00:31:28.560 | So the best way we know how to make,
00:31:31.200 | turn it into a kind of more science
00:31:34.560 | and experimental and quantitative
00:31:36.320 | is to compare yourself to other computers of the same era
00:31:39.800 | that have the same access,
00:31:40.880 | the same kind of technology,
00:31:42.560 | on commonly agreed benchmark programs.
00:31:45.060 | - So maybe to toss up two possible directions we can go,
00:31:49.160 | one is what are the different trade-offs
00:31:51.560 | in designing architectures?
00:31:54.120 | We've been already talking about CISC and RISC,
00:31:56.000 | but maybe a little bit more detail
00:31:58.940 | in terms of specific features that you were thinking about.
00:32:02.200 | And the other side is,
00:32:03.760 | what are the metrics that you're thinking about
00:32:06.200 | when looking at these trade-offs?
00:32:08.280 | - Yeah, let's talk about the metrics.
00:32:10.040 | So during these debates,
00:32:12.820 | we actually had kind of a hard time explaining,
00:32:15.580 | convincing people the ideas,
00:32:17.040 | and partly we didn't have a formula to explain it.
00:32:20.360 | And a few years into it,
00:32:22.120 | we hit upon the formula that helped explain
00:32:24.760 | what was going on.
00:32:25.840 | And I think if we can do this,
00:32:28.760 | see how it works orally to do this.
00:32:30.480 | So, let's see if I can do a formula orally.
00:32:35.040 | So fundamentally, the way you measure performance
00:32:39.760 | is how long does it take a program to run?
00:32:42.440 | Program, if you have 10 programs,
00:32:45.600 | and typically these benchmarks were sweet
00:32:47.320 | 'cause you'd wanna have 10 programs
00:32:48.880 | so they could represent lots of different applications.
00:32:51.420 | So for these 10 programs, how long did it take to run?
00:32:53.960 | Well, now, when you're trying to explain
00:32:55.960 | why it took so long,
00:32:57.080 | you could factor how long it takes a program to run
00:33:00.000 | into three factors.
00:33:01.560 | One of the first one is how many instructions
00:33:06.040 | did it take to execute?
00:33:07.240 | So that's what we've been talking about,
00:33:09.960 | the instructions of the academy.
00:33:11.240 | How many did it take?
00:33:12.480 | All right.
00:33:14.000 | The next question is how long did each instruction
00:33:17.240 | take to run on average?
00:33:18.840 | So you'd multiply the number of instructions
00:33:21.500 | times how long it took to run,
00:33:23.420 | and that gets you a whole time.
00:33:24.620 | Okay, so that's, but now let's look at this metric
00:33:28.260 | of how long did it take the instruction to run?
00:33:29.980 | Well, it turns out the way we could build computers today
00:33:33.780 | is they all have a clock.
00:33:35.140 | And you've seen this, if you buy a microprocessor,
00:33:37.940 | it'll say 3.1 gigahertz or 2.5 gigahertz,
00:33:42.300 | and more gigahertz is good.
00:33:43.920 | Well, what that is is the speed of the clock.
00:33:46.620 | So 2.5 gigahertz turns out to be four billionths
00:33:50.920 | of instruction or four nanoseconds.
00:33:53.420 | So that's the clock cycle time.
00:33:55.740 | But there's another factor, which is what's the average
00:33:58.620 | number of clock cycles it takes per instruction?
00:34:01.260 | So it's number of instructions, average number
00:34:03.980 | of clock cycles and the clock cycle time.
00:34:06.420 | So in these RISC-Sys debates, they would concentrate on,
00:34:11.060 | but RISC needs to take more instructions.
00:34:14.140 | And we'd argue what maybe the clock cycle is faster,
00:34:16.820 | but what the real big difference was,
00:34:19.260 | was the number of clock cycles per instruction.
00:34:21.300 | - Per instruction, that's fascinating.
00:34:22.740 | What about the mess of, the beautiful mess of parallelism
00:34:25.780 | in the whole picture?
00:34:26.860 | - Parallelism, which has to do with say,
00:34:28.860 | how many instructions could execute in parallel
00:34:31.460 | and things like that.
00:34:32.720 | You could think of that as affecting the clock cycles
00:34:34.980 | per instruction, 'cause it's the average clock cycles
00:34:37.020 | per instruction.
00:34:38.100 | So when you're running a program,
00:34:39.380 | if it took a hundred billion instructions
00:34:42.860 | and on average, it took two clock cycles per instruction
00:34:46.020 | and they were four nanoseconds, you could multiply that out
00:34:48.140 | and see how long it took to run.
00:34:50.060 | And there's all kinds of tricks to try and reduce
00:34:51.940 | the number of clock cycles per instruction.
00:34:54.100 | But it turned out that the way they would do
00:34:58.120 | these complex instructions is they would actually build
00:35:00.820 | what we would call an interpreter in a simpler,
00:35:04.060 | a very simple hardware interpreter.
00:35:05.940 | But it turned out that for the SISC instructions,
00:35:08.900 | if you had to use one of those interpreters,
00:35:10.820 | it would be like 10 clock cycles per instruction
00:35:13.300 | where the RISC instructions could be two.
00:35:16.100 | So there'd be this factor of five advantage
00:35:18.300 | in clock cycles per instruction.
00:35:20.240 | We have to execute say 25 or 50% more instructions.
00:35:23.620 | So that's where the win would come.
00:35:25.140 | And then you could make an argument
00:35:26.340 | whether the clock cycle times are the same or not.
00:35:28.380 | But pointing out that we could divide the benchmark results
00:35:32.960 | time per program into three factors.
00:35:35.400 | And the biggest difference in RISC and SISC
00:35:37.820 | was the clock cycles per, you execute a few more instructions
00:35:40.780 | but the clock cycles per instruction is much less.
00:35:43.380 | And that was what this debate was.
00:35:45.020 | Once we made that argument, then people said,
00:35:48.460 | oh, okay, I get it.
00:35:49.900 | And so we went from, it was outrageously controversial
00:35:54.540 | in 1982 that maybe probably by 1984 or so,
00:35:58.660 | people said, oh yeah, technically,
00:36:00.820 | they've got a good argument.
00:36:02.180 | - What are the instructions in the RISC instruction set?
00:36:06.280 | Just to get an intuition.
00:36:08.620 | - Okay, 1995, I was asked to predict the future
00:36:13.380 | of what microprocessor future.
00:36:14.900 | So I'd seen these predictions
00:36:18.300 | and usually people predict something outrageous
00:36:20.900 | just to be entertaining, right?
00:36:22.940 | And so my prediction for 2020 was,
00:36:26.220 | things are gonna be pretty much,
00:36:27.740 | they're gonna look very familiar to what they are.
00:36:29.900 | And they are, if you were to read the article,
00:36:33.260 | the things I said are pretty much true.
00:36:34.820 | The instructions that have been around forever
00:36:37.260 | are kind of the same.
00:36:38.300 | - And that's the outrageous prediction actually,
00:36:40.740 | given how fast computers have been growing.
00:36:41.980 | - Well, and Moore's law was gonna go on,
00:36:44.120 | we thought for 25 more years, who knows?
00:36:47.800 | But kind of the surprising thing,
00:36:49.580 | in fact, Hennessy and I won the ACM AM Turing Award
00:36:54.580 | for both the RISC instruction set contributions
00:36:57.860 | and for that textbook I mentioned.
00:36:59.780 | But we are surprised that here we are 35,
00:37:03.820 | 40 years later after we did our work,
00:37:08.320 | and the conventional wisdom
00:37:10.480 | of the best way to do instruction sets
00:37:12.680 | is still those RISC instruction sets
00:37:14.520 | that look very similar to what we looked like
00:37:17.160 | we did in the 1980s.
00:37:18.400 | So those, surprisingly,
00:37:20.760 | there hasn't been some radical new idea,
00:37:23.640 | even though we have a million times as many transistors
00:37:26.840 | as we had back then.
00:37:28.620 | - But what are the basic instructions
00:37:31.640 | and how did they change over the years?
00:37:33.200 | So are we talking about addition, subtraction,
00:37:35.280 | these are the--
00:37:36.120 | - It's a specific, so the things that are in a calculator
00:37:40.680 | are in a computer.
00:37:41.680 | So any of the buttons that are in the calculator
00:37:43.840 | in the computer.
00:37:44.720 | So the-- - Nice way to put it.
00:37:46.120 | - So there's a memory function key,
00:37:48.280 | and like I said, those are turns into
00:37:50.080 | putting something in memory is called a store,
00:37:51.720 | bring something back is called a load.
00:37:53.080 | - Just a quick tangent, when you say memory,
00:37:55.780 | what does memory mean?
00:37:57.000 | - Well, I told you there were five pieces of a computer,
00:38:00.560 | and if you remember in a calculator, there's a memory key,
00:38:03.440 | so you wanna have intermediate calculation
00:38:05.520 | and bring it back later.
00:38:06.720 | So you'd hit the memory plus key, M plus maybe,
00:38:09.480 | and it would put that into memory,
00:38:10.920 | and then you'd hit an RM like current instruction,
00:38:13.680 | and it'd bring it back into display,
00:38:15.120 | so you don't have to type it,
00:38:16.280 | you don't have to write it down and bring it back again.
00:38:17.960 | So that's exactly what memory is,
00:38:19.760 | that you can put things into it as temporary storage
00:38:22.760 | and bring it back when you need it later.
00:38:24.760 | So that's memory and loads and stores.
00:38:27.400 | But the big thing, the difference between a computer
00:38:30.720 | and a calculator is that the computer can make decisions.
00:38:34.680 | And amazingly, decisions are as simple as,
00:38:38.360 | is this value less than zero,
00:38:40.560 | or is this value bigger than that value?
00:38:42.960 | So there's, and those instructions,
00:38:45.400 | which are called conditional branch instructions,
00:38:47.600 | is what give computers all its power.
00:38:50.280 | If you were in the early days of computing
00:38:52.440 | before what's called the general purpose microprocessor,
00:38:55.160 | people would write these instructions kind of in hardware,
00:39:00.160 | but it couldn't make decisions,
00:39:01.640 | it would just, it would do the same thing
00:39:03.520 | over and over again.
00:39:04.600 | With the power of having branch instructions,
00:39:08.040 | it can look at things and make decisions automatically.
00:39:10.760 | And it can make these decisions,
00:39:12.360 | billions of times per second.
00:39:13.840 | And amazingly enough, we can get,
00:39:16.520 | thanks to advanced machine learning,
00:39:18.080 | we can create programs that can do something
00:39:21.160 | smarter than human beings can do.
00:39:22.960 | But if you go down that very basic level,
00:39:24.680 | what's the instructions are the keys on the calculator,
00:39:28.120 | plus the ability to make decisions,
00:39:30.440 | these conditional branch instructions.
00:39:32.360 | - And all decisions fundamentally can be reduced
00:39:34.440 | down to these branch instructions.
00:39:36.760 | - Yeah, so in fact, and so,
00:39:39.160 | going way back in the stack, back to,
00:39:42.320 | we did four RISC projects at Berkeley in the 1980s,
00:39:45.600 | they did a couple at Stanford in the 1980s.
00:39:48.960 | In 2010, we decided we wanted to do a new instruction set,
00:39:53.920 | learning from the mistakes of those RISC architectures
00:39:56.560 | in the 1980s, and that was done here at Berkeley.
00:40:00.040 | Almost exactly 10 years ago,
00:40:01.600 | and the people who did it, I participated,
00:40:04.680 | but other, Krzysztof Sanovic and others drove it.
00:40:08.560 | They called it RISC-V to honor those RISC,
00:40:11.480 | the four RISC projects of the 1980s.
00:40:14.000 | - So what does RISC-V involve?
00:40:15.960 | - So RISC-V is another instruction set vocabulary.
00:40:20.040 | It's learned from the mistakes of the past,
00:40:22.200 | but it still has, if you look at the,
00:40:24.440 | there's a core set of instructions
00:40:25.800 | that's very similar to the simplest architectures
00:40:28.520 | from the 1980s, and the big difference
00:40:30.880 | about RISC-V is it's open.
00:40:33.320 | So I talked earlier about proprietary
00:40:35.280 | versus open, kind of software.
00:40:40.280 | So this is an instruction set, so it's a vocabulary.
00:40:43.480 | It's not hardware, but by having an open instruction set,
00:40:47.280 | we can have open source implementations,
00:40:50.200 | open source processors that people can use.
00:40:52.880 | - Where do you see that going?
00:40:56.360 | So it's a really exciting possibility,
00:40:58.080 | but you're just like in the scientific American,
00:41:00.200 | if you were to predict 10, 20, 30 years from now,
00:41:03.600 | that kind of ability to utilize open source
00:41:07.840 | instruction set architectures like RISC-V,
00:41:11.120 | what kind of possibilities might that unlock?
00:41:13.680 | - Yeah, and so just to make it clear,
00:41:16.040 | because this is confusing, the specification of RISC-V
00:41:20.320 | is something that's like in a textbook.
00:41:22.380 | There's books about it.
00:41:23.560 | So that's defining an interface.
00:41:27.640 | There's also the way you build hardware
00:41:29.880 | is you write it in languages.
00:41:31.960 | They're kind of like C, but they're specialized
00:41:34.560 | for hardware that gets translated into hardware.
00:41:38.180 | And so these implementations of this specification
00:41:42.360 | are what are the open source.
00:41:43.960 | So they're written in something that's called Verilog or VHDL,
00:41:47.520 | but it's put up on the web, just like you can see
00:41:50.600 | the C++ code for Linux on the web.
00:41:54.760 | So that's the open instruction set
00:41:56.800 | enables open source implementations of RISC-V.
00:42:00.720 | - So you can literally build a processor
00:42:02.320 | using this instruction set.
00:42:04.200 | - People are, people are.
00:42:05.540 | So what happened to us, the story was,
00:42:08.140 | this was developed here for our use to do our research.
00:42:11.760 | And we made it, we licensed under the Berkeley
00:42:14.600 | software distribution license,
00:42:16.080 | like a lot of things get licensed here.
00:42:18.020 | So other academics use it, they wouldn't be afraid to use it.
00:42:20.800 | And then about 2014, we started getting complaints
00:42:25.760 | that we were using it in our research and in our courses.
00:42:28.560 | And we got complaints from people in industries,
00:42:30.880 | why did you change your instruction set
00:42:33.880 | between the fall and the spring semester?
00:42:36.760 | And well, we get complaints from industrial time.
00:42:38.600 | Why the hell do you care
00:42:40.440 | what we do with our instructions?
00:42:42.020 | And then when we talked to them, we found out
00:42:44.000 | there was this thirst for this idea
00:42:46.040 | of an open instruction set architecture.
00:42:47.720 | And they had been looking for one,
00:42:49.360 | they stumbled upon ours at Berkeley,
00:42:51.320 | thought it was, boy, this looks great.
00:42:54.120 | We should use this one.
00:42:55.920 | And so once we realized there is this need
00:42:58.560 | for an open instruction set architecture,
00:43:00.480 | we thought that's a great idea.
00:43:02.080 | And then we started supporting it
00:43:03.680 | and tried to make it happen.
00:43:05.200 | So this was, we accidentally stumbled into this,
00:43:09.680 | into this need and our timing was good.
00:43:12.040 | And so it's really taking off.
00:43:14.640 | There's, you know, universities are good at starting things,
00:43:18.520 | but they're not good at sustaining things.
00:43:20.040 | So like Linux has a Linux foundation,
00:43:22.480 | there's a RISC-V foundation that we started.
00:43:25.360 | There's an annual conferences.
00:43:27.560 | And the first one was done, I think, January of 2015.
00:43:31.800 | And the one that was just last December,
00:43:33.480 | and it, you know, it had 50 people at it.
00:43:35.360 | And the one last December had, I don't know,
00:43:38.880 | 1700 people were at it
00:43:40.800 | and the companies excited all over the world.
00:43:43.660 | So if predicting into the future, you know,
00:43:46.640 | if we were doing 25 years, I would predict that RISC-V
00:43:49.800 | will be, you know, possibly the most popular
00:43:53.600 | instruction set architecture out there,
00:43:55.440 | because it's a pretty good instruction set architecture
00:43:58.600 | and it's open and free.
00:43:59.880 | And there's no reason lots of people shouldn't use it.
00:44:04.480 | And there's benefits, just like Linux is so popular today
00:44:09.160 | compared to 20 years ago.
00:44:10.680 | And, you know, the fact that you can get access to it
00:44:15.480 | for free, you can modify it, you can improve it
00:44:18.000 | for all those same arguments.
00:44:19.800 | And so people collaborate to make it a better system
00:44:22.600 | for everybody to use, and that works in software.
00:44:24.840 | And I expect the same thing will happen in hardware.
00:44:27.840 | - So if you look at ARM, Intel, MIPS,
00:44:31.240 | if you look at just the lay of the land,
00:44:34.200 | and what do you think, just for me,
00:44:38.240 | because I'm not familiar how difficult
00:44:41.240 | this kind of transition would,
00:44:44.720 | how much challenges this kind of transition would entail,
00:44:48.040 | do you see, let me ask my dumb question in another way.
00:44:52.400 | - No, that's, I know where you're headed.
00:44:54.560 | (laughing)
00:44:55.560 | Well, there's a bunch, I think the thing you point out,
00:44:57.400 | there's these very popular proprietary instruction sets,
00:45:01.160 | the x86 and ARM.
00:45:02.760 | - And so how do we move to RISC-V potentially
00:45:05.680 | in sort of, in the span of five, 10, 20 years,
00:45:09.200 | a kind of unification, given that the devices,
00:45:13.440 | the kind of way we use devices, IoT, mobile devices,
00:45:17.600 | and the cloud keeps changing?
00:45:20.240 | - Well, part of it, a big piece of it,
00:45:23.140 | is the software stack.
00:45:25.320 | And what, right now, looking forward,
00:45:28.080 | there seem to be three important markets.
00:45:31.080 | There's the cloud, and the cloud is simply
00:45:35.120 | companies like Alibaba and Amazon and Google,
00:45:40.420 | Microsoft, having these giant data centers
00:45:43.840 | with tens of thousands of servers
00:45:45.840 | and maybe a hundred of these data centers all over the world.
00:45:50.320 | And that's what the cloud is.
00:45:51.400 | So the computer that dominates the cloud
00:45:53.520 | is the x86 instruction set.
00:45:55.960 | So the instruction, or the instruction sets
00:45:58.280 | used in the cloud are the x86,
00:46:00.000 | almost 100% of that today is x86.
00:46:05.000 | The other big thing are cell phones and laptops.
00:46:09.760 | Those are the big things today.
00:46:10.860 | I mean, the PC is also dominated
00:46:13.820 | by the x86 instruction set,
00:46:15.080 | but those sales are dwindling.
00:46:17.140 | You know, there's maybe 200 million PCs a year,
00:46:20.460 | and there's, is there 1.5 billion phones a year?
00:46:24.060 | There's numbers like that.
00:46:25.380 | So for the phones, that's dominated by ARM.
00:46:29.160 | And now, and a reason that,
00:46:33.900 | I talked about the software stacks,
00:46:35.860 | and the third category is internet of things,
00:46:38.180 | which is basically embedded devices,
00:46:39.620 | things in your cars and your microwaves, everywhere.
00:46:43.180 | So what's different about those three categories
00:46:46.100 | is for the cloud, the software that runs in the cloud
00:46:49.420 | is determined by these companies,
00:46:51.140 | Alibaba, Amazon, Google, Microsoft.
00:46:53.980 | So they control that software stack.
00:46:56.820 | For the cell phones, there's both,
00:46:59.820 | for Android and Apple, the software they supply,
00:47:02.500 | but both of them have marketplaces
00:47:04.340 | where anybody in the world can build software.
00:47:07.040 | And that software is translated,
00:47:10.040 | or compiled down and shipped in the vocabulary of ARM.
00:47:15.040 | So that's what's referred to as binary compatible,
00:47:18.440 | because the actual, it's the instructions
00:47:21.540 | are turned into numbers, binary numbers,
00:47:24.100 | and shipped around the world.
00:47:25.020 | So-- - And so,
00:47:25.860 | just a quick interruption.
00:47:27.140 | So ARM, what is ARM?
00:47:28.800 | ARM is an instruction set, like a risk-based--
00:47:32.780 | - Yeah, it's a risk-based instruction set.
00:47:34.260 | It's a proprietary one.
00:47:35.380 | ARM stands for Advanced Risk Machine,
00:47:40.380 | ARM is the name where the company is.
00:47:42.420 | So it's a proprietary risk architecture.
00:47:44.600 | So, and it's been around for a while,
00:47:48.420 | and it's surely the most popular instruction set
00:47:50.940 | in the world right now.
00:47:52.200 | Every year, billions of chips are using the ARM design
00:47:56.260 | in this post-PC era.
00:47:58.660 | - Was it one of the early risk adopters of the risk idea?
00:48:01.860 | - Yeah.
00:48:02.700 | The first ARM goes back, I don't know, '86 or so.
00:48:05.580 | So Berkeley and Stanford did their work in the early '80s.
00:48:08.820 | Their ARM guys needed an instruction set,
00:48:11.660 | and they read our papers, and it heavily influenced them.
00:48:15.500 | So getting back to my story,
00:48:18.180 | what about Internet of Things?
00:48:19.140 | Well, software's not shipped in Internet of Things.
00:48:21.420 | It's the embedded device,
00:48:24.820 | people control that software stack.
00:48:26.580 | So the opportunities for RISC-V, everybody thinks,
00:48:31.120 | is in the Internet of Things embedded things,
00:48:33.500 | because there's no dominant player
00:48:35.620 | like there is in the cloud or the smartphones.
00:48:39.820 | And it doesn't have a lot of licenses associated with,
00:48:44.260 | and you can enhance the instruction set if you want.
00:48:46.940 | And people have looked at instruction sets
00:48:51.140 | and think it's a very good instruction set.
00:48:52.940 | So it appears to be very popular there.
00:48:55.460 | It's possible that in the cloud,
00:48:59.220 | those companies control their software stacks.
00:49:02.480 | So it's possible that they would decide to use RISC-V,
00:49:06.560 | if we're talking about 10 and 20 years in the future.
00:49:09.600 | The one that would be harder would be the cell phones,
00:49:11.920 | since people ship software in the ARM instruction set.
00:49:15.120 | That, you'd think, would be the more difficult one.
00:49:17.400 | But if RISC-V really catches on,
00:49:19.800 | and in a period of a decade,
00:49:22.280 | you can imagine that's changing over too.
00:49:24.320 | - Do you have a sense why RISC-V or ARM is dominated?
00:49:27.720 | You mentioned these three categories.
00:49:29.120 | Why did ARM dominate?
00:49:31.300 | Why does it dominate the mobile device space?
00:49:33.980 | And maybe my naive intuition is that there's some aspects
00:49:38.980 | of power efficiency that are important,
00:49:41.220 | that somehow come along with RISC.
00:49:43.140 | - Well, part of it is,
00:49:44.400 | for these old CISC instruction sets, like in the x86,
00:49:49.060 | it was more expensive to these, for, you know,
00:49:57.560 | they're older, so they have disadvantages in them
00:50:00.620 | because they were designed 40 years ago.
00:50:02.940 | But also, they have to translate in hardware
00:50:06.100 | from CISC instructions to RISC instructions on the fly.
00:50:08.540 | And that costs both silicon area,
00:50:11.780 | the chips are bigger to be able to do that,
00:50:14.100 | and it uses more power.
00:50:15.700 | So ARM has, which has, you know,
00:50:18.100 | followed this RISC philosophy,
00:50:19.420 | is seen to be much more energy efficient.
00:50:22.080 | And in today's computer world,
00:50:24.060 | both in the cloud and the cell phone and, you know, things,
00:50:28.820 | it isn't, the limiting resource
00:50:31.420 | isn't the number of transistors you can fit in the chip,
00:50:33.460 | it's what, how much power can you dissipate
00:50:36.400 | for your application?
00:50:37.400 | So by having a reduced instruction set,
00:50:41.260 | that's possible to have a simpler hardware,
00:50:44.080 | which is more energy efficient.
00:50:45.360 | And energy efficiency is incredibly important in the cloud.
00:50:48.560 | When you have tens of thousands of computers
00:50:50.860 | in a data center,
00:50:51.700 | you wanna have the most energy efficient ones there as well.
00:50:54.680 | And of course, for embedded things running off of batteries,
00:50:57.000 | you want those to be energy efficient,
00:50:58.520 | and the cell phones too.
00:50:59.940 | So I think it's believed that there's a energy disadvantage
00:51:04.940 | of using these more complex instruction set architectures.
00:51:09.720 | - So the other aspect of this is,
00:51:13.640 | if we look at Apple, Qualcomm, Samsung, Huawei,
00:51:16.360 | all use the ARM architecture.
00:51:19.800 | And yet the performance of the systems varies.
00:51:22.220 | I mean, I don't know whose opinion you take on,
00:51:24.700 | but, you know, Apple, for some reason,
00:51:26.700 | seems to perform better in terms of these implementations,
00:51:29.980 | these architectures.
00:51:30.820 | So where's the magic, enter the picture?
00:51:33.020 | - How's that happen?
00:51:33.860 | Yeah, so what ARM pioneered was a new business model.
00:51:36.900 | As they said, well,
00:51:38.100 | here's our proprietary instruction set,
00:51:40.020 | and we'll give you two ways to do it.
00:51:43.060 | We'll give you one of these implementations
00:51:47.020 | written in things like C called Verilog,
00:51:49.860 | and you can just use ours.
00:51:51.860 | You have to pay money for that.
00:51:53.700 | Not only will give you their, you know,
00:51:56.580 | we'll license you to do that, or you could design your own.
00:51:59.700 | And so we're talking about numbers like
00:52:02.820 | tens of millions of dollars
00:52:04.080 | to have the right to design your own,
00:52:05.500 | since the instruction set belongs to them.
00:52:08.940 | So Apple got one of those, the right to build their own.
00:52:13.220 | Most of the other people who build like Android phones
00:52:15.860 | just get one of the designs from ARM to do it themselves.
00:52:20.860 | So Apple developed a really good
00:52:24.580 | microprocessor design team.
00:52:26.740 | They, you know, acquired a very good team
00:52:29.980 | that was building other microprocessors
00:52:33.380 | and brought them into the company to build their designs.
00:52:36.220 | So the instruction sets are the same,
00:52:38.160 | the specifications are the same,
00:52:39.900 | but their hardware design is much more efficient
00:52:42.660 | than I think everybody else's.
00:52:45.340 | And that's given Apple an advantage in the marketplace
00:52:49.740 | in that the iPhones tend to be faster
00:52:54.260 | than most everybody else's phones that are there.
00:52:57.120 | - It'd be nice to be able to jump around
00:52:59.980 | and kind of explore different little sides of this.
00:53:02.660 | But let me ask one sort of romanticized question.
00:53:05.680 | What to you is the most beautiful aspect
00:53:08.740 | or idea of RISC instruction set or instruction sets
00:53:12.500 | or this work that you've done?
00:53:14.900 | - You know, I was always attracted to the idea of,
00:53:19.740 | you know, small is beautiful.
00:53:21.620 | Is that the temptation in engineering,
00:53:25.140 | it's kind of easy to make things more complicated.
00:53:27.980 | It's harder to come up with a,
00:53:30.160 | it's more difficult, surprisingly,
00:53:31.740 | to come up with a simple, elegant solution.
00:53:33.900 | And I think that there's a bunch of small features
00:53:37.020 | of RISC in general that, you know,
00:53:40.840 | where you can see this examples of keeping it simpler
00:53:44.260 | makes it more elegant.
00:53:45.580 | Specifically in RISC-V, which, you know,
00:53:47.980 | I was kind of the mentor in the program,
00:53:49.980 | but it was really driven by Krzysztof Sanovic
00:53:52.020 | and two grad students, Andrew Waterman and Yensip Lee,
00:53:55.940 | is they hit upon this idea of having
00:53:59.220 | a subset of instructions,
00:54:02.820 | a nice simple subset instructions,
00:54:05.300 | like 40-ish instructions that all software,
00:54:09.120 | the software stack for RISC-V
00:54:11.580 | can run just on those 40 instructions.
00:54:14.060 | And then they provide optional features
00:54:17.060 | that could accelerate the performance instructions
00:54:20.780 | that if you needed them could be very helpful,
00:54:22.720 | but you don't need to have them.
00:54:24.260 | And that's a new, really a new idea.
00:54:26.840 | So RISC-V has right now maybe five optional subsets
00:54:31.820 | that you can pull in, but the software runs without them.
00:54:34.500 | If you just want to build the,
00:54:36.260 | just the core 40 instructions, that's fine.
00:54:39.200 | You can do that.
00:54:40.040 | So this is fantastic for educationally
00:54:43.380 | is you can explain computers.
00:54:44.820 | You only have to explain 40 instructions
00:54:47.260 | and not thousands of them.
00:54:48.660 | Also, if you invent some wild and crazy new technology,
00:54:52.320 | like biological computing,
00:54:55.740 | you'd like a nice simple instruction set
00:54:58.580 | and you can RISC-V,
00:55:00.540 | if you implement those core instructions,
00:55:02.060 | you can run really interesting programs on top of that.
00:55:05.420 | So this idea of a core set of instructions
00:55:08.020 | that the software stack runs on,
00:55:10.000 | and then optional features that if you turn them on,
00:55:13.480 | the compilers were used, but you don't have to,
00:55:15.640 | I think is a powerful idea.
00:55:17.920 | What's happened in the past
00:55:19.920 | if for the proprietary instruction sets
00:55:22.500 | is when they add new instructions,
00:55:25.160 | it becomes required piece.
00:55:27.920 | And so that all microprocessors in the future
00:55:32.080 | have to use those instructions.
00:55:33.520 | So it's kind of like,
00:55:35.080 | for a lot of people as they get older,
00:55:36.280 | they gain weight, right?
00:55:38.080 | (laughing)
00:55:38.920 | That weight and age are correlated.
00:55:41.200 | And so you can see these instruction sets
00:55:43.120 | get getting bigger and bigger as they get older.
00:55:45.440 | So RISC-V, lets you be as slim as you as a teenager,
00:55:50.060 | and you only have to add these extra features
00:55:52.760 | if you're really gonna use them,
00:55:53.880 | rather than you have no choice,
00:55:55.680 | you have to keep growing with the instruction set.
00:55:58.320 | - I don't know if the analogy holds up,
00:55:59.680 | but that's a beautiful notion.
00:56:01.080 | (laughing)
00:56:02.560 | That there's, it's almost like a nudge towards,
00:56:04.560 | here's the simple core, that's the essential.
00:56:07.720 | - Yeah, I think the surprising thing is still,
00:56:10.240 | if we brought back the pioneers from the 1950s
00:56:13.800 | and showed them the instruction set architectures,
00:56:16.040 | they'd understand it.
00:56:16.920 | They'd say, "Wow, that doesn't look that different."
00:56:19.880 | Well, yeah, I'm surprised.
00:56:21.880 | And it's, there's, it may be something,
00:56:24.440 | to talk about philosophical things,
00:56:25.840 | I mean, there may be something powerful
00:56:29.240 | about those 40 or 50 instructions
00:56:33.560 | that all you need is these commands,
00:56:36.480 | like these instructions that we talked about,
00:56:38.840 | and that is sufficient to build,
00:56:41.760 | to bring about artificial intelligence.
00:56:45.320 | And so it's a remarkable, surprising to me
00:56:49.200 | that as complicated as it is to build these things,
00:56:54.200 | a microprocessor is where the line widths
00:56:58.760 | are narrower than the wavelength of light,
00:57:02.520 | is this amazing technology is at some fundamental level,
00:57:07.240 | the commands that software executes
00:57:08.880 | are really pretty straightforward
00:57:10.200 | and haven't changed that much in decades,
00:57:13.680 | which, what a surprising outcome.
00:57:16.000 | - So underlying all computation, all Turing machines,
00:57:19.320 | all artificial intelligence systems,
00:57:21.640 | perhaps might be a very simple instruction set,
00:57:24.160 | like a RISC-V, or it's--
00:57:26.560 | - Yeah, I mean, that's kind of what I said.
00:57:29.680 | I was interested to see,
00:57:30.960 | I had another more senior faculty colleague,
00:57:33.440 | and he had written something in Scientific American,
00:57:36.600 | and his 25 years in the future,
00:57:40.360 | and his turned out about when I was a young professor,
00:57:42.840 | and he said, "Yep, I checked it."
00:57:44.600 | And so I was interested to see
00:57:45.520 | how that was gonna turn out for me,
00:57:48.200 | and it's pretty, held up pretty well.
00:57:51.180 | But yeah, so there's probably,
00:57:52.840 | there must be something fundamental
00:57:56.520 | about those instructions that we're capable of,
00:58:01.140 | creating intelligence from pretty primitive operations,
00:58:06.140 | and just doing them really fast.
00:58:09.380 | - You kind of mentioned a different,
00:58:12.020 | maybe radical computational medium, like biological,
00:58:15.300 | and there's other ideas.
00:58:16.500 | So there's a lot of spaces in ASIC,
00:58:18.540 | so it's domain-specific,
00:58:20.620 | and then there could be quantum computers,
00:58:22.140 | and so we can think of all of those different mediums
00:58:25.780 | and types of computation.
00:58:27.420 | What's the connection between swapping out
00:58:30.780 | different hardware systems in the instruction set?
00:58:34.780 | Do you see those as disjoint,
00:58:36.100 | or are they fundamentally coupled?
00:58:37.620 | - Yeah, so what's, so kind of,
00:58:39.220 | if we go back to the history,
00:58:40.800 | you know, when Moore's Law's in full effect,
00:58:45.460 | and you're getting twice as many transistors
00:58:48.180 | every couple of years,
00:58:50.820 | you know, kind of the challenge for computer designers
00:58:53.020 | is how can we take advantage of that?
00:58:54.580 | How can we turn those transistors
00:58:56.140 | into better computers, faster, typically?
00:58:59.340 | And so there was an era, I guess, in the '80s and '90s,
00:59:04.100 | where computers were doubling performance every 18 months,
00:59:09.100 | and if you weren't around then,
00:59:11.700 | what would happen is you had your computer,
00:59:15.020 | and your friend's computer,
00:59:17.260 | which was like a year, year and a half newer,
00:59:19.740 | and it was much faster than your computer,
00:59:21.940 | and he or she could get their work done
00:59:24.860 | much faster than your,
00:59:25.700 | 'cause you were, so people took their computers,
00:59:27.820 | perfectly good computers,
00:59:29.420 | and threw them away to buy a newer computer
00:59:32.700 | because the computer, one or two years later,
00:59:35.260 | was so much faster.
00:59:36.500 | So that's what the world was like in the '80s and '90s.
00:59:39.660 | Well, with the slowing down of Moore's Law,
00:59:43.580 | that's no longer true, right?
00:59:45.340 | Now with, you know, not desk-side computers,
00:59:47.700 | but the laptops, I only get a new laptop when it breaks,
00:59:51.580 | right, oh, damn, the disk broke, or this display broke,
00:59:55.060 | you gotta buy a new computer,
00:59:56.020 | but before, you would throw them away
00:59:57.820 | because they were just so sluggish
01:00:00.500 | compared to the latest computers.
01:00:03.040 | So that's, you know,
01:00:04.220 | that's a huge change of what's gone on.
01:00:10.420 | So, but since this lasted for decades,
01:00:13.520 | kind of programmers, and maybe all of society,
01:00:16.840 | is used to computers getting faster regularly.
01:00:19.640 | We now believe, those of us who are in computer design,
01:00:24.100 | it's called computer architecture,
01:00:25.580 | that the path forward is instead,
01:00:28.780 | is to add accelerators that only work well
01:00:33.020 | for certain applications.
01:00:35.240 | So since Moore's Law is slowing down,
01:00:40.060 | we don't think general-purpose computers
01:00:42.220 | are gonna get a lot faster.
01:00:43.660 | So the Intel processors of the world are not gonna,
01:00:46.740 | haven't been getting a lot faster.
01:00:48.060 | They've been barely improving, like a few percent a year.
01:00:51.860 | It used to be doubling every 18 months,
01:00:54.000 | now it's doubling every 20 years.
01:00:56.060 | So it's just shocking.
01:00:57.800 | So to be able to deliver on what Moore's Law used to do,
01:01:00.680 | we think what's gonna happen,
01:01:02.580 | what is happening right now,
01:01:03.860 | is people adding accelerators to their microprocessors
01:01:08.800 | that only work well for some domains.
01:01:11.920 | And by sheer coincidence,
01:01:14.780 | at the same time that this is happening,
01:01:17.220 | has been this revolution in artificial intelligence
01:01:19.980 | called machine learning.
01:01:21.820 | So with, as I'm sure your other guests have said,
01:01:26.820 | AI had these two competing schools of thought,
01:01:30.920 | is that we could figure out artificial intelligence
01:01:33.580 | by just writing the rules top-down,
01:01:35.460 | or that was wrong, you had to look at data
01:01:38.700 | and infer what the rules are in machine learning,
01:01:41.320 | and what's happened in the last decade or eight years
01:01:45.140 | is machine learning has won.
01:01:47.260 | And it turns out that machine learning,
01:01:49.860 | the hardware you build for machine learning
01:01:52.620 | is pretty much multiply.
01:01:55.300 | The matrix multiply is a key feature
01:01:58.020 | for the way machine learning is done.
01:02:00.560 | So that's a godsend for computer designers.
01:02:04.080 | We know how to make matrix multiply run really fast.
01:02:07.540 | So general purpose microprocessors are slowing down,
01:02:10.180 | we're adding accelerators for machine learning
01:02:12.180 | that fundamentally are doing matrix multiplies
01:02:14.980 | much more efficiently
01:02:15.980 | than general purpose computers have done.
01:02:17.980 | So we have to come up with a new way to accelerate things.
01:02:21.580 | The danger of only accelerating one application
01:02:23.820 | is how important is that application.
01:02:25.700 | Turns out machine learning gets used
01:02:28.300 | for all kinds of things.
01:02:29.500 | So serendipitously, we found something to accelerate
01:02:34.500 | that's widely applicable.
01:02:37.140 | And we don't even, we're in the middle of this revolution
01:02:39.580 | of machine learning,
01:02:40.500 | we're not sure what the limits of machine learning are.
01:02:42.580 | So this has been kind of a godsend.
01:02:46.060 | If you're gonna be able to deliver on improved performance,
01:02:50.560 | as long as people are moving their programs
01:02:53.980 | to be embracing more machine learning,
01:02:56.300 | we know how to give them more performance
01:02:58.540 | even as Moore's Law is slowing down.
01:03:00.560 | - And counterintuitively,
01:03:02.780 | the machine learning mechanism,
01:03:05.780 | you can say is domain specific,
01:03:07.740 | but because it's leveraging data,
01:03:09.900 | it's actually could be very broad
01:03:12.700 | in terms of the domains it could be applied in.
01:03:17.700 | - Yeah, that's exactly right.
01:03:19.580 | - Sort of, it's almost,
01:03:21.100 | sort of people sometimes talk about
01:03:23.300 | the idea of software 2.0.
01:03:25.220 | We're almost taking another step up
01:03:27.900 | in the abstraction layer
01:03:29.180 | in designing machine learning systems,
01:03:33.000 | because now you're programming in the space of data,
01:03:35.400 | in the space of hyperparameters.
01:03:37.320 | It's changing fundamentally the nature of programming.
01:03:40.300 | And so the specialized devices that accelerate
01:03:44.220 | the performance, especially neural network based
01:03:46.300 | machine learning systems,
01:03:47.780 | might become the new general.
01:03:50.260 | - Yeah, so the thing that's interesting to point out,
01:03:53.620 | these are not tied together.
01:03:57.620 | The enthusiasm about machine learning,
01:04:00.620 | about creating programs driven from data,
01:04:03.660 | that we should figure out the answers from data
01:04:05.700 | rather than kind of top down,
01:04:07.180 | which classically the way most programming is done
01:04:10.300 | and the way artificial intelligence used to be done.
01:04:12.580 | That's a movement that's going on at the same time.
01:04:15.820 | Coincidentally, and the first word in machine learning
01:04:19.420 | is machines, right?
01:04:20.260 | So that's going to increase the demand for computing,
01:04:24.340 | because instead of programmers being smart,
01:04:26.840 | writing those things down,
01:04:28.660 | we're gonna instead use computers to examine a lot of data
01:04:31.460 | to kind of create the programs.
01:04:33.100 | That's the idea.
01:04:35.780 | And remarkably, this gets used for all kinds of things
01:04:39.060 | very successfully.
01:04:40.060 | The image recognition, the language translation,
01:04:42.540 | the game playing, and it gets into pieces of the software
01:04:47.540 | stack like databases and stuff like that.
01:04:50.420 | We're not quite sure how general purpose is,
01:04:52.540 | but that's going on independent of this hardware stuff.
01:04:55.100 | What's happening on the hardware side is Moore's law
01:04:57.220 | is slowing down right when we need a lot more cycles.
01:05:00.060 | It's failing us right when we need it,
01:05:03.020 | because there's gonna be a greater increase in computing.
01:05:06.940 | And then this idea that we're gonna do
01:05:09.140 | so-called domain specific.
01:05:10.500 | Here's a domain that your greatest fear
01:05:13.660 | is you'll make this one thing work,
01:05:16.380 | and that'll help 5% of the people in the world.
01:05:19.620 | Well, this looks like it's a very general purpose thing.
01:05:23.300 | So the timing is fortuitous that if we can,
01:05:26.220 | perhaps if we can keep building hardware
01:05:29.700 | that will accelerate machine learning, the neural networks,
01:05:34.060 | that'll beat the timing will be right,
01:05:36.900 | that neural network revolution will transform software,
01:05:41.460 | the so-called software 2.0.
01:05:43.260 | And the software of the future will be very different
01:05:45.820 | from the software of the past.
01:05:47.180 | And just as our microprocessors,
01:05:49.580 | even though we're still gonna have that same basic
01:05:51.700 | risk instructions to run a big pieces of the software stack
01:05:55.860 | like user interfaces and stuff like that,
01:05:58.220 | we can accelerate the kind of the small piece
01:06:01.100 | that's computationally intensive.
01:06:02.380 | It's not lots of lines of code,
01:06:04.140 | but it takes a lot of cycles to run that code,
01:06:07.180 | that that's gonna be the accelerator piece.
01:06:09.460 | So that's what makes this from a computer designers
01:06:13.820 | perspective, a really interesting decade.
01:06:16.700 | What Hennessy and I talked about in the title
01:06:19.220 | of our Turing-Warren speech is a new golden age.
01:06:21.820 | We see this as a very exciting decade,
01:06:26.020 | much like when we were assistant professors
01:06:28.900 | and the risk stuff was going on.
01:06:30.500 | That was a very exciting time,
01:06:32.020 | was where we were changing what was going on.
01:06:33.500 | We see this happening again,
01:06:35.740 | tremendous opportunities of people
01:06:37.980 | because we're fundamentally changing how software is built
01:06:41.500 | and how we're running it.
01:06:42.820 | - So which layer of the abstraction
01:06:44.500 | do you think most of the acceleration might be happening?
01:06:47.860 | If you look in the next 10 years,
01:06:50.340 | sort of Google is working on a lot of exciting stuff
01:06:52.500 | with the TPU, sort of there's a closer to the hardware
01:06:55.700 | that could be optimizations around the,
01:06:58.780 | a rut closer to the instruction set
01:07:00.820 | that could be optimization at the compiler level.
01:07:02.980 | It could be even at the higher level software stack.
01:07:06.220 | - Yeah, it's gotta be, I mean,
01:07:07.460 | if you think about the old RISC-Sys debate,
01:07:09.900 | it was both, it was software hardware.
01:07:13.140 | It was the compilers improving
01:07:15.820 | as well as the architecture improving.
01:07:18.220 | And that's likely to be the way things are now.
01:07:21.820 | With machine learning,
01:07:23.740 | they're using domain specific languages,
01:07:26.620 | the languages like TensorFlow and PyTorch
01:07:30.260 | are very popular with the machine learning people.
01:07:33.140 | Those are the raising the level of abstraction.
01:07:35.460 | It's easier for people to write machine learning
01:07:37.420 | in these domain specific languages
01:07:40.140 | like PyTorch and TensorFlow.
01:07:43.980 | - So where the most optimization might be happening?
01:07:45.860 | - Yeah, and so there'll be both the compiler piece
01:07:49.780 | and the hardware piece underneath it.
01:07:51.340 | So as you kind of, the fatal flaw for hardware people
01:07:54.700 | is to create really great hardware,
01:07:57.140 | but not have brought along the compilers.
01:07:59.420 | And what we're seeing right now in the marketplace,
01:08:01.980 | because of this enthusiasm around hardware
01:08:04.940 | for machine learning is getting,
01:08:07.460 | probably billions of dollars invested in startup companies.
01:08:10.940 | We're seeing startup companies go belly up
01:08:13.540 | because they focused on the hardware,
01:08:15.900 | but didn't bring the software stack along.
01:08:18.020 | We talked about benchmarks earlier.
01:08:20.660 | So I participated in machine learning,
01:08:23.980 | didn't really have a set of benchmarks.
01:08:26.500 | I think just two years ago,
01:08:27.460 | they didn't have a set of benchmarks
01:08:28.620 | and we've created something called MLPerf,
01:08:31.140 | which is machine learning benchmark suite.
01:08:33.940 | And pretty much the companies
01:08:37.100 | who didn't invest in the software stack
01:08:38.860 | couldn't run MLPerf very well.
01:08:40.900 | And the ones who did invest in software stack did,
01:08:43.820 | and we're seeing, like kind of in computer architecture,
01:08:46.500 | this is what happens.
01:08:47.340 | You have these arguments about risk versus sys.
01:08:49.380 | People spend billions of dollars in the marketplace
01:08:51.460 | to see who wins.
01:08:52.460 | It's not a perfect comparison,
01:08:54.900 | but it kind of sorts things out.
01:08:56.500 | And we're seeing companies go out of business.
01:08:58.940 | And then companies like,
01:09:00.280 | there's a company in Israel called Habana.
01:09:04.940 | They came up with machine learning accelerators.
01:09:08.060 | They had good MLPerf scores.
01:09:11.140 | Intel had acquired a company earlier
01:09:13.220 | called Nirvana a couple of years ago.
01:09:15.420 | They didn't reveal their MLPerf scores,
01:09:17.380 | which was suspicious.
01:09:18.980 | But a month ago,
01:09:20.900 | Intel announced that they're canceling
01:09:22.540 | the Nirvana product line,
01:09:24.340 | and they bought Habana for $2 billion.
01:09:26.740 | And Intel's gonna be shipping Habana chips,
01:09:30.140 | which have hardware and software
01:09:32.260 | and run the MLPerf programs pretty well.
01:09:34.420 | And that's gonna be their product line in the future.
01:09:36.940 | - Brilliant.
01:09:37.780 | So maybe just to linger briefly on MLPerf.
01:09:40.780 | I love metrics.
01:09:41.820 | I love standards that everyone can gather around.
01:09:44.420 | What are some interesting aspects
01:09:46.340 | to that portfolio of metrics?
01:09:49.020 | - Well, one of the interesting metrics is
01:09:51.020 | what we thought.
01:09:52.900 | I was involved in the start.
01:09:56.140 | But Peter Mattson is leading the effort from Google.
01:09:59.700 | Google got it off the ground,
01:10:01.100 | but we had to reach out to competitors and say,
01:10:03.500 | "There's no benchmarks here.
01:10:06.900 | "We think this is bad for the field.
01:10:08.260 | "It'll be much better if we look at examples."
01:10:10.180 | Like in the risk days,
01:10:11.380 | there was an effort to create a,
01:10:13.580 | for the people in the risk community got together,
01:10:16.620 | competitors got together,
01:10:17.580 | were building risk microprocessors
01:10:18.980 | to agree on a set of benchmarks that were called spec.
01:10:21.540 | And that was good for the industry.
01:10:23.460 | It's rather before the different risk architectures
01:10:26.380 | were arguing, "Well, you can believe my performance, others,
01:10:28.340 | "but those other guys are liars."
01:10:30.660 | And that didn't do any good.
01:10:32.260 | So we agreed on a set of benchmarks,
01:10:34.660 | and then we could figure out who was faster
01:10:36.540 | between the various risk architectures.
01:10:38.060 | But it was a little bit faster.
01:10:39.660 | But that grew the market
01:10:41.220 | rather than people were afraid to buy anything.
01:10:43.340 | So we argued the same thing would happen with ML Perf.
01:10:46.980 | Companies like Nvidia were maybe worried
01:10:49.460 | that it was some kind of trap,
01:10:50.620 | but eventually we all got together
01:10:52.860 | to create a set of benchmarks and do the right thing.
01:10:56.100 | And we agree on the results.
01:10:58.140 | And so we can see whether TPUs or GPUs or CPUs
01:11:03.140 | are really faster and how much the faster.
01:11:05.580 | And I think from an engineer's perspective,
01:11:08.300 | as long as the results are fair, you can live with it.
01:11:10.860 | Okay, you kind of tip your hat
01:11:12.500 | to your colleagues at another institution.
01:11:15.540 | Boy, they did a better job than us.
01:11:17.500 | What you hate is if it's false, right?
01:11:19.780 | They're making claims and it's just marketing bullshit,
01:11:22.460 | and that's affecting sales.
01:11:24.100 | So from an engineer's perspective,
01:11:26.020 | as long as it's a fair comparison
01:11:27.940 | and we don't come in first place, that's too bad,
01:11:30.060 | but it's fair.
01:11:31.180 | So we wanted to create that environment for ML Perf.
01:11:33.740 | And so now there's 10 companies,
01:11:37.740 | I mean, 10 universities and 50 companies involved.
01:11:40.620 | So pretty much ML Perf is the way you measure
01:11:45.620 | machine learning performance.
01:11:49.020 | And it didn't exist even two years ago.
01:11:52.140 | - One of the cool things that I enjoy about the internet,
01:11:54.900 | it has a few downsides, but one of the nice things
01:11:57.540 | is people can see through BS a little better
01:12:00.980 | with the presence of these kinds of metrics.
01:12:03.140 | So it's really nice, companies like Google
01:12:05.980 | and Facebook and Twitter.
01:12:07.580 | Now it's the cool thing to do
01:12:09.420 | is to put your engineers forward
01:12:10.940 | and to actually show off how well you do on these metrics.
01:12:13.860 | There's not sort of,
01:12:15.340 | there's less of a desire to do marketing, less so.
01:12:20.740 | Am I sort of naive?
01:12:22.620 | - No, I think, I was trying to understand
01:12:25.340 | what's changed from the '80s in this era.
01:12:27.220 | I think because of things like social networking,
01:12:30.460 | Twitter and stuff like that,
01:12:31.780 | if you put up bullshit stuff, right,
01:12:36.220 | that's just purposely misleading,
01:12:39.660 | you can get a violent reaction in social media
01:12:44.180 | pointing out the flaws in your arguments, right?
01:12:47.100 | And so from a marketing perspective,
01:12:48.980 | you have to be careful today
01:12:51.700 | that you didn't have to be careful,
01:12:53.260 | that there'll be people who put out the flaw.
01:12:56.620 | You can get the word out about the flaws
01:12:58.900 | in what you're saying much more easily today
01:13:01.220 | than in the past.
01:13:02.700 | It used to be easier to get away with it.
01:13:04.980 | And the other thing that's been happening
01:13:07.140 | in terms of showing off engineers is just,
01:13:09.380 | in the software side,
01:13:11.780 | people have largely embraced open source software.
01:13:15.660 | It was 20 years ago, it was a dirty word at Microsoft,
01:13:19.740 | and today Microsoft is one of the big proponents
01:13:22.140 | of open source software.
01:13:24.060 | The kind of, that's the standard way
01:13:25.420 | most software gets built,
01:13:26.820 | which really shows off your engineers
01:13:29.060 | because you can see, if you look at the source code,
01:13:31.580 | you can see who are making the commits,
01:13:34.900 | who's making the improvements,
01:13:36.100 | who are the engineers at all these companies
01:13:38.700 | who are really great programmers and engineers
01:13:43.700 | and making really solid contributions,
01:13:47.260 | which enhances their reputations
01:13:48.820 | and the reputation of the companies.
01:13:50.620 | - But that's, of course, not everywhere.
01:13:52.860 | Like in the space that I work more
01:13:55.540 | in is autonomous vehicles,
01:13:56.900 | and there's still, the machinery of hype and marketing
01:14:00.620 | is still very strong there,
01:14:02.060 | and there's less willingness to be open
01:14:04.460 | in this kind of open source way and sort of benchmark.
01:14:06.860 | So MLPerf represents the machine learning world
01:14:10.540 | is much better at being open source
01:14:12.060 | about holding itself to standards of different,
01:14:14.900 | the amount of incredible benchmarks
01:14:16.940 | in terms of the different computer vision,
01:14:19.740 | natural language processing tasks is incredible.
01:14:23.460 | - Historically, it wasn't always that way.
01:14:26.900 | I had a graduate student working with me, David Martin.
01:14:29.940 | So in computer, in some fields,
01:14:32.580 | benchmarking has been around forever.
01:14:34.900 | So computer architecture, databases,
01:14:39.340 | maybe operating systems,
01:14:41.700 | benchmarks are the way you measure progress.
01:14:45.420 | But he was working with me
01:14:47.700 | and then started working with Jitendra Malik,
01:14:49.900 | and Jitendra Malik in computer vision space,
01:14:53.140 | I guess you've interviewed Jitendra.
01:14:55.580 | And David Martin told me, "They don't have benchmarks.
01:14:59.180 | "Everybody has their own vision algorithm."
01:15:01.020 | And the way that, here's my image, look at how well I do.
01:15:04.540 | And everybody had their own image.
01:15:06.100 | So David Martin, back when he did his dissertation,
01:15:10.260 | figured out a way to do benchmarks.
01:15:11.420 | He had a bunch of graduate students identify images
01:15:15.620 | and then ran benchmarks to see which algorithms run well.
01:15:18.300 | And that was, as far as I know,
01:15:19.860 | kind of the first time people did benchmarks
01:15:23.580 | in computer vision, which was predated all the things
01:15:27.620 | that eventually led to ImageNet and stuff like that.
01:15:29.460 | But then the vision community got religion.
01:15:31.940 | And then once we got as far as ImageNet,
01:15:33.980 | then that let the guys in Toronto
01:15:38.620 | be able to win the ImageNet competition.
01:15:41.580 | And then that changed the whole world.
01:15:43.980 | - It's a scary step, actually,
01:15:45.120 | because when you enter the world of benchmarks,
01:15:48.340 | you actually have to be good to participate,
01:15:51.100 | as opposed to, yeah, you can just,
01:15:53.780 | you just believe you're the best in the world.
01:15:56.080 | (laughing)
01:15:57.020 | And I think the people,
01:15:58.780 | I think they weren't purposely misleading.
01:16:00.940 | I think if you don't have benchmarks,
01:16:02.980 | I mean, how do you know?
01:16:04.340 | You could have, your intuition,
01:16:06.420 | it's kind of like the way we used to do
01:16:07.660 | computer architecture.
01:16:08.700 | Your intuition is that this is the right instruction set
01:16:11.500 | to do this job.
01:16:12.340 | I believe, in my experience, my hunch is that's true.
01:16:16.820 | We had to get, to make things more quantitative
01:16:20.060 | to make progress.
01:16:21.060 | And so I just don't know how,
01:16:24.300 | in fields that don't have benchmarks,
01:16:25.700 | I don't understand how they figure out
01:16:27.620 | how they're making progress.
01:16:29.020 | - We're kind of in the vacuum tube days
01:16:33.020 | of quantum computing.
01:16:34.540 | What are your thoughts in this wholly different
01:16:36.900 | kind of space of architectures?
01:16:38.880 | - I actually, quantum computing,
01:16:43.100 | idea's been around for a while,
01:16:44.180 | and I actually thought, well, I sure hope
01:16:46.220 | I retire before I have to start teaching this.
01:16:49.460 | (laughing)
01:16:50.460 | I'd say, because I talk about,
01:16:52.660 | give these talks about the slowing of Moore's law,
01:16:55.140 | and when we need to change
01:16:58.580 | by doing domain-specific accelerators,
01:17:01.180 | common questions say, what about quantum computing?
01:17:03.820 | The reason that comes up, it's in the news all the time.
01:17:05.740 | So I think the third thing to keep in mind
01:17:08.780 | is quantum computing is not right around the corner.
01:17:12.140 | There've been two national reports,
01:17:14.260 | one by the National Academy of Engineering,
01:17:15.700 | another by the Computing Consortium,
01:17:17.900 | where they did a frank assessment of quantum computing.
01:17:21.680 | And both of those reports said,
01:17:25.580 | as far as we can tell,
01:17:27.100 | before you get error-corrected quantum computing,
01:17:30.180 | it's a decade away.
01:17:31.320 | So I think of it like nuclear fusion.
01:17:33.720 | There've been people who've been excited
01:17:35.660 | about nuclear fusion a long time.
01:17:36.960 | If we ever get nuclear fusion,
01:17:38.280 | it's gonna be fantastic for the world.
01:17:40.560 | I'm glad people are working on it,
01:17:41.880 | but it's not right around the corner.
01:17:44.120 | Those two reports, to me, say probably it'll be 2030
01:17:49.360 | before quantum computing is something that could happen.
01:17:54.080 | And when it does happen,
01:17:56.160 | this is gonna be big science stuff.
01:17:57.880 | This is micro-Kelvin, almost absolute zero things
01:18:02.100 | that if they vibrate, if a truck goes by, it won't work.
01:18:05.840 | So this'll be in data center stuff.
01:18:08.120 | We're not gonna have a quantum cell phone.
01:18:10.960 | And it's probably a 2030 kind of thing.
01:18:13.880 | So I'm happy that people are working on it,
01:18:16.240 | but just it's hard with all the news about it
01:18:19.680 | not to think that it's right around the corner.
01:18:22.640 | And that's why we need to do something
01:18:24.920 | as Moore's Law is slowing down to provide the computing,
01:18:28.440 | keep computing getting better for this next decade.
01:18:31.040 | And we shouldn't be betting on quantum computing
01:18:34.440 | or expecting quantum computing to deliver
01:18:38.840 | in the next few years.
01:18:40.160 | It's probably further off.
01:18:42.680 | I'd be happy to be wrong.
01:18:43.800 | It'd be great if quantum computing
01:18:45.040 | is gonna commercially viable,
01:18:46.600 | but it will be a set of applications.
01:18:48.960 | It's not a general purpose computation.
01:18:51.200 | So it's gonna do some amazing things,
01:18:53.560 | but there'll be a lot of things that probably,
01:18:55.920 | you know, the old fashioned computers
01:18:58.200 | are gonna keep doing better for quite a while.
01:19:01.280 | - And there'll be a teenager 50 years from now
01:19:03.480 | watching this video saying,
01:19:05.280 | look how silly David Patterson was saying.
01:19:07.880 | - No, I just said, I said 2030.
01:19:09.920 | I didn't say never.
01:19:12.280 | - We're not gonna have quantum cell phones.
01:19:14.160 | So he's gonna be watching it in a quantum cell phone.
01:19:16.080 | - I mean, I think this is such a, you know,
01:19:18.720 | given that we've had Moore's Law,
01:19:20.400 | I just, I feel comfortable trying to do projects
01:19:24.960 | that are thinking about the next decade.
01:19:26.720 | I admire people who are trying to do things
01:19:28.840 | that are 30 years out,
01:19:29.680 | but it's such a fast moving field.
01:19:32.640 | I just don't know how to,
01:19:34.160 | I'm not good enough to figure out
01:19:37.080 | what's the problem's gonna be in 30 years.
01:19:39.320 | You know, 10 years is hard enough for me.
01:19:41.680 | - So maybe if it's possible to untangle
01:19:43.640 | your intuition a little bit,
01:19:44.920 | I spoke with Jim Keller.
01:19:46.600 | I don't know if you're familiar with Jim.
01:19:48.600 | And he is trying to sort of be a little bit rebellious
01:19:53.000 | and to try to think that--
01:19:54.480 | - Yes, he quotes me as being wrong.
01:19:57.600 | - Yeah, so this is--
01:19:58.680 | - What are you, wait, wait, wait, wait,
01:19:59.920 | for the record, Jim talks about that he has an intuition
01:20:04.920 | that Moore's Law is not in fact dead yet
01:20:08.880 | and that it may continue for some time to come.
01:20:11.920 | What are your thoughts about Jim's ideas in this space?
01:20:14.920 | - Yeah, this is just marketing.
01:20:17.320 | So what Gordon Moore said is a quantitative prediction.
01:20:22.080 | We can check the facts, right?
01:20:23.640 | Which is doubling the number of transistors every two years.
01:20:27.800 | So we can look back at Intel for the last five years
01:20:31.040 | and ask him, let's look at DRAM chips six years ago.
01:20:36.840 | So that would be three two-year periods.
01:20:40.140 | So then our DRAM chips have eight times as many transistors
01:20:44.220 | as they did six years ago.
01:20:46.080 | We can look up Intel microprocessors six years ago.
01:20:50.040 | If Moore's Law is continuing,
01:20:51.600 | it should have eight times as many transistors
01:20:54.700 | as six years ago.
01:20:55.640 | The answer in both those cases is no.
01:20:58.760 | The problem has been because Moore's Law
01:21:03.560 | was kind of genuinely embraced
01:21:06.080 | by the semiconductor industry
01:21:07.800 | is they would make investments in similar equipment
01:21:10.640 | to make Moore's Law come true.
01:21:12.460 | Semiconductor improving and Moore's Law
01:21:17.200 | in many people's mind are the same thing.
01:21:19.840 | So when I say, and I'm factually correct,
01:21:23.160 | that Moore's Law is no longer holds,
01:21:26.400 | we are not doubling transistors every year's years,
01:21:29.720 | the downside for a company like Intel
01:21:31.800 | is people think that means it's stopped,
01:21:35.400 | that technology has no longer improved.
01:21:37.920 | And so Jim is trying to counteract the impression
01:21:42.920 | that semiconductors are frozen in 2019
01:21:49.880 | are never gonna get better.
01:21:51.300 | So I never said that.
01:21:53.200 | All I said was Moore's Law is no more.
01:21:56.600 | And I'm-- - Strictly look at
01:21:58.040 | the number of transistors.
01:21:59.560 | - Exactly, that's what Moore's Law is.
01:22:02.200 | There's the, I don't know,
01:22:04.040 | there's been this aura associated with Moore's Law
01:22:07.840 | that they've enjoyed for 50 years
01:22:10.720 | about look at the field we're in,
01:22:12.360 | we're doubling transistors every two years,
01:22:14.520 | what an amazing field,
01:22:15.440 | which is amazing thing that they were able to pull off.
01:22:18.080 | But even as Gordon Moore said,
01:22:19.760 | no exponential can last forever.
01:22:21.520 | It lasted for 50 years, which is amazing.
01:22:24.040 | And this is a huge impact on the industry
01:22:26.560 | because of these changes that we've been talking about.
01:22:29.640 | So he claims, because he's trying to act,
01:22:32.680 | he claims, Patterson says Moore's Law is no more
01:22:36.240 | and look at it, it's still going.
01:22:38.560 | And TSMC, they say it's no longer.
01:22:41.800 | But there's quantitative evidence
01:22:44.000 | that Moore's Law is not continuing.
01:22:45.400 | So what I say now to try and,
01:22:47.760 | okay, I understand the perception problem
01:22:51.040 | when I say Moore's Law has stopped.
01:22:53.460 | Okay, so now I say Moore's Law is slowing down
01:22:55.940 | and I think Jim, which is another way of,
01:22:59.760 | if it's predicting every two years
01:23:02.000 | and I say it's slowing down,
01:23:03.000 | then that's another way of saying it doesn't hold anymore.
01:23:05.720 | And I think Jim wouldn't disagree that it's slowing down
01:23:10.600 | because that sounds like it's,
01:23:12.480 | things are still getting better, just not as fast,
01:23:14.640 | which is another way of saying
01:23:16.640 | Moore's Law isn't working anymore.
01:23:18.560 | - It's still good for marketing.
01:23:19.880 | But what's your, you're not,
01:23:22.800 | you don't like expanding the definition of Moore's Law.
01:23:25.900 | Sort of naturally-- - Well, as an educator,
01:23:29.240 | it's just like modern politics.
01:23:32.440 | Does everybody get their own facts?
01:23:34.200 | Or do we have, Moore's Law was a crisp,
01:23:38.200 | Carver Mead looked at his Moore's Constructions
01:23:43.440 | drawing on a log-log scale, a straight line,
01:23:46.240 | and that's what the definition of Moore's Law is.
01:23:49.080 | There's this other, what Intel did for a while,
01:23:52.080 | interestingly, before Jim joined them,
01:23:55.120 | is they said, oh no, Moore's Law isn't the number of doubling
01:23:57.400 | isn't really doubling transistors every two years.
01:24:00.160 | Moore's Law is the cost of the individual transistor
01:24:03.280 | going down, cutting in half every two years.
01:24:08.060 | Now, that's not what he said, but they reinterpreted it
01:24:10.240 | because they believed that the cost of transistors
01:24:14.440 | was continuing to drop,
01:24:15.640 | even if they couldn't get twice as many chips.
01:24:18.760 | Many people in industry have told me
01:24:20.200 | that's not true anymore, that basically,
01:24:22.760 | in more recent technologies, it got more complicated,
01:24:26.200 | the actual cost of transistor went up.
01:24:28.200 | So even a corollary might not be true,
01:24:32.580 | but certainly, Moore's Law,
01:24:35.680 | that was the beauty of Moore's Law.
01:24:37.120 | It was a very simple, it's like E equals MC squared, right?
01:24:40.680 | It was like, wow, what an amazing prediction.
01:24:43.540 | It's so easy to understand, the implications are amazing,
01:24:46.560 | and that's why it was so famous as a prediction,
01:24:50.120 | and this reinterpretation of what it meant
01:24:52.800 | and changing is revisionist history,
01:24:56.240 | and I'd be happy,
01:24:59.400 | and they're not claiming there's a new Moore's Law.
01:25:02.800 | They're not saying, by the way,
01:25:05.160 | it's instead of every two years, it's every three years.
01:25:08.760 | I don't think they wanna say that.
01:25:11.280 | I think what's gonna happen is new technology revisions,
01:25:13.800 | each one's gonna get a little bit slower.
01:25:15.520 | So it is slowing down,
01:25:18.560 | the improvements won't be as great,
01:25:21.320 | and that's why we need to do new things.
01:25:23.200 | - Yeah, I don't like that the idea of Moore's Law
01:25:26.200 | is tied up with marketing.
01:25:28.280 | It would be nice if--
01:25:29.920 | - Whether it's marketing or it's,
01:25:31.520 | well, it could be affecting business,
01:25:34.720 | but it could also be affecting the imagination of engineers.
01:25:37.720 | If Intel employees actually believe
01:25:40.880 | that we're frozen in 2019,
01:25:42.800 | well, that would be bad for Intel.
01:25:46.620 | - Not just Intel, but everybody.
01:25:48.080 | - Yeah.
01:25:49.200 | - Moore's Law is inspiring to everybody.
01:25:53.000 | - But what's happening right now,
01:25:55.640 | talking to people who have working in national offices
01:25:59.880 | and stuff like that,
01:26:00.700 | a lot of the computer science community
01:26:02.920 | is unaware that this is going on,
01:26:05.400 | that we are in an era that's gonna need radical change
01:26:08.960 | at lower levels that could affect the whole software stack.
01:26:16.600 | If you're using cloud stuff
01:26:18.040 | and the servers that you get next year
01:26:20.360 | are basically only a little bit faster
01:26:22.260 | than the servers you got this year,
01:26:24.100 | you need to know that,
01:26:25.120 | and we need to start innovating
01:26:26.720 | to start delivering on it.
01:26:30.020 | If you're counting on your software
01:26:32.160 | gonna have a lot more features,
01:26:33.280 | assuming the computers are gonna get faster,
01:26:34.760 | that's not true.
01:26:35.880 | So are you gonna have to start
01:26:36.920 | making your software stack more efficient,
01:26:38.760 | or are you gonna have to start learning
01:26:40.040 | about machine learning?
01:26:41.160 | So it's kind of a warning or call for arms
01:26:46.120 | that the world is changing right now,
01:26:47.840 | and a lot of people,
01:26:49.120 | a lot of computer science PhDs are unaware of that.
01:26:51.720 | So a way to try and get their attention
01:26:54.400 | is to say that Moore's law is slowing down,
01:26:56.840 | and that's gonna affect your assumptions.
01:26:59.220 | And we're trying to get the word out.
01:27:01.560 | And when companies like TSMC and Intel say,
01:27:04.240 | "Oh, no, no, no, Moore's law is fine."
01:27:06.600 | Then people think, "Okay,
01:27:07.920 | I don't have to change my behavior.
01:27:09.920 | I'll just get the next servers."
01:27:11.200 | And if they start doing measurements,
01:27:13.720 | they'll realize what's going on.
01:27:15.240 | - It'd be nice to have some transparency
01:27:16.920 | and metrics for the layperson
01:27:19.640 | to be able to know if computers are getting faster
01:27:23.360 | and not to forget Moore's law.
01:27:24.800 | - Yeah, there are a bunch of,
01:27:26.480 | most people kind of use clock rate
01:27:28.760 | as a measure of performance.
01:27:31.800 | You know, it's not a perfect one,
01:27:33.400 | but if you've noticed,
01:27:34.520 | clock rates are more or less the same
01:27:36.240 | as they were five years ago.
01:27:38.760 | Computers are a little better than they are.
01:27:40.960 | They haven't made zero progress,
01:27:43.080 | but they've made small progress.
01:27:44.360 | So there's some indications out there.
01:27:46.480 | And then our behavior, right?
01:27:47.520 | Nobody buys the next laptop
01:27:49.960 | because it's so much faster
01:27:51.320 | than the laptop from the past.
01:27:53.400 | For cell phones, I think,
01:27:55.960 | I don't know why people buy new cell phones,
01:28:00.600 | because a new one's announced.
01:28:02.560 | The cameras are better,
01:28:03.480 | but that's kind of domain specific, right?
01:28:05.120 | They're putting special purpose hardware
01:28:07.080 | to make the processing of images go much better.
01:28:10.240 | So that's the way they're doing it.
01:28:12.400 | They're not particularly,
01:28:14.040 | it's not that the ARM processor in there
01:28:16.080 | is twice as fast as much as they've added accelerators
01:28:19.520 | to help the experience of the phone.
01:28:22.920 | - Can we talk a little bit about one other exciting space,
01:28:27.080 | arguably the same level of impact
01:28:29.680 | as your work with RISC is RAID.
01:28:33.280 | In 1988, you co-authored a paper,
01:28:38.320 | "A Case for Redundant Arrays of Inexpensive Disks,"
01:28:42.080 | hence R-A-I-D, RAID.
01:28:45.000 | So that's where you introduced the idea of RAID.
01:28:47.440 | Incredible that that little,
01:28:49.080 | I mean little, that paper kind of had this ripple effect
01:28:53.600 | and had a really a revolutionary effect.
01:28:55.920 | So first, what is RAID?
01:28:58.080 | - What is RAID?
01:28:58.920 | So this is work I did with my colleague, Randy Katz,
01:29:01.800 | and a star graduate student, Garth Gibson.
01:29:05.080 | So we had just done the fourth generation RISC project,
01:29:09.480 | and Randy Katz, which had an early Apple Macintosh computer,
01:29:14.480 | at this time, everything was done with floppy disks,
01:29:20.240 | which are old technologies that could store things
01:29:25.240 | that didn't have much capacity,
01:29:27.600 | and you had to, to get any work done,
01:29:29.760 | you're always sticking your little floppy disk in and out
01:29:32.480 | 'cause they didn't have much capacity.
01:29:33.880 | But they started building what are called hard disk drives,
01:29:37.480 | which is magnetic material
01:29:39.480 | that can remember information storage for the Mac.
01:29:44.000 | And Randy asked the question when he saw this disk
01:29:49.000 | next to his Mac, "Gee, these are brand new small things."
01:29:51.960 | Before that, for the big computers,
01:29:54.240 | the disk would be the size of washing machines.
01:29:57.680 | And here's something the size of a,
01:30:00.080 | kind of the size of a book or so.
01:30:02.520 | He says, "I wonder what we could do with that."
01:30:03.760 | Well, we, Randy was involved in the fourth generation
01:30:08.760 | RISC project here at Brooklyn in the '80s.
01:30:11.720 | So we'd figured out a way how to make the computation part,
01:30:14.720 | the processor part, go a lot faster.
01:30:16.800 | But what about the storage part?
01:30:19.320 | Can we do something to make it faster?
01:30:20.780 | So we hit upon the idea of taking a lot of these disks
01:30:25.380 | developed for personal computers and Macintoshes
01:30:27.400 | and putting many of them together
01:30:29.400 | instead of one of these washing machine-sized things.
01:30:31.720 | And so we wrote the first draft of the paper
01:30:34.660 | and we'd have 40 of these little PC disks
01:30:37.280 | instead of one of these washing machine-sized things.
01:30:40.680 | And they would be much cheaper 'cause they're made for PCs.
01:30:43.860 | And they could actually kind of be faster
01:30:45.520 | 'cause there was 40 of them rather than one of them.
01:30:48.240 | And so we wrote a paper like that
01:30:49.520 | and sent it to one of our former Berkeley students at IBM.
01:30:52.480 | And he said, "Well, this is all great and good,
01:30:53.800 | "but what about the reliability of these things?"
01:30:56.240 | Now you have 40 of these devices,
01:30:59.080 | each of which are kind of PC quality,
01:31:01.240 | so they're not as good as these IBM washing machines.
01:31:03.960 | IBM dominated the storage.
01:31:08.440 | So the reliability's gonna be awful.
01:31:10.600 | And so when we calculated it out,
01:31:12.400 | instead of it breaking on average once a year,
01:31:15.580 | it would break every two weeks.
01:31:17.640 | So we thought about the idea and said,
01:31:20.560 | "Well, we gotta address the reliability."
01:31:22.760 | So we did it originally performance,
01:31:24.400 | but we had the reliability.
01:31:25.760 | So the name, Redundant Array of Inexpensive Disks,
01:31:29.360 | is array of these disks, inexpensive like for PCs,
01:31:32.980 | but we have extra copies.
01:31:35.000 | So if one breaks, we won't lose all the information.
01:31:38.480 | We'll have enough redundancy that we could let some break
01:31:41.640 | and we can still preserve the information.
01:31:43.200 | So the name is an Array of Inexpensive Disks.
01:31:45.520 | This is a collection of these PCs.
01:31:48.200 | And the R part of the name was the redundancy
01:31:51.280 | so they'd be reliable.
01:31:52.280 | And it turns out if you put a modest number of extra disks
01:31:55.560 | in one of these arrays,
01:31:56.480 | it could actually not only be as faster and cheaper
01:32:00.120 | than one of these washing machine disks,
01:32:01.520 | it could be actually more reliable
01:32:03.480 | because you could have a couple of breaks
01:32:05.340 | even with these cheap disks,
01:32:06.800 | whereas one failure with the washing machine thing
01:32:09.200 | would knock it out.
01:32:10.640 | - Did you have a sense just like with risk
01:32:13.400 | that in the 30 years that followed,
01:32:17.400 | RAID would take over as a mechanism for storage?
01:32:22.040 | - I'd say, I think I'm naturally an optimist,
01:32:27.040 | but I thought our ideas were right.
01:32:30.480 | I thought kind of like Moore's law,
01:32:32.960 | it seemed to me if you looked at the history
01:32:34.960 | of the disk drives,
01:32:36.180 | they went from washing machine size things
01:32:38.120 | and they were getting smaller and smaller
01:32:40.200 | and the volumes were with the smaller disk drives
01:32:43.320 | because that's where the PCs were.
01:32:45.280 | So we thought that was a technological trend
01:32:48.280 | that disk drives, the volume of disk drives
01:32:51.560 | was gonna be getting smaller and smaller devices,
01:32:54.360 | which were true, they were the size of a,
01:32:56.620 | I don't know, eight inches diameter,
01:32:58.560 | then five inches, then three inches diameters.
01:33:01.480 | And so that it made sense to figure out
01:33:04.000 | how to deal things with an array of disks.
01:33:06.280 | So I think it was one of those things where logically,
01:33:09.220 | we think the technological forces were on our side,
01:33:13.600 | that it made sense.
01:33:14.740 | So we expected it to catch on,
01:33:17.040 | but there was that same kind of business question.
01:33:20.040 | IBM was the big pusher of these disk drives.
01:33:23.720 | In the real world, where the technical advantage
01:33:25.960 | get turned into a business advantage or not.
01:33:28.880 | It proved to be true, it did.
01:33:30.200 | And so we thought we were sound technically
01:33:33.800 | and it was unclear whether the business side,
01:33:36.700 | but we kind of, as academics,
01:33:38.440 | we believe that technology should win and it did.
01:33:41.480 | - And if you look at those 30 years,
01:33:44.840 | just from your perspective,
01:33:46.060 | are there interesting developments in the space of storage
01:33:49.000 | that have happened in that time?
01:33:50.440 | - Yeah, the big thing that happened,
01:33:52.900 | well, a couple of things that happened.
01:33:54.520 | What we did had a modest amount of storage,
01:33:56.880 | so as redundancy,
01:33:59.360 | as people built bigger and bigger storage systems,
01:34:02.380 | they've added more redundancy
01:34:04.400 | so they could add more failures.
01:34:05.880 | And the biggest thing that happened in storage
01:34:07.960 | is for decades, it was based on things physically spinning
01:34:12.960 | called hard disk drives,
01:34:15.600 | where you used to turn on your computer
01:34:17.240 | and it would make a noise.
01:34:18.600 | What that noise was, was the disk drive spinning
01:34:21.560 | and they were rotating at like 60 revolutions per second.
01:34:25.720 | And it's like, if you remember the vinyl records,
01:34:30.560 | if you've ever seen those, that's what it looked like.
01:34:32.880 | And there was like a needle
01:34:34.280 | like on a vinyl record that was reading it.
01:34:36.320 | So the big drive change is switching that over
01:34:38.880 | to a semiconductor technology called flash.
01:34:41.540 | So within the last, I'd say about decade,
01:34:44.840 | is increasing fraction of all the computers in the world
01:34:47.720 | are using semiconductor for storage.
01:34:51.000 | The flash drive, instead of being magnetic,
01:34:54.960 | they're optical, well, they're semiconductor
01:34:59.560 | writing of information very densely.
01:35:02.900 | And that's been a huge difference.
01:35:05.600 | So all the cell phones in the world use flash.
01:35:08.040 | Most of the laptops use flash.
01:35:09.960 | All the embedded devices use flash instead of storage.
01:35:12.840 | Still in the cloud, magnetic disks
01:35:16.400 | are more economical than flash,
01:35:18.320 | but they use both in the cloud.
01:35:20.240 | So it's been a huge change in the storage industry.
01:35:23.160 | Switching from primarily disk to being
01:35:27.360 | primarily semiconductor.
01:35:28.520 | - For the individual disk, but still the RAID mechanism
01:35:31.120 | applies to those different kinds of disks.
01:35:32.680 | - Yes, the people will still use RAID ideas
01:35:36.040 | because it's kind of what's different,
01:35:38.480 | kind of interesting, kind of psychologically,
01:35:41.260 | if you think about it.
01:35:42.680 | People have always worried about the reliability
01:35:45.360 | of computing since the earliest days.
01:35:47.080 | So kind of, but if we're talking about computation,
01:35:51.040 | if your computer makes a mistake and the computer says,
01:35:56.040 | the computer has ways to check and say,
01:35:57.920 | oh, we screwed up, we made a mistake.
01:36:00.720 | What happens is that program that was running,
01:36:03.360 | you have to redo it, which is a hassle.
01:36:06.000 | For storage, if you've sent important information away,
01:36:11.580 | and it loses that information, you go nuts.
01:36:14.740 | This is the worst, oh my God.
01:36:16.660 | So if you have a laptop and you're not backing it up
01:36:19.700 | on the cloud or something like this,
01:36:21.220 | and your disk drive breaks, which it can do,
01:36:24.980 | you'll lose all that information
01:36:26.500 | and you just go crazy, right?
01:36:27.740 | So the importance of reliability for storage
01:36:30.660 | is tremendously higher than the importance
01:36:32.660 | of reliability for computation
01:36:34.340 | because of the consequences of it.
01:36:36.260 | So yes, so RAID ideas are still very popular,
01:36:39.180 | even with the switch of the technology.
01:36:41.020 | Although flash drives are more reliable.
01:36:43.640 | If you're not doing anything like backing it up
01:36:47.180 | to get some redundancy so they handle it,
01:36:49.140 | you're taking great risks.
01:36:51.740 | - You said that for you and possibly for many others,
01:36:56.460 | teaching and research don't conflict with each other
01:37:00.060 | as one might suspect, and in fact,
01:37:02.000 | they kind of complement each other.
01:37:03.440 | So maybe a question I have is,
01:37:05.980 | how has teaching helped you in your research
01:37:08.260 | or just in your entirety as a person
01:37:12.300 | who both teaches and does research
01:37:14.460 | and just thinks and creates new ideas in this world?
01:37:18.160 | - Yes, I think what happens is,
01:37:20.740 | is when you're a college student,
01:37:22.320 | you know there's this kind of tenure system
01:37:24.080 | in doing research.
01:37:24.920 | So kind of this model that is popular in America,
01:37:29.920 | I think America really made it happen,
01:37:31.780 | is we can attract these really great faculty
01:37:34.360 | to research universities because they get to do research
01:37:37.500 | as well as teach.
01:37:38.380 | And that, especially in fast moving fields,
01:37:40.780 | this means people are up to date
01:37:42.260 | and they're teaching those kinds of things.
01:37:44.260 | But when you run into a really bad professor,
01:37:46.420 | a really bad teacher, I think the students think,
01:37:49.220 | well this guy must be a great researcher
01:37:51.460 | 'cause why else could he be here?
01:37:53.500 | So as I, you know, after 40 years at Berkeley,
01:37:57.140 | we had a retirement party and I got a chance to reflect
01:37:59.500 | and I looked back at some things.
01:38:01.380 | That is not my experience.
01:38:03.500 | There's a, I saw a photograph of five of us
01:38:07.400 | in the department who won the Distinguished Teaching Award
01:38:10.100 | from campus, a very high honor.
01:38:11.800 | You know, I've got one of those, one of the highest honors.
01:38:14.220 | So there are five of us on that picture.
01:38:16.300 | There's Manuel Blum, Richard Karp, me, Randy Kass
01:38:21.300 | and John Osterhout, contemporaries of mine.
01:38:24.420 | I mentioned Randy already.
01:38:26.360 | All of us are in the National Academy of Engineering.
01:38:29.180 | We've all run the Distinguished Teaching Award.
01:38:32.140 | Blum, Karp and I all have Turing Awards.
01:38:35.060 | - Turing Awards, right.
01:38:36.540 | - You know, the highest award in computing.
01:38:38.860 | So that's the opposite, right?
01:38:43.100 | What happens is if you, it's, they're highly correlated.
01:38:46.020 | So probably, the other way to think of it,
01:38:48.900 | if you're very successful people
01:38:50.860 | or maybe successful at everything they do,
01:38:52.780 | it's not an either or.
01:38:54.420 | - But it's an interesting question whether specifically,
01:38:57.580 | that's probably true, but specifically for teaching,
01:39:00.180 | if there's something in teaching that,
01:39:02.540 | it's the Richard Feynman, right, idea.
01:39:04.980 | Is there something about teaching
01:39:06.380 | that actually makes your research,
01:39:08.380 | makes you think deeper and more outside the box
01:39:11.500 | and more insightful?
01:39:12.780 | - Absolutely, I was gonna bring up Feynman.
01:39:14.260 | I mean, he criticized the Institute of Advanced Studies.
01:39:17.180 | So the Institute of Advanced Studies
01:39:19.940 | was this thing that was created near Princeton
01:39:21.940 | where Einstein and all these smart people went.
01:39:24.260 | And when he was invited, he thought it was a terrible idea.
01:39:27.420 | This is a university, it was supposed to be heaven, right?
01:39:30.460 | A university without any teaching.
01:39:32.700 | But he thought it was a mistake.
01:39:33.940 | It's getting up in the classroom
01:39:35.620 | and having to explain things to students
01:39:37.660 | and having them ask questions.
01:39:39.140 | Like, well, why is that true?
01:39:40.740 | Makes you stop and think.
01:39:41.860 | So he thought, and I agree,
01:39:45.660 | I think that interaction between a research university
01:39:48.740 | and having students with bright young men
01:39:51.140 | asking hard questions the whole time is synergistic.
01:39:54.620 | And a university without teaching
01:39:58.060 | wouldn't be as vital and exciting a place.
01:40:02.020 | And I think it helps stimulate the research.
01:40:05.220 | - Another romanticized question,
01:40:08.060 | but what's your favorite concept or idea to teach?
01:40:12.380 | What inspires you or you see inspire the students?
01:40:15.860 | Is there something that pops to mind?
01:40:17.100 | Or puts the fear of God in them, I don't know.
01:40:19.540 | Whichever is most effective.
01:40:20.940 | - I mean, in general, I think people are surprised.
01:40:25.380 | I've seen a lot of people
01:40:26.340 | who don't think they like teaching
01:40:28.860 | come give guest lectures or teach a course
01:40:31.820 | and get hooked on seeing the lights turn on, right?
01:40:34.860 | Is people, you can explain something to people
01:40:37.620 | that they don't understand
01:40:39.140 | and suddenly they get something, you know,
01:40:41.500 | that's not, that's important and difficult.
01:40:44.340 | And just seeing the lights turn on
01:40:45.900 | is a real satisfaction there.
01:40:49.140 | I don't think there's any specific example of that.
01:40:53.980 | It's just the general joy of seeing them understand.
01:40:59.300 | - I have to talk about this, because I've wrestled.
01:41:02.580 | I do martial arts.
01:41:03.820 | Yeah, of course I love wrestling.
01:41:05.260 | I'm a huge, I'm Russian, so.
01:41:06.900 | - Ah, oh sure.
01:41:07.780 | - I have talked to Dan Gable on the podcast.
01:41:10.900 | (laughing)
01:41:12.420 | - Dan Gable was my era kind of guy.
01:41:14.860 | - So you wrestled at UCLA,
01:41:16.820 | among many other things you've done in your life,
01:41:19.580 | competitively in sports and science and so on.
01:41:21.620 | You've wrestled, maybe, again,
01:41:25.220 | continuing with the romanticized questions,
01:41:26.940 | but what have you learned about life
01:41:29.780 | and maybe even science from wrestling or from?
01:41:32.300 | - Yeah, that's, in fact, I wrestled at UCLA,
01:41:36.020 | but also at El Camino Community College.
01:41:38.340 | And just right now, we were, in the state of California,
01:41:42.140 | we were state champions at El Camino.
01:41:43.780 | And in fact, I was talking to my mom,
01:41:45.900 | and I got into UCLA,
01:41:48.100 | but I decided to go to the community college,
01:41:50.500 | which is, it's much harder to go to UCLA
01:41:53.060 | than the community college.
01:41:54.620 | And I asked, why did I make that decision?
01:41:56.220 | 'Cause I thought it was because of my girlfriend.
01:41:57.940 | She said, "Well, it was the girlfriend,
01:41:59.500 | "and you thought the wrestling team was really good."
01:42:01.220 | (laughing)
01:42:02.180 | And we were right, we had a great wrestling team.
01:42:04.140 | We actually wrestled against UCLA at a tournament,
01:42:08.460 | and we beat UCLA.
01:42:10.060 | It's a community college,
01:42:11.620 | which is just freshmen and sophomores.
01:42:13.940 | And part of the reason I brought this up
01:42:15.700 | is I'm gonna go, they've invited me back at El Camino
01:42:18.540 | to give a lecture next month.
01:42:22.180 | And so, my friend who was on the wrestling team,
01:42:27.180 | that we're still together,
01:42:28.740 | we're right now reaching out to other members
01:42:30.620 | of the wrestling team,
01:42:31.460 | so we can get together for a reunion.
01:42:33.460 | But in terms of me, it was a huge difference.
01:42:36.300 | I was both, I was kind of,
01:42:39.300 | the age cutoff, it was December 1st,
01:42:41.340 | and so I was almost always the youngest person in my class.
01:42:44.960 | And I matured later, our family matured later,
01:42:49.560 | so I was almost always the smallest guy.
01:42:51.580 | So, I took kind of nerdy courses, but I was wrestling,
01:42:56.580 | so wrestling was huge for my self-confidence in high school.
01:43:01.860 | And then, I kind of got bigger at El Camino and in college,
01:43:06.060 | and so I had this kind of physical self-confidence.
01:43:11.060 | And it's translated into research self-confidence.
01:43:16.140 | And also kind of, I've had this feeling even today,
01:43:21.980 | in my 70s, if something going on in the streets
01:43:26.700 | that is bad physically, I'm not gonna ignore it, right?
01:43:29.340 | I'm gonna stand up and try and straighten that out.
01:43:32.380 | - And that kind of confidence just carries
01:43:34.120 | through the entirety of your life.
01:43:35.460 | - Yeah, and the same things happens intellectually.
01:43:37.820 | If there's something going on
01:43:39.220 | where people are saying something that's not true,
01:43:41.680 | I feel it's my job to stand up,
01:43:43.740 | just like I would in the street,
01:43:45.300 | if there's something going on.
01:43:47.060 | Somebody attacking some woman or something,
01:43:48.980 | I'm not standing by and letting that get away.
01:43:51.420 | So I feel it's my job to stand up,
01:43:53.580 | so it kind of ironically translates.
01:43:55.900 | The other things that turned out,
01:43:57.380 | for both, I had really great college and high school coaches
01:44:01.460 | and they believed, even though wrestling's
01:44:03.580 | an individual sport, that we'd be more successful
01:44:06.340 | as a team if we bonded together,
01:44:08.540 | you'd do things that we would support each other,
01:44:11.140 | rather than everybody, you know,
01:44:12.140 | in wrestling it's a one-on-one,
01:44:13.660 | and you could be everybody's on their own.
01:44:15.460 | But he felt if we bonded as a team, we'd succeed.
01:44:18.540 | So I kind of picked up those skills
01:44:20.620 | of how to form successful teams from wrestling.
01:44:24.820 | And so I think, most people would say,
01:44:27.500 | one of my strengths is I can create teams of faculty,
01:44:31.740 | large teams of faculty, grad students,
01:44:32.980 | pull all together for a common goal
01:44:34.900 | and often be successful at it.
01:44:38.500 | But I got both of those things from wrestling.
01:44:41.860 | Also, I think, I heard this line about
01:44:44.300 | if people are in kind of collision,
01:44:48.180 | sports with physical contact like wrestling
01:44:50.700 | or football and stuff like that,
01:44:51.860 | people are a little bit more assertive or something.
01:44:56.020 | And so I think that also comes through.
01:44:59.340 | I didn't shy away from the risk-assist debates.
01:45:03.740 | I enjoyed taking on the arguments and stuff like that.
01:45:07.340 | So it was, I'm really glad I did wrestling.
01:45:10.620 | I think it was really good for my self-image
01:45:12.860 | and I learned a lot from it.
01:45:14.020 | So I think that's, sports done well,
01:45:17.420 | there's really lots of positives you can take about it,
01:45:20.540 | of leadership, how to form teams and how to be successful.
01:45:25.540 | - So we've talked about metrics a lot.
01:45:28.580 | There's a really cool,
01:45:29.860 | in terms of bench press and weightlifting,
01:45:31.580 | pioneers metric that you've developed
01:45:33.300 | that we don't have time to talk about,
01:45:34.820 | but it's a really cool one that people should look into.
01:45:37.420 | It's rethinking the way we think
01:45:38.900 | about metrics and weightlifting.
01:45:40.500 | But let me talk about metrics more broadly,
01:45:42.500 | since that appeals to you in all forms.
01:45:45.420 | Let's look at the most ridiculous,
01:45:47.700 | the biggest question of the meaning of life.
01:45:50.600 | If you were to try to put metrics on a life well-lived,
01:45:54.040 | what would those metrics be?
01:45:55.440 | - Yeah, a friend of mine, Randy Katz said this.
01:45:59.740 | He said, "When it's time to sign off,
01:46:04.340 | "the measure isn't the number of zeros in your bank account,
01:46:08.580 | "it's the number of inches in the obituary
01:46:10.880 | "in the New York Times."
01:46:12.340 | (Dave laughs)
01:46:13.180 | That's what he said.
01:46:14.000 | I think having,
01:46:15.680 | and this is a cliche,
01:46:18.940 | is that people don't die
01:46:20.620 | wishing they'd spent more time in the office, right?
01:46:23.420 | As I reflect upon my career,
01:46:25.740 | there have been a half a dozen,
01:46:28.620 | or a dozen things, say, I've been proud of.
01:46:30.900 | A lot of them aren't papers or scientific results.
01:46:33.780 | Certainly my family, my wife,
01:46:36.060 | we've been married more than 50 years,
01:46:39.220 | kids and grandkids, that's really precious.
01:46:42.480 | The education things I've done, I'm very proud of.
01:46:46.680 | Books and courses.
01:46:49.400 | I did some help with underrepresented groups
01:46:51.840 | that was effective.
01:46:52.840 | So it was interesting to see
01:46:54.360 | what were the things I reflected.
01:46:56.200 | I had hundreds of papers,
01:46:58.560 | but some of them were the papers,
01:47:00.480 | like the risk and rate stuff that I'm proud of,
01:47:02.060 | but a lot of them were not those things.
01:47:04.520 | So people who just spend their lives
01:47:08.040 | going after the dollars
01:47:09.520 | or going after all the papers in the world,
01:47:12.280 | that's probably not the things
01:47:14.000 | that afterwards you're gonna care about.
01:47:16.080 | When I got the offer from Berkeley before I showed up,
01:47:21.520 | I read a book where they interviewed a lot of people
01:47:24.520 | in all works of life,
01:47:25.480 | and what I got out of that book
01:47:26.980 | was the people who felt good about what they did
01:47:28.760 | was the people who affected people,
01:47:30.760 | as opposed to things that were more transitory.
01:47:32.920 | So I came into this job
01:47:34.680 | assuming that it wasn't gonna be the papers,
01:47:36.560 | it was gonna be relationships with the people over time
01:47:39.000 | that I would value,
01:47:41.320 | and that was a correct assessment.
01:47:43.600 | It's the people you work with,
01:47:45.920 | the people you can influence,
01:47:47.080 | the people you can help,
01:47:48.000 | is the things that you feel good about
01:47:49.680 | towards the end of your career.
01:47:50.520 | It's not the stuff that's more transitory.
01:47:53.500 | - I don't think there's a better way to end it
01:47:56.340 | than talking about your family,
01:47:58.640 | the over 50 years of being married
01:48:01.080 | to your childhood sweetheart.
01:48:02.880 | - What I think I could add is,
01:48:04.400 | when you tell people you've been married 50 years,
01:48:06.800 | they wanna know why.
01:48:08.200 | - How, why?
01:48:09.200 | - Yeah, I can tell you the nine magic words
01:48:11.600 | that you need to say to your partner
01:48:13.800 | to keep a good relationship.
01:48:15.960 | And the nine magic words are,
01:48:17.680 | I was wrong, you were right, I love you.
01:48:21.200 | - Okay.
01:48:22.040 | - And you gotta say all nine.
01:48:22.860 | You can't say, I was wrong, you were right, you're a jerk.
01:48:25.480 | You know, you can't say that.
01:48:27.120 | So yeah, freely acknowledging that you made a mistake,
01:48:30.080 | the other person was right,
01:48:31.160 | and that you love them
01:48:32.920 | really gets over a lot of bumps in the road.
01:48:36.480 | So that's what I pass along.
01:48:38.520 | - Beautifully put.
01:48:39.640 | David, it is a huge honor.
01:48:41.360 | Thank you so much for the book you've written,
01:48:42.920 | for the research you've done, for changing the world.
01:48:45.120 | Thank you for talking today.
01:48:46.000 | - Oh, thanks for the interview.
01:48:48.080 | - Thanks for listening to this conversation
01:48:49.560 | with David Patterson,
01:48:50.800 | and thank you to our sponsors,
01:48:53.280 | the Jordan Harbinger Show and Cash App.
01:48:56.740 | Please consider supporting this podcast
01:48:58.520 | by going to jordanharbinger.com/lex
01:49:01.780 | and downloading Cash App and using code LEXPODCAST.
01:49:06.000 | Click the links, buy the stuff.
01:49:08.200 | It's the best way to support this podcast
01:49:10.040 | and the journey I'm on.
01:49:12.000 | If you enjoy this thing, subscribe on YouTube,
01:49:15.000 | review it with 5,000 F on podcast,
01:49:17.160 | support it on Patreon,
01:49:18.480 | or connect with me on Twitter at Lex Friedman,
01:49:20.680 | spelled without the E,
01:49:23.360 | try to figure out how to do that.
01:49:24.720 | It's just F-R-I-D-M-A-N.
01:49:27.480 | And now let me leave you with some words
01:49:29.640 | from Henry David Thoreau.
01:49:32.400 | Our life is fretted away by detail.
01:49:35.600 | Simplify, simplify.
01:49:38.040 | Thank you for listening and hope to see you next time.
01:49:42.440 | (upbeat music)
01:49:45.020 | (upbeat music)
01:49:47.600 | [BLANK_AUDIO]