back to index

Chris Lattner: The Future of Computing and Programming Languages | Lex Fridman Podcast #131


Chapters

0:0 Introduction
2:25 Working with Elon Musk, Steve Jobs, Jeff Dean
7:55 Why do programming languages matter?
13:55 Python vs Swift
24:48 Design decisions
30:6 Types
33:54 Programming languages are a bicycle for the mind
36:26 Picking what language to learn
42:25 Most beautiful feature of a programming language
51:50 Walrus operator
61:16 LLVM
66:28 MLIR compiler framework
70:35 SiFive semiconductor design
83:9 Moore's Law
86:22 Parallelization
90:50 Swift concurrency manifesto
101:39 Running a neural network fast
107:16 Is the universe a quantum computer?
112:57 Effects of the pandemic on society
130:9 GPT-3
134:28 Software 2.0
147:54 Advice for young people
152:37 Meaning of life

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Chris Lattner,
00:00:02.640 | his second time on the podcast.
00:00:04.680 | He's one of the most brilliant engineers
00:00:06.600 | in modern computing,
00:00:07.800 | having created LLVM compiler infrastructure project,
00:00:11.480 | the Clang compiler, the Swift programming language,
00:00:14.640 | a lot of key contributions to TensorFlow and TPUs
00:00:17.640 | as part of Google.
00:00:19.080 | He served as vice president of autopilot software at Tesla,
00:00:23.520 | was a software innovator and leader at Apple,
00:00:26.200 | and now is at Sci-5
00:00:28.280 | as senior vice president of platform engineering,
00:00:30.920 | looking to revolutionize chip design
00:00:33.520 | to make it faster, better, and cheaper.
00:00:36.560 | Quick mention of each sponsor,
00:00:38.240 | followed by some thoughts related to the episode.
00:00:40.920 | First sponsor is Blinkist,
00:00:42.400 | an app that summarizes key ideas from thousands of books.
00:00:45.400 | I use it almost every day to learn new things
00:00:48.040 | or to pick which books I want to read or listen to next.
00:00:52.280 | Second is Neuro,
00:00:53.920 | the maker of functional sugar-free gum and mints
00:00:56.480 | that I use to supercharge my mind
00:00:58.520 | with caffeine, L-theanine, and B vitamins.
00:01:01.640 | Third is Masterclass,
00:01:03.240 | online courses from the best people in the world
00:01:06.680 | on each of the topics covered
00:01:08.360 | from rockets to game design, to poker,
00:01:11.140 | to writing, and to guitar.
00:01:13.920 | And finally, Cash App,
00:01:15.680 | the app I use to send money to friends for food, drinks,
00:01:19.360 | and unfortunately, lost bets.
00:01:21.800 | Please check out the sponsors in the description
00:01:23.740 | to get a discount and to support this podcast.
00:01:27.320 | As a side note, let me say that Chris
00:01:29.320 | has been an inspiration to me on a human level
00:01:32.560 | because he is so damn good as an engineer
00:01:35.240 | and leader of engineers,
00:01:36.720 | and yet he's able to stay humble,
00:01:38.600 | especially humble enough to hear the voices of disagreement
00:01:42.120 | and to learn from them.
00:01:43.800 | He was supportive of me and this podcast
00:01:46.080 | from the early days, and for that, I'm forever grateful.
00:01:49.520 | To be honest, most of my life,
00:01:51.180 | no one really believed that I would amount to much.
00:01:53.920 | So when another human being looks at me,
00:01:56.520 | it makes me feel like I might be someone special.
00:01:58.920 | It can be truly inspiring.
00:02:00.840 | That's a lesson for educators.
00:02:02.780 | The weird kid in the corner with a dream
00:02:05.640 | is someone who might need your love and support
00:02:08.160 | in order for that dream to flourish.
00:02:10.060 | If you enjoy this thing, subscribe on YouTube,
00:02:13.320 | review it with 5 Stars on Apple Podcasts,
00:02:15.480 | follow on Spotify, support on Patreon,
00:02:17.960 | or connect with me on Twitter @LexFriedman.
00:02:21.300 | And now, here's my conversation with Chris Ladner.
00:02:24.780 | - What are the strongest qualities of Steve Jobs,
00:02:28.940 | Elon Musk, and the great and powerful Jeff Dean
00:02:32.980 | since you've gotten the chance to work with each?
00:02:36.020 | - You're starting with an easy question there.
00:02:38.580 | These are three very different people.
00:02:40.700 | I guess you could do maybe a pairwise comparison
00:02:43.860 | between them instead of a group comparison.
00:02:45.740 | So if you look at Steve Jobs and Elon,
00:02:48.200 | I worked a lot more with Elon than I did with Steve.
00:02:51.040 | They have a lot of commonality.
00:02:52.400 | They're both visionary in their own way.
00:02:55.400 | They're both very demanding in their own way.
00:02:57.640 | My sense is Steve is much more human factor focused,
00:03:02.440 | where Elon is more technology focused.
00:03:04.640 | - What does human factor mean?
00:03:06.000 | - Steve's trying to build things that feel good,
00:03:08.480 | that people love, that affect people's lives, how they live.
00:03:11.600 | He's looking into the future a little bit
00:03:14.680 | in terms of what people want,
00:03:17.800 | where I think that Elon focuses more on
00:03:20.240 | learning how exponentials work
00:03:21.560 | and predicting the development of those.
00:03:24.120 | - Steve worked with a lot of engineers.
00:03:26.280 | That was one of the things that reading the biography.
00:03:29.520 | How can a designer essentially talk to engineers
00:03:33.320 | and get their respect?
00:03:35.640 | - I think, so I did not work very closely with Steve.
00:03:37.800 | I'm not an expert at all.
00:03:38.640 | My sense is that he pushed people really hard,
00:03:41.860 | but then when he got an explanation that made sense to him,
00:03:44.480 | then he would let go.
00:03:45.760 | And he did actually have a lot of respect for engineering,
00:03:49.200 | but he also knew when to push.
00:03:51.480 | And when you can read people well,
00:03:54.160 | you can know when they're holding back
00:03:56.880 | and when you can get a little bit more out of them.
00:03:58.440 | And I think he was very good at that.
00:04:00.320 | I mean, if you compare the other folks,
00:04:03.240 | so Jeff Dean, right?
00:04:05.200 | Jeff Dean's an amazing guy.
00:04:06.280 | He's super smart, as are the other guys.
00:04:09.080 | Jeff is a really, really, really nice guy.
00:04:13.200 | Well-meaning, he's a classic Googler.
00:04:15.280 | He wants people to be happy.
00:04:17.720 | He combines it with brilliance,
00:04:19.760 | so he can pull people together in a really great way.
00:04:22.600 | He's definitely not a CEO type.
00:04:24.640 | I don't think he would even want to be that.
00:04:28.040 | - Do you know if he still programs?
00:04:29.280 | - Oh yeah, he definitely programs.
00:04:30.560 | Jeff is an amazing engineer today, right?
00:04:32.840 | And that has never changed.
00:04:34.080 | So it's really hard to compare Jeff to either of those two.
00:04:40.320 | I think that Jeff leads through technology
00:04:43.640 | and building it himself
00:04:44.880 | and then pulling people in and inspiring them.
00:04:46.760 | And so I think that that's one of the amazing things
00:04:50.040 | about Jeff, but each of these people,
00:04:51.880 | with their pros and cons, all are really inspirational
00:04:55.000 | and have achieved amazing things.
00:04:56.600 | I've been very fortunate to get to work with these guys.
00:05:00.760 | - For yourself, you've led large teams,
00:05:03.880 | you've done so many incredible,
00:05:06.240 | difficult technical challenges.
00:05:08.480 | Is there something you've picked up
00:05:10.400 | from them about how to lead?
00:05:12.560 | - Yeah, I think leadership is really hard.
00:05:14.720 | It really depends on what you're looking for there.
00:05:17.200 | I think you really need to know what you're talking about.
00:05:20.220 | So being grounded on the product, on the technology,
00:05:23.000 | on the business, on the mission is really important.
00:05:26.320 | Understanding what people are looking for,
00:05:29.840 | why they're there.
00:05:30.760 | One of the most amazing things about Tesla
00:05:32.400 | is the unifying vision, right?
00:05:34.640 | People are there because they believe in clean energy
00:05:37.240 | and electrification, all these kinds of things.
00:05:39.640 | The other is to understand what really motivates people,
00:05:44.720 | how to get the best people,
00:05:45.800 | how to build a plan that actually can be executed, right?
00:05:48.920 | There's so many different aspects of leadership
00:05:50.480 | and it really depends on the time, the place, the problems.
00:05:53.680 | There's a lot of issues that don't need to be solved.
00:05:56.920 | And so if you focus on the right things and prioritize well,
00:05:59.880 | that can really help move things.
00:06:01.440 | - Two interesting things you mentioned.
00:06:03.240 | One is you really have to know what you're talking about,
00:06:06.120 | how you've worked on a lot of
00:06:10.160 | very challenging technical things.
00:06:11.960 | - Sure.
00:06:12.800 | - So I kind of assume you were born technically savvy,
00:06:17.800 | but assuming that's not the case,
00:06:20.680 | how did you develop technical expertise?
00:06:24.920 | Like even at Google, you worked on,
00:06:27.320 | I don't know how many projects,
00:06:28.920 | but really challenging, very varied.
00:06:32.200 | - Compilers, TPUs, hardware, cloud stuff,
00:06:34.600 | a bunch of different things.
00:06:36.440 | The thing that I've become comfortable with,
00:06:38.760 | more comfortable with as I've gained experience,
00:06:42.280 | is being okay with not knowing.
00:06:45.080 | And so a major part of leadership is actually,
00:06:49.120 | it's not about having the right answer,
00:06:50.840 | it's about getting the right answer.
00:06:52.840 | And so if you're working in a team of amazing people,
00:06:56.000 | right, in many of these places,
00:06:57.520 | many of these companies all have amazing people.
00:07:00.320 | It's the question of how do you get people together?
00:07:02.120 | How do you build trust?
00:07:04.160 | How do you get people to open up?
00:07:05.920 | How do you get people to be vulnerable sometimes
00:07:10.000 | with an idea that maybe isn't good enough,
00:07:11.760 | but it's the start of something beautiful?
00:07:14.000 | How do you provide an environment
00:07:17.400 | where you're not just like top-down,
00:07:18.840 | thou shalt do the thing that I tell you to do, right?
00:07:21.120 | But you're encouraging people to be part of the solution
00:07:23.720 | and providing a safe space
00:07:26.400 | where if you're not doing the right thing,
00:07:27.880 | they're willing to tell you about it.
00:07:29.640 | - So you're asking dumb questions?
00:07:31.440 | - Yeah, dumb questions are my specialty.
00:07:32.960 | Yeah.
00:07:34.000 | So I've been in the hardware realm recently
00:07:35.840 | and I don't know much at all about how chips are designed.
00:07:39.040 | I know a lot about using them.
00:07:40.040 | I know some of the principles
00:07:41.120 | and the arts technical level of this,
00:07:43.280 | but it turns out that if you ask a lot of dumb questions,
00:07:47.240 | you get smarter really quick.
00:07:48.920 | And when you're surrounded by people that wanna teach
00:07:51.040 | and learn themselves, it can be a beautiful thing.
00:07:54.080 | - So let's talk about programming languages, if it's okay.
00:07:58.840 | At the highest absurd philosophical level, 'cause I-
00:08:02.080 | - Don't get romantic on me, Lex.
00:08:03.640 | - I will forever get romantic and torture you, I apologize.
00:08:08.640 | Why do programming languages even matter?
00:08:14.160 | - Okay, well, thank you very much.
00:08:15.680 | So you're saying why should you care
00:08:17.440 | about any one programming language
00:08:18.640 | or why do we care about programming computers or?
00:08:20.920 | - No, why do we care about programming language design,
00:08:25.200 | creating effective programming languages,
00:08:27.960 | choosing A, one programming languages
00:08:32.600 | versus another programming language,
00:08:34.560 | why we keep struggling and improving
00:08:37.840 | through the evolution of these programming languages.
00:08:39.840 | - Sure, sure, sure, okay.
00:08:40.680 | So, I mean, I think you have to come back
00:08:42.080 | to what are we trying to do here, right?
00:08:43.640 | So we have these beasts called computers
00:08:47.120 | that are very good at specific kinds of things
00:08:48.840 | and we think it's useful to have them do it for us, right?
00:08:52.000 | Now you have this question of how best to express that
00:08:55.560 | because you have a human brain still
00:08:57.200 | that has an idea in its head
00:08:58.840 | and you wanna achieve something, right?
00:09:00.560 | So, well, there's lots of ways of doing this.
00:09:03.200 | You can go directly to the machine
00:09:04.720 | and speak assembly language
00:09:06.000 | and then you can express directly
00:09:07.640 | what the computer understands, that's fine.
00:09:09.800 | You can then have higher and higher and higher levels
00:09:12.800 | of abstraction up until machine learning
00:09:14.880 | and you're designing a neural net to do the work for you.
00:09:18.040 | The question is where along this way do you want to stop
00:09:21.200 | and what benefits do you get out of doing so?
00:09:23.440 | And so programming languages in general,
00:09:25.280 | you have C, you have Fortran, Java,
00:09:28.000 | and Ada, Pascal, Swift, you have lots of different things.
00:09:33.000 | They all have different trade-offs
00:09:34.360 | and they're tackling different parts of the problems.
00:09:36.520 | Now, one of the things that most programming languages do
00:09:39.960 | is they're trying to make it so that you have
00:09:41.920 | pretty basic things like portability
00:09:43.600 | across different hardware.
00:09:45.080 | So you've got, I'm gonna run on an Intel PC,
00:09:47.640 | I'm gonna run on a RISC-V PC,
00:09:49.240 | I'm gonna run on a ARM phone or something like that, fine.
00:09:53.480 | I wanna write one program and have it portable
00:09:55.520 | and this is something that assembly doesn't do.
00:09:57.760 | Now, when you start looking at the space
00:09:59.720 | of programming languages,
00:10:00.880 | this is where I think it's fun
00:10:02.400 | because programming languages all have trade-offs
00:10:06.160 | and most people will walk up to them
00:10:07.920 | and they look at the surface level of syntax
00:10:10.440 | and say, oh, I like curly braces,
00:10:12.400 | or I like tabs, or I like, you know,
00:10:15.600 | semi-colons or not or whatever, right?
00:10:17.120 | Subjective, fairly subjective, very shallow things.
00:10:21.240 | But programming languages when done right
00:10:23.140 | can actually be very powerful.
00:10:24.600 | And the benefit they bring is expression.
00:10:29.600 | Okay, and if you look at programming languages,
00:10:32.560 | there's really kind of two different levels to them.
00:10:34.400 | One is the down in the dirt, nuts and bolts
00:10:37.920 | of how do you get the computer to be efficient,
00:10:39.340 | stuff like that, how they work, type systems,
00:10:41.640 | compiler stuff, things like that.
00:10:43.480 | The other is the UI.
00:10:44.980 | And the UI for a programming language
00:10:47.160 | is really a design problem
00:10:48.560 | and a lot of people don't think about it that way.
00:10:50.600 | - And the UI, you mean all that stuff with the braces
00:10:53.400 | and the action. - Yeah, all that stuff's the UI
00:10:55.240 | and what it is, and UI means user interface.
00:10:58.200 | And so what's really going on is
00:11:00.400 | it's the interface between the guts and the human.
00:11:03.260 | And humans are hard, right?
00:11:05.880 | Humans have feelings, they have things they like,
00:11:09.520 | they have things they don't like.
00:11:10.720 | And a lot of people treat programming languages
00:11:12.720 | as though humans are just kind of abstract creatures
00:11:16.320 | that cannot be predicted.
00:11:17.520 | But it turns out that actually there is better and worse.
00:11:21.640 | Like people can tell when a programming language is good
00:11:24.960 | or when it was an accident, right?
00:11:26.880 | And one of the things with Swift in particular
00:11:29.360 | is that a tremendous amount of time
00:11:30.960 | by a tremendous number of people
00:11:33.240 | have been put into really polishing and making it feel good.
00:11:36.660 | But it also has really good nuts and bolts underneath it.
00:11:39.080 | - You said that Swift makes a lot of people feel good.
00:11:42.480 | How do you get to that point?
00:11:45.480 | So how do you predict that,
00:11:50.840 | tens of thousands, hundreds of thousands of people
00:11:52.800 | are going to enjoy using this,
00:11:55.000 | the user experience of this programming language?
00:11:57.160 | - Well, you can look at it in terms of better and worse.
00:11:59.540 | So if you have to write lots of boilerplate
00:12:01.320 | or something like that, you will feel unproductive.
00:12:03.520 | And so that's a bad thing.
00:12:05.040 | You can look at it in terms of safety.
00:12:06.680 | If like C, for example,
00:12:08.120 | is what's called a memory unsafe language.
00:12:10.040 | And so you get dangling pointers
00:12:11.560 | and you get all these kind of bugs
00:12:13.320 | that then you have spent tons of time debugging
00:12:15.000 | and it's a real pain in the butt and you feel unproductive.
00:12:17.760 | And so by subtracting these things from the experience,
00:12:19.940 | you get happier people.
00:12:22.600 | - But again, keep interrupting.
00:12:25.360 | I'm sorry.
00:12:26.200 | - It's so hard to deal with.
00:12:27.640 | (laughing)
00:12:29.200 | - If you look at the people,
00:12:30.560 | people that are most productive on Stack Overflow,
00:12:33.100 | they have a set of priorities
00:12:37.440 | that may not always correlate perfectly
00:12:39.840 | with the experience of the majority of users.
00:12:43.120 | You know, if you look at the most upvoted,
00:12:46.280 | quote unquote, correct answer on Stack Overflow,
00:12:49.120 | it usually really sort of prioritizes like safe code,
00:12:54.120 | proper code, stable code, you know, that kind of stuff.
00:13:01.860 | As opposed to like,
00:13:02.980 | if I want to use go-to statements in my basic, right?
00:13:07.060 | I want to use go-to state.
00:13:09.860 | Like what if 99% of people want to use go-to statements?
00:13:12.700 | So you use completely improper, you know, unsafe syntax.
00:13:16.620 | - I don't think that people actually,
00:13:17.900 | like if you boil it down and you get below the surface level
00:13:20.120 | people don't actually care about go-tos or if statements
00:13:23.340 | or things like this.
00:13:24.180 | They care about achieving a goal.
00:13:26.420 | - Yeah.
00:13:27.260 | - Right, so the real question is,
00:13:28.260 | I want to set up a web server and I want to do a thing,
00:13:30.580 | I want to do whatever.
00:13:32.260 | Like how quickly can I achieve that, right?
00:13:34.260 | And so from a programming language perspective,
00:13:36.420 | there's really two things that matter there.
00:13:39.020 | One is what libraries exist
00:13:41.920 | and then how quickly can you put it together
00:13:44.460 | and what are the tools around that look like, right?
00:13:47.260 | And when you want to build a library that's missing,
00:13:49.740 | what do you do?
00:13:50.580 | Okay, now this is where you see huge divergence
00:13:53.280 | in the force between worlds, okay?
00:13:55.820 | And so you look at Python, for example,
00:13:57.340 | Python is really good at assembling things,
00:13:59.220 | but it's not so great at building all the libraries.
00:14:02.500 | And so what you get because of performance reasons,
00:14:04.340 | other things like this,
00:14:05.560 | is you get Python layered on top of C, for example.
00:14:09.260 | And that means that doing certain kinds of things,
00:14:11.540 | well, it doesn't really make sense to do in Python.
00:14:13.340 | Instead you do it in C and then you wrap it
00:14:15.580 | and then you're living in two worlds
00:14:17.660 | and two worlds never is really great
00:14:19.300 | because tooling and the debugger doesn't work right
00:14:21.900 | and like all these kinds of things.
00:14:23.800 | - Can you clarify a little bit what you mean by
00:14:26.460 | Python is not good at building libraries,
00:14:28.580 | meaning it doesn't make it conducive?
00:14:30.460 | - Certain kinds of libraries.
00:14:31.540 | - No, but just the actual meaning of the sentence.
00:14:34.860 | - Yeah.
00:14:35.900 | - Meaning like it's not conducive to developers
00:14:38.380 | to come in and add libraries
00:14:40.500 | or is it the duality of the,
00:14:44.760 | it's a dance between Python and C and you can never.
00:14:48.100 | - Well, so Python's amazing.
00:14:49.460 | Python's a great language.
00:14:50.420 | I did not mean to say that Python is bad for libraries.
00:14:53.420 | What I meant to say is there are libraries
00:14:56.820 | that Python's really good at,
00:14:58.580 | that you can write in Python,
00:15:00.420 | but there are other things,
00:15:01.300 | like if you wanna build a machine learning framework,
00:15:03.620 | you're not gonna build a machine learning framework
00:15:05.020 | in Python because of performance, for example,
00:15:07.380 | or you want GPU acceleration or things like this.
00:15:10.180 | Instead what you do is you write a bunch of C
00:15:13.260 | or C++ code or something like that
00:15:15.300 | and then you talk to it from Python, right?
00:15:18.460 | And so this is because of decisions
00:15:21.100 | that were made in the Python design
00:15:23.140 | and those decisions have other counterbalancing forces,
00:15:27.140 | but the trick when you start looking at this
00:15:29.860 | from a programming language perspective
00:15:31.300 | is you start to say, okay, cool,
00:15:33.180 | how do I build this catalog of libraries
00:15:36.340 | that are really powerful?
00:15:37.820 | And how do I make it so that then they can be assembled
00:15:40.500 | into ways that feel good
00:15:42.080 | and they generally work the first time?
00:15:44.020 | Because when you're talking about building a thing,
00:15:46.900 | you have to include the debugging, the fixing,
00:15:50.220 | the turnaround cycle, the development cycle,
00:15:51.900 | all that kind of stuff into the process
00:15:55.140 | of building the thing.
00:15:56.060 | It's not just about pounding out the code.
00:15:58.300 | And so this is where things like catching bugs
00:16:01.300 | at compile time is valuable, for example.
00:16:03.300 | But if you dive into the details in this,
00:16:07.600 | Swift, for example, has certain things like value semantics,
00:16:10.560 | which is this fancy way of saying
00:16:11.960 | that when you treat a variable like a value,
00:16:16.420 | it acts like a mathematical object would.
00:16:21.480 | Okay, so you have used PyTorch a little bit.
00:16:25.200 | In PyTorch, you have tensors.
00:16:26.640 | Tensors are n-dimensional grid of numbers.
00:16:31.280 | Very simple.
00:16:32.120 | You can do plus and other operators on them.
00:16:34.640 | It's all totally fine.
00:16:35.840 | But why do you need to clone a tensor sometimes?
00:16:38.240 | Have you ever run into that?
00:16:40.840 | - Yeah.
00:16:41.660 | - Okay, and so why is that?
00:16:42.760 | Why do you need to clone a tensor?
00:16:43.920 | - It's the usual object thing that's in Python.
00:16:46.800 | - So in Python, and just like with Java
00:16:49.280 | and many other languages, this isn't unique to Python.
00:16:51.520 | In Python, it has a thing called reference semantics,
00:16:53.760 | which is the nerdy way of explaining this.
00:16:55.680 | And what that means is you actually have a pointer
00:16:58.080 | do a thing instead of the thing.
00:16:59.960 | Now, this is due to a bunch of implementation details
00:17:05.240 | that you don't wanna go into.
00:17:06.800 | But in Swift, you have this thing called value semantics.
00:17:09.560 | And so when you have a tensor in Swift, it is a value.
00:17:12.160 | If you copy it, it looks like you have a unique copy.
00:17:15.080 | And if you go change one of those copies,
00:17:16.800 | then it doesn't update the other one
00:17:19.320 | 'cause you just made a copy of this thing.
00:17:21.400 | - So that's highly error-prone in at least computer science,
00:17:26.400 | math-centric disciplines about Python.
00:17:32.120 | - The thing you would expect to behave-
00:17:34.680 | - Like math.
00:17:35.520 | - Like math, it doesn't behave like math.
00:17:38.280 | And in fact, quietly doesn't behave like math
00:17:41.680 | and then can ruin the entirety of your math thing.
00:17:43.320 | - Exactly.
00:17:44.160 | Well, and then it puts you in debugging land again.
00:17:46.040 | - Yeah.
00:17:46.880 | - Right, now you just wanna get something done
00:17:48.600 | and you're like, wait a second,
00:17:50.080 | where do I need to put clone?
00:17:51.520 | And what level of the stack, which is very complicated,
00:17:54.200 | which I thought I was reusing somebody's library
00:17:56.800 | and now I need to understand it
00:17:57.880 | to know where to clone a thing, right?
00:17:59.640 | - And hard to debug, by the way.
00:18:01.320 | - Exactly, right?
00:18:02.160 | And so this is where programming languages really matter.
00:18:04.320 | Right, so in Swift, having value semantics
00:18:06.280 | so that both you get the benefit of math working like math,
00:18:11.280 | right, but also the efficiency
00:18:13.680 | that comes with certain advantages there,
00:18:15.920 | certain implementation details there
00:18:17.320 | really benefit you as a programmer, right?
00:18:18.920 | - Can you clarify the value semantics?
00:18:20.640 | Like how do you know that a thing
00:18:22.320 | should be treated like a value?
00:18:23.720 | - Yeah, so Swift has a pretty strong culture
00:18:27.720 | and good language support for defining values.
00:18:30.400 | And so if you have an array,
00:18:31.960 | so tensors are one example
00:18:33.400 | that the machine learning folks are very used to.
00:18:36.480 | Just think about arrays, same thing,
00:18:38.280 | where you have an array, you put, you create an array,
00:18:41.640 | you put two or three or four things into it
00:18:43.920 | and then you pass it off to another function.
00:18:46.920 | What happens if that function adds some more things to it?
00:18:51.360 | Well, you'll see it on the side that you pass it in, right?
00:18:54.320 | This is called reference semantics.
00:18:56.680 | Now, what if you pass an array off to a function,
00:19:01.240 | it scrolls it away in some dictionary
00:19:02.880 | or some other data structure somewhere, right?
00:19:04.880 | Well, it thought that you just handed it that array,
00:19:07.960 | then you return back and that reference to that array
00:19:10.800 | still exists in the caller
00:19:12.800 | and they go and put more stuff in it, right?
00:19:15.760 | The person you handed it off to
00:19:17.840 | may have thought they had the only reference to that.
00:19:20.240 | And so they didn't know what they,
00:19:21.680 | that this was gonna change underneath the covers.
00:19:23.960 | And so this is where you end up having to do a clone.
00:19:26.200 | So like I was past a thing,
00:19:27.800 | I'm not sure if I have the only version of it.
00:19:30.240 | So now I have to clone it.
00:19:32.280 | So what value semantics does is it allows you to say,
00:19:34.680 | hey, I have a, so in Swift, it defaults to value semantics.
00:19:38.560 | - Oh, so it defaults to value semantics
00:19:40.240 | and then because most things should be true values,
00:19:44.120 | then it makes sense for that to be the default.
00:19:46.080 | - And one of the important things about that
00:19:47.240 | is that arrays and dictionaries
00:19:48.720 | and all these other collections
00:19:49.960 | that are aggregations of other things
00:19:51.280 | also have value semantics.
00:19:53.040 | And so when you pass this around
00:19:55.040 | to different parts of your program,
00:19:56.680 | you don't have to do these defensive copies.
00:19:59.200 | And so this is great for two sides, right?
00:20:01.280 | It's great because you define away the bug,
00:20:04.200 | which is a big deal for productivity,
00:20:05.960 | the number one thing most people care about,
00:20:08.200 | but it's also good for performance
00:20:09.720 | because when you're doing a clone,
00:20:11.600 | so you pass the array down to the thing,
00:20:13.440 | it was like, I don't know if anybody else has it,
00:20:15.400 | I have to clone it.
00:20:16.640 | Well, you just did a copy of a bunch of data.
00:20:18.480 | It could be big.
00:20:19.960 | And then it could be that the thing that called you
00:20:21.960 | is not keeping track of the old thing.
00:20:24.040 | So you just made a copy of it and you may not have had to.
00:20:27.800 | And so the way the value semantics work in Swift
00:20:30.160 | is it uses this thing called copy on write,
00:20:32.060 | which means that you get the benefit of safety
00:20:35.520 | and performance.
00:20:36.400 | And it has another special trick
00:20:38.360 | because if you think certain languages like Java,
00:20:41.200 | for example, they have immutable strings.
00:20:43.960 | And so what they're trying to do
00:20:44.920 | is they provide value semantics by having pure immutability.
00:20:49.000 | Functional languages have pure immutability
00:20:51.040 | in lots of different places.
00:20:52.280 | And this provides a much safer model
00:20:53.960 | and it provides value semantics.
00:20:56.160 | The problem with this is if you have immutability,
00:20:58.400 | everything is expensive.
00:20:59.480 | Everything requires a copy.
00:21:00.980 | For example, in Java, if you have a string X
00:21:05.440 | and a string Y, you append them together,
00:21:07.880 | we have to allocate a new string to hold XY.
00:21:11.040 | - If they're immutable.
00:21:13.720 | - Well, and strings in Java are immutable.
00:21:16.920 | And if there's optimizations for short ones,
00:21:19.320 | and it's complicated,
00:21:20.960 | but generally think about them as a separate allocation.
00:21:24.560 | And so when you append them together,
00:21:26.640 | you have to go allocate a third thing
00:21:28.560 | because somebody might have a pointer
00:21:29.680 | to either of the other ones, right?
00:21:31.080 | And you can't go change them.
00:21:32.060 | So you have to go allocate a third thing
00:21:34.720 | because of the beauty of how the Swift value semantics
00:21:36.760 | system works out.
00:21:37.760 | If you have a string on Swift and you say,
00:21:38.960 | "Hey, put in X," right?
00:21:41.000 | And they say, "Append on Y, Z, W, W."
00:21:44.880 | It knows that there's only one reference to that.
00:21:47.480 | And so it can do an in-place update.
00:21:50.240 | And so you're not allocating tons of stuff on the side.
00:21:53.440 | You don't have all those problems.
00:21:54.620 | When you pass it off,
00:21:56.040 | you can know you have the only reference.
00:21:57.520 | If you pass it off to multiple different people,
00:21:59.340 | but nobody changes it, they can all share the same thing.
00:22:02.600 | So you get a lot of the benefit of purely mutable design.
00:22:05.800 | And so you get a really nice sweet spot
00:22:07.640 | that I haven't seen in other languages.
00:22:09.280 | - Yeah, that's interesting.
00:22:10.560 | I thought there was going to be a philosophical
00:22:14.680 | like narrative here that you're gonna have to pay
00:22:17.560 | a cost for it.
00:22:19.760 | It sounds like, I think value semantics
00:22:24.480 | is beneficial for easing of debugging
00:22:27.440 | or minimizing the risk of errors,
00:22:30.980 | like bringing the errors closer to the source,
00:22:34.200 | bringing the symptom of the error closer
00:22:38.160 | to the source of the error, however you say that.
00:22:40.840 | But you're saying there's not a performance cost either
00:22:45.000 | if you implement it correctly.
00:22:46.320 | - Well, so there's trade-offs with everything.
00:22:48.280 | And so if you are doing very low level stuff,
00:22:51.880 | then sometimes you can notice the cost,
00:22:53.160 | but then what you're doing is you're saying,
00:22:54.880 | what is the right default?
00:22:56.520 | So coming back to user interface,
00:22:59.120 | when you talk about programming languages,
00:23:00.760 | one of the major things that Swift does
00:23:03.000 | that makes people love it, that is not obvious
00:23:06.900 | when it comes to designing a language
00:23:08.200 | is this UI principle of progressive disclosure of complexity.
00:23:12.280 | So Swift, like many languages, is very powerful.
00:23:16.720 | The question is, when do you have to learn the power
00:23:18.840 | as a user?
00:23:19.680 | So Swift, like Python, allows you to start
00:23:22.640 | with print hello world.
00:23:23.960 | Certain other languages start with public static void main,
00:23:28.280 | class, zzzzzzzz, like all the ceremony, right?
00:23:32.120 | And so you go to teach a new person,
00:23:34.640 | hey, welcome to this new thing.
00:23:36.760 | Let's talk about public, access control classes.
00:23:40.300 | Wait, what's that?
00:23:41.140 | String system.out.println, like packages, like, ah!
00:23:46.080 | Right, and so instead, if you take this and you say,
00:23:48.720 | hey, we need packages, you know, modules.
00:23:51.720 | We need powerful things like classes.
00:23:54.220 | We need data structures.
00:23:55.720 | We need like all these things.
00:23:57.360 | The question is, how do you factor the complexity?
00:23:59.440 | And how do you make it so that the normal case scenario
00:24:02.840 | is that you're dealing with things
00:24:04.600 | that work the right way, the right way,
00:24:06.320 | give you good performance by default.
00:24:09.360 | But then as a power user, if you want to dive down to it,
00:24:12.360 | you have full C performance,
00:24:14.600 | full control over low-level pointers.
00:24:16.000 | You can call malloc if you want to call malloc.
00:24:18.320 | This is not recommended on the first page of every tutorial,
00:24:20.800 | but it's actually really important
00:24:22.280 | when you want to get work done, right?
00:24:23.760 | And so being able to have that is really the design
00:24:27.480 | in programming language design.
00:24:28.840 | And design is really, really hard.
00:24:31.300 | It's something that I think a lot of people
00:24:33.600 | kind of outside of UI, again,
00:24:36.760 | a lot of people just think is subjective.
00:24:39.360 | Like there's nothing, you know,
00:24:41.320 | it's just like curly braces or whatever.
00:24:43.600 | It's just like somebody's preference,
00:24:45.320 | but actually good design is something that you can feel.
00:24:48.720 | - And how many people are involved with good design?
00:24:52.080 | So if we looked at Swift, but look at historically,
00:24:54.840 | I mean, this might touch like,
00:24:57.320 | it's almost like a Steve Jobs question too.
00:24:59.680 | Like how much dictatorial decision-making
00:25:03.360 | is required versus collaborative.
00:25:08.320 | And we'll talk about how all that can go wrong or right.
00:25:12.000 | - Yeah, well, Swift, so I can't speak to in general,
00:25:14.400 | all design everywhere.
00:25:15.600 | So the way it works with Swift is that there's a core team.
00:25:19.800 | And so core team is six or seven people-ish,
00:25:22.480 | something like that,
00:25:23.320 | that is people that have been working with Swift
00:25:25.440 | since very early days.
00:25:26.640 | And so-
00:25:27.480 | - And by early days is not that long ago.
00:25:30.120 | - Okay, yeah.
00:25:30.960 | So it became public in 2014.
00:25:33.640 | So it's been six years public now,
00:25:35.520 | but still that's enough time that there's a story arc there.
00:25:38.840 | (laughs)
00:25:39.680 | - Okay, yeah.
00:25:40.520 | - And there's mistakes have been made that then get fixed
00:25:42.800 | and you learn something and then you, you know,
00:25:44.680 | and so what the core team does is it provides continuity.
00:25:48.400 | And so you wanna have a,
00:25:50.400 | okay, well, there's a big hole that we wanna fill.
00:25:54.020 | We know we wanna fill it.
00:25:55.280 | So don't do other things that invade that space
00:25:58.080 | until we fill the hole, right?
00:25:59.920 | There's a boulder that's missing here.
00:26:01.120 | We wanna do, we will do that boulder,
00:26:03.040 | even though it's not today, keep out of that space.
00:26:06.080 | - And the whole team remembers the myth of the boulder
00:26:10.360 | that's there.
00:26:11.200 | - Yeah, yeah.
00:26:12.020 | There's a general sense of what the future looks like
00:26:13.520 | in broad strokes and a shared understanding of that
00:26:16.440 | combined with a shared understanding of what has happened
00:26:18.780 | in the past that worked out well and didn't work out well.
00:26:22.080 | The next level out is you have the,
00:26:24.280 | what's called the Swift Evolution Community.
00:26:25.800 | And you've got, in that case, hundreds of people
00:26:27.680 | that really care passionately about the way Swift evolves.
00:26:31.000 | And that's like an amazing thing to, again,
00:26:33.880 | the core team doesn't necessarily need to come up
00:26:35.520 | with all the good ideas.
00:26:36.760 | You got hundreds of people out there
00:26:38.000 | that care about something and they come up
00:26:39.540 | with really good ideas too.
00:26:41.040 | And that provides this like tumbling,
00:26:42.960 | rock tumbler for ideas.
00:26:45.120 | And so the evolution process is, you know,
00:26:48.720 | a lot of people in a discourse forum,
00:26:50.320 | they're like hashing it out and trying to like talk about,
00:26:52.040 | okay, well, should we go left or right?
00:26:54.080 | Or if we did this, what would be good?
00:26:55.640 | And, you know, here you're talking about hundreds of people.
00:26:57.680 | So you're not gonna get consensus necessarily.
00:27:00.320 | You're not obvious consensus.
00:27:01.920 | And so there's a proposal process that then allows
00:27:06.280 | the core team and the community to work this out.
00:27:08.360 | And what the core team does is it aims to get consensus
00:27:12.120 | out of the community and provide guardrails,
00:27:14.960 | but also provide long-term,
00:27:17.400 | make sure we're going the right direction kind of things.
00:27:20.360 | - So does that group represent like the,
00:27:23.520 | how much people will love the user interface?
00:27:27.400 | Like do you think they're able to capture that?
00:27:29.400 | - Well, I mean, it's something we talk about a lot.
00:27:31.040 | It's something we care about.
00:27:32.320 | How well we do that, it's up for debate,
00:27:34.760 | but I think that we've done pretty well so far.
00:27:36.800 | - Is the beginner in mind?
00:27:38.560 | - Yeah.
00:27:39.400 | - 'Cause you said the progressive disclosure.
00:27:40.800 | - Yeah, so we care a lot about that, a lot about power,
00:27:45.080 | a lot about efficiency, a lot about,
00:27:46.920 | there are many factors to good design
00:27:48.680 | and you have to figure out a way to kind of
00:27:51.680 | work your way through that.
00:27:53.320 | - So if you like think about like a language I love is Lisp,
00:27:57.560 | probably still because I use Emacs,
00:27:59.360 | but I haven't done anything, any serious work in Lisp,
00:28:02.160 | but it has a ridiculous amount of parentheses.
00:28:06.520 | I've also, with Java and C++, the braces,
00:28:11.520 | I like, I enjoyed the comfort of being between braces.
00:28:19.800 | - Yeah, yeah, well let's talk--
00:28:20.960 | - And then Python is, sorry to interrupt,
00:28:23.120 | just like, and last thing to me, as a designer,
00:28:25.760 | if I was a language designer, God forbid,
00:28:28.720 | is I would be very surprised that Python
00:28:32.600 | with no braces would nevertheless somehow
00:28:36.680 | be comforting also.
00:28:38.160 | So like, I could see arguments for all of these.
00:28:40.600 | - But look at this, this is evidence
00:28:41.880 | that it's not about braces versus tabs.
00:28:44.200 | - Right, exactly, you're good, it's a good point.
00:28:46.960 | - Right, so like, you know, there's evidence that--
00:28:49.960 | - But see, like, it's one of the most argued about things.
00:28:52.320 | - Oh yeah, of course, just like tabs and spaces,
00:28:54.080 | which it doesn't, I mean, there's one obvious right answer,
00:28:57.160 | but it doesn't actually matter.
00:28:59.120 | - What's that?
00:28:59.960 | - Let's not, come on, we're friends.
00:29:01.760 | Like, come on, what are you trying to do to me here?
00:29:03.480 | - People are gonna, yeah, half the people
00:29:04.840 | are gonna tune out, yeah.
00:29:06.140 | - So-- - So at least you're able
00:29:08.520 | to identify things that don't really matter
00:29:11.040 | for the experience.
00:29:12.600 | - Well, no, no, no, it's always a really hard,
00:29:14.760 | so the easy decisions are easy, right?
00:29:16.880 | I mean, fine, those are not the interesting ones.
00:29:19.520 | The hard ones are the ones that are most interesting, right?
00:29:21.760 | The hard ones are the places where,
00:29:23.560 | hey, we wanna do a thing, everybody agrees we should do it,
00:29:27.000 | there's one proposal on the table,
00:29:28.880 | but it has all these bad things associated with it.
00:29:31.560 | Well, okay, what are we gonna do about that?
00:29:33.720 | Do we just take it?
00:29:34.980 | Do we delay it?
00:29:36.260 | Do we say, hey, well, maybe there's this other feature
00:29:38.520 | that if we do that first, this will work out better?
00:29:41.520 | How does this, if we do this,
00:29:44.080 | are we paying ourselves into a corner, right?
00:29:46.160 | And so this is where, again,
00:29:47.320 | you're having that core team of people
00:29:48.600 | that has some continuity and has perspective,
00:29:51.680 | has some of the historical understanding,
00:29:53.640 | is really valuable because you get,
00:29:56.120 | it's not just like one brain,
00:29:57.200 | you get the power of multiple people coming together
00:29:59.200 | to make good decisions,
00:30:00.120 | and then you get the best out of all these people,
00:30:02.520 | and you also can harness the community around it.
00:30:06.280 | - And what about the decision of whether,
00:30:08.520 | like in Python, having one type,
00:30:10.920 | or having strict typing?
00:30:14.120 | - Yeah, okay. - Many types.
00:30:15.080 | - Yeah, let's talk about this.
00:30:16.080 | So I like how you put that, by the way.
00:30:19.600 | So many people would say that Python doesn't have types.
00:30:21.920 | - Doesn't have types, yeah.
00:30:22.920 | - But you're right. - Well, I've listened
00:30:23.880 | to you enough to where, (laughs)
00:30:26.880 | I'm a fan of yours,
00:30:27.840 | and I've listened to way too many podcasts and videos
00:30:31.120 | of you talking about this.
00:30:32.440 | - Oh yeah, so I would argue that Python has one type,
00:30:34.760 | and so when you import Python into Swift,
00:30:38.160 | which, by the way, works really well,
00:30:39.760 | you have everything comes in as a Python object.
00:30:41.800 | Now, here there are trade-offs because,
00:30:44.040 | you know, it depends on what you're optimizing for,
00:30:47.440 | and Python is a super successful language
00:30:49.240 | for a really good reason.
00:30:51.040 | Because it has one type,
00:30:52.720 | you get duck typing for free and things like this,
00:30:55.320 | but also, you're pushing,
00:30:56.920 | you're making it very easy to pound out code on one hand,
00:31:00.600 | but you're also making it very easy
00:31:01.840 | to introduce complicated bugs that you have to debug,
00:31:05.280 | and you pass a string into something
00:31:07.280 | that expects an integer,
00:31:08.200 | and it doesn't immediately die,
00:31:10.200 | it goes all the way down the stack trace,
00:31:12.080 | and you find yourself in the middle of some code
00:31:13.480 | that you really didn't wanna know anything about,
00:31:14.920 | and it blows up, and you're just saying,
00:31:16.400 | well, what did I do wrong, right?
00:31:18.200 | And so types are good and bad,
00:31:20.840 | and they have trade-offs, they're good for performance,
00:31:22.720 | and certain other things,
00:31:23.600 | depending on where you're coming from,
00:31:24.720 | but it's all about trade-offs.
00:31:26.360 | And so this is what design is, right?
00:31:28.600 | Design is about weighing trade-offs
00:31:30.240 | and trying to understand the ramifications
00:31:32.600 | of the things that you're weighing,
00:31:34.280 | like types or not, or one type or many types.
00:31:37.280 | But also, within many types,
00:31:39.820 | how powerful do you make that type system
00:31:41.720 | is another very complicated question
00:31:44.480 | with lots of trade-offs.
00:31:45.400 | It's very interesting, by the way.
00:31:47.560 | But that's like one dimension.
00:31:50.840 | And there's a bunch of other dimensions.
00:31:53.400 | JIT compiled versus static compiled,
00:31:55.240 | garbage collected versus reference counted,
00:31:57.800 | versus manual memory management,
00:32:00.000 | versus, you know, like,
00:32:01.160 | and like all these different trade-offs
00:32:03.000 | and how you balance them
00:32:03.840 | are what make a program language good.
00:32:05.600 | - Concurrency. - Yep.
00:32:07.160 | - So in all those things, I guess,
00:32:08.960 | when you're designing the language,
00:32:11.320 | you also have to think of how that's gonna get
00:32:13.040 | all compiled down to--
00:32:15.200 | - If you care about performance, yeah.
00:32:17.420 | Well, and go back to Lisp, right?
00:32:18.760 | So Lisp, also I would say JavaScript
00:32:20.920 | is another example of a very simple language, right?
00:32:24.120 | And so one of the, so I also love Lisp.
00:32:27.200 | I don't use it as much as maybe you do or you did.
00:32:29.760 | - No, I think we're both, everyone who loves Lisp,
00:32:32.480 | it's like, you love, it's like, I don't know,
00:32:35.120 | I love Frank Sinatra,
00:32:36.240 | but like how often do I seriously listen to Frank Sinatra?
00:32:39.200 | - Sure, sure.
00:32:40.040 | But you look at that or you look at JavaScript,
00:32:42.760 | which is another very different
00:32:44.080 | but relatively simple language,
00:32:45.960 | and there's certain things that don't exist in the language,
00:32:49.100 | but there is inherent complexity
00:32:51.240 | to the problems that we're trying to model.
00:32:53.120 | And so what happens to the complexity?
00:32:54.640 | In the case of both of them, for example, you say,
00:32:57.440 | well, what about large-scale software development?
00:33:00.080 | Okay, well, you need something like packages.
00:33:02.360 | Neither language has a like language affordance for packages.
00:33:05.760 | And so what you get is patterns.
00:33:07.400 | You get things like NPN, you get things like,
00:33:09.720 | you know, like these ecosystems that get built around.
00:33:12.040 | And I'm a believer that if you don't model
00:33:15.120 | at least the most important inherent complexity
00:33:17.760 | in the language, then what ends up happening
00:33:19.600 | is that complexity gets pushed elsewhere.
00:33:22.760 | And when it gets pushed elsewhere,
00:33:24.120 | sometimes that's great because often building things
00:33:26.600 | as libraries is very flexible and very powerful
00:33:28.920 | and allows you to evolve and things like that.
00:33:30.720 | But often it leads to a lot of unnecessary divergence
00:33:34.040 | in the force and fragmentation.
00:33:35.600 | And when that happens, you just get kind of a mess.
00:33:39.560 | And so the question is, how do you balance that?
00:33:42.960 | Don't put too much stuff in the language
00:33:44.280 | 'cause that's really expensive and it makes things complicated
00:33:46.760 | but how do you model enough of the inherent complexity
00:33:49.640 | of the problem that you provide the framework
00:33:52.400 | and the structure for people to think about?
00:33:54.880 | Also, so the key thing to think about
00:33:57.240 | with programming languages,
00:33:59.080 | and you think about what a programming language is there for
00:34:01.360 | is it's about making a human more productive, right?
00:34:04.240 | And so like there's an old,
00:34:05.360 | I think it's a Steve Jobs quote about,
00:34:08.160 | it's a bicycle for the mind, right?
00:34:10.680 | You can definitely walk,
00:34:13.000 | but you'll get there a lot faster
00:34:15.280 | if you can bicycle on your way.
00:34:17.540 | - And a programming language is a bicycle for the mind?
00:34:20.160 | - Yeah.
00:34:21.000 | - Crazy, wow, that's a really interesting way
00:34:23.040 | to think about it.
00:34:23.920 | - By raising the level of abstraction,
00:34:25.520 | now you can fit more things in your head.
00:34:27.400 | By being able to just directly leverage somebody's library,
00:34:30.080 | you can now get something done quickly.
00:34:33.420 | In the case of Swift, SwiftUI is this new framework
00:34:36.160 | that Apple has released recently for doing UI programming.
00:34:39.760 | And it has this declarative programming model
00:34:43.000 | which defines away entire classes of bugs.
00:34:45.160 | It builds on value semantics
00:34:47.040 | and many other nice Swift things.
00:34:48.820 | And what this does is it allows you to get way more done
00:34:51.600 | with way less code.
00:34:53.260 | And now your productivity as a developer is much higher.
00:34:56.580 | Right?
00:34:57.420 | And so that's really what programming languages
00:34:59.420 | should be about,
00:35:00.260 | is it's not about tabs versus spaces
00:35:01.780 | or curly braces or whatever.
00:35:03.300 | It's about how productive do you make the person?
00:35:05.380 | And you can only see that when you have libraries
00:35:08.980 | that were built with the right intention
00:35:11.100 | that the language was designed for.
00:35:13.760 | And with Swift, I think we're still a little bit early,
00:35:16.640 | but SwiftUI and many other things that are coming out now
00:35:19.500 | are really showing that.
00:35:20.340 | And I think that they're opening people's eyes.
00:35:22.520 | - It's kind of interesting to think about like how that,
00:35:27.020 | you know, the knowledge of something
00:35:29.640 | of how good the bicycle is,
00:35:31.640 | how people learn about that, you know?
00:35:33.740 | So I've used C++.
00:35:36.060 | Now this is not going to be a trash talking session
00:35:38.960 | about C++, but I used C++ for a really long time.
00:35:41.880 | - You can go there if you want.
00:35:42.720 | (laughing)
00:35:43.540 | I have the scars.
00:35:44.380 | (laughing)
00:35:45.220 | - I feel like I spent many years without realizing
00:35:49.620 | like there's languages that could,
00:35:51.540 | for my particular lifestyle, brain style, thinking style,
00:35:56.540 | there's languages that could make me a lot more productive
00:36:00.340 | in the debugging stage, in the, just the development stage
00:36:04.380 | and thinking like the bicycle for the mind
00:36:05.980 | that I could fit more stuff into my-
00:36:07.780 | - Python's a great example of that, right?
00:36:09.260 | I mean, a machine learning framework in Python
00:36:10.980 | is a great example of that.
00:36:12.300 | It's just very high abstraction level.
00:36:14.700 | And so you can be thinking about things
00:36:15.900 | on a like very high level algorithmic level
00:36:19.060 | instead of thinking about, okay, well,
00:36:20.460 | am I copying this tensor to a GPU or not?
00:36:22.940 | Right?
00:36:23.820 | It's not what you want to be thinking about.
00:36:25.540 | - And as I was telling you, I mean,
00:36:26.660 | I guess the question I had is, you know,
00:36:29.760 | how does a person like me or in general people
00:36:31.780 | discover more productive, you know, languages?
00:36:36.780 | Like how, as I've been telling you offline,
00:36:39.960 | I've been looking for like a project to work on in Swift
00:36:43.220 | so I can really try it out.
00:36:45.580 | I mean, my intuition was like doing a hello world
00:36:48.620 | is not going to get me there.
00:36:50.460 | To get me to experience the power of the language.
00:36:53.820 | - You need a few weeks of change in metabolism.
00:36:55.980 | - Exactly.
00:36:56.820 | I think that's beautifully put.
00:36:58.260 | That's one of the problems with people with diets.
00:37:01.500 | Like I'm actually currently, to go in parallel,
00:37:05.300 | but in a small tangent is I've been recently
00:37:07.820 | eating only meat.
00:37:09.500 | Okay?
00:37:10.340 | - Okay.
00:37:11.180 | - Okay.
00:37:12.000 | - And most people are like,
00:37:13.260 | they think that's horribly unhealthy or whatever.
00:37:16.900 | You have like a million, whatever the science is,
00:37:20.660 | it just doesn't sound right.
00:37:22.540 | - Well, so back when I was in college,
00:37:24.180 | we did the Atkins diet.
00:37:25.220 | That was a thing.
00:37:26.540 | - Similar.
00:37:27.380 | And, but if you, you have to always give these things
00:37:29.700 | a chance.
00:37:30.740 | I mean, with dieting, always not dieting,
00:37:33.340 | but just the things that you like.
00:37:35.780 | If I eat personally, if I eat meat,
00:37:38.180 | just everything, I can be super focused,
00:37:40.220 | or more focused than usual.
00:37:42.060 | I just feel great.
00:37:44.040 | I mean, I've been running a lot,
00:37:46.320 | doing pushups and pulls and so on.
00:37:48.040 | I mean, Python is similar in that sense for me.
00:37:50.720 | - Where are you going with this?
00:37:52.280 | (laughing)
00:37:53.640 | - I mean, literally, I just felt,
00:37:55.800 | I had like a stupid smile on my face
00:37:58.040 | when I first started using Python.
00:38:00.800 | I could code up really quick things.
00:38:03.000 | Like I would see the world.
00:38:05.800 | I'll be empowered to write a script to,
00:38:10.200 | you know, to do some basic data processing,
00:38:11.860 | to rename files on my computer.
00:38:13.860 | Right?
00:38:14.700 | And like Perl didn't do that for me.
00:38:16.320 | It kind of, a little bit.
00:38:19.320 | - Well, and again, none of these are about
00:38:21.220 | which is best or something like that,
00:38:23.360 | but there's definitely better and worse here.
00:38:25.080 | - But it clicks.
00:38:26.120 | Well, yeah.
00:38:26.960 | - And if you look at Perl, for example,
00:38:29.340 | you get bogged down in scalars versus arrays
00:38:32.700 | versus hashes versus type globs
00:38:34.420 | and like all that kind of stuff.
00:38:35.780 | And Python's like, yeah, let's not do this.
00:38:38.640 | Right?
00:38:39.480 | It's debugging.
00:38:40.300 | Like everyone has different priorities,
00:38:41.560 | but for me, it's, can I create systems for myself
00:38:45.020 | that empower me to debug quickly?
00:38:47.880 | Like I've always been a big fan,
00:38:50.440 | even just crude, like asserts,
00:38:52.120 | like always stating things that should be true,
00:38:57.120 | which in Python, I found myself doing more
00:38:59.840 | because of type, all these kinds of stuff.
00:39:02.400 | - Well, you could think of types in a programming language
00:39:04.600 | as being kind of assert.
00:39:05.880 | - Yeah.
00:39:06.720 | - They get checked at compile time.
00:39:07.600 | Right?
00:39:08.920 | So how do you learn a new thing?
00:39:11.040 | Well, so this, or how do people learn new things?
00:39:13.520 | Right?
00:39:14.360 | This is hard.
00:39:15.320 | People don't like to change.
00:39:17.200 | People generally don't like change around them either.
00:39:19.320 | And so we're all very slow to adapt and change.
00:39:22.880 | And usually there's a catalyst that's required
00:39:25.480 | to force yourself over this.
00:39:28.000 | So for learning a programming language
00:39:30.040 | is really comes down to finding an excuse,
00:39:32.720 | like build a thing that the language is actually good for,
00:39:36.320 | that the ecosystem's ready for.
00:39:38.840 | And so if you were to write an iOS app, for example,
00:39:43.000 | that'd be the easy case.
00:39:44.240 | Obviously you would use Swift for that.
00:39:46.080 | Right?
00:39:46.920 | There are other--
00:39:47.760 | - Android.
00:39:48.600 | - So Swift runs on Android.
00:39:50.520 | - Oh, does it?
00:39:51.360 | - Oh yeah.
00:39:52.200 | Yeah, Swift runs in lots of places.
00:39:53.040 | - How does that work?
00:39:55.560 | - Okay, so Swift is built on top of LLVM.
00:39:58.600 | LLVM runs everywhere.
00:40:00.400 | LLVM, for example, builds the Android kernel.
00:40:03.200 | - Oh, wow.
00:40:04.040 | Okay.
00:40:04.880 | - So yeah.
00:40:05.720 | - I didn't realize this.
00:40:06.800 | - Yeah, so Swift is very portable,
00:40:08.760 | runs on Windows.
00:40:09.920 | There's, it runs on lots of different things.
00:40:12.600 | - And Swift, sorry to interrupt.
00:40:14.120 | Swift UI, and then there's a thing called UIKit.
00:40:17.920 | So can I build an app with Swift?
00:40:20.200 | - Well, so that's the thing,
00:40:22.160 | is the ecosystem is what matters there.
00:40:23.880 | So Swift UI and UIKit are Apple technologies.
00:40:27.040 | - Okay, got it.
00:40:27.880 | - And so they happen to,
00:40:28.720 | like Swift UI happens to be written in Swift,
00:40:30.520 | but it's an Apple proprietary framework
00:40:32.880 | that Apple loves and wants to keep on its platform,
00:40:35.560 | which makes total sense.
00:40:36.920 | You go to Android and you don't have that library.
00:40:39.000 | - Yeah.
00:40:39.840 | - Right, and so Android has a different ecosystem of things
00:40:42.880 | that hasn't been built out
00:40:44.080 | and doesn't work as well with Swift.
00:40:45.400 | And so you can totally use Swift to do like arithmetic
00:40:48.880 | and things like this,
00:40:49.720 | but building a UI with Swift on Android
00:40:51.720 | is not a great experience right now.
00:40:54.600 | - So if I wanted to learn Swift,
00:40:57.320 | what's the, I mean,
00:40:59.080 | the one practical different version of that
00:41:01.840 | is Swift for TensorFlow, for example.
00:41:05.560 | And one of the inspiring things for me
00:41:08.400 | with both TensorFlow and PyTorch
00:41:10.440 | is how quickly the community can like switch
00:41:13.080 | from different libraries.
00:41:14.680 | - Yeah.
00:41:15.520 | - Like you could see some of the communities
00:41:17.720 | switching to PyTorch now,
00:41:19.680 | but it's very easy to see.
00:41:21.920 | And then TensorFlow is really stepping up its game.
00:41:24.480 | And then there's no reason why,
00:41:26.120 | I think the way it works is basically
00:41:27.840 | it has to be one GitHub repo,
00:41:29.560 | like one paper steps up.
00:41:31.000 | - It gets people excited.
00:41:32.360 | - It gets people excited.
00:41:33.240 | And they're like, "Oh, I have to learn this."
00:41:36.040 | Swift for, what's Swift again?
00:41:39.520 | And then they learn and they fall in love with it.
00:41:41.200 | I mean, that's what happened with PyTorch.
00:41:43.080 | - There has to be a reason, a catalyst.
00:41:44.400 | - Yeah.
00:41:45.240 | - And so, and there, I mean, people don't like change,
00:41:48.680 | but it turns out that once you've worked
00:41:50.400 | with one or two programming languages,
00:41:52.640 | the basics are pretty similar.
00:41:54.080 | And so one of the fun things
00:41:55.720 | about learning programming languages,
00:41:57.320 | even maybe Lisp, I don't know if you agree with this,
00:41:59.840 | is that when you start doing that,
00:42:01.400 | you start learning new things.
00:42:03.200 | (laughing)
00:42:04.040 | 'Cause you have a new way to do things
00:42:05.640 | and you're forced to do them.
00:42:06.800 | And that forces you to explore
00:42:09.240 | and it puts you in learning mode.
00:42:10.280 | And when you get in learning mode,
00:42:11.360 | your mind kind of opens a little bit
00:42:12.760 | and you can see things in a new way,
00:42:15.280 | even when you go back to the old place.
00:42:17.040 | - Right.
00:42:17.880 | Yeah, so with Lisp, it's functional stuff.
00:42:21.160 | But I wish there was a kind of window,
00:42:23.640 | maybe you can tell me if there is, there you go.
00:42:26.080 | This is a question to ask,
00:42:28.280 | what is the most beautiful feature
00:42:29.680 | in a programming language?
00:42:30.960 | - Before I ask it, let me say like with Python,
00:42:33.320 | I remember when I saw Lisp Comprehensions.
00:42:36.720 | - Yeah.
00:42:37.560 | - Was like, when I really took it in.
00:42:40.840 | - Yeah.
00:42:42.000 | - I don't know, I just loved it.
00:42:43.720 | It was like fun to do.
00:42:45.280 | Like it was fun to do that kind of,
00:42:47.560 | it was something about it,
00:42:50.800 | to be able to filter through a list
00:42:52.920 | and to create a new list on a single line was elegant.
00:42:56.320 | I could all get into my head
00:42:58.240 | and it just made me fall in love with the language.
00:43:01.920 | - Yeah.
00:43:02.760 | - So is there, let me ask you a question.
00:43:04.880 | Is there, what do you use the most beautiful feature
00:43:07.600 | in a programming languages that you've ever encountered?
00:43:11.760 | In Swift maybe, and then outside of Swift?
00:43:15.160 | - I think the thing that I like the most
00:43:17.440 | from a programming language,
00:43:18.840 | so I think the thing you have to think about
00:43:21.240 | with a programming language, again, what is the goal?
00:43:23.600 | You're trying to get people to get things done quickly.
00:43:27.160 | And so you need libraries, you need high quality libraries,
00:43:30.480 | and then you need a user base around them
00:43:32.600 | that can assemble them and do cool things with them.
00:43:35.040 | And so to me, the question is,
00:43:36.200 | what enables high quality libraries?
00:43:38.240 | Okay.
00:43:40.680 | - Yeah.
00:43:41.520 | - And there's a huge divide in the world
00:43:43.400 | between libraries who enable high quality libraries
00:43:48.320 | versus the ones that put special stuff in the language.
00:43:52.800 | - So programming languages that enable
00:43:55.240 | - High quality libraries.
00:43:56.080 | - High quality libraries, got it.
00:43:57.400 | - So, and what I mean by that is expressive libraries
00:44:00.840 | that then feel like a natural integrated part
00:44:03.720 | of the language itself.
00:44:05.560 | So an example of this in Swift is the int and float
00:44:09.880 | and also array and string, things like this.
00:44:12.080 | These are all part of the library.
00:44:13.720 | Like int is not hard-coded into Swift.
00:44:16.160 | And so what that means is that because int
00:44:19.880 | is just a library thing defined in the standard library,
00:44:22.600 | along with strings and arrays and all the other things
00:44:24.640 | that come with the standard library.
00:44:26.240 | Well, hopefully you do like int,
00:44:29.240 | but anything that any language features
00:44:31.960 | that you needed to define int,
00:44:33.920 | you can also use in your own types.
00:44:36.080 | So if you wanted to find a quaternion
00:44:39.560 | or something like this, right?
00:44:41.440 | Well, it doesn't come in the standard library.
00:44:43.560 | There's a very special set of people
00:44:45.640 | that care a lot about this,
00:44:47.200 | but those people are also important.
00:44:49.400 | It's not about classism, right?
00:44:51.120 | It's not about the people who care about ints and floats
00:44:53.480 | are more important than the people care about quaternions.
00:44:55.760 | And so to me, the beautiful things
00:44:56.920 | about programming languages is when you allow
00:44:58.960 | those communities to build high quality libraries
00:45:02.280 | that feel native, that feel like they're built
00:45:03.760 | into the compiler without having to be.
00:45:06.840 | - What does it mean for the int to be part
00:45:11.120 | of a not hard-coded in?
00:45:13.200 | So is it like, how, so what is an int?
00:45:18.200 | - Okay, int is just a integer.
00:45:20.800 | In this case, it's like a 64 bit integer
00:45:23.560 | or something like this.
00:45:24.400 | - But so like the 64 bit is hard-coded or no?
00:45:28.120 | - No, none of that's hard-coded.
00:45:29.400 | So int, if you go look at how it's implemented,
00:45:32.160 | it's just a struct in Swift.
00:45:34.760 | And so it's a struct.
00:45:35.880 | And then how do you add two structs?
00:45:37.440 | Well, you define plus.
00:45:38.720 | And so you can define plus on int.
00:45:41.800 | Well, you can define plus on your thing too.
00:45:43.560 | You can define int has like an,
00:45:45.800 | is odd method or something like that on it.
00:45:47.800 | And so, yeah, you can add methods on the things.
00:45:50.400 | - Yeah.
00:45:51.320 | - So you can define operators, like how it behaves.
00:45:55.360 | That to you is beautiful.
00:45:56.360 | When there's something about the language
00:45:58.240 | which enables others to create libraries,
00:46:01.920 | which are not hacky.
00:46:05.360 | - Yeah, they feel native.
00:46:07.200 | And so one of the best examples of this is Lisp, right?
00:46:10.840 | Because in Lisp, all the libraries
00:46:13.800 | are basically part of the language, right?
00:46:15.440 | You write term rewrite systems and things like this.
00:46:18.120 | - Can you, as a counter example,
00:46:20.040 | provide what makes it difficult
00:46:22.400 | to write a library that's native?
00:46:23.880 | Is it the Python C?
00:46:25.520 | - Well, so one example, I'll give you two examples,
00:46:29.000 | Java and C++, or Java and C.
00:46:31.600 | They both allow you to define your own types,
00:46:35.760 | but int is hard-coded in the language.
00:46:38.440 | Okay, well, why?
00:46:39.360 | Well, in Java, for example,
00:46:41.200 | coming back to this whole reference,
00:46:42.480 | semantic value, semantic thing,
00:46:45.160 | int gets passed around by value.
00:46:47.440 | - Yeah, that.
00:46:49.720 | - But if you make a pair or something like that,
00:46:53.760 | a complex number, right?
00:46:55.120 | It's a class in Java,
00:46:56.840 | and now it gets passed around by reference, by pointer.
00:46:59.920 | And so now you lose value semantics, right?
00:47:02.600 | You lost math.
00:47:04.200 | Okay, well, that's not great, right?
00:47:06.880 | If you can do something with int,
00:47:08.160 | why can't I do it with my type?
00:47:09.640 | - Yeah.
00:47:10.480 | - Right, so that's the negative side
00:47:13.720 | of the thing I find beautiful,
00:47:15.320 | is when you can solve that,
00:47:17.320 | when you can have full expressivity,
00:47:19.240 | where you as a user of the language
00:47:21.680 | have as much or almost as much power
00:47:24.160 | as the people who implemented
00:47:25.480 | all the standard built-in stuff,
00:47:27.240 | because what that enables
00:47:28.440 | is that enables truly beautiful libraries.
00:47:31.400 | - You know, it's kind of weird,
00:47:32.560 | 'cause I've gotten used to that.
00:47:34.840 | That's one, I guess,
00:47:37.080 | other aspect of programming language design.
00:47:39.040 | You have to think, you know,
00:47:41.080 | the old first principles thinking,
00:47:43.480 | like, why are we doing it this way?
00:47:45.520 | By the way, I mean, I remember,
00:47:47.840 | 'cause I was thinking about the Walrus operator,
00:47:50.800 | and I'll ask you about it later,
00:47:53.240 | but it hit me that like the equal sign for assignment,
00:47:57.760 | like, why are we using the equal sign for assignment?
00:48:01.600 | - It's wrong, and that's not the only solution, right?
00:48:04.480 | So if you look at Pascal,
00:48:05.440 | they use colon equals for assignment
00:48:07.760 | and equals for equality,
00:48:11.440 | and they use like less than, greater than
00:48:12.960 | instead of the not equal thing.
00:48:14.560 | - Yeah.
00:48:15.400 | - Like, there are other answers here.
00:48:16.360 | - So, but like, and yeah, like, I ask you all,
00:48:19.920 | but how do you then decide to break convention?
00:48:24.880 | To say, you know what?
00:48:26.120 | Everybody's doing it wrong.
00:48:29.720 | We're gonna do it right.
00:48:30.960 | - Yeah.
00:48:31.960 | So it's like an ROI,
00:48:33.720 | like return on investment trade-off, right?
00:48:35.480 | So if you do something weird,
00:48:37.320 | let's just say like not,
00:48:38.840 | like colon equal instead of equal for assignment,
00:48:40.960 | that would be weird with today's aesthetic, right?
00:48:44.920 | And so you'd say, cool, this is theoretically better,
00:48:47.480 | but is it better in which ways?
00:48:49.640 | Like, what do I get out of that?
00:48:50.760 | Do I define away class of bugs?
00:48:52.360 | Well, one of the class of bugs that C has
00:48:54.280 | is that you can use like, you know,
00:48:55.880 | if X equals without equals equals,
00:48:58.840 | if X equals Y, right?
00:49:01.760 | Well, it turns out you can solve that problem
00:49:04.040 | in lots of ways.
00:49:05.240 | Clang, for example, GCC,
00:49:06.960 | all these compilers will detect that as a likely bug,
00:49:09.840 | produce a warning.
00:49:10.800 | Do they?
00:49:11.640 | - Yeah.
00:49:12.480 | - I feel like they didn't, or Clang does.
00:49:13.800 | GCC didn't.
00:49:15.960 | It's like, one of the important things
00:49:17.880 | about programming language design
00:49:19.280 | is like you're literally creating suffering in the world.
00:49:22.920 | (laughing)
00:49:23.960 | - Okay.
00:49:24.920 | - Like, I feel like,
00:49:26.720 | I mean, one way to see it is the bicycle for the mind,
00:49:29.200 | but the other way is to like minimizing suffering.
00:49:32.200 | - Well, you have to decide if it's worth it, right?
00:49:33.640 | And so let's come back to that.
00:49:35.560 | - Okay.
00:49:36.400 | - But if you look at this,
00:49:38.040 | and again, this is where there's a lot of detail
00:49:40.080 | that goes into each of these things,
00:49:41.840 | equal in C returns a value.
00:49:45.120 | - Yep.
00:49:47.600 | - That's messed up.
00:49:48.920 | That allows you to say X equals Y equals Z.
00:49:51.120 | Like that works in C.
00:49:52.440 | - Yeah.
00:49:53.440 | Is it messed up?
00:49:54.600 | You know, most people think it's messed up, I think.
00:49:57.560 | - It is very, by messed up,
00:50:00.080 | what I mean is it is very rarely used for good,
00:50:03.520 | and it's often used for bugs.
00:50:05.520 | - Yeah.
00:50:06.360 | - Right, and so.
00:50:07.200 | - That's a good definition of messed up, yeah.
00:50:09.400 | - You could use, you know, it's a, in hindsight,
00:50:12.080 | this was not such a great idea, right?
00:50:13.520 | Now, one of the things with Swift that is really powerful,
00:50:16.160 | and one of the reasons it's actually good,
00:50:18.400 | versus it being full of good ideas,
00:50:20.240 | is that when we launched Swift 1,
00:50:23.400 | we announced that it was public, people could use it,
00:50:26.800 | people could build apps,
00:50:27.880 | but it was gonna change and break, okay?
00:50:30.920 | When Swift 2 came out, we said, "Hey, it's open source,
00:50:33.160 | "and there's this open process
00:50:34.360 | "which people can help evolve and direct the language."
00:50:37.880 | So the community at large, like Swift users,
00:50:40.120 | can now help shape the language as it is,
00:50:43.120 | and what happened is that, as part of that process is,
00:50:46.120 | a lot of really bad mistakes got taken out.
00:50:48.680 | So for example, Swift used to have the C style
00:50:52.480 | plus plus and minus minus operators.
00:50:55.040 | Like, what does it mean when you put it before
00:50:56.560 | versus after, right?
00:50:59.320 | Well, that got cargo-culted from C into Swift early on.
00:51:02.600 | - What's cargo-culted?
00:51:03.720 | - Cargo-culted means brought forward
00:51:05.320 | without really considering it.
00:51:07.760 | - Okay.
00:51:08.600 | - This is maybe not the most PC term, but--
00:51:11.880 | - You have to look it up in Urban Dictionary, yeah.
00:51:13.600 | - Yeah, so it got pulled into C without,
00:51:17.520 | or it got pulled into Swift without very good consideration,
00:51:20.520 | and we went through this process,
00:51:22.200 | and one of the first things got ripped out
00:51:23.720 | was plus plus and minus minus,
00:51:25.620 | because they lead to confusion,
00:51:27.760 | they have very little value over saying, you know,
00:51:29.960 | X plus equals one, and X plus equals one is way more clear,
00:51:34.240 | and so when you're optimizing for teachability
00:51:36.360 | and clarity and bugs and this multidimensional space
00:51:39.600 | that you're looking at, things like that really matter,
00:51:42.340 | and so being first principles on where you're coming from
00:51:45.560 | and what you're trying to achieve
00:51:46.520 | and being anchored on the objective is really important.
00:51:50.160 | - Well, let me ask you about the most,
00:51:53.280 | sort of this podcast isn't about information,
00:51:58.160 | it's about drama.
00:51:59.320 | - Okay.
00:52:00.160 | - Let me talk to you about some drama.
00:52:01.360 | So you mentioned Pascal and colon equals,
00:52:06.320 | there's something that's called the Walrus operator,
00:52:09.560 | and Python 3.8 added the Walrus operator,
00:52:14.560 | and the reason I think it's interesting
00:52:17.600 | is not just 'cause of the feature,
00:52:20.400 | it has the same kind of expression feature
00:52:23.480 | you can imagine to see
00:52:24.320 | that it returns the value of the assignment,
00:52:27.240 | and maybe you can comment on that in general,
00:52:29.680 | but on the other side of it,
00:52:31.240 | it's also the thing that toppled the dictator.
00:52:36.240 | - So, okay.
00:52:37.960 | - It finally drove Guido to step down from BDFL,
00:52:41.280 | the toxicity of the community.
00:52:42.880 | So maybe, what do you think about the Walrus operator
00:52:46.000 | in Python, is there an equivalent thing in Swift
00:52:50.080 | that really stress tested the community,
00:52:54.200 | and then on the flip side,
00:52:56.680 | what do you think about Guido stepping down over it?
00:52:58.720 | - Yeah, well, if I look past the details
00:53:01.160 | of the Walrus operator,
00:53:02.400 | one of the things that makes it most polarizing
00:53:04.160 | is that it's syntactic sugar.
00:53:05.720 | - Okay, what do you mean by syntactic sugar?
00:53:09.120 | - It means you can take something
00:53:10.520 | that already exists in language
00:53:11.760 | and you can express it in a more concise way.
00:53:14.400 | - So, okay, I'm gonna play devil's advocate.
00:53:15.960 | So, this is great.
00:53:17.760 | Is that a objective or subjective statement?
00:53:21.560 | Like, can you argue that basically anything
00:53:24.400 | isn't syntactic sugar or not?
00:53:26.560 | - No, not everything is syntactic sugar.
00:53:30.320 | So, for example, the type system,
00:53:32.720 | like can you have classes versus,
00:53:35.680 | like, do you have types or not, right?
00:53:39.960 | So, one type versus many types
00:53:42.160 | is not something that affects syntactic sugar.
00:53:44.760 | And so, if you say, I wanna have the ability to define types,
00:53:48.240 | I have to have all this like language mechanics
00:53:49.960 | to define classes, and oh, now I have to have inheritance,
00:53:53.320 | and I have like, I have all this stuff,
00:53:55.040 | that's just making language more complicated.
00:53:57.080 | That's not about sugaring it.
00:53:59.240 | Swift has sugar.
00:54:02.360 | So, like Swift has this thing called if let,
00:54:04.320 | and it has various operators
00:54:06.560 | that are used to concisify specific use cases.
00:54:10.480 | So, the problem with syntactic sugar,
00:54:12.840 | when you're talking about, hey, I have a thing
00:54:15.000 | that takes a lot to write, and I have a new way to write it,
00:54:17.720 | you have this like horrible trade-off,
00:54:19.920 | which becomes almost completely subjective,
00:54:22.400 | which is how often does this happen
00:54:24.680 | and does it matter?
00:54:26.320 | And one of the things that is true about human psychology,
00:54:28.520 | particularly when you're talking about
00:54:29.360 | introducing a new thing,
00:54:30.400 | is that people overestimate the burden of learning something
00:54:35.400 | and so it looks foreign when you haven't gotten used to it.
00:54:38.920 | But if it was there from the beginning,
00:54:40.440 | of course, it's just part of Python.
00:54:42.080 | Like unquestionably, like this is just the thing I know,
00:54:45.160 | and it's not a new thing
00:54:46.760 | that you're worried about learning,
00:54:47.720 | it's just part of the deal.
00:54:49.480 | Now, with Guido, I don't know Guido well.
00:54:55.480 | - Yeah, have you passed across much?
00:54:56.920 | - Yeah, I've met him a couple of times,
00:54:58.200 | but I don't know Guido well.
00:55:00.000 | But the sense that I got out of that whole dynamic
00:55:03.280 | was that he had put not just the decision-maker weight
00:55:07.760 | on his shoulders, but it was so tied to his personal identity
00:55:11.920 | that he took it personally and he felt the need
00:55:15.040 | and he kind of put himself in the situation
00:55:16.520 | of being the person,
00:55:18.160 | instead of building a base of support around him.
00:55:20.920 | I mean, this is probably not quite literally true.
00:55:23.920 | But by--
00:55:24.960 | - Too much, so there's too much--
00:55:26.640 | - Too much concentrated on him, right?
00:55:28.800 | And so--
00:55:29.640 | - And that can wear you down.
00:55:31.320 | - Well, yeah, particularly because people then say,
00:55:33.720 | Guido, you're a horrible person, I hate this thing,
00:55:36.120 | blah, blah, blah, blah, blah, blah, blah.
00:55:37.520 | And sure, it's like, you know,
00:55:38.640 | maybe 1% of the community that's doing that.
00:55:41.160 | But Python's got a big community,
00:55:43.520 | and 1% of millions of people is a lot of hate mail.
00:55:46.600 | And that just from human factor will just wear on you.
00:55:49.440 | - Well, to clarify, it looked from just what I saw
00:55:52.520 | in the messaging for the,
00:55:53.960 | let's not look at the million Python users,
00:55:55.800 | but at the Python core developers,
00:55:58.360 | it feels like the majority,
00:56:00.080 | the big majority on a vote were opposed to it.
00:56:03.680 | - Okay, I'm not that close to it, so I don't know.
00:56:06.400 | - So this, okay, so the situation is like literally,
00:56:09.240 | yeah, I mean, the majority, the core developers,
00:56:13.120 | again, it's--
00:56:13.960 | - Were opposed to it.
00:56:14.800 | - So, and they weren't,
00:56:17.480 | they weren't even like against it.
00:56:19.800 | It was, there was a few,
00:56:22.240 | well, they were against it,
00:56:23.120 | but the against it wasn't like, this is a bad idea.
00:56:27.840 | They were more like, we don't see why this is a good idea.
00:56:31.280 | And what that results in is there's a stalling feeling,
00:56:35.200 | like you just slow things down.
00:56:38.040 | Now, from my perspective,
00:56:40.040 | now you could argue this,
00:56:41.640 | and I think it's very interesting
00:56:44.640 | if we look at politics today and the way Congress works,
00:56:47.640 | it's slowed down everything.
00:56:49.600 | - It's a dampener.
00:56:50.440 | - Yeah, it's a dampener,
00:56:51.380 | but that's a dangerous thing too,
00:56:53.680 | because if it dampens things,
00:56:55.480 | if the dampening results--
00:56:58.440 | - What are you talking about?
00:56:59.400 | Like, it's a low-pass filter,
00:57:00.560 | but if you need billions of dollars
00:57:02.360 | injected into the economy, or trillions of dollars,
00:57:05.100 | then suddenly stuff happens, right?
00:57:06.880 | And so--
00:57:07.720 | - For sure.
00:57:09.400 | So you're talking about--
00:57:10.480 | - I'm not defending our political situation,
00:57:12.000 | just to be clear.
00:57:13.360 | - But you're talking about like a global pandemic.
00:57:17.200 | I was hoping we could fix the healthcare system
00:57:20.600 | and the education system.
00:57:21.900 | - I'm not a politics person, I don't know.
00:57:26.260 | When it comes to languages,
00:57:28.160 | the community's kind of right,
00:57:29.600 | in terms of it's a very high burden
00:57:31.680 | to add something to a language.
00:57:33.240 | So as soon as you add something,
00:57:34.440 | you have a community of people building on it,
00:57:35.760 | and you can't remove it, okay?
00:57:38.120 | And if there's a community of people
00:57:39.660 | that feel really uncomfortable with it,
00:57:41.680 | then taking it slow, I think, is an important thing to do,
00:57:45.640 | and there's no rush,
00:57:46.720 | particularly if it's something that's 25 years old
00:57:49.200 | and is very established,
00:57:50.360 | and it's not like coming into its own.
00:57:53.520 | - What about features?
00:57:55.840 | - Well, so I think that the issue with Guido
00:57:58.800 | is that maybe this is a case
00:58:00.360 | where he realized it had outgrown him,
00:58:03.600 | and it went from being--
00:58:04.440 | - The feature or the language?
00:58:05.360 | - The language.
00:58:06.240 | So Python, I mean, Guido's amazing,
00:58:09.660 | but Python isn't about Guido anymore.
00:58:12.260 | It's about the users,
00:58:13.520 | and to a certain extent, the users own it.
00:58:15.320 | And Guido spent years of his life,
00:58:19.720 | a significant fraction of his career on Python,
00:58:22.880 | and from his perspective, I imagine he's like,
00:58:24.640 | "Well, this is my thing.
00:58:25.720 | "I should be able to do the thing I think is right."
00:58:28.220 | But you can also understand the users,
00:58:30.320 | where they feel like, "This is my thing.
00:58:33.020 | "I use this."
00:58:34.040 | And I don't know, it's a hard thing.
00:58:38.280 | - But if we could talk about leadership in this,
00:58:41.360 | 'cause it's so interesting.
00:58:42.200 | To me, I'm gonna work.
00:58:44.400 | Hopefully somebody makes it.
00:58:45.480 | If not, I'll make it a water stopper, I'm pretty sure,
00:58:47.640 | because I think it represents to me,
00:58:50.340 | maybe it's my Russian roots or something,
00:58:52.440 | it's the burden of leadership.
00:58:56.160 | I feel like to push back,
00:58:59.200 | I feel like progress can only,
00:59:02.960 | like most difficult decisions, just like you said,
00:59:06.240 | there'll be a lot of divisiveness over,
00:59:09.080 | especially in a passionate community.
00:59:12.180 | It just feels like leaders need to take
00:59:14.540 | those risky decisions that if you listen,
00:59:19.540 | that with some non-zero probability,
00:59:23.020 | maybe even a high probability,
00:59:24.420 | would be the wrong decision.
00:59:26.100 | But they have to use their gut and make that decision.
00:59:29.260 | - Well, this is one of the things
00:59:30.940 | where you see amazing founders.
00:59:34.180 | The founders understand exactly what's happened
00:59:36.220 | and how the company got there,
00:59:37.500 | and are willing to say, "We have been doing thing X
00:59:40.860 | "the last 20 years, but today we're gonna do thing Y."
00:59:45.460 | And they make a major pivot for the whole company,
00:59:47.380 | the company lines up behind them,
00:59:48.580 | they move, and it's the right thing.
00:59:50.540 | But then when the founder dies,
00:59:52.380 | the successor doesn't always feel that agency
00:59:57.060 | to be able to make those kinds of decisions.
00:59:59.140 | Even though they're a CEO,
01:00:00.020 | they could theoretically do whatever.
01:00:02.180 | There's two reasons for that, in my opinion.
01:00:04.420 | Or in many cases, it's always different.
01:00:07.340 | But one of which is, they weren't there
01:00:09.760 | for all the decisions that were made,
01:00:11.620 | and so they don't know the principles
01:00:13.360 | in which those decisions were made.
01:00:15.340 | And once the principles change,
01:00:17.100 | you should be obligated to change what you're doing
01:00:20.740 | and change direction.
01:00:21.900 | And so, if you don't know how you got to where you are,
01:00:25.860 | it just seems like gospel.
01:00:27.420 | And you're not gonna question it.
01:00:29.860 | You may not understand that it really is
01:00:31.780 | the right thing to do, so you just may not see it.
01:00:33.460 | - That's so brilliant.
01:00:34.300 | I never thought of it that way.
01:00:36.460 | It's so much higher burden when, as a leader,
01:00:39.340 | you step into a thing that's already worked for a long time.
01:00:41.740 | - Yeah, yeah.
01:00:42.580 | Well, and if you change it and it doesn't work out,
01:00:44.100 | now you're the person who screwed it up.
01:00:46.340 | People always second-guess that.
01:00:48.420 | And the second thing is that even if you decide
01:00:49.980 | to make a change, even if you're theoretically in charge,
01:00:53.500 | you're just a person that thinks they're in charge.
01:00:57.460 | Meanwhile, you have to motivate the troops.
01:00:58.860 | You have to explain it to them in terms they'll understand.
01:01:00.540 | You have to get them to buy into it and believe in it,
01:01:02.140 | because if they don't, then they're not gonna be able
01:01:05.260 | to make the turn, even if you tell them
01:01:07.180 | their bonuses are gonna be curtailed.
01:01:08.420 | They're just not gonna buy into it.
01:01:10.700 | And so there's only so much power you have as a leader.
01:01:13.460 | You have to understand what those limitations are.
01:01:16.380 | - Are you still BDFL?
01:01:18.220 | You've been BDFL of some stuff.
01:01:20.300 | You're very heavy on the B, the benevolent,
01:01:25.940 | benevolent dictator for life.
01:01:27.900 | I guess LLVM?
01:01:29.140 | - Yeah, so I still lead the LLVM world.
01:01:32.560 | - I mean, what's the role of, so then on Swift,
01:01:36.420 | you said that there's a group of people.
01:01:38.460 | - Yeah, so if you contrast Python with Swift, right?
01:01:41.620 | One of the reasons, so everybody on the core team
01:01:44.820 | takes the role really seriously,
01:01:46.380 | and I think we all really care about where Swift goes,
01:01:49.260 | but you're almost delegating the final decision-making
01:01:52.980 | to the wisdom of the group,
01:01:54.980 | and so it doesn't become personal.
01:01:56.720 | And also, when you're talking with the community,
01:01:59.680 | so yeah, some people are very annoyed
01:02:02.100 | at certain decisions that get made.
01:02:04.380 | There's a certain faith in the process,
01:02:06.320 | because it's a very transparent process,
01:02:08.140 | and when a decision gets made,
01:02:09.980 | a full rationale is provided, things like this.
01:02:12.220 | These are almost defense mechanisms
01:02:14.460 | to help both guide future discussions
01:02:16.540 | and provide case law, kind of like Supreme Court does
01:02:18.860 | about this decision was made for this reason,
01:02:21.020 | and here's the rationale
01:02:21.940 | and what we wanna see more of or less of.
01:02:24.160 | But it's also a way to provide a defense mechanism
01:02:27.620 | so that when somebody's griping about it,
01:02:29.020 | they're not saying that person did the wrong thing.
01:02:32.020 | They're saying, well, this thing sucks,
01:02:34.020 | and (growls) and later they move on and they get over it.
01:02:38.580 | - Yeah, the analogy of the Supreme Court,
01:02:40.140 | I think, is really good, but then, okay,
01:02:43.820 | not to get personal on the Swift team,
01:02:45.680 | but it just seems like it's impossible
01:02:50.020 | for division not to emerge.
01:02:52.800 | - Well, each of the humans on the Swift core team,
01:02:55.320 | for example, are different,
01:02:56.980 | and the membership of the Swift core team
01:02:58.380 | changes slowly over time,
01:03:00.520 | which is, I think, a healthy thing.
01:03:02.540 | And so each of these different humans
01:03:04.020 | have different opinions.
01:03:05.220 | Trust me, it's not a singular consciousness
01:03:09.380 | by any stretch of the imagination.
01:03:11.000 | You've got three major organizations,
01:03:12.840 | including Apple, Google, and Sci-Fi
01:03:14.540 | all kind of working together,
01:03:16.380 | and it's a small group of people, but you need high trust.
01:03:20.140 | You need, again, it comes back to the principles
01:03:21.900 | of what you're trying to achieve
01:03:23.360 | and understanding what you're optimizing for.
01:03:27.460 | And I think that starting with strong principles
01:03:30.500 | and working towards decisions is always a good way
01:03:33.280 | to both make wise decisions in general,
01:03:36.260 | but then be able to communicate them to people
01:03:37.900 | so that they can buy into them, and that is hard.
01:03:41.380 | And so you mentioned LLVM.
01:03:42.660 | LLVM is gonna be 20 years old this December,
01:03:46.740 | so it's showing its own age.
01:03:49.460 | - Do you have like a dragon cake plan?
01:03:53.540 | - Oh, we should definitely do that.
01:03:54.700 | Yeah, if we can have a pandemic cake.
01:03:57.740 | - Pandemic cake.
01:03:58.940 | - Everybody gets a slice of cake
01:04:00.380 | and it gets sent through email.
01:04:02.260 | But LLVM has had tons of its own challenges over time too,
01:04:08.940 | right, and one of the challenges
01:04:10.340 | that the LLVM community has, in my opinion,
01:04:13.300 | is that it has a whole bunch of people
01:04:15.260 | that have been working on LLVM for 10 years, right,
01:04:19.100 | 'cause this happened somehow,
01:04:20.940 | and LLVM's always been one way,
01:04:22.780 | but it needs to be a different way, right?
01:04:25.100 | And they've worked on it for like 10 years
01:04:26.660 | is a long time to work on something,
01:04:28.580 | and you suddenly can't see the faults
01:04:32.180 | in the thing that you're working on,
01:04:33.500 | and LLVM has lots of problems,
01:04:34.900 | and we need to address them, and we need to make it better,
01:04:36.740 | and if we don't make it better,
01:04:37.740 | then somebody else will come up with a better idea, right?
01:04:40.300 | And so it's just kind of of that age
01:04:42.540 | where the community is in danger of getting too calcified,
01:04:46.620 | and so I'm happy to see new projects joining
01:04:50.460 | and new things mixing it up.
01:04:52.020 | Fortran is now a new thing in the LLVM community,
01:04:54.540 | which is hilarious and good.
01:04:56.340 | - I've been trying to find, on this little tangent,
01:04:59.020 | find people who program in Cobalt or Fortran,
01:05:02.380 | Fortran especially, to talk to, they're hard to find.
01:05:06.500 | - Yeah, look to the scientific community.
01:05:09.860 | They still use Fortran quite a bit.
01:05:11.700 | - Well, interesting thing you kind of mentioned with LLVM,
01:05:14.300 | or just in general, that if something evolved,
01:05:17.020 | you're not able to see the faults.
01:05:19.740 | So do you fall in love with the thing over time,
01:05:23.140 | or do you start hating everything about the thing over time?
01:05:26.340 | - Well, so my personal folly is that I see,
01:05:31.060 | maybe not all, but many of the faults,
01:05:33.500 | and they grate on me, and I don't have time to go fix 'em.
01:05:35.620 | - Yeah, and they get magnified over time.
01:05:37.580 | - Well, and they may not get magnified,
01:05:38.940 | but they never get fixed, and it's like sand underneath,
01:05:42.060 | it's just like grating against you,
01:05:43.620 | and it's like sand underneath your fingernails or something.
01:05:45.860 | It's just like you know it's there, you can't get rid of it.
01:05:49.700 | And so the problem is that if other people don't see it,
01:05:53.060 | nobody ever, like I can't go,
01:05:55.700 | I don't have time to go write the code and fix it anymore,
01:05:58.460 | but then people are resistant to change,
01:06:01.460 | and so you say, "Hey, we should go fix this thing."
01:06:03.060 | They're like, "Oh yeah, that sounds risky."
01:06:05.300 | It's like, well, is it the right thing or not?
01:06:07.220 | - Are the challenges the group dynamics,
01:06:10.220 | or is it also just technical?
01:06:11.660 | I mean, some of these features,
01:06:13.260 | I think as an observer, as almost like a fan in the,
01:06:19.660 | as a spectator of the whole thing,
01:06:21.420 | I don't often think about,
01:06:23.860 | some things might actually be
01:06:25.060 | technically difficult to implement.
01:06:27.580 | - An example of this is we built
01:06:29.060 | this new compiler framework called MLIR.
01:06:31.660 | MLIR is a whole new framework.
01:06:34.220 | It's not, many people think it's about machine learning.
01:06:37.340 | The ML stands for multi-level,
01:06:39.180 | because compiler people can't name things very well, I guess.
01:06:41.780 | - Can we dig into what MLIR is?
01:06:45.260 | - Yeah, so when you look at compilers,
01:06:47.740 | compilers have historically been
01:06:49.900 | solutions for a given space.
01:06:51.700 | So LLVM is a, it's really good for dealing with CPUs,
01:06:56.580 | let's just say, at a high level.
01:06:58.100 | You look at Java, Java has a JVM.
01:07:01.620 | The JVM is very good for garbage collected languages
01:07:04.300 | that need dynamic compilation,
01:07:05.540 | and it's very optimized for a specific space.
01:07:08.420 | And so Hotspot is one of the compilers
01:07:09.980 | that gets used in that space,
01:07:11.020 | and that compiler is really good at that kind of stuff.
01:07:14.080 | Usually when you build these domain-specific compilers,
01:07:16.740 | you end up building the whole thing from scratch
01:07:19.620 | for each domain.
01:07:20.540 | - What's a domain?
01:07:23.380 | So what's the scope of a domain?
01:07:26.660 | - Well, so here I would say, like, if you look at Swift,
01:07:29.180 | there's several different parts to the Swift compiler,
01:07:31.940 | one of which is covered by the LLVM part of it.
01:07:36.100 | There's also a high-level piece that's specific to Swift,
01:07:39.420 | and there's a huge amount of redundancy
01:07:41.540 | between those two different infrastructures,
01:07:44.060 | and a lot of re-implemented stuff
01:07:46.380 | that is similar but different.
01:07:48.340 | - What does LLVM define?
01:07:50.020 | - LLVM is effectively an infrastructure,
01:07:53.020 | so you can mix and match it in different ways.
01:07:55.140 | It's built out of libraries.
01:07:56.060 | You can use it for different things,
01:07:57.620 | but it's really good at CPUs and GPUs.
01:07:59.820 | CPUs and, like, the tip of the iceberg on GPUs.
01:08:02.500 | It's not really great at GPUs.
01:08:04.340 | Okay.
01:08:05.660 | But it turns out-- - And a bunch of languages
01:08:07.060 | that-- - That then use it
01:08:08.900 | to talk to CPUs. - Got it.
01:08:11.060 | - And so it turns out there's a lot of hardware out there
01:08:13.100 | that is custom accelerators,
01:08:14.820 | so machine learning, for example.
01:08:16.140 | There are a lot of matrix multiply accelerators
01:08:18.780 | and things like this.
01:08:20.580 | There's a whole world of hardware synthesis,
01:08:22.820 | so we're using MLIR to build circuits, okay?
01:08:27.180 | And so you're compiling for a domain of transistors,
01:08:30.860 | and so what MLIR does is it provides
01:08:32.460 | a tremendous amount of compiler infrastructure
01:08:34.460 | that allows you to build these domain-specific compilers
01:08:37.500 | in a much faster way and have the result be good.
01:08:41.900 | - If we're thinking about the future,
01:08:44.380 | now we're talking about, like, ASICs, so anything?
01:08:46.900 | - Yeah, yeah.
01:08:47.740 | - So if we project into the future,
01:08:50.540 | it's very possible that the number of these kinds of ASICs,
01:08:54.460 | very specific infrastructure thing,
01:08:59.460 | the architecture things, like, multiplies exponentially.
01:09:04.740 | - I hope so, yeah.
01:09:06.340 | - So that's MLIR--
01:09:08.620 | - So what MLIR does is it allows you
01:09:10.780 | to build these compilers very efficiently, right?
01:09:13.500 | Now, one of the things that, coming back to the LLVM thing,
01:09:16.740 | and then we'll go to hardware,
01:09:17.980 | is LLVM is a specific compiler for a specific domain.
01:09:22.980 | MLIR is now this very general, very flexible thing
01:09:26.900 | that can solve lots of different kinds of problems.
01:09:29.300 | So LLVM is a subset of what MLIR does.
01:09:32.420 | - So MLIR is, I mean, it's an ambitious project then.
01:09:35.380 | - Yeah, it's a very ambitious project, yeah.
01:09:37.020 | And so to make it even more confusing,
01:09:39.860 | MLIR has joined the LLVM Umbrella Project,
01:09:42.460 | so it's part of the LLVM family.
01:09:45.140 | But where this comes full circle is now folks
01:09:47.620 | that work on the LLVM part,
01:09:49.380 | the classic part that's 20 years old,
01:09:51.980 | aren't aware of all the cool new things
01:09:54.100 | that have been done in the new thing,
01:09:56.140 | that MLIR was built by me and many other people
01:09:59.620 | that knew a lot about LLVM,
01:10:01.860 | and so we fixed a lot of the mistakes that lived in LLVM.
01:10:05.140 | And so now you have this community dynamic
01:10:07.100 | where it's like, well, there's this new thing,
01:10:08.540 | but it's not familiar, nobody knows it,
01:10:10.340 | it feels like it's new, and so let's not trust it.
01:10:12.860 | And so it's just really interesting
01:10:13.980 | to see the cultural social dynamic that comes out of that.
01:10:16.900 | And I think it's super healthy
01:10:19.500 | because we're seeing the ideas percolate
01:10:21.540 | and we're seeing the technology diffusion happen
01:10:23.980 | as people get more comfortable with it,
01:10:25.260 | they start to understand things in their own terms.
01:10:27.220 | And this just gets to the,
01:10:28.820 | it takes a while for ideas to propagate,
01:10:31.220 | even though they may be very different
01:10:33.980 | than what people are used to.
01:10:35.260 | - Maybe let's talk about that a little bit,
01:10:37.220 | the world of ASICs.
01:10:38.740 | Well, actually, you have a new role at SciFive.
01:10:43.740 | What's that place about?
01:10:47.420 | What is the vision? - Sure.
01:10:49.460 | - For their vision for, I would say,
01:10:51.820 | the future of computing.
01:10:53.220 | - So I lead the engineering and product teams at SciFive.
01:10:55.940 | SciFive is a company who was founded
01:10:59.700 | with this architecture called RISC-V.
01:11:02.620 | RISC-V is a new instruction set.
01:11:04.420 | Instruction sets are the things inside of your computer
01:11:06.300 | that tell it how to run things.
01:11:08.420 | x86 from Intel and ARM from the ARM company
01:11:12.060 | and things like this are other instruction sets.
01:11:13.900 | - I've talked to, sorry to interrupt,
01:11:15.020 | I've talked to Dave Patterson,
01:11:16.020 | who's super excited about RISC-V.
01:11:17.980 | - Dave is awesome.
01:11:18.900 | - Yeah, he's brilliant.
01:11:20.540 | - The RISC-V is distinguished by not being proprietary.
01:11:23.700 | And so x86 can only be made by Intel and AMD,
01:11:28.820 | ARM can only be made by ARM,
01:11:30.380 | they sell licenses to build ARM chips to other companies,
01:11:33.340 | things like this.
01:11:34.460 | Mips is another instruction set
01:11:35.540 | that is owned by the Mips company, now Wave,
01:11:38.300 | and then it gets licensed out, things like that.
01:11:40.860 | And so RISC-V is an open standard
01:11:43.340 | that anybody can build chips for.
01:11:45.140 | And so SciFive was founded by three of the founders
01:11:48.220 | of RISC-V that designed and built it in Berkeley,
01:11:51.580 | working with Dave.
01:11:52.860 | And so that was the genesis of the company.
01:11:55.780 | SciFive today has some of the world's best RISC-V cores
01:11:59.060 | and we're selling them and that's really great.
01:12:01.420 | They're going to tons of products, it's very exciting.
01:12:04.060 | - So they're taking this thing that's open source
01:12:06.100 | and just trying to be or are the best in the world
01:12:09.620 | at building these things.
01:12:10.780 | - Yeah, so here it's the specifications open source.
01:12:13.260 | It's like saying TCP/IP is an open standard
01:12:15.940 | or C is an open standard,
01:12:18.020 | but then you have to build an implementation
01:12:19.620 | of the standard.
01:12:20.780 | And so SciFive, on the one hand, pushes forward
01:12:23.660 | and defined and pushes forward the standard.
01:12:26.260 | On the other hand, we have implementations
01:12:28.100 | that are best in class for different points in the space,
01:12:30.980 | depending on if you want a really tiny CPU
01:12:33.620 | or if you want a really big beefy one that is faster,
01:12:36.940 | but it uses more area and things like this.
01:12:38.820 | - What about the actual manufacturer?
01:12:41.220 | So like what, where does that all fit?
01:12:43.540 | I'm gonna ask a bunch of dumb questions.
01:12:45.300 | - That's okay, this is how we learn, right?
01:12:48.140 | And so the way this works is that there's generally
01:12:52.500 | a separation of the people who design the circuits
01:12:55.100 | and then the people who manufacture them.
01:12:56.820 | And so you'll hear about fabs like TSMC and Samsung
01:13:00.740 | and things like this that actually produce the chips,
01:13:03.780 | but they take a design coming in
01:13:05.820 | and that design specifies how the, you know,
01:13:09.940 | you turn code for the chip into little rectangles
01:13:14.940 | that then use photolithography to make mask sets
01:13:20.260 | and then burn transistors onto a chip
01:13:22.260 | or onto silicon rather.
01:13:24.700 | - So, and we're talking about mass manufacturing, so.
01:13:28.340 | - Yeah, they're talking about making hundreds
01:13:29.580 | of millions of parts and things like that, yeah.
01:13:31.340 | And so the fab handles the volume production,
01:13:33.540 | things like that.
01:13:34.620 | But when you look at this problem,
01:13:36.340 | the interesting thing about the space when you look at it
01:13:39.700 | is that these, the steps that you go from designing a chip
01:13:44.340 | and writing the quote unquote code for it
01:13:46.260 | and things like Verilog and languages like that,
01:13:49.180 | down to what you hand off to the fab
01:13:51.620 | is a really well-studied, really old problem, okay?
01:13:56.200 | Tons of people have worked on it.
01:13:57.540 | Lots of smart people have built systems and tools.
01:14:00.540 | These tools then have generally gone through acquisitions.
01:14:03.460 | And so they've ended up at three different major companies
01:14:06.140 | that build and sell these tools.
01:14:07.740 | They're called EDA tools,
01:14:08.940 | like for electronic design automation.
01:14:11.620 | The problem with this is you have huge amounts
01:14:13.180 | of fragmentation, you have loose standards,
01:14:17.140 | and the tools don't really work together.
01:14:20.020 | So you have tons of duct tape
01:14:21.300 | and you have tons of lost productivity.
01:14:24.220 | - Now, these are tools for designing.
01:14:26.700 | So the RISC-V is a instruction, like what is RISC-V?
01:14:31.700 | Like how deep does it go?
01:14:33.300 | How much does it touch the hardware?
01:14:35.860 | How much does it define how much of the hardware is?
01:14:38.420 | - Yeah, so RISC-V is all about, given a CPU,
01:14:41.860 | so the processor in your computer,
01:14:44.860 | how does the compiler, like the Swift compiler,
01:14:47.380 | the C compiler, things like this, how does it make it work?
01:14:50.460 | So it's what is the assembly code?
01:14:52.660 | And so you write RISC-V assembly
01:14:54.140 | instead of X86 assembly, for example.
01:14:57.060 | - But it's a set of instructions as opposed to--
01:14:59.020 | - A set of instructions, yeah.
01:15:00.060 | - What do you say, it tells you how the compiler works?
01:15:03.660 | - Sorry, it's what the compiler talks to.
01:15:05.380 | - Okay. - Yeah.
01:15:06.220 | - And then the tooling you mentioned,
01:15:08.500 | the disparate tools are for what?
01:15:10.700 | - For when you're building a specific chip.
01:15:13.340 | So RISC-V-- - In hardware.
01:15:14.900 | - In hardware, yeah.
01:15:15.740 | So RISC-V, you can buy a RISC-V core from Sci-5
01:15:19.140 | and say, "Hey, I wanna have a certain number of,
01:15:21.580 | "run a certain number of gigahertz.
01:15:23.300 | "I want it to be this big.
01:15:24.580 | "I want it to have these features.
01:15:26.740 | "I wanna have, like, I want floating point or not,"
01:15:29.860 | for example.
01:15:30.820 | And then what you get is you get a description of a CPU
01:15:34.700 | with those characteristics.
01:15:36.620 | Now, if you wanna make a chip,
01:15:38.140 | you wanna build like an iPhone chip or something like that,
01:15:41.020 | right, you have to take both the CPU,
01:15:42.740 | but then you have to talk to memory,
01:15:44.420 | you have to have timers, IOs, a GPU, other components.
01:15:49.300 | And so you need to pull all those things together
01:15:51.380 | into what's called an ASIC,
01:15:53.860 | an application-specific integrated circuit,
01:15:55.500 | so a custom chip.
01:15:56.860 | And then you take that design,
01:15:58.980 | and then you have to transform it into something
01:16:00.860 | that the fabs, like TSMC, for example,
01:16:03.940 | know how to take to production.
01:16:06.740 | - Got it.
01:16:07.580 | So, but yeah, okay.
01:16:08.580 | - And so that process, I will,
01:16:10.580 | I can't help but see it as, is a big compiler.
01:16:15.380 | (Dave laughs)
01:16:16.940 | It's a whole bunch of compilers written
01:16:18.820 | without thinking about it through that lens.
01:16:21.380 | - Isn't the universe a compiler?
01:16:23.740 | (laughs)
01:16:24.580 | - Yeah, like compilers do two things.
01:16:26.820 | They represent things and transform them.
01:16:29.100 | And so there's a lot of things that end up being compilers.
01:16:31.820 | But this is a space where we're talking about design
01:16:34.700 | and usability and the way you think about things,
01:16:37.460 | the way things compose correctly, it matters a lot.
01:16:40.900 | And so Sci-5 is investing a lot into that space.
01:16:43.460 | And we think that there's a lot of benefit
01:16:45.900 | that can be made by allowing people
01:16:47.460 | to design chips faster, get them to market quicker
01:16:50.340 | and scale out because, you know,
01:16:53.860 | the alleged end of Moore's law,
01:16:56.420 | you've got this problem of you're not getting
01:16:59.260 | free performance just by waiting another year
01:17:01.980 | for a faster CPU.
01:17:03.540 | And so you have to find performance in other ways.
01:17:06.540 | And one of the ways to do that is with custom accelerators
01:17:09.060 | and other things in hardware.
01:17:10.660 | - And so, well, we'll talk a little bit about,
01:17:16.180 | a little more about ASICs,
01:17:17.420 | but do you see that a lot of people,
01:17:21.980 | a lot of companies will try to have
01:17:25.020 | like different sets of requirements
01:17:26.980 | that this whole process to go for?
01:17:28.380 | So like almost different car companies might use different
01:17:32.620 | and like different PC manufacturers.
01:17:35.220 | Like, so is this, like is RISC-V in this whole process,
01:17:40.220 | is it potentially the future of all computing devices?
01:17:44.820 | - Yeah, I think that, so if you look at RISC-V
01:17:47.420 | and step back from the Silicon side of things,
01:17:49.620 | RISC-V is an open standard.
01:17:51.540 | And one of the things that has happened
01:17:53.860 | over the course of decades,
01:17:55.420 | if you look over the long arc of computing,
01:17:57.780 | somehow became decades old.
01:17:59.220 | - Yeah.
01:18:00.060 | - Is that you have companies that come and go
01:18:02.660 | and you have instruction sets that come and go.
01:18:04.860 | Like one example of this out of many is Sun with Spark.
01:18:09.860 | - Yeah.
01:18:10.740 | - Sun went away, Spark still lives on at Fujitsu,
01:18:12.980 | but we have HP had this instruction set called PA-RISC.
01:18:17.300 | So PA-RISC was its big server business
01:18:21.020 | and had tons of customers.
01:18:22.900 | They decided to move to this architecture
01:18:25.100 | called Itanium from Intel.
01:18:27.180 | - Yeah.
01:18:28.020 | - This didn't work out so well.
01:18:29.620 | - Yeah.
01:18:30.460 | - Right, and so you have this issue
01:18:32.180 | of you're making many billion dollar investments
01:18:35.380 | on instruction sets that are owned by a company.
01:18:38.220 | And even companies as big as Intel
01:18:39.740 | don't always execute as well as they could.
01:18:42.460 | They have their own issues.
01:18:43.860 | HP, for example, decided that it wasn't
01:18:46.700 | in their best interest to continue investing in the space
01:18:48.620 | 'cause it was very expensive.
01:18:49.700 | And so they make technology decisions
01:18:52.180 | or they make their own business decisions.
01:18:54.180 | And this means that as a customer, what do you do?
01:18:57.860 | You've sunk all this time, all this engineering,
01:18:59.680 | all this software work, all these,
01:19:01.300 | you've built other products around them
01:19:02.540 | and now you're stuck, right?
01:19:05.020 | What RISC-V does is it provides you
01:19:06.700 | more optionality in the space,
01:19:08.260 | because if you buy an implementation of RISC-V from Sci-5,
01:19:12.620 | and you should, they're the best ones.
01:19:14.380 | - Yeah.
01:19:15.200 | - But if something bad happens to Sci-5 in 20 years, right?
01:19:19.420 | Well, great, you can turn around
01:19:21.180 | and buy a RISC-V core from somebody else.
01:19:23.300 | And there's an ecosystem of people
01:19:24.980 | that are all making different RISC-V cores
01:19:26.580 | with different trade-offs,
01:19:28.100 | which means that if you have more than one requirement,
01:19:30.580 | if you have a family of products,
01:19:31.900 | you can probably find something in the RISC-V space
01:19:34.660 | that fits your needs.
01:19:35.980 | Whereas if you're talking about XA6, for example,
01:19:39.340 | Intel's only gonna bother to make
01:19:41.700 | certain classes of devices, right?
01:19:45.020 | - I see, so maybe a weird question,
01:19:47.700 | but if Sci-5 is infinitely successful
01:19:52.700 | in the next 20, 30 years, what does the world look like?
01:19:58.060 | So how does the world of computing change?
01:20:01.860 | - So too much diversity in hardware instruction sets,
01:20:05.300 | I think is bad.
01:20:06.540 | Like we have a lot of people that are using
01:20:08.660 | lots of different instruction sets,
01:20:10.980 | particularly in the embedded,
01:20:12.220 | the like very tiny microcontroller space,
01:20:14.340 | the thing in your toaster,
01:20:15.580 | that are just weird and different for historical reasons.
01:20:21.060 | And so the compilers and the tool chains
01:20:23.100 | and the languages on top of them aren't there, right?
01:20:27.220 | And so the developers for that software
01:20:29.220 | have to use really weird tools
01:20:31.060 | because the ecosystem that supports is not big enough.
01:20:34.220 | So I expect that will change, right?
01:20:35.460 | People will have better tools and better languages,
01:20:38.020 | better features everywhere that then can service
01:20:40.340 | many different points in the space.
01:20:42.100 | And I think RISC-V will progressively
01:20:46.300 | eat more of the ecosystem because it can scale up,
01:20:49.420 | it can scale down, sideways, left, right.
01:20:51.580 | It's very flexible and very well considered
01:20:53.820 | and well-designed instruction set.
01:20:56.380 | I think when you look at Sci-5 tackling silicon
01:20:58.780 | and how people build chips,
01:20:59.980 | which is a very different space,
01:21:03.940 | that's where you say,
01:21:05.140 | I think we'll see a lot more custom chips.
01:21:07.500 | And that means that you get much more battery life,
01:21:09.780 | you get better tuned solutions for your IoT thingy.
01:21:14.780 | So you get people that move faster,
01:21:18.220 | you get the ability to have faster time to market,
01:21:20.660 | for example.
01:21:21.500 | - So how many custom, so first of all,
01:21:23.660 | on the IoT side of things,
01:21:24.980 | do you see the number of smart toasters
01:21:29.020 | increasing exponentially?
01:21:30.220 | So, (laughs)
01:21:32.420 | and if you do, how much customization per toaster is there?
01:21:38.820 | Do all toasters in the world run the same silicon,
01:21:42.660 | like the same design?
01:21:44.020 | Or is it different companies have different design?
01:21:46.020 | Like how much customization is possible here?
01:21:49.700 | - Well, a lot of it comes down to cost.
01:21:52.220 | Right, and so the way that chips work
01:21:54.180 | is you end up paying by the,
01:21:56.100 | one of the factors is the size of the chip.
01:21:59.700 | And so what ends up happening
01:22:01.400 | just from an economic perspective is,
01:22:03.200 | there's only so many chips that get made in a year
01:22:05.860 | of a given design.
01:22:07.340 | And so often what customers end up having to do
01:22:10.260 | is they end up having to pick up a chip that exists
01:22:12.260 | that was built for somebody else
01:22:14.140 | so that they can then ship their product.
01:22:16.540 | And the reason for that is they don't have the volume
01:22:18.340 | of the iPhone, they can't afford to build a custom chip.
01:22:21.700 | However, what that means is they're now buying
01:22:23.820 | an off the shelf chip that isn't really good,
01:22:26.900 | isn't a perfect fit for their needs,
01:22:28.260 | and so they're paying a lot of money for it
01:22:30.060 | because they're buying silicon that they're not using.
01:22:33.500 | Well, if you now reduce the cost of designing the chip,
01:22:36.600 | now you get a lot more chips.
01:22:37.780 | And the more you reduce it, the easier it is to design chips.
01:22:41.500 | The more the world keeps evolving
01:22:44.300 | and we get more AI accelerators, we get more other things,
01:22:46.740 | we get more standards to talk to, we get 6G, right?
01:22:50.940 | You get changes in the world that you wanna be able
01:22:53.820 | to talk to these different things.
01:22:54.780 | There's more diversity in the cross product of features
01:22:57.220 | that people want, and that drives differentiated chips
01:23:01.620 | in another direction.
01:23:03.300 | And so nobody really knows what the future looks like,
01:23:05.620 | but I think that there's a lot of silicon in the future.
01:23:09.780 | - Speaking of the future,
01:23:11.180 | you said Moore's law allegedly is dead.
01:23:13.740 | So do you agree with Dave Patterson and many folks
01:23:18.740 | that Moore's law is dead?
01:23:22.080 | Or do you agree with Jim Keller,
01:23:26.180 | who's standing at the helm of the pirate ship
01:23:28.660 | saying it's-- - Still alive.
01:23:30.740 | - It's still alive.
01:23:31.660 | - Yeah, well, so I agree with what they're saying
01:23:35.740 | and different people are interpreting
01:23:37.820 | the end of Moore's law in different ways.
01:23:39.740 | So Jim would say, there's another 1000X left in physics
01:23:44.220 | and we can continue to squeeze the stone
01:23:46.940 | and make it faster and smaller and smaller geometries
01:23:50.100 | and all that kind of stuff.
01:23:51.420 | He's right.
01:23:53.540 | So Jim is absolutely right
01:23:55.260 | that there's a ton of progress left
01:23:57.860 | and we're not at the limit of physics yet.
01:23:59.960 | That's not really what Moore's law is though.
01:24:03.980 | If you look at what Moore's law is,
01:24:06.660 | is that it's a very simple evaluation of,
01:24:10.700 | okay, well, you look at the cost for,
01:24:13.620 | I think it was cost per area
01:24:15.020 | and the most economic point in that space.
01:24:17.060 | And if you go look at the now quite old paper
01:24:20.060 | that describes this,
01:24:21.900 | Moore's law has a specific economic aspect to it.
01:24:25.500 | And I think this is something that Dave
01:24:26.780 | and others often point out.
01:24:28.260 | And so on a technicality, that's right.
01:24:30.580 | I look at it from,
01:24:33.340 | so I can acknowledge both of those viewpoints.
01:24:35.020 | - They're both right.
01:24:35.860 | - They're both right.
01:24:36.700 | I'll give you a third wrong viewpoint
01:24:39.220 | that may be right in its own way,
01:24:40.340 | which is single threaded performance
01:24:43.060 | doesn't improve like it used to.
01:24:46.060 | And it used to be back when you got a,
01:24:48.500 | you know, a Pentium 66 or something.
01:24:50.640 | And the year before you had a Pentium 33
01:24:53.820 | and now it's twice as fast, right?
01:24:56.740 | Well, it was twice as fast at doing exactly the same thing.
01:25:00.380 | Okay.
01:25:01.220 | Like literally the same program ran twice as fast.
01:25:03.820 | You just wrote a check and waited a year, year and a half.
01:25:07.020 | Well, so that's what a lot of people think about Moore's law.
01:25:10.100 | And I think that is dead.
01:25:11.820 | And so what we're seeing instead is we're pushing,
01:25:15.260 | we're pushing people to write software in different ways.
01:25:17.260 | And so we're pushing people to write CUDA
01:25:19.060 | so they can get GPU compute
01:25:20.960 | and the thousands of cores on GPU.
01:25:23.400 | We're talking about C programmers having to use P threads
01:25:26.360 | because they now have, you know, a hundred threads
01:25:29.120 | or 50 cores in a machine or something like that.
01:25:31.960 | You're not talking about machine learning accelerators.
01:25:33.680 | They're now domain specific.
01:25:35.080 | And when you look at these kinds of use cases,
01:25:38.460 | you can still get performance.
01:25:40.440 | And Jim will come up with cool things
01:25:42.640 | that utilize the Silicon in new ways for sure.
01:25:45.760 | But you're also gonna change the programming model.
01:25:48.400 | - Right.
01:25:49.240 | - And now when you start talking about
01:25:50.060 | changing the programming model,
01:25:50.940 | that's when you come back to languages
01:25:53.060 | and things like this too,
01:25:54.020 | because often what you see is like,
01:25:58.020 | you take the C programming language, right?
01:25:59.820 | The C programming language is designed for CPUs.
01:26:02.220 | And so if you wanna talk to a GPU,
01:26:04.980 | now you're talking to its cousin CUDA.
01:26:08.140 | Okay, CUDA is a different thing
01:26:10.540 | with a different set of tools, a different world,
01:26:12.860 | a different way of thinking.
01:26:14.380 | And we don't have one world that scales.
01:26:16.940 | And I think that we can get there.
01:26:18.440 | We can have one world that scales in a much better way.
01:26:21.040 | - On a small tangent then,
01:26:22.480 | I think most programming languages are designed
01:26:24.720 | for CPUs for a single core, even just in their spirit,
01:26:28.920 | even if they allow for parallelization.
01:26:30.480 | So what does it look like for a programming language
01:26:34.160 | to have parallelization or massive parallelization
01:26:38.660 | as its like first principle?
01:26:41.320 | - So the canonical example of this
01:26:43.520 | is the hardware design world.
01:26:46.400 | So Verilog, VHDL, these kinds of languages,
01:26:50.020 | they're what's called a high-level synthesis language.
01:26:53.500 | This is the thing people design chips in.
01:26:56.860 | And when you're designing a chip,
01:26:58.140 | it's kind of like a brain
01:26:59.780 | where you have infinite parallelism.
01:27:01.860 | Like you've got, you're like laying down transistors.
01:27:05.580 | Transistors are always running, okay?
01:27:08.380 | And so you're not saying run this transistor,
01:27:10.260 | then this transistor, then this transistor.
01:27:12.340 | It's like your brain,
01:27:13.180 | like your neurons are always just doing something.
01:27:15.140 | They're not clocked, right?
01:27:16.820 | They're just doing their thing.
01:27:20.200 | And so when you design a chip or when you design a CPU,
01:27:23.560 | when you design a GPU,
01:27:24.540 | when you design, when you're laying down the transistors,
01:27:27.280 | similarly, you're talking about, well, okay,
01:27:29.000 | well, how do these things communicate?
01:27:31.120 | And so these languages exist.
01:27:32.760 | Verilog is a kind of mixed example of that.
01:27:36.160 | None of these languages are really great.
01:27:37.600 | - Yeah, they're very low level, yeah.
01:27:39.560 | - Yeah, they're very low level
01:27:40.680 | and abstraction is necessary here.
01:27:42.520 | And there's different approaches with that.
01:27:44.520 | And it's itself a very complicated world,
01:27:47.360 | but it's implicitly parallel.
01:27:50.640 | And so having that as the domain that you program towards
01:27:55.640 | makes it so that by default, you get parallel systems.
01:27:59.480 | If you look at CUDA,
01:28:00.320 | CUDA is a point halfway in the space where in CUDA,
01:28:03.680 | when you write a CUDA kernel for your GPU,
01:28:05.940 | it feels like you're writing a scalar program.
01:28:08.100 | So you're like, you have ifs, you have for loops,
01:28:10.000 | stuff like this, you're just writing normal code.
01:28:12.600 | But what happens outside of that in your driver
01:28:14.840 | is that it actually is running you on like
01:28:16.800 | a thousand things at once, right?
01:28:18.880 | And so it's parallel,
01:28:20.560 | but it has pulled it out of the programming model.
01:28:23.060 | And so now you as a programmer are working in a simpler
01:28:27.760 | world and it's solved that for you, right?
01:28:31.520 | - How do you take the language like Swift?
01:28:33.760 | You know, if we think about GPUs, but also ASICs,
01:28:39.040 | maybe if we can dance back and forth
01:28:40.920 | between hardware and software.
01:28:42.520 | (laughs)
01:28:43.680 | Is, you know, how do you design for these features
01:28:46.720 | to be able to program, make it a first class citizen
01:28:50.000 | to be able to do like Swift for TensorFlow,
01:28:53.080 | to be able to do machine learning on current hardware,
01:28:56.640 | but also future hardware like TPUs and all kinds of ASICs
01:29:00.600 | that I'm sure will be popping up more and more.
01:29:02.200 | - Yeah, well, so a lot of this comes down to this whole idea
01:29:05.360 | of having the nuts and bolts underneath the covers
01:29:07.360 | that work really well.
01:29:08.600 | So you need, if you're talking to TPUs,
01:29:10.400 | you need, you know, MLIR, XLA, or one of these compilers
01:29:13.760 | that talks to TPUs to build on top of, okay?
01:29:17.400 | And if you're talking to circuits,
01:29:19.320 | you need to figure out how to lay down the transistors
01:29:21.520 | and how to organize it and how to set up clocking
01:29:23.280 | and like all the domain problems that you get with circuits.
01:29:26.280 | Then you have to decide how to explain it to a human.
01:29:29.800 | What is the UI, right?
01:29:31.840 | And if you do it right, that's a library problem,
01:29:34.480 | not a language problem.
01:29:36.440 | And that works if you have a library or a language
01:29:39.080 | which allows your library to write things
01:29:42.120 | that feel native in the language by implementing libraries,
01:29:45.840 | because then you can innovate in programming models
01:29:49.240 | without having to change your syntax again.
01:29:51.200 | And like you have to invent new code formatting tools
01:29:54.880 | and like all the other things that languages come with.
01:29:57.520 | And this gets really interesting.
01:29:59.920 | And so if you look at the space, the interesting thing,
01:30:03.400 | once you separate out syntax,
01:30:05.840 | becomes what is that programming model?
01:30:07.840 | And so do you want the CUDA style,
01:30:10.240 | I write one program and it runs many places.
01:30:12.760 | Do you want the implicitly parallel model?
01:30:16.800 | How do you reason about that?
01:30:17.760 | How do you give developers, chip architects,
01:30:20.800 | the ability to express their intent?
01:30:24.080 | And that comes into this whole design question
01:30:26.280 | of how do you detect bugs quickly?
01:30:29.160 | So you don't have to tape out a chip
01:30:30.240 | to find out what's wrong, ideally, right?
01:30:32.600 | How do you, and this is a spectrum,
01:30:35.520 | how do you make it so that people feel productive?
01:30:38.520 | So their turnaround time is very quick.
01:30:40.480 | All these things are really hard problems.
01:30:42.440 | And in this world, I think that not a lot of effort
01:30:46.120 | has been put into that design problem
01:30:48.080 | and thinking about the layering and other pieces.
01:30:51.160 | - Well, you've, on the topic of concurrency,
01:30:53.520 | you've written the Swift concurrency manifest.
01:30:55.600 | I think it's kind of interesting.
01:30:57.640 | Anything that has the word manifesto in it
01:31:00.640 | is very interesting.
01:31:02.400 | Can you summarize the key ideas
01:31:04.080 | of each of the five parts you've written about?
01:31:07.400 | - So what is a manifesto?
01:31:08.880 | - Yes.
01:31:09.720 | - How about we start there?
01:31:10.920 | So in the Swift community, we have this problem,
01:31:15.160 | which is on the one hand,
01:31:16.120 | you wanna have relatively small proposals
01:31:19.320 | that you can kind of fit in your head,
01:31:21.440 | you can understand the details at a very fine-grained level
01:31:24.080 | that move the world forward.
01:31:26.000 | But then you also have these big arcs, okay?
01:31:28.880 | And often when you're working on something
01:31:30.800 | that is a big arc, but you're tackling it in small pieces,
01:31:34.080 | you have this question of,
01:31:35.160 | how do I know I'm not doing a random walk?
01:31:37.400 | Where are we going?
01:31:38.720 | Like, how does this add up?
01:31:39.760 | Furthermore, when you start that first,
01:31:42.120 | the first small step, what terminology do you use?
01:31:45.280 | How do we think about it?
01:31:46.560 | What is better and worse in the space?
01:31:47.920 | What are the principles?
01:31:48.760 | What are we trying to achieve?
01:31:50.080 | And so what a manifesto in the Swift community does
01:31:52.080 | is it starts to say, hey, well,
01:31:53.920 | let's step back from the details of everything.
01:31:56.640 | Let's paint a broad picture to talk about how,
01:31:59.720 | what we're trying to achieve.
01:32:01.280 | Let's give an example design point.
01:32:02.760 | Let's try to paint the big picture
01:32:05.280 | so that then we can zero in on the individual steps
01:32:07.400 | and make sure that we're making good progress.
01:32:09.680 | And so the Swift Concurrency Manifesto
01:32:11.240 | is something I wrote three years ago.
01:32:13.880 | It's been a while, maybe more.
01:32:16.240 | Trying to do that for Swift and concurrency.
01:32:18.660 | And it starts with some fairly simple things,
01:32:22.400 | like making the observation that
01:32:24.000 | when you have multiple different computers
01:32:26.720 | or multiple different threads that are communicating,
01:32:28.920 | it's best for them to be asynchronous.
01:32:30.800 | And so you need things to be able to run separately
01:32:34.520 | and then communicate with each other.
01:32:35.840 | And this means asynchrony.
01:32:37.440 | And this means that you need a way
01:32:39.000 | to modeling asynchronous communication.
01:32:41.720 | Many languages have features like this.
01:32:43.720 | Async/await is a popular one.
01:32:45.400 | And so that's what I think is very likely in Swift.
01:32:48.220 | But as you start building this tower of abstractions,
01:32:51.380 | it's not just about how do you write this?
01:32:53.640 | You then reach into the, how do you get memory safety?
01:32:57.460 | Because you want correctness.
01:32:58.360 | You want debuggability and sanity for developers.
01:33:01.680 | And how do you get that memory safety into the language?
01:33:06.620 | So if you take a language like Go or C
01:33:09.000 | or any of these languages,
01:33:10.420 | you get what's called a race condition
01:33:11.920 | when two different threads or Go routines or whatever
01:33:14.920 | touch the same point in memory.
01:33:16.480 | This is a huge, maddening problem to debug
01:33:21.280 | because it's not reproducible generally.
01:33:24.500 | And so there's tools, there's a whole ecosystem
01:33:26.480 | of solutions that built up around this.
01:33:28.320 | But it's a huge problem when you're writing concurrent code.
01:33:31.040 | And so with Swift, this whole value semantics thing
01:33:34.160 | is really powerful there because it turns out
01:33:36.160 | that math and copies actually work
01:33:39.080 | even in concurrent worlds.
01:33:40.680 | And so you get a lot of safety just out of the box,
01:33:43.280 | but there are also some hard problems
01:33:44.640 | and it talks about some of that.
01:33:47.040 | When you start building up to the next level up
01:33:48.840 | and you start talking beyond memory safety,
01:33:50.520 | you have to talk about what is the programmer model?
01:33:52.960 | How does a human think about this?
01:33:54.240 | So a developer that's trying to build a program
01:33:56.760 | think about this and it proposes a really old model
01:34:00.160 | with a new spin called actors.
01:34:02.040 | Actors are about saying we have islands
01:34:05.360 | of single threadedness logically.
01:34:08.120 | So you write something that feels like it's one programming,
01:34:10.660 | one program running in a unit,
01:34:13.200 | and then it communicates asynchronously with other things.
01:34:16.720 | And so making that expressive and natural feel good
01:34:20.840 | be the first thing you reach for and being safe by default
01:34:23.480 | is a big part of the design of that proposal.
01:34:26.600 | When you start going beyond that,
01:34:27.680 | now you start to say, cool,
01:34:28.680 | well, these things that communicate asynchronously,
01:34:31.080 | they don't have to share memory.
01:34:32.980 | Well, if they don't have to share memory
01:34:34.240 | and they're sending messages to each other,
01:34:36.080 | why do they have to be in the same process?
01:34:38.240 | These things should be able to be in different processes
01:34:41.480 | on your machine and why just processes?
01:34:44.040 | Well, why not different machines?
01:34:45.680 | And so now you have a very nice gradual transition
01:34:49.600 | towards distributed programming.
01:34:51.760 | And of course, when you start talking about the big future,
01:34:54.800 | the manifesto doesn't go into it,
01:34:56.980 | but accelerators are things you talk to asynchronously
01:35:01.980 | by sending messages to them.
01:35:03.560 | How do you program those?
01:35:05.820 | Well, that gets very interesting.
01:35:07.720 | That's not in the proposal.
01:35:09.400 | - So, and how much do you wanna make that explicit,
01:35:14.400 | like the control of that whole process
01:35:17.040 | explicit to the programmer?
01:35:18.120 | - Yeah, good question.
01:35:19.240 | So when you're designing any of these kinds of features
01:35:22.880 | or language features or even libraries,
01:35:25.320 | you have this really hard trade-off you have to make,
01:35:27.720 | which is how much is it magic
01:35:29.800 | or how much is it in the human's control?
01:35:32.120 | How much can they predict and control it?
01:35:34.720 | What do you do when the default case is the wrong case?
01:35:38.680 | And so when you're designing a system,
01:35:42.180 | I won't name names, but there are systems
01:35:46.520 | where it's really easy to get started
01:35:51.000 | and then you jump, so let's pick like Logo, okay?
01:35:54.360 | So something like this.
01:35:55.560 | So it's really easy to get started.
01:35:57.120 | It's really designed for teaching kids,
01:35:59.520 | but as you get into it, you hit a ceiling
01:36:02.040 | and then you can't go any higher.
01:36:03.200 | And then what do you do?
01:36:04.080 | Well, you have to go switch to a different world
01:36:05.560 | and rewrite all your code.
01:36:07.160 | And this Logo is a silly example here.
01:36:09.120 | This exists in many other languages.
01:36:11.360 | With Python, you would say like concurrency, right?
01:36:15.260 | So Python has the global interpreter lock,
01:36:17.320 | so threading is challenging in Python.
01:36:19.480 | And so if you start writing a large-scale application
01:36:22.600 | in Python and then suddenly you need concurrency,
01:36:25.140 | you're kind of stuck with a series of bad trade-offs, right?
01:36:28.420 | There's other ways to go where you say like,
01:36:32.240 | voiced all the complexity on the user all at once, right?
01:36:37.040 | And that's also bad in a different way.
01:36:38.800 | And so what I prefer is building a simple model
01:36:43.480 | that you can explain that then has an escape hatch.
01:36:46.960 | So you get in, you have guardrails,
01:36:49.440 | memory safety works like this in Swift
01:36:52.120 | where you can start with,
01:36:53.960 | like by default, if you use all the standard things,
01:36:56.400 | it's memory safe, you're not gonna shoot your foot off.
01:36:58.640 | But if you wanna get a C-level pointer to something,
01:37:02.320 | you can explicitly do that.
01:37:04.320 | - But by default, there's guardrails.
01:37:07.760 | - There's guardrails.
01:37:08.880 | - Okay, so, but like, you know,
01:37:11.120 | whose job is it to figure out
01:37:14.320 | which part of the code is parallelizable?
01:37:17.360 | - So in the case of the proposal, it is the human's job.
01:37:20.960 | So they decide how to architect their application.
01:37:24.200 | And then the runtime and the compiler is very predictable.
01:37:28.120 | And so this is in contrast to,
01:37:31.560 | like there's a long body of work,
01:37:32.920 | including on Fortran for auto-parallelizing compilers.
01:37:35.980 | And this is an example of a bad thing.
01:37:40.160 | So as a compiler person, I can rag on compiler people.
01:37:43.480 | Often compiler people will say,
01:37:45.600 | "Cool, since I can't change the code,
01:37:47.280 | I'm gonna write my compiler
01:37:48.480 | that then takes this unmodified code
01:37:50.040 | and makes it go way faster on this machine."
01:37:52.680 | Okay, application, and so it does pattern matching.
01:37:56.280 | It does like really deep analysis.
01:37:58.320 | Compiler people are really smart.
01:37:59.560 | And so they like wanna like do something
01:38:01.380 | really clever and tricky.
01:38:02.400 | And you get like 10X speed up
01:38:04.120 | by taking like an array of structures
01:38:06.100 | and turn it into a structure of arrays or something,
01:38:08.040 | because it's so much better for memory.
01:38:09.280 | Like there's bodies, like tons of tricks.
01:38:11.640 | - Yeah, they love optimization.
01:38:13.800 | - Yeah, you love optimization.
01:38:14.640 | - Everyone loves optimization.
01:38:15.720 | - Everyone loves it.
01:38:16.560 | Well, and it's this promise of build with my compiler
01:38:19.080 | and your thing goes fast.
01:38:20.720 | Right, but here's the problem.
01:38:22.520 | Lex, you write a program.
01:38:24.680 | You run it with my compiler, it goes fast.
01:38:26.560 | You're very happy.
01:38:27.400 | Wow, it's so much faster than the other compiler.
01:38:29.480 | Then you go and you add a feature to your program
01:38:31.220 | or you refactor some code.
01:38:32.680 | And suddenly you got a 10X loss in performance.
01:38:35.720 | Well, why?
01:38:36.560 | What just happened there?
01:38:37.560 | What just happened there is the heuristic,
01:38:39.840 | the pattern matching, the compiler,
01:38:41.960 | whatever analysis it was doing just got defeated
01:38:43.920 | because you didn't inline a function or something, right?
01:38:48.200 | As a user, you don't know, you don't wanna know.
01:38:50.240 | That was the whole point.
01:38:51.080 | You don't wanna know how the compiler works.
01:38:52.780 | You don't wanna know how the memory hierarchy works.
01:38:54.560 | You don't wanna know how it got parallelized
01:38:56.040 | across all these things.
01:38:57.340 | You wanted that abstract away from you.
01:38:59.840 | But then the magic is lost as soon as you did something
01:39:02.840 | and you fall off a performance cliff.
01:39:05.000 | And now you're in this funny position where,
01:39:07.520 | what do I do?
01:39:08.360 | I don't change my code.
01:39:09.180 | I don't fix that bug.
01:39:10.840 | It costs 10X performance.
01:39:12.280 | Now what do I do?
01:39:13.580 | Well, this is the problem with unpredictable performance.
01:39:15.960 | If you care about performance,
01:39:17.320 | predictability is a very important thing.
01:39:19.480 | And so what the proposal does is it provides
01:39:23.760 | architectural patterns for being able to lay out your code,
01:39:26.680 | gives you full control over that,
01:39:28.320 | makes it really simple so you can explain it.
01:39:30.120 | And then if you wanna scale out in different ways,
01:39:34.720 | you have full control over that.
01:39:36.520 | - So in your sense, the intuition is for a compiler,
01:39:39.400 | it's too hard to do automated parallelization.
01:39:42.520 | Like, you know, 'cause the compilers do stuff automatically
01:39:47.520 | that's incredibly impressive for other things.
01:39:49.860 | - Right.
01:39:50.700 | - But for parallelization, we're not even,
01:39:53.420 | we're not close to there.
01:39:54.580 | - Well, it depends on the programming model.
01:39:56.220 | So there's many different kinds of compilers.
01:39:58.420 | And so if you talk about like a C compiler,
01:40:00.340 | a Swift compiler, something like that,
01:40:01.900 | where you're writing imperative code,
01:40:04.940 | parallelizing that and reasoning about all the pointers
01:40:07.100 | and stuff like that is a very difficult problem.
01:40:10.100 | Now, if you switch domains,
01:40:12.220 | so there's this cool thing called machine learning, right?
01:40:15.540 | So the machine learning nerds,
01:40:17.620 | among other endearing things like, you know,
01:40:19.420 | solving cat detectors and other things like that,
01:40:22.080 | have done this amazing breakthrough
01:40:25.380 | of producing a programming model,
01:40:27.520 | operations that you compose together,
01:40:29.380 | that has raised the levels of abstraction high enough
01:40:33.160 | that suddenly you can have auto-parallelizing compilers.
01:40:36.740 | You can write a model using TensorFlow
01:40:39.580 | and have it run on 1,024 nodes of a TPU.
01:40:43.420 | - Yeah, that's true.
01:40:44.260 | I didn't even think about like, you know,
01:40:46.860 | 'cause there's so much flexibility
01:40:48.180 | in the design of architectures
01:40:49.580 | that ultimately boil down to a graph
01:40:51.420 | that's parallelizable for you, parallelized for you.
01:40:54.160 | - And if you think about it, that's pretty cool.
01:40:56.620 | - That's pretty cool, yeah.
01:40:57.620 | - And you think about batching, for example,
01:40:59.740 | as a way of being able to exploit more parallelism.
01:41:02.180 | - Yeah.
01:41:03.020 | - Like that's a very simple thing
01:41:03.860 | that now is very powerful.
01:41:05.380 | That didn't come out of the programming language nerds,
01:41:07.740 | right, those people.
01:41:08.880 | Like that came out of people
01:41:10.100 | that are just looking to solve a problem
01:41:11.440 | and use a few GPUs and organically developed
01:41:14.020 | by the community of people focusing on machine learning.
01:41:16.860 | And it's an incredibly powerful abstraction layer
01:41:19.860 | that enables the compiler people to go and exploit that.
01:41:22.780 | And now you can drive supercomputers from Python.
01:41:26.380 | That's pretty cool.
01:41:27.500 | - That's amazing.
01:41:28.340 | So just to pause on that,
01:41:29.420 | 'cause I'm not sufficiently low level,
01:41:32.260 | I forget to admire the beauty and power of that.
01:41:35.360 | But maybe just to linger on it,
01:41:38.500 | like what does it take to run a neural network fast?
01:41:42.620 | Like how hard is that compilation?
01:41:44.060 | - It's really hard.
01:41:45.660 | - So we just skipped,
01:41:46.900 | you said like it's amazing that that's a thing,
01:41:49.600 | but how hard is that of a thing?
01:41:51.540 | - It's hard, and I would say that
01:41:53.660 | not all of the systems are really great,
01:41:57.180 | including the ones I helped build.
01:41:58.620 | So there's a lot of work left to be done there.
01:42:00.740 | - Is it the compiler nerds working on that
01:42:02.340 | or is it a whole new group of people?
01:42:04.620 | - Well, it's a full stack problem,
01:42:05.900 | including compiler people,
01:42:07.900 | including APIs, so like Keras
01:42:10.140 | and the module API in PyTorch and Jax.
01:42:14.620 | And there's a bunch of people pushing
01:42:15.980 | on all the different parts of these things.
01:42:17.460 | Because when you look at it,
01:42:18.860 | it's both how do I express the computation?
01:42:21.300 | Do I stack up layers?
01:42:22.940 | Well, cool, like setting up a linear sequence of layers
01:42:25.660 | is great for the simple case,
01:42:26.780 | but how do I do the hard case?
01:42:28.260 | How do I do reinforcement learning?
01:42:29.540 | Well, now I need to integrate
01:42:30.380 | my application logic in this, right?
01:42:32.700 | Then it's the next level down of
01:42:34.700 | how do you represent that for the runtime?
01:42:36.700 | How do you get hardware abstraction?
01:42:39.100 | And then you get to the next level down of saying like,
01:42:40.780 | forget about abstraction,
01:42:41.860 | how do I get the peak performance out of my TPU
01:42:44.540 | or my iPhone accelerator or whatever, right?
01:42:47.620 | And all these different things.
01:42:48.980 | And so this is a layered problem
01:42:50.260 | with a lot of really interesting design
01:42:53.100 | and work going on in the space
01:42:54.540 | and a lot of really smart people working on it.
01:42:56.940 | Machine learning is a very well-funded area
01:42:59.460 | of investment right now.
01:43:00.820 | And so there's a lot of progress being made.
01:43:02.940 | - So how much innovation is there on the lower level?
01:43:05.900 | So closer to the ASIC.
01:43:08.220 | So redesigning the hardware
01:43:09.780 | or redesigning concurrently compilers with that hardware.
01:43:13.140 | Is that, if you were to predict the biggest,
01:43:16.060 | the equivalent of Moore's law improvements
01:43:20.540 | in the inference, in the training of neural networks,
01:43:24.620 | in just all of that, where is that gonna come from?
01:43:26.660 | You think?
01:43:27.500 | - Sure, you get scalability, you have different things.
01:43:28.900 | And so you get Jim Keller shrinking process technology,
01:43:33.620 | you get three nanometer instead of five or seven
01:43:36.260 | or 10 or 28 or whatever.
01:43:38.100 | And so that marches forward and that provides improvements.
01:43:41.300 | You get architectural level performance.
01:43:44.060 | And so the TPU with a matrix multiply unit
01:43:47.660 | and a systolic array is much more efficient
01:43:49.620 | than having a scalar core doing multiplies
01:43:52.780 | and adds and things like that.
01:43:54.380 | You then get system level improvements.
01:43:58.620 | So how you talk to memory,
01:43:59.860 | how you talk across a cluster of machines,
01:44:02.340 | how you scale out, how you have fast interconnects
01:44:04.820 | between machines.
01:44:06.060 | You then get system level programming models.
01:44:08.780 | So now that you have all this hardware, how do you utilize it?
01:44:11.300 | You then have algorithmic breakthroughs
01:44:12.860 | where you say, "Hey, wow, cool.
01:44:14.380 | Instead of training in a ResNet-50 in a week,
01:44:18.900 | I'm now training it in 25 seconds."
01:44:21.580 | - Yeah, and opening that--
01:44:22.420 | - And it's a combination of new optimizers
01:44:27.020 | and new just training regimens
01:44:29.700 | and different approaches to train.
01:44:32.180 | And all of these things come together
01:44:34.060 | to push the world forward.
01:44:36.100 | - That was a beautiful exposition.
01:44:39.140 | But if you were to force to bet all your money
01:44:42.820 | on one of these, would you?
01:44:45.300 | - Why do we have to?
01:44:46.300 | Unfortunately, we have people working on all this.
01:44:50.780 | It's an exciting time, right?
01:44:52.260 | - So, I mean, you know, OpenAI did this little paper
01:44:56.180 | showing the algorithmic improvement you can get
01:44:58.060 | has been improving exponentially.
01:45:00.940 | I haven't quite seen the same kind of analysis
01:45:04.260 | on other layers of the stack.
01:45:06.740 | I'm sure it's also improving significantly.
01:45:09.340 | I just, it's a nice intuition builder.
01:45:12.340 | I mean, there's a reason why Moore's law,
01:45:16.140 | that's the beauty of Moore's law,
01:45:17.340 | is somebody writes a paper
01:45:19.500 | that makes a ridiculous prediction.
01:45:21.420 | And it, you know, becomes reality in a sense.
01:45:27.180 | There's something about these narratives
01:45:28.900 | when you, when Chris Lattner on a silly little podcast
01:45:33.620 | makes, bets all his money on a particular thing,
01:45:37.260 | somehow it can have a ripple effect
01:45:39.260 | of actually becoming real.
01:45:40.780 | That's an interesting aspect of it.
01:45:43.300 | 'Cause like, it might've been, you know,
01:45:46.060 | we focus with Moore's law,
01:45:47.540 | most of the computing industry really, really focused
01:45:51.460 | on the hardware.
01:45:52.740 | I mean, software innovation,
01:45:55.100 | I don't know how much software innovation
01:45:56.500 | there was in terms of efficiency.
01:45:57.340 | - You can tell giveth, bill takes away.
01:45:59.140 | (laughing)
01:46:00.180 | - Yeah.
01:46:01.300 | I mean, compilers improved significantly also.
01:46:04.100 | - Well, not really.
01:46:04.940 | So actually, I mean, I'm joking
01:46:06.900 | about how software's gotten slower
01:46:09.020 | pretty much as fast as hardware got better,
01:46:11.620 | at least through the nineties.
01:46:13.260 | There's another joke, another law in compilers,
01:46:15.820 | which is called, I think it's called Probstein's law,
01:46:18.260 | which is compilers double the performance
01:46:21.700 | of any given code every 18 years.
01:46:23.820 | (laughing)
01:46:26.300 | - So they move slowly.
01:46:27.980 | - Yeah, well, so-
01:46:28.820 | - Well, yeah, it's exponential also.
01:46:31.060 | - Yeah, but you're making progress.
01:46:32.500 | But there, again, it's not about,
01:46:34.300 | the power of compilers is not just about
01:46:37.820 | how do you make the same thing go faster?
01:46:39.180 | It's how do you unlock the new hardware?
01:46:41.900 | A new chip came out, how do you utilize it?
01:46:43.660 | You say, oh, the programming model,
01:46:45.260 | how do we make people more productive?
01:46:47.100 | How do we, like, have better error messages?
01:46:52.060 | Even such mundane things, like,
01:46:54.220 | how do I generate a very specific error message
01:46:57.300 | about your code, actually makes people happy
01:46:59.900 | because then they know how to fix it, right?
01:47:01.940 | And it comes back to how do you help people
01:47:03.660 | get their job done?
01:47:04.660 | - Yeah, and yeah, and then in this world
01:47:06.820 | of exponentially increasing smart toasters,
01:47:10.340 | how do you expand computing to all these kinds of devices?
01:47:15.340 | Do you see this world where just everything's
01:47:18.580 | a computing surface?
01:47:20.460 | You see that possibility?
01:47:22.180 | Just everything's a computer?
01:47:24.020 | - Yeah, I don't see any reason
01:47:25.140 | that that couldn't be achieved.
01:47:27.020 | It turns out that sand goes into glass
01:47:30.500 | and glass is pretty useful too.
01:47:32.700 | And, you know, like, why not?
01:47:35.220 | - Why not?
01:47:36.060 | So, very important question then,
01:47:39.580 | if we're living in a simulation
01:47:44.580 | and the simulation is running a computer,
01:47:47.420 | like, what's the architecture of that computer,
01:47:50.020 | do you think?
01:47:51.900 | - So you're saying, is it a quantum system?
01:47:54.540 | Is it a--
01:47:55.380 | - Yeah, like this whole quantum discussion,
01:47:56.700 | is it needed or can we run it on a,
01:47:59.420 | you know, with a RISC-V architecture,
01:48:03.260 | a bunch of CPUs?
01:48:05.300 | - I think it comes down to the right tool for the job.
01:48:07.580 | Okay, and so--
01:48:08.660 | - And what's the compiler?
01:48:10.140 | - Yeah, exactly, that's my question.
01:48:12.580 | How do I get that job?
01:48:13.700 | Be the universe compiler.
01:48:14.940 | And so there, as far as we know,
01:48:19.740 | quantum systems are the bottom of the pile of turtles
01:48:23.700 | so far.
01:48:24.540 | And so we don't know efficient ways
01:48:28.300 | to implement quantum systems
01:48:29.660 | without using quantum computers.
01:48:31.260 | - Yeah, and that's totally outside
01:48:33.580 | of everything we've talked about.
01:48:35.180 | - But who runs that quantum computer?
01:48:37.060 | - Yeah.
01:48:37.900 | - Right, so if we really are living in a simulation,
01:48:41.460 | then is it bigger quantum computers?
01:48:44.420 | Is it different ones?
01:48:45.260 | Like, how does that work out?
01:48:46.580 | How does that scale?
01:48:47.700 | - Well, it's the same size.
01:48:49.940 | It's the same size.
01:48:50.780 | But then the thought of the simulation
01:48:52.660 | is that you don't have to run the whole thing,
01:48:54.220 | that, you know, we humans are cognitively very limited.
01:48:56.860 | - You do checkpoints.
01:48:57.700 | - You do checkpoints, yeah.
01:48:59.420 | And if we, the point at which we human,
01:49:02.980 | so you basically do minimal amount of,
01:49:06.820 | what is it, Swift does on write?
01:49:10.980 | Copy and run. - Copy and run, yeah.
01:49:12.180 | - So you only adjust the simulation.
01:49:15.500 | - Parallel universe theories, right?
01:49:17.020 | And so every time a decision's made,
01:49:20.540 | somebody opens the Schrodinger box,
01:49:22.060 | then there's a fork, this could happen.
01:49:24.980 | - And then, thank you for considering the possibility.
01:49:29.980 | But yeah, so it may not require, you know,
01:49:32.780 | the entirety of the universe to simulate it.
01:49:34.700 | But it's interesting to think about
01:49:38.900 | as we create these higher and higher fidelity systems.
01:49:43.340 | But I do wanna ask on the quantum computer side,
01:49:46.620 | 'cause everything we've talked about with,
01:49:49.220 | you work with Sci-5, with compilers,
01:49:52.060 | none of that includes quantum computers, right?
01:49:55.060 | - That's true.
01:49:56.060 | - So have you ever thought about what, you know,
01:50:01.060 | this whole serious engineering work
01:50:05.420 | of quantum computers looks like, of compilers,
01:50:08.100 | of architectures, all of that kind of stuff?
01:50:10.660 | - So I've looked at it a little bit.
01:50:11.820 | I know almost nothing about it,
01:50:14.300 | which means that at some point,
01:50:15.540 | I will have to find an excuse to get involved,
01:50:17.860 | 'cause that's how I work.
01:50:18.700 | - But do you think that's a thing to be,
01:50:21.140 | like, with your little tingly senses
01:50:23.420 | of the timing of one to be involved, is it not yet?
01:50:26.860 | - Well, so the thing I do really well
01:50:28.820 | is I jump into messy systems
01:50:31.660 | and figure out how to make them,
01:50:33.700 | figure out what the truth in the situation is,
01:50:35.540 | try to figure out what the unifying theory is,
01:50:39.100 | how to, like, factor the complexity,
01:50:40.980 | how to find a beautiful answer to a problem
01:50:42.860 | that has been well-studied
01:50:44.860 | and lots of people have bashed their heads against it.
01:50:47.060 | I don't know that quantum computers are mature enough
01:50:49.300 | and accessible enough to be figured out yet, right?
01:50:53.740 | And I think the open question with quantum computers is,
01:50:58.580 | is there a useful problem that gets solved
01:51:00.900 | with a quantum computer that makes it worth
01:51:04.100 | the economic cost of, like, having one of these things
01:51:06.740 | and having legions of people that set it up?
01:51:11.500 | You go back to the '50s, right,
01:51:12.780 | and there's the projections of,
01:51:13.980 | the world will only need seven computers, right?
01:51:18.220 | Well, and part of that was that people hadn't figured out
01:51:20.740 | what they're useful for.
01:51:21.980 | What are the algorithms we wanna run?
01:51:23.260 | What are the problems that get solved?
01:51:24.340 | And this comes back to, how do we make the world better,
01:51:27.620 | either economically or making somebody's life better
01:51:29.900 | or, like, solving a problem that wasn't solved before,
01:51:31.940 | things like this.
01:51:33.140 | And I think that just we're a little bit too early
01:51:36.020 | in that development cycle
01:51:36.860 | because it's still, like, literally a science project,
01:51:39.380 | not in a negative connotation, right?
01:51:41.540 | It's literally a science project
01:51:42.860 | and the progress there's amazing.
01:51:45.420 | And so I don't know if it's 10 years away,
01:51:48.900 | if it's two years away,
01:51:50.100 | exactly where that breakthrough happens,
01:51:51.660 | but you look at machine learning,
01:51:54.540 | we went through a few winners
01:51:58.420 | before the AlexNet transition,
01:52:00.180 | and then suddenly it had its breakout moment,
01:52:02.980 | and that was the catalyst that then drove
01:52:05.860 | the talent flocking into it.
01:52:07.580 | That's what drove the economic applications of it.
01:52:10.180 | That's what drove the technology to go faster
01:52:13.420 | because you now have more minds thrown at the problem.
01:52:15.940 | This is what caused a serious knee in deep learning
01:52:20.180 | and the algorithms that we're using.
01:52:22.100 | And so I think that's what quantum needs to go through.
01:52:25.540 | And so right now it's in that formidable finding itself,
01:52:28.820 | getting literally the physics figured out.
01:52:32.700 | - And then it has to figure out the application
01:52:36.100 | that makes this useful.
01:52:37.580 | - Yeah, but I'm not skeptical of that.
01:52:39.860 | I think that will happen.
01:52:40.860 | I think it's just 10 years away, something like that.
01:52:43.500 | - I forgot to ask, what programming language
01:52:46.100 | do you think the simulation is written in?
01:52:48.700 | - Ooh, probably Lisp.
01:52:50.300 | (laughing)
01:52:52.060 | - So not Swift.
01:52:53.020 | Like if you were to bet,
01:52:54.220 | I'll just leave it at that.
01:52:58.100 | So, I mean, we've mentioned that you worked
01:53:00.460 | at all these companies,
01:53:01.460 | we've talked about all these projects.
01:53:03.940 | It's kind of like if we just step back and zoom out
01:53:07.260 | about the way you did that work,
01:53:10.100 | and we look at COVID times,
01:53:12.220 | this pandemic we're living through,
01:53:13.780 | that may, if I look at the way Silicon Valley folks
01:53:17.020 | are talking about it, the way MIT's talking about it,
01:53:19.860 | this might last for a long time.
01:53:23.060 | Not just the virus, but the remote nature.
01:53:28.060 | - The economic impact.
01:53:29.660 | I mean, yeah, it's gonna be a mess.
01:53:32.140 | - Do you think, what's your prediction?
01:53:34.500 | I mean, from Sci-Fi to Google to just all the places
01:53:39.500 | you worked in just Silicon Valley,
01:53:43.380 | you're in the middle of it.
01:53:44.260 | What do you think is, how is this whole place gonna change?
01:53:46.620 | - Yeah, so, I mean, I really can only speak
01:53:49.060 | to the tech perspective.
01:53:50.460 | I am in that bubble.
01:53:52.820 | I think it's gonna be really interesting
01:53:55.700 | because the Zoom culture of being remote
01:53:58.780 | and on video chat all the time
01:54:00.260 | has really interesting effects on people.
01:54:01.980 | So on the one hand, it's a great normalizer.
01:54:05.020 | It's a normalizer that I think will help communities
01:54:09.060 | of people that have traditionally been underrepresented
01:54:12.580 | because now you're taking, in some cases, a face-off
01:54:16.340 | 'cause you don't have to have a camera going, right?
01:54:18.740 | And so you can have conversations
01:54:19.980 | without physical appearance being part of the dynamic,
01:54:22.740 | which is pretty powerful.
01:54:24.500 | You're taking remote employees that have already been remote
01:54:27.020 | and you're saying you're now on the same level
01:54:29.900 | and footing as everybody else.
01:54:31.380 | Nobody gets whiteboards.
01:54:33.460 | You're not gonna be the one person
01:54:34.580 | that doesn't get to be participating
01:54:35.980 | in the whiteboard conversation,
01:54:37.180 | and that's pretty powerful.
01:54:39.300 | You've got, you're forcing people to think asynchronously
01:54:44.100 | in some cases because it's hard
01:54:45.660 | to just get people physically together
01:54:48.140 | and the bumping into each other
01:54:49.380 | forces people to find new ways to solve those problems.
01:54:52.740 | And I think that that leads to more inclusive behavior,
01:54:55.220 | which is good.
01:54:56.740 | On the other hand, it's also, it just sucks, right?
01:55:00.740 | And so-
01:55:02.580 | - The nature, the actual communication,
01:55:05.300 | or it just sucks being not with people
01:55:08.700 | like on a daily basis and collaborating with them?
01:55:11.380 | - Yeah, all of that, right?
01:55:13.060 | I mean, everything, this whole situation is terrible.
01:55:15.620 | What I meant primarily was the,
01:55:17.580 | I think that most humans like working physically with humans.
01:55:22.940 | I think this is something that not everybody,
01:55:24.620 | but many people are programmed to do.
01:55:27.060 | And I think that we get something out of that
01:55:29.180 | that is very hard to express, at least for me.
01:55:31.420 | And so maybe this isn't true of everybody.
01:55:33.100 | But, and so the question to me is,
01:55:36.780 | when you get through that time of adaptation,
01:55:38.980 | you get out of March and April and you get into December
01:55:43.100 | and you get into next March, if it's not changed, right?
01:55:46.500 | - It's already terrifying.
01:55:47.740 | - Well, you think about that and you think about
01:55:49.540 | what is the nature of work and how do we adapt?
01:55:52.620 | And humans are very adaptable species, right?
01:55:54.980 | We can learn things.
01:55:57.100 | And when we're forced to,
01:55:58.140 | and there's a catalyst to make that happen.
01:56:00.500 | And so what is it that comes out of this
01:56:02.620 | and are we better or worse off, right?
01:56:04.660 | I think that, you look at the Bay Area,
01:56:07.100 | housing prices are insane.
01:56:08.860 | Well, why?
01:56:09.820 | Well, there's a high incentive to be physically located
01:56:12.420 | because if you don't have proximity,
01:56:14.980 | you end up paying for it in commute, right?
01:56:18.380 | And there has been huge social pressure
01:56:21.020 | in terms of like, you will be there for the meeting, right?
01:56:24.620 | Or whatever scenario it is.
01:56:26.900 | And I think that's gonna be way better.
01:56:28.220 | I think it's gonna be much more of the norm
01:56:29.980 | to have remote employees.
01:56:31.620 | And I think this is gonna be really great.
01:56:33.180 | - Do you have friends or do you hear of people moving?
01:56:36.500 | - Yeah, I know one family friend that moved.
01:56:40.740 | They moved back to Michigan and, you know,
01:56:43.620 | they were a family with three kids
01:56:45.580 | living in a small apartment and like, we're going insane.
01:56:48.900 | (laughing)
01:56:50.460 | Right, and they're in tech, husband works for Google.
01:56:54.260 | So first of all, friends of mine are in the process of,
01:56:58.100 | or have already lost the business.
01:57:00.580 | The thing that represents their passion, their dream.
01:57:03.180 | It could be small entrepreneur projects,
01:57:05.300 | but it can be large businesses like people that run gyms.
01:57:07.900 | - Oh, restaurants, like tons of things, yeah.
01:57:10.820 | - But also, people like look at themselves in the mirror
01:57:14.140 | and ask the question of like,
01:57:16.180 | what do I wanna do in life?
01:57:17.580 | For some reason, they haven't done it until COVID.
01:57:20.900 | They really ask that question
01:57:22.060 | and that results often in moving or leaving the company
01:57:26.300 | or with starting your own business
01:57:28.100 | or transitioning to different company.
01:57:30.620 | Do you think we're gonna see that a lot?
01:57:33.600 | - Well, I can't speak to that.
01:57:36.780 | I mean, we're definitely gonna see it at a higher frequency
01:57:38.500 | than we did before, just because I think what you're trying
01:57:41.900 | to say is there are decisions that you make yourself
01:57:45.820 | and big life decisions that you make yourself.
01:57:47.860 | And like, I'm gonna like quit my job and start a new thing.
01:57:50.440 | There's also decisions that get made for you.
01:57:52.880 | Like I got fired from my job.
01:57:54.560 | What am I gonna do, right?
01:57:55.860 | And that's not a decision that you think about,
01:57:58.580 | but you're forced to act, okay?
01:58:00.880 | And so I think that those you're forced to act
01:58:03.460 | kind of moments where like, you know,
01:58:05.140 | global pandemic comes and wipes out the economy
01:58:07.220 | and now your business doesn't exist.
01:58:10.400 | I think that does lead to more reflection, right?
01:58:12.340 | Because you're less anchored on what you have
01:58:14.980 | and it's not a, what do I have to lose
01:58:17.580 | versus what do I have to gain, AB comparison.
01:58:20.480 | It's more of a fresh slate.
01:58:22.440 | Cool, I could do anything now.
01:58:24.380 | Do I wanna do the same thing I was doing?
01:58:26.860 | Did that make me happy?
01:58:28.320 | Is this now time to go back to college
01:58:30.000 | and take a class and learn a new skill?
01:58:33.120 | Is this a time to spend time with family?
01:58:36.600 | If you can afford to do that, is this time to like,
01:58:39.000 | you know, literally move in with the parents, right?
01:58:41.000 | I mean, all these things that were not normative before
01:58:43.880 | suddenly become, I think, very, the value systems change.
01:58:48.880 | And I think that's actually a good thing
01:58:50.800 | in the short term at least, because it leads to, you know,
01:58:55.800 | there's kind of been an over-optimization
01:58:58.400 | along one set of priorities for the world.
01:59:01.540 | And now maybe we'll get to a more balanced
01:59:03.520 | and more interesting world
01:59:05.180 | where people are doing different things.
01:59:06.760 | I think it could be good.
01:59:07.640 | I think there could be more innovation
01:59:09.000 | that comes out of it, for example.
01:59:10.120 | - What do you think about all the social chaos
01:59:12.760 | we're in the middle of?
01:59:13.920 | - It sucks.
01:59:14.760 | (laughing)
01:59:17.520 | - Let me ask you, you think it's all gonna be okay?
01:59:21.080 | - Well, I think humanity will survive.
01:59:23.400 | - The form of nexus denture,
01:59:25.400 | like we're not all gonna kill, yeah, well.
01:59:27.280 | - Yeah, I don't think the virus
01:59:28.120 | is gonna kill all the humans.
01:59:30.360 | I don't think all the humans are gonna kill all the humans.
01:59:32.000 | I think that's unlikely.
01:59:32.880 | But I look at it as
01:59:35.560 | progress requires a catalyst, right?
01:59:42.160 | So you need a reason for people
01:59:44.760 | to be willing to do things that are uncomfortable.
01:59:47.720 | I think that the US at least,
01:59:50.720 | but I think the world in general
01:59:51.760 | is a pretty unoptimal place to live in for a lot of people.
01:59:56.760 | And I think that what we're seeing right now
01:59:58.880 | is we're seeing a lot of unhappiness.
02:00:00.440 | And because of all the pressure,
02:00:03.560 | because of all the badness in the world
02:00:05.520 | that's coming together,
02:00:06.340 | it's really kind of igniting some of that debate
02:00:07.840 | that should have happened a long time ago, right?
02:00:10.120 | I mean, I think that we'll see more progress.
02:00:11.600 | If you're asking about,
02:00:12.880 | offline you're asking about politics
02:00:14.240 | and wouldn't it be great if politics moved faster
02:00:15.760 | because there's all these problems in the world
02:00:16.600 | and we can move it.
02:00:18.160 | Well, people are inherently conservative.
02:00:22.320 | And so if you're talking about conservative people,
02:00:25.040 | particularly if they have heavy burdens on their shoulders
02:00:27.480 | 'cause they represent literally thousands of people,
02:00:30.080 | it makes sense to be conservative.
02:00:33.240 | But on the other hand, when you need change,
02:00:35.360 | how do you get it?
02:00:36.240 | The global pandemic will probably lead to some change.
02:00:40.560 | And it's not a directed plan,
02:00:44.320 | but I think that it leads to people
02:00:45.920 | asking really interesting questions.
02:00:47.400 | And some of those questions
02:00:48.240 | should have been asked a long time ago.
02:00:50.120 | - Well, let me know if you've observed this as well.
02:00:53.320 | Something that's bothering me
02:00:54.840 | in the machine learning community,
02:00:56.160 | I'm guessing it might be prevalent in other places,
02:00:59.680 | is something that feels like in 2020
02:01:02.520 | increased level of toxicity.
02:01:05.280 | Like people are just quicker to pile on
02:01:09.720 | to just be harsh on each other,
02:01:13.280 | to like mob, pick a person that screwed up
02:01:18.280 | and like make it a big thing.
02:01:22.080 | And is there something that we can like,
02:01:25.520 | have you observed that in other places?
02:01:28.240 | Is there some way out of this?
02:01:30.200 | - I think there's an inherent thing in humanity
02:01:32.200 | that's kind of an us versus them thing,
02:01:34.480 | which is that you wanna succeed.
02:01:36.240 | And how do you succeed?
02:01:37.160 | Well, it's relative to somebody else.
02:01:39.640 | And so what's happening, at least in some part,
02:01:43.160 | is that with the internet and with online communication,
02:01:47.160 | the world's getting smaller.
02:01:48.560 | Right, and so we're having some of the social ties
02:01:53.080 | of like my town versus your town's football team,
02:01:56.480 | right, turn into much larger and yet shallower problems.
02:02:02.360 | And people don't have time, the incentives,
02:02:05.680 | the clickbait and like all these things
02:02:08.080 | kind of really, really feed into this machine.
02:02:10.520 | And I don't know where that goes.
02:02:12.480 | - Yeah, I mean, the reason I think about that,
02:02:14.760 | I mentioned to you this offline a little bit,
02:02:17.520 | but I have a few difficult conversations scheduled,
02:02:22.520 | some of them political related,
02:02:25.120 | some of them within the community,
02:02:27.320 | difficult personalities that went through some stuff.
02:02:30.640 | I mean, one of them I've talked before,
02:02:32.160 | I will talk again is Yann LeCun.
02:02:34.320 | He got a little bit of crap on Twitter
02:02:37.200 | for talking about a particular paper
02:02:41.000 | and the bias within a dataset.
02:02:42.800 | And then there's been a huge, in my view,
02:02:45.960 | and I'm willing, comfortable saying it,
02:02:49.800 | irrational, over-exaggerated pile on on his comments
02:02:54.440 | because he made pretty basic comments
02:02:57.160 | about the fact that if there's bias in the data,
02:02:59.920 | there's going to be bias in the results.
02:03:02.440 | So we should not have bias in the data,
02:03:04.640 | but people piled on to him
02:03:06.600 | because he said he trivialized the problem of bias.
02:03:10.080 | Like it's a lot more than just bias in the data.
02:03:13.240 | But like, yes, that's a very good point,
02:03:16.600 | but that's- - That's not what he was saying.
02:03:19.000 | - That's not what he was saying.
02:03:19.840 | And the response, like the implied response
02:03:23.160 | that he's basically sexist and racist
02:03:26.720 | is something that completely drives away
02:03:30.480 | the possibility of nuanced discussion.
02:03:32.920 | One nice thing about like a podcast long form conversation
02:03:37.920 | is you can talk it out,
02:03:40.320 | you can lay your reasoning out.
02:03:42.880 | And even if you're wrong,
02:03:44.560 | you can still show that you're a good human being
02:03:47.200 | underneath it.
02:03:48.280 | - You know, your point about
02:03:49.160 | you can't have a productive discussion.
02:03:51.040 | Well, how do you get to that point where people can turn?
02:03:53.920 | They can learn, they can listen, they can think,
02:03:56.360 | they can engage versus just being a shallow,
02:03:59.240 | like, and then keep moving, right?
02:04:02.600 | - And I don't think that progress really comes from that.
02:04:06.720 | Right, and I don't think that one should expect that.
02:04:09.920 | I think that you'd see that as reinforcing
02:04:12.360 | individual circles and the us versus them thing.
02:04:14.560 | And I think that's fairly divisive.
02:04:17.600 | - Yeah, I think there's a big role in,
02:04:21.000 | like the people that bother me most on Twitter
02:04:24.160 | when I observe things
02:04:25.760 | is not the people who get very emotional,
02:04:28.400 | angry, like over the top.
02:04:30.160 | It's the people who like prop them up.
02:04:33.920 | It's all the, it's this.
02:04:36.200 | I think what should be the,
02:04:38.000 | we should teach each other is to be sort of empathetic.
02:04:42.360 | - The thing that it's really easy to forget,
02:04:44.760 | particularly on like Twitter or the internet or an email,
02:04:47.800 | is that sometimes people just have a bad day.
02:04:50.160 | - Yeah.
02:04:51.000 | - Right, you have a bad day or you're like,
02:04:53.200 | I've been in the situation where it's like between meetings,
02:04:55.560 | like fire off a quick response to an email
02:04:57.360 | 'cause I wanna like help get something unblocked.
02:04:59.760 | Phrase it really objectively wrong.
02:05:03.680 | I screwed up and suddenly this is now
02:05:07.160 | something that sticks with people.
02:05:08.720 | And it's not because they're bad.
02:05:10.640 | It's not because you're bad.
02:05:11.880 | Just psychology of like you said a thing,
02:05:15.240 | it sticks with you.
02:05:16.080 | You didn't mean it that way,
02:05:17.000 | but it really impacted somebody
02:05:18.520 | because the way they interpret it.
02:05:20.920 | And this is just an aspect of working together as humans.
02:05:23.400 | And I have a lot of optimism in the long-term,
02:05:26.200 | the very long-term about what we as humanity can do,
02:05:29.120 | but I think that's gonna be, it's just always a rough ride.
02:05:31.160 | And you came into this by saying like,
02:05:33.160 | what is COVID and all the social strife
02:05:36.200 | that's happening right now mean?
02:05:38.120 | And I think that it's really bad in the short-term,
02:05:40.960 | but I think it'll lead to progress.
02:05:42.580 | And for that, I'm very thankful.
02:05:44.340 | - Yeah, it's painful in the short-term though.
02:05:48.040 | - Well, yeah, I mean, people are out of jobs.
02:05:49.760 | Like some people can't eat, like it's horrible.
02:05:52.520 | And, but, but, you know, it's progress.
02:05:56.960 | So we'll see what happens.
02:05:58.560 | I mean, the real question is when you look back 10 years,
02:06:01.920 | 20 years, a hundred years from now,
02:06:03.560 | how do we evaluate the decisions that are being made
02:06:05.400 | right now?
02:06:06.860 | I think that's really the way you can frame that
02:06:09.800 | and look at it.
02:06:10.640 | And you say, you know, you integrate across all
02:06:12.840 | the short-term horribleness that's happening.
02:06:15.440 | And you look at what that means and is the, you know,
02:06:18.600 | improvement across the world or the regression
02:06:20.360 | across the world significant enough to make it a good
02:06:24.160 | or a bad thing.
02:06:25.000 | I think that's the question.
02:06:26.800 | - Yeah.
02:06:27.640 | And for that, it's good to study history.
02:06:29.480 | I mean, one of the big problems for me right now
02:06:32.060 | is I'm reading the rise and fall of the third Reich.
02:06:34.760 | - Light reading.
02:06:37.400 | - So it's everything is just, I just see parallels
02:06:40.880 | and it means it's, you have to be really careful
02:06:43.500 | not to overstep it.
02:06:45.360 | But just the thing that worries me the most is the pain
02:06:49.400 | that people feel when a few things combine,
02:06:54.400 | which is like economic depression,
02:06:56.000 | which is quite possible in this country.
02:06:57.960 | And then just being disrespected in some kind of way,
02:07:02.600 | which the German people were really disrespected
02:07:05.160 | by most of the world, like in a way that's over the top,
02:07:10.160 | that something can build up and then all you need
02:07:13.460 | is a charismatic leader to go either positive or negative
02:07:18.400 | and both work as long as they're charismatic.
02:07:21.080 | And there's--
02:07:22.160 | - It's taking advantage of, again,
02:07:24.000 | that inflection point that the world's in
02:07:26.360 | and what they do with it could be good or bad.
02:07:28.720 | - And so it's a good way to think about times now,
02:07:32.680 | like on an individual level, what we decide to do
02:07:35.760 | is when history is written, 30 years from now,
02:07:39.560 | what happened in 2020, probably history's gonna remember
02:07:42.260 | 2020.
02:07:43.120 | - Yeah, I think so.
02:07:43.960 | (laughing)
02:07:45.520 | - Either for good or bad.
02:07:46.800 | And it's like up to us to write it, so it's good.
02:07:49.520 | - Well, one of the things I've observed
02:07:50.880 | that I find fascinating is most people act
02:07:54.160 | as though the world doesn't change.
02:07:56.440 | You make decisions knowingly, right?
02:08:00.000 | You make a decision where you're predicting the future
02:08:02.620 | based on what you've seen in the recent past.
02:08:04.800 | And so if something's always been,
02:08:06.120 | it's rained every single day,
02:08:07.320 | then of course you expect it to rain today too, right?
02:08:10.080 | On the other hand, the world changes all the time.
02:08:13.400 | - Yeah.
02:08:14.240 | - Incessantly, like for better and for worse.
02:08:16.800 | And so the question is, if you're interested
02:08:18.400 | in something that's not right,
02:08:20.880 | what is the inflection point that led to a change?
02:08:22.920 | And you can look to history for this.
02:08:24.360 | Like what is the catalyst that led to that explosion
02:08:27.960 | that led to that bill that led to the,
02:08:30.240 | like you can kind of work your way backwards from that.
02:08:33.240 | And maybe if you pull together the right people
02:08:35.760 | and you get the right ideas together,
02:08:36.940 | you can actually start driving that change
02:08:39.000 | and doing it in a way that's productive
02:08:40.360 | and hurts fewer people.
02:08:41.800 | - Yeah, like a single person, a single event
02:08:43.680 | can turn all of history.
02:08:44.520 | - Yeah, absolutely.
02:08:45.340 | Everything starts somewhere.
02:08:46.400 | And often it's a combination of multiple factors,
02:08:48.480 | but yeah, these things can be engineered.
02:08:52.520 | - That's actually the optimistic view that--
02:08:54.960 | - I'm a long-term optimist on pretty much everything.
02:08:57.600 | And human nature, you know,
02:08:59.360 | we can look to all the negative things that humanity has,
02:09:02.220 | all the pettiness and all the self-servingness
02:09:05.840 | and just the cruelty, right?
02:09:09.760 | The biases, just humans can be very horrible.
02:09:13.400 | But on the other hand, we're capable of amazing things.
02:09:16.120 | (laughs)
02:09:17.120 | And the progress across, you know,
02:09:20.600 | hundred year chunks is striking.
02:09:23.280 | And even across decades, we've come a long ways
02:09:26.720 | and there's still a long ways to go,
02:09:27.840 | but that doesn't mean that we've stopped.
02:09:30.000 | - Yeah, the kind of stuff we've done
02:09:31.440 | in the last hundred years is unbelievable.
02:09:34.920 | It's kind of scary to think what's gonna happen
02:09:36.760 | in the next hundred years.
02:09:37.600 | It's scary, like exciting.
02:09:39.040 | Like scary in a sense that it's kind of sad
02:09:41.680 | that the kind of technology is gonna come out
02:09:43.760 | in 10, 20, 30 years.
02:09:45.720 | We're probably too old to really appreciate
02:09:47.800 | 'cause you don't grow up with it.
02:09:49.120 | It'll be like kids these days with their virtual reality
02:09:51.720 | and their--
02:09:52.680 | - Their TikToks and stuff like this.
02:09:54.520 | Like, oh, there's this thing and like,
02:09:56.840 | come on, give me my, you know, static photo.
02:09:59.120 | (laughs)
02:09:59.960 | - You know, my Commodore 64.
02:10:02.320 | - Yeah, exactly.
02:10:03.760 | - Okay, sorry, we kind of skipped over.
02:10:05.840 | Let me ask on, you know,
02:10:09.680 | the machine learning world has been kind of inspired,
02:10:14.400 | their imagination captivated with GPT-3
02:10:17.040 | and these language models.
02:10:18.760 | I thought it'd be cool to get your opinion on it.
02:10:21.800 | What's your thoughts on this exciting world of,
02:10:26.260 | it connects to computation actually,
02:10:29.920 | is of language models that are huge
02:10:33.000 | and take many, many computers, not just to train,
02:10:37.400 | but to also do inference on.
02:10:39.400 | - Sure.
02:10:40.440 | Well, I mean, it depends on what you're speaking to there,
02:10:43.400 | but I mean, I think that there's been
02:10:45.280 | a pretty well understood maximum deep learning
02:10:48.360 | that if you make the model bigger
02:10:49.640 | and you shove more data into it,
02:10:51.360 | assuming you train it right
02:10:52.400 | and you have a good model architecture,
02:10:54.020 | that you'll get a better model out.
02:10:55.800 | And so on one hand, GPT-3 was not that surprising.
02:10:59.740 | On the other hand, a tremendous amount of engineering
02:11:02.040 | went into making it possible.
02:11:03.500 | The implications of it are pretty huge.
02:11:07.080 | I think that when GPT-2 came out,
02:11:09.000 | there was a very provocative blog post from OpenAI
02:11:11.360 | talking about, you know, we're not gonna release it
02:11:13.640 | because of the social damage it could cause
02:11:15.440 | if it's misused.
02:11:16.520 | I think that's still a concern.
02:11:20.120 | I think that we need to look at how technology is applied
02:11:23.240 | and, you know, well-meaning tools can be applied
02:11:25.800 | in very horrible ways,
02:11:26.840 | and they can have very profound impact on that.
02:11:29.320 | I think that GPT-3 is a huge technical achievement.
02:11:33.960 | And what will GPT-4 be?
02:11:35.760 | Will it probably be bigger, more expensive to train?
02:11:38.480 | Really cool architectural tricks.
02:11:42.000 | - Do you think, is there,
02:11:43.960 | I don't know how much thought you've done
02:11:46.480 | on distributed computing.
02:11:48.720 | Is there some technical challenges that are interesting
02:11:52.960 | that you're hopeful about exploring in terms of,
02:11:55.880 | you know, a system that, like a piece of code that,
02:11:59.000 | you know, with GPT-4,
02:12:02.760 | that might have, I don't know,
02:12:07.040 | hundreds of trillions of parameters
02:12:09.360 | which have to run on thousands of computers.
02:12:11.600 | Is there some hope that we can make that happen?
02:12:15.320 | - Yeah, well, I mean, today you can write a check
02:12:18.960 | and get access to a thousand TPU cores
02:12:21.800 | and do really interesting large-scale training
02:12:23.960 | and inference and things like that in Google Cloud,
02:12:26.520 | for example, right?
02:12:27.440 | And so I don't think it's a question about scale,
02:12:31.320 | it's a question about utility.
02:12:33.200 | And when I look at the transformer series of architectures
02:12:36.200 | that the GPT series is based on,
02:12:38.760 | it's really interesting to look at that
02:12:39.880 | because they're actually very simple designs.
02:12:42.920 | They're not recurrent.
02:12:44.720 | The training regimens are pretty simple.
02:12:47.440 | And so they don't really reflect like human brains, right?
02:12:51.680 | But they're really good at learning language models
02:12:54.600 | and they're unrolled enough that you get,
02:12:56.500 | you can simulate some recurrence, right?
02:12:59.120 | And so the question I think about is,
02:13:02.080 | where does this take us?
02:13:03.240 | Like, so we can just keep scaling it,
02:13:05.120 | have more parameters, more data, more things,
02:13:07.640 | we'll get a better result for sure.
02:13:09.400 | But are there architectural techniques
02:13:11.800 | that can lead to progress at a faster pace?
02:13:14.220 | Right, this is when, how do you get,
02:13:17.680 | instead of just like making it a constant time bigger,
02:13:20.600 | how do you get like an algorithmic improvement
02:13:23.320 | out of this, right?
02:13:24.160 | And whether it be a new training regimen,
02:13:25.720 | if it becomes sparse networks, for example,
02:13:30.320 | human brain sparse, all these networks are dense,
02:13:33.600 | the connectivity patterns can be very different.
02:13:36.120 | I think this is where I get very interested
02:13:38.240 | and I'm way out of my league
02:13:39.480 | on the deep learning side of this.
02:13:41.560 | But I think that could lead to big breakthroughs.
02:13:43.680 | When you talk about large scale networks,
02:13:46.160 | one of the things that Jeff Dean likes to talk about
02:13:48.000 | and he's given a few talks on is this idea
02:13:51.680 | of having a sparsely gated mixture of experts
02:13:54.200 | kind of a model where you have, you know,
02:13:57.400 | different nets that are trained
02:13:59.480 | and are really good at certain kinds of tasks.
02:14:02.080 | And so you have this distributed across a cluster.
02:14:04.840 | And so you have a lot of different computers
02:14:06.400 | that end up being kind of locally specialized
02:14:08.520 | in different demands.
02:14:09.720 | And then when a query comes in,
02:14:11.040 | you gate it and you use learn techniques
02:14:13.720 | to route to different parts of the network.
02:14:15.440 | And then you utilize the compute resources
02:14:18.000 | of the entire cluster by having specialization within it.
02:14:20.640 | And I don't know where that goes
02:14:23.680 | or if it starts to, when it starts to work,
02:14:25.520 | but I think things like that
02:14:26.680 | could be really interesting as well.
02:14:28.360 | - And then on the data side too,
02:14:30.000 | if you can think of data selection as a kind of programming.
02:14:35.000 | - Yeah.
02:14:36.680 | - I mean, essentially, if you look at like Karpathy
02:14:38.760 | talked about software 2.0,
02:14:40.640 | I mean, in a sense, data is the programming.
02:14:44.040 | - Yeah, yeah.
02:14:44.880 | So let me try to summarize Andre's position really quick
02:14:48.320 | before I disagree with it.
02:14:50.000 | - Yeah.
02:14:51.120 | - So Andre Karpathy is amazing.
02:14:53.400 | So this is nothing personal with him.
02:14:55.200 | He's an amazing engineer.
02:14:57.400 | - And also a good blog post writer.
02:14:59.240 | - Yeah, well, he's a great communicator.
02:15:01.080 | He's just an amazing person.
02:15:02.400 | He's also really sweet.
02:15:03.720 | So his basic premise is that software is suboptimal.
02:15:09.400 | I think we can all agree to that.
02:15:11.040 | He also points out that deep learning
02:15:14.480 | and other learning-based techniques are really great
02:15:16.360 | because you can solve problems in more structured ways
02:15:19.120 | with less like ad hoc code that people write out
02:15:23.040 | and don't write test cases for in some cases.
02:15:25.160 | And so they don't even know if it works in the first place.
02:15:27.800 | And so if you start replacing systems of imperative code
02:15:32.320 | with deep learning models, then you get a better result.
02:15:37.120 | And I think that he argues that software 2.0
02:15:40.680 | is a pervasively learned set of models
02:15:44.120 | and you get away from writing code.
02:15:45.920 | And he's given talks where he talks about
02:15:47.920 | swapping over more and more and more parts of the code
02:15:50.960 | to being learned and driven that way.
02:15:54.840 | I think that works.
02:15:56.640 | And if you're predisposed to liking machine learning,
02:15:59.240 | then I think that that's definitely a good thing.
02:16:01.760 | I think this is also good for accessibility in many ways
02:16:04.700 | because certain people are not gonna write C code
02:16:06.800 | or something.
02:16:07.720 | And so having a data-driven approach to do this kind of
02:16:10.620 | stuff, I think can be very valuable.
02:16:12.720 | On the other hand, there are huge trade-offs.
02:16:14.200 | And it's not clear to me that software 2.0 is the answer.
02:16:19.200 | And probably Andre wouldn't argue that it's the answer
02:16:21.440 | for every problem either.
02:16:22.960 | But I look at machine learning as not a replacement
02:16:26.760 | for software 1.0.
02:16:27.920 | I look at it as a new programming paradigm.
02:16:30.120 | And so programming paradigms, when you look across domains,
02:16:35.140 | is structured programming where you go from go-tos
02:16:38.480 | to if-then-else, or functional programming from Lisp.
02:16:42.280 | And you start talking about higher order functions
02:16:44.440 | and values and things like this.
02:16:45.880 | Or you talk about object-oriented programming.
02:16:48.040 | You're talking about encapsulation, subclassing,
02:16:49.960 | inheritance.
02:16:50.800 | You start talking about generic programming
02:16:52.640 | where you start talking about code reuse
02:16:54.480 | through specialization and different type instantiations.
02:16:59.480 | When you start talking about differentiable programming,
02:17:01.720 | something that I am very excited about in the context
02:17:04.960 | of machine learning, talking about taking functions
02:17:07.200 | and generating variants, like the derivative
02:17:10.280 | of another function.
02:17:11.120 | Like that's a programming paradigm that's very useful
02:17:13.760 | for solving certain classes of problems.
02:17:16.220 | Machine learning is amazing at solving certain classes
02:17:18.680 | of problems.
02:17:19.520 | Like you're not gonna write a cat detector
02:17:21.940 | or even a language translation system by writing C code.
02:17:25.920 | That's not a very productive way to do things anymore.
02:17:28.920 | And so machine learning is absolutely the right way
02:17:31.480 | to do that.
02:17:32.320 | In fact, I would say that learned models are really
02:17:35.000 | one of the best ways to work with the human world
02:17:37.280 | in general.
02:17:38.240 | And so anytime you're talking about sensory input
02:17:40.320 | of different modalities, anytime that you're talking
02:17:42.320 | about generating things in a way that makes sense
02:17:45.120 | to a human, I think that learned models are really,
02:17:47.840 | really useful.
02:17:48.920 | And that's because humans are very difficult
02:17:50.560 | to characterize, okay?
02:17:52.660 | And so this is a very powerful paradigm for solving
02:17:55.660 | classes of problems.
02:17:57.120 | But on the other hand, imperative code is too.
02:17:59.680 | You're not gonna write a bootloader for your computer
02:18:02.600 | in with a deep learning model.
02:18:04.060 | Deep learning models are very hardware intensive.
02:18:07.040 | They're very energy intensive because you have a lot
02:18:09.900 | of parameters and you can provably implement any function
02:18:14.500 | with a learned model, like this has been shown,
02:18:17.700 | but that doesn't make it efficient.
02:18:19.900 | And so if you're talking about caring about a few orders
02:18:22.300 | of magnitudes worth of energy usage,
02:18:24.080 | then it's useful to have other tools in the toolbox.
02:18:26.940 | - There's also robustness too.
02:18:28.420 | I mean, as a-- - Yeah, exactly.
02:18:29.900 | All the problems of dealing with data and bias in data,
02:18:32.500 | all the problems of, you know, software 2.0.
02:18:35.100 | And one of the great things that Andre is arguing towards,
02:18:39.320 | which I completely agree with him, is that when you start
02:18:43.100 | implementing things with deep learning, you need to learn
02:18:45.180 | from software 1.0 in terms of testing,
02:18:47.660 | continuous integration, how you deploy,
02:18:50.020 | how do you validate all these things and building systems
02:18:53.060 | around that so that you're not just saying like,
02:18:54.980 | "Oh, it seems like it's good, ship it."
02:18:57.580 | Right?
02:18:58.420 | Well, what happens when I regress something?
02:18:59.820 | What happens when I make a classification that's wrong
02:19:02.480 | and now I hurt somebody, right?
02:19:05.540 | All these things you have to reason about.
02:19:07.340 | - Yeah, but at the same time, the bootloader that works
02:19:10.140 | for us humans looks awfully a lot like a neural network.
02:19:14.900 | Right?
02:19:15.740 | So it's messy and you can cut out different parts
02:19:19.180 | of the brain.
02:19:20.020 | There's a lot of this neuroplasticity work that shows
02:19:22.900 | that it's gonna adjust.
02:19:24.140 | It's a really interesting question,
02:19:26.900 | how much of the world's programming could be replaced
02:19:30.420 | by software 2.0?
02:19:31.780 | Like with-- - Oh, well, I mean,
02:19:33.340 | it's provably true that you could replace all of it.
02:19:36.600 | - Right, so then it's a question of trade-offs.
02:19:39.260 | - Right, so anything that's a function, you can.
02:19:40.980 | So it's not a question about if.
02:19:42.980 | I think it's a economic question.
02:19:44.940 | It's a, what kind of talent can you get?
02:19:47.740 | What kind of trade-offs in terms of maintenance?
02:19:50.060 | Right, those kinds of questions, I think.
02:19:51.680 | What kind of data can you collect?
02:19:53.260 | I think one of the reasons that I'm most interested
02:19:55.100 | in machine learning as a programming paradigm is that one
02:19:59.160 | of the things that we've seen across computing in general
02:20:01.520 | is that being laser focused on one paradigm often puts you
02:20:06.100 | in a box that's not super great.
02:20:08.460 | And so you look at object-oriented programming,
02:20:10.420 | like it was all the rage in the early '80s.
02:20:12.060 | And like, everything has to be objects.
02:20:13.500 | And people forgot about functional programming,
02:20:15.620 | even though it came first.
02:20:17.380 | And then people rediscovered that, hey,
02:20:20.020 | if you mix functional and object-oriented and structure,
02:20:22.700 | like you mix these things together,
02:20:24.260 | you can provide very interesting tools
02:20:25.780 | that are good at solving different problems.
02:20:28.420 | And so the question there is how do you get the best way
02:20:31.180 | to solve the problems?
02:20:32.620 | It's not about whose tribe should win, right?
02:20:35.980 | It's not about, you know, that shouldn't be the question.
02:20:38.780 | The question is how do you make it
02:20:40.020 | so that people can solve those problems the fastest
02:20:42.180 | and they have the right tools in their box
02:20:44.300 | to build good libraries and they can solve these problems.
02:20:47.140 | And when you look at that, that's like, you know,
02:20:49.060 | you look at reinforcement learning
02:20:50.300 | as one really interesting subdomain of this.
02:20:52.620 | Reinforcement learning, often you have to have
02:20:55.060 | the integration of a learned model combined with your Atari
02:20:59.380 | or whatever the other scenario it is that you're working in.
02:21:02.860 | You have to combine that thing
02:21:04.420 | with the robot control for the arm, right?
02:21:07.620 | And so now it's not just about that one paradigm.
02:21:11.900 | It's about integrating that with all the other systems
02:21:14.540 | that you have, including often legacy systems
02:21:17.020 | and things like this, right?
02:21:18.100 | And so to me, I think that the interesting thing to say
02:21:21.460 | is like, how do you get the best out of this domain
02:21:23.820 | and how do you enable people to achieve things
02:21:25.820 | that they otherwise couldn't do
02:21:27.300 | without excluding all the good things
02:21:29.700 | we already know how to do?
02:21:31.300 | - Right, but, okay, this is a crazy question,
02:21:35.300 | but we talked a little bit about GPT-3,
02:21:38.820 | but do you think it's possible that these language models
02:21:42.340 | that in essence, in the language domain,
02:21:47.340 | software 2.0 could replace some aspect of compilation,
02:21:51.820 | for example, or do program synthesis
02:21:54.260 | replace some aspect of programming?
02:21:56.860 | - Yeah, absolutely.
02:21:57.700 | So I think that learned models in general
02:22:00.340 | are extremely powerful
02:22:01.580 | and I think that people underestimate them.
02:22:03.700 | - Maybe you can suggest what I should do.
02:22:07.140 | So I've access to the GPT-3 API.
02:22:11.380 | Would I be able to generate Swift code, for example?
02:22:14.260 | Do you think that could do something interesting
02:22:16.020 | and would it work?
02:22:17.060 | - So GPT-3 is probably not trained on the right corpus.
02:22:21.140 | So it probably has the ability to generate some Swift.
02:22:23.700 | I bet it does.
02:22:25.220 | It's probably not gonna generate a large enough body of Swift
02:22:27.620 | to be useful, but like taking it a next step further,
02:22:30.580 | like if you had the goal of training something like GPT-3
02:22:33.980 | and you wanted to train it to generate source code, right?
02:22:38.020 | It could definitely do that.
02:22:39.780 | Now the question is, how do you express the intent
02:22:42.660 | of what you want filled in?
02:22:44.300 | You can definitely like write scaffolding of code
02:22:47.060 | and say, fill in the hole
02:22:48.900 | and sort of put in some for loops
02:22:50.340 | or put in some classes or whatever.
02:22:51.540 | And the power of these models is impressive,
02:22:53.700 | but there's an unsolved question, at least unsolved to me,
02:22:56.940 | which is how do I express the intent of what to fill in?
02:22:59.740 | Right, and kind of what you'd really want to have,
02:23:03.180 | and I don't know that these models are up to the task,
02:23:06.340 | is you wanna be able to say,
02:23:08.300 | here's a scaffolding and here are the assertions at the end
02:23:11.260 | and the assertions always pass.
02:23:14.060 | And so you want a generative model on the one hand, yes.
02:23:16.620 | - That's fascinating, yeah.
02:23:17.620 | - Right, but you also want some loopback,
02:23:20.500 | some reinforcement learning system or something
02:23:23.220 | where you're actually saying like,
02:23:24.700 | I need to hill climb towards something that is more correct.
02:23:28.540 | And I don't know that we have that.
02:23:29.780 | - So it would generate not only a bunch of the code,
02:23:33.700 | but like the checks that do the testing,
02:23:35.980 | it would generate the test.
02:23:37.100 | - I think the humans would generate the test, right?
02:23:38.860 | - Oh, okay.
02:23:39.700 | - The test would be-- - But it would be fascinating--
02:23:41.380 | - Well, the test are the requirements.
02:23:43.060 | - Yes, but the, okay, so--
02:23:44.220 | - 'Cause you have to express to the model what you want to,
02:23:47.060 | you don't just want gibberish code.
02:23:48.820 | Look at how compelling this code looks.
02:23:51.300 | You want a story about four horned unicorns or something.
02:23:54.740 | - Well, okay, so exactly, but that's human requirements.
02:23:57.700 | But then I thought it's a compelling idea
02:24:00.180 | that the GPT-4 model could generate checks
02:24:06.260 | like that are more high fidelity that check for correctness.
02:24:11.260 | Because the code it generates,
02:24:15.500 | like say I ask it to generate a function
02:24:18.420 | that gives me the Fibonacci sequence.
02:24:21.620 | - Sure.
02:24:22.460 | - I don't like--
02:24:24.340 | - So decompose the problem, right?
02:24:25.620 | So you have two things.
02:24:26.980 | You have, you need the ability to generate
02:24:29.380 | syntactically correct Swift code that's interesting, right?
02:24:33.100 | I think GPT series of model architectures can do that.
02:24:37.580 | But then you need the ability to add the requirements.
02:24:41.340 | So generate Fibonacci.
02:24:43.060 | - Yeah.
02:24:43.900 | - The human needs to express that goal.
02:24:46.040 | We don't have that language that I know of.
02:24:49.140 | - No, I mean, it can generate stuff.
02:24:50.820 | Have you seen with GPT-3, it can generate,
02:24:52.820 | you can say, I mean, there's interface stuff,
02:24:55.780 | like it can generate HTML,
02:24:58.380 | it can generate basic for loops that give you like--
02:25:02.020 | - Right, but pick HTML.
02:25:02.900 | How do I say I want google.com?
02:25:05.120 | - Well, no, you could say--
02:25:07.820 | - Or not literally google.com.
02:25:09.380 | How do I say I want a webpage that's got a shopping cart
02:25:11.740 | and this and that?
02:25:12.580 | - Yeah, it does that.
02:25:14.020 | I mean, so, okay, so just,
02:25:16.140 | I don't know if you've seen these demonstrations,
02:25:17.720 | but you type in, I want a red button
02:25:20.380 | with the text that says hello,
02:25:22.480 | and you type that in natural language,
02:25:24.220 | and it generates the correct HTML.
02:25:25.940 | - Okay.
02:25:26.780 | - I've done this demo.
02:25:27.620 | It's kind of compelling.
02:25:29.020 | So you have to prompt it with similar kinds of mappings.
02:25:33.300 | Of course, it's probably handpicked.
02:25:35.660 | I got to experiment.
02:25:36.580 | They probably, but the fact that you can do that once,
02:25:39.500 | even out of like 20, is quite impressive.
02:25:43.180 | Again, that's very basic.
02:25:45.220 | Like the HTML is kind of messy and bad.
02:25:48.420 | But yes, the intent is,
02:25:49.980 | the idea is the intent is specified in natural language.
02:25:52.660 | - Okay.
02:25:53.500 | Yeah, so I have not seen that.
02:25:54.420 | That's really cool.
02:25:55.240 | - Yeah. (laughs)
02:25:56.080 | - Yeah.
02:25:56.920 | - So the question is the correctness of that.
02:25:59.880 | Like visually you can check, oh, the button is red.
02:26:02.880 | But for more,
02:26:04.660 | for more complicated functions,
02:26:10.200 | where the intent is harder to check,
02:26:12.120 | this goes into like NP completeness kind of things.
02:26:15.480 | Like I want to know that this code is correct.
02:26:18.160 | And generates a giant thing.
02:26:20.120 | - Yeah.
02:26:20.960 | - That does some kind of calculation.
02:26:23.720 | It seems to be working.
02:26:25.440 | It's interesting to think like,
02:26:27.880 | should the system also try to generate checks
02:26:30.720 | for itself for correctness?
02:26:32.080 | - Yeah, I don't know.
02:26:33.000 | And this is way beyond my experience.
02:26:35.160 | (laughs)
02:26:36.000 | The thing that I think about is that
02:26:39.200 | there doesn't seem to be a lot of
02:26:41.120 | equational reasoning going on.
02:26:43.280 | - Right.
02:26:44.100 | - There's a lot of pattern matching and filling in.
02:26:45.280 | And kind of propagating patterns that have been seen before
02:26:48.480 | into the future and into the generated result.
02:26:50.680 | And so if you want to get correctness,
02:26:53.240 | you kind of need to improving kind of things.
02:26:55.180 | And like higher level logic.
02:26:57.320 | And I don't know that,
02:26:58.600 | you could talk to Jan about that.
02:26:59.920 | (laughs)
02:27:00.760 | And see what the bright minds are thinking about right now.
02:27:04.720 | But I don't think the GPT is in that vein.
02:27:08.180 | It's still really cool.
02:27:09.240 | - Yeah, and surprisingly, who knows?
02:27:11.880 | Maybe reasoning is--
02:27:13.960 | - Is overrated.
02:27:14.800 | - Yeah, is overrated.
02:27:15.640 | - Right, I mean, do we reason?
02:27:17.320 | - Yeah.
02:27:18.160 | - How do you tell, right?
02:27:18.980 | Are we just pattern matching based on what we have?
02:27:20.560 | And then reverse justify to ourselves?
02:27:22.800 | - Yeah, exactly, the reverse.
02:27:24.280 | So I think what the neural networks are missing,
02:27:26.940 | and I think GPT4 might have,
02:27:29.820 | is to be able to tell stories to itself about what it did.
02:27:33.800 | - Well, that's what humans do, right?
02:27:34.900 | I mean, you talk about network explainability, right?
02:27:38.260 | And we give neural nets a hard time about this.
02:27:40.700 | But humans don't know why we make decisions.
02:27:42.420 | We have this thing called intuition,
02:27:43.780 | and then we try to say,
02:27:45.220 | "This feels like the right thing, but why?"
02:27:47.100 | Right, and you wrestle with that
02:27:49.140 | when you're making hard decisions.
02:27:50.300 | And is that science?
02:27:52.220 | Not really.
02:27:53.380 | (laughs)
02:27:54.440 | - Let me ask you about a few high-level questions, I guess.
02:27:57.440 | You've done a million things in your life
02:28:02.400 | and been very successful.
02:28:04.240 | A bunch of young folks listen to this,
02:28:07.000 | ask for advice from successful people like you.
02:28:10.720 | If you were to give advice to somebody,
02:28:16.000 | an undergraduate student or a high school student,
02:28:19.040 | about pursuing a career in computing
02:28:23.520 | or just advice about life in general,
02:28:25.560 | is there some words of wisdom you can give them?
02:28:28.840 | - So I think you come back to change.
02:28:30.840 | And profound leaps happen
02:28:34.120 | because people are willing to believe
02:28:35.400 | that change is possible and that the world does change
02:28:39.160 | and are willing to do the hard thing
02:28:41.000 | that it takes to make change happen.
02:28:42.680 | And whether it be implementing a new programming language
02:28:45.880 | or implementing a new system
02:28:47.080 | or implementing a new research paper,
02:28:49.200 | designing a new thing,
02:28:50.200 | moving the world forward in science and philosophy,
02:28:52.680 | whatever, it really comes down to somebody
02:28:54.520 | who's willing to put in the work.
02:28:56.760 | Right, and you have,
02:28:57.960 | the work is hard for a whole bunch of different reasons,
02:29:01.520 | one of which is you,
02:29:04.200 | it's work, right?
02:29:06.920 | And so you have to have the space in your life
02:29:08.800 | in which you can do that work,
02:29:09.840 | which is why going to grad school
02:29:10.980 | can be a beautiful thing for certain people.
02:29:14.720 | But also there's a self-doubt that happens.
02:29:16.840 | Like you're two years into a project,
02:29:18.320 | is it going anywhere, right?
02:29:20.280 | Well, what do you do?
02:29:21.120 | Do you just give up because it's hard?
02:29:23.280 | Well, no, I mean, some people like suffering.
02:29:25.620 | And so you plow through it.
02:29:29.280 | The secret to me is that you have to love what you're doing
02:29:31.960 | and follow that passion
02:29:35.000 | because when you get to the hard times,
02:29:37.080 | that's when, if you love what you're doing,
02:29:40.080 | you're willing to kind of push through.
02:29:41.680 | And this is really hard
02:29:45.440 | because it's hard to know what you will love doing
02:29:48.640 | until you start doing a lot of things.
02:29:50.200 | And so that's why I think that,
02:29:51.640 | particularly early in your career,
02:29:53.280 | it's good to experiment.
02:29:54.900 | Do a little bit of everything.
02:29:56.400 | Go take the survey class on,
02:29:59.320 | the first half of every class
02:30:01.480 | in your upper division lessons
02:30:03.720 | and just get exposure to things
02:30:05.680 | because certain things will resonate with you
02:30:07.080 | and you'll find out, wow, I'm really good at this.
02:30:08.920 | I'm really smart at this.
02:30:10.040 | Well, it's just because it works with the way your brain.
02:30:13.000 | - And when something jumps out,
02:30:14.320 | I mean, that's one of the things
02:30:15.600 | that people often ask about is like,
02:30:19.120 | well, I think there's a bunch of cool stuff out there.
02:30:21.360 | Like, how do I pick the thing?
02:30:22.940 | - Yeah.
02:30:25.160 | - How do you hook, in your life,
02:30:27.560 | how did you just hook yourself in and stuck with it?
02:30:30.440 | - Well, I got lucky, right?
02:30:31.680 | I mean, I think that many people forget
02:30:34.800 | that a huge amount of it or most of it is luck, right?
02:30:38.760 | So let's not forget that.
02:30:40.860 | So for me, I fell in love with computers early on
02:30:44.800 | because they spoke to me, I guess.
02:30:47.700 | - What language did they speak?
02:30:50.720 | - Basic.
02:30:51.560 | - Basic, yeah.
02:30:52.380 | - But then it was just kind of following
02:30:56.960 | a set of logical progressions,
02:30:58.200 | but also deciding that something that was hard
02:31:01.400 | was worth doing and a lot of fun, right?
02:31:04.080 | And so I think that that is also something
02:31:06.240 | that's true for many other domains,
02:31:08.100 | which is if you find something that you love doing,
02:31:10.400 | that's also hard, if you invest yourself in it
02:31:13.480 | and add value to the world,
02:31:15.000 | then it will mean something, generally, right?
02:31:17.160 | And again, that can be a research paper,
02:31:19.160 | that can be a software system,
02:31:20.440 | that can be a new robot,
02:31:22.080 | that can be, there's many things that can be,
02:31:24.820 | but a lot of it is like real value
02:31:27.160 | comes from doing things that are hard.
02:31:29.360 | And that doesn't mean you have to suffer.
02:31:32.000 | But--
02:31:34.000 | - It's hard.
02:31:34.820 | I mean, you don't often hear that message.
02:31:36.400 | We talked about it last time a little bit,
02:31:38.040 | but it's one of my, not enough people talk about this.
02:31:42.860 | It's beautiful to hear a successful person.
02:31:47.440 | - Well, and self-doubt and imposter syndrome,
02:31:49.480 | and these are all things that successful people
02:31:52.400 | suffer with as well,
02:31:54.000 | particularly when they put themselves
02:31:55.160 | in a point of being uncomfortable,
02:31:56.700 | which I like to do now and then,
02:31:59.240 | just because it puts you in learning mode.
02:32:02.120 | Like if you wanna grow as a person,
02:32:04.120 | put yourself in a room with a bunch of people
02:32:07.040 | that know way more about whatever you're talking about
02:32:09.200 | than you do, and ask dumb questions.
02:32:11.560 | And guess what?
02:32:13.080 | Smart people love to teach, often, not always, but often.
02:32:16.840 | And if you listen, if you're prepared to listen,
02:32:18.360 | if you're prepared to grow,
02:32:19.200 | if you're prepared to make connections,
02:32:20.720 | you can do some really interesting things.
02:32:22.400 | And I think that a lot of progress is made by people
02:32:25.400 | who kind of hop between domains now and then,
02:32:28.040 | because they bring a perspective into a field
02:32:32.520 | that nobody else has,
02:32:34.760 | if people have only been working in that field themselves.
02:32:38.320 | - We mentioned that the universe is kind of like a compiler,
02:32:41.440 | the entirety of it, the whole evolution
02:32:44.920 | is kind of a kind of compilation.
02:32:46.740 | Maybe us human beings are kind of compilers.
02:32:50.680 | Let me ask the old sort of question
02:32:53.600 | that I didn't ask you last time,
02:32:54.960 | which is what's the meaning of it all?
02:32:57.780 | Is there a meaning?
02:32:58.760 | Like if you asked a compiler why,
02:33:00.860 | what would a compiler say?
02:33:03.400 | What's the meaning of life?
02:33:04.640 | - What's the meaning of life?
02:33:06.840 | I'm prepared for it not to mean anything.
02:33:08.840 | Here we are all biological things programmed to survive
02:33:14.200 | and propagate our DNA.
02:33:17.520 | And maybe the universe is just a computer
02:33:21.440 | and you just go until entropy takes over the world
02:33:24.160 | and it takes over the universe and then you're done.
02:33:27.440 | I don't think that's a very productive way
02:33:29.680 | to live your life, if so.
02:33:33.000 | And so I prefer to bias towards the other way,
02:33:34.760 | which is saying the universe has a lot of value.
02:33:37.960 | And I take happiness out of other people.
02:33:41.800 | And a lot of times part of that's having kids,
02:33:43.840 | but also the relationships you build with other people.
02:33:46.940 | And so the way I try to live my life is like,
02:33:49.680 | what can I do that has value?
02:33:51.240 | How can I move the world forward?
02:33:52.480 | How can I take what I'm good at
02:33:54.540 | and bring it into the world?
02:33:57.600 | And how can I, I'm one of these people
02:33:59.520 | that likes to work really hard
02:34:00.640 | and be very focused on the things that I do.
02:34:03.160 | And so if I'm gonna do that,
02:34:05.040 | how can it be in a domain that actually will matter?
02:34:08.080 | Because a lot of things that we do,
02:34:10.040 | we find ourselves in the cycle of like,
02:34:11.680 | okay, I'm doing a thing, I'm very familiar with it,
02:34:13.740 | I've done it for a long time,
02:34:15.400 | I've never done anything else,
02:34:16.680 | but I'm not really learning.
02:34:18.960 | I'm keeping things going,
02:34:21.740 | but there's a younger generation
02:34:23.440 | that can do the same thing,
02:34:24.640 | maybe even better than me.
02:34:26.480 | Maybe if I actually step out of this
02:34:28.000 | and jump into something I'm less comfortable with,
02:34:31.280 | it's scary, but on the other hand,
02:34:33.440 | it gives somebody else a new opportunity.
02:34:34.920 | It also then puts you back in learning mode,
02:34:37.480 | and that can be really interesting.
02:34:38.920 | And one of the things I've learned
02:34:40.580 | is that when you go through that,
02:34:42.360 | that first you're deep into imposter syndrome,
02:34:45.040 | but when you start working your way out,
02:34:46.940 | you start to realize,
02:34:47.780 | hey, well, there's actually a method to this.
02:34:50.000 | And now I'm able to add new things
02:34:53.280 | 'cause I bring different perspective.
02:34:54.680 | And this is one of the good things
02:34:57.240 | about bringing different kinds of people together.
02:34:59.800 | Diversity of thought is really important.
02:35:01.860 | And if you can pull together people
02:35:04.440 | that are coming at things from different directions,
02:35:06.480 | you often get innovation.
02:35:07.760 | And I love to see that, that aha moment
02:35:10.560 | where you're like, oh, we've really cracked this.
02:35:12.760 | This is something nobody's ever done before.
02:35:15.200 | And then if you can do it in the context
02:35:16.760 | where it adds value, other people can build on it,
02:35:18.960 | it helps move the world,
02:35:20.280 | then that's what really excites me.
02:35:22.720 | - So that kind of description
02:35:24.480 | of the magic of the human experience,
02:35:26.480 | do you think we'll ever create that in like an AGI system?
02:35:29.880 | Do you think we'll be able to create,
02:35:33.480 | give AI systems a sense of meaning
02:35:38.040 | where they operate in this kind of world
02:35:39.640 | exactly in the way you've described,
02:35:41.800 | which is they interact with each other,
02:35:43.240 | they interact with us humans?
02:35:44.800 | - Sure, sure.
02:35:45.640 | Well, so I mean, why are you being so speciest?
02:35:50.040 | Right?
02:35:50.880 | All right, so AGIs versus bionets,
02:35:54.600 | or versus biology, right?
02:35:56.520 | What are we but machines, right?
02:36:00.240 | We're just programmed to run our,
02:36:02.880 | we have our objective function that we were optimized for.
02:36:05.520 | Right?
02:36:06.400 | And so we're doing our thing.
02:36:07.600 | We think we have purpose, but do we really?
02:36:09.280 | - Yeah.
02:36:10.120 | - Right, I'm not prepared to say
02:36:10.960 | that those newfangled AGIs have no soul
02:36:14.560 | just because we don't understand them, right?
02:36:16.840 | And I think that would be, when they exist,
02:36:20.080 | that would be very premature to look at a new thing
02:36:24.160 | through your own lens without fully understanding it.
02:36:26.760 | - You might be just saying that
02:36:29.400 | because AI systems in the future will be listening to this.
02:36:32.720 | And then--
02:36:33.560 | - Oh yeah, yeah, exactly.
02:36:34.400 | - You don't wanna say anything.
02:36:35.220 | - Please be nice to me.
02:36:36.060 | You know, when Skynet kills everybody, please spare me.
02:36:39.160 | - So wise, wise look ahead thinking.
02:36:42.640 | - Yeah, but I mean, I think that people
02:36:44.560 | will spend a lot of time worrying about this kind of stuff.
02:36:46.360 | And I think that what we should be worrying about
02:36:48.200 | is how do we make the world better?
02:36:49.880 | And the thing that I'm most scared about with AGIs
02:36:52.880 | is not that necessarily the Skynet
02:36:57.480 | will start shooting everybody with lasers
02:36:59.000 | and stuff like that to use us for calories.
02:37:02.120 | The thing that I'm worried about is that
02:37:05.440 | humanity I think needs a challenge.
02:37:08.320 | And if we get into a mode of not having a personal challenge,
02:37:11.640 | not having a personal contribution,
02:37:13.600 | whether that be like, you know, your kids
02:37:15.920 | and seeing what they grow into and helping guide them,
02:37:18.840 | whether it be your community that you're engaged in,
02:37:21.960 | you're driving forward, whether it be your work
02:37:23.920 | and the things that you're doing
02:37:25.040 | and the people you're working with
02:37:25.960 | and the products you're building
02:37:26.800 | and the contribution there.
02:37:28.880 | If people don't have a objective,
02:37:31.960 | I'm afraid what that means.
02:37:33.360 | And I think that this would lead to a rise
02:37:37.840 | of the worst part of people, right?
02:37:39.920 | Instead of people striving together
02:37:42.240 | and trying to make the world better,
02:37:45.080 | it could degrade into a very unpleasant world.
02:37:49.720 | But I don't know.
02:37:51.140 | I mean, we hopefully have a long ways to go
02:37:53.600 | before we discover that.
02:37:54.760 | (laughing)
02:37:55.720 | Unfortunately, we have pretty on the ground problems
02:37:57.680 | with the pandemic right now.
02:37:58.680 | And so I think we should be focused on that as well.
02:38:01.480 | - Yeah, ultimately, just as you said, you're optimistic.
02:38:04.640 | I think it helps for us to be optimistic.
02:38:07.320 | So that's, take it until you make it.
02:38:10.360 | - Yeah, well, and why not?
02:38:11.840 | What's the other side?
02:38:12.680 | Right, so I mean, I'm not personally a very religious person,
02:38:17.460 | but I've heard people say like,
02:38:19.200 | oh yeah, of course I believe in God.
02:38:20.440 | Of course I go to church, because if God's real,
02:38:23.340 | (laughing)
02:38:24.440 | you know, I wanna be on the right side of that.
02:38:25.920 | And if it's not real, it doesn't matter.
02:38:27.080 | - Yeah, it doesn't matter.
02:38:27.920 | - And so, you know, that's a fair way to do it.
02:38:30.960 | - Yeah, I mean, the same thing with nuclear deterrence,
02:38:35.600 | all of, you know, global warming, all these things,
02:38:38.400 | all these threats, natural, engineer, pandemics,
02:38:41.340 | all these threats we face.
02:38:42.680 | I think it's paralyzing to be terrified
02:38:49.660 | of all the possible ways we could destroy ourselves.
02:38:52.540 | I think it's much better, or at least productive,
02:38:56.580 | to be hopeful and to engineer defenses against these things,
02:39:00.820 | to engineer a future where like, you know,
02:39:04.820 | see like a positive future and engineer that future.
02:39:07.940 | - Yeah, well, and I think that's another thing
02:39:10.220 | to think about as, you know, a human,
02:39:12.700 | particularly if you're young and trying to figure out
02:39:14.540 | what it is that you wanna be when you grow up, like I am.
02:39:18.100 | I'm always looking for that.
02:39:19.820 | The question then is, how do you wanna spend your time?
02:39:23.360 | And right now there seems to be a norm
02:39:25.980 | of being a consumption culture.
02:39:28.780 | Like I'm gonna watch the news and revel
02:39:31.500 | in how horrible everything is right now.
02:39:33.500 | I'm going to go find out about the latest atrocity
02:39:36.540 | and find out all the details of like the terrible thing
02:39:38.820 | that happened and be outraged by it.
02:39:40.620 | You can spend a lot of time watching TV
02:39:43.980 | and watching the new sitcom or whatever
02:39:46.600 | people watch these days, I don't know.
02:39:49.300 | But that's a lot of hours, right?
02:39:51.100 | And those are hours that if you're turning
02:39:53.420 | to being productive, learning, growing, experiencing,
02:39:58.420 | you know, when the pandemic's over, going exploring, right?
02:40:02.060 | It leads to more growth.
02:40:03.620 | And I think it leads to more optimism and happiness
02:40:06.420 | because you're building, right?
02:40:08.660 | You're building yourself, you're building your capabilities,
02:40:11.000 | you're building your viewpoints,
02:40:12.220 | you're building your perspective.
02:40:13.460 | And I think that a lot of the consuming
02:40:18.380 | of other people's messages leads to kind
02:40:20.780 | of a negative viewpoint, which you need to be aware
02:40:23.260 | of what's happening because that's also important,
02:40:25.660 | but there's a balance that I think focusing
02:40:28.100 | on creation is a very valuable thing to do.
02:40:31.980 | - Yeah, so what you're saying is people should focus
02:40:33.840 | on working on the sexiest field of them all,
02:40:37.300 | which is compiler design.
02:40:38.420 | - Exactly.
02:40:39.660 | Hey, you could go work on machine learning
02:40:41.160 | and be crowded out by the thousands of graduates popping
02:40:43.980 | out of school that all want to do the same thing.
02:40:45.620 | Or you could work in the place that people overpay you
02:40:48.580 | because there's not enough smart people working in it.
02:40:51.260 | And here at the end of Moore's law, according
02:40:53.780 | to some people, actually the software is the hard part too.
02:40:57.140 | - I mean, optimization is truly, truly beautiful.
02:41:02.300 | And also on the YouTube side or education side, you know,
02:41:06.500 | it'd be nice to have some material that shows the beauty
02:41:10.620 | of compilers.
02:41:12.120 | - Yeah, yeah.
02:41:13.160 | - That's something.
02:41:14.480 | So that's a call for people to create that kind
02:41:17.800 | of content as well.
02:41:18.920 | Chris, you're one of my favorite people to talk to.
02:41:22.840 | It's such a huge honor that you would waste your time
02:41:25.560 | talking to me.
02:41:26.400 | I've always appreciated it.
02:41:27.760 | Thank you so much for talking today.
02:41:30.120 | - The truth of it is you spent a lot of time talking to me
02:41:32.320 | just on walks and other things like that.
02:41:34.440 | So it's great to catch up.
02:41:35.640 | - Thanks, man.
02:41:37.200 | Thanks for listening to this conversation
02:41:39.240 | with Chris Latner.
02:41:40.400 | A thank you to our sponsors.
02:41:42.360 | Blinkist, an app that summarizes key ideas
02:41:45.200 | from thousands of books.
02:41:46.600 | Neuro, which is a maker of functional gum and mints
02:41:49.640 | that supercharge my mind.
02:41:51.440 | Masterclass, which are online courses from world experts.
02:41:55.480 | And finally Cash App, which is an app
02:41:57.840 | for sending money to friends.
02:42:00.200 | Please check out these sponsors in the description
02:42:02.360 | to get a discount and to support this podcast.
02:42:06.120 | If you enjoy this thing, subscribe on YouTube,
02:42:08.440 | review it with Five Stars on Apple Podcast,
02:42:10.600 | follow on Spotify, support on Patreon,
02:42:13.280 | connect with me on Twitter @LexFriedman.
02:42:16.320 | And now let me leave you with some words from Chris Latner.
02:42:19.080 | So much of language design is about trade-offs
02:42:21.760 | and you can't see those trade-offs
02:42:23.680 | unless you have a community of people
02:42:25.640 | that really represent those different points.
02:42:28.560 | Thank you for listening and hope to see you next time.
02:42:31.640 | (upbeat music)
02:42:34.220 | (upbeat music)
02:42:36.800 | [BLANK_AUDIO]