Chris Lattner: The Future of Computing and Programming Languages

00:00:00.000 | The following is a conversation with Chris Lattner,

00:00:02.640 | his second time on the podcast.

00:00:04.680 | He's one of the most brilliant engineers

00:00:06.600 | in modern computing,

00:00:07.800 | having created LLVM compiler infrastructure project,

00:00:11.480 | the Clang compiler, the Swift programming language,

00:00:14.640 | a lot of key contributions to TensorFlow and TPUs

00:00:17.640 | as part of Google.

00:00:19.080 | He served as vice president of autopilot software at Tesla,

00:00:23.520 | was a software innovator and leader at Apple,

00:00:26.200 | and now is at Sci-5

00:00:28.280 | as senior vice president of platform engineering,

00:00:30.920 | looking to revolutionize chip design

00:00:33.520 | to make it faster, better, and cheaper.

00:00:36.560 | Quick mention of each sponsor,

00:00:38.240 | followed by some thoughts related to the episode.

00:00:40.920 | First sponsor is Blinkist,

00:00:42.400 | an app that summarizes key ideas from thousands of books.

00:00:45.400 | I use it almost every day to learn new things

00:00:48.040 | or to pick which books I want to read or listen to next.

00:00:52.280 | Second is Neuro,

00:00:53.920 | the maker of functional sugar-free gum and mints

00:00:56.480 | that I use to supercharge my mind

00:00:58.520 | with caffeine, L-theanine, and B vitamins.

00:01:01.640 | Third is Masterclass,

00:01:03.240 | online courses from the best people in the world

00:01:06.680 | on each of the topics covered

00:01:08.360 | from rockets to game design, to poker,

00:01:11.140 | to writing, and to guitar.

00:01:13.920 | And finally, Cash App,

00:01:15.680 | the app I use to send money to friends for food, drinks,

00:01:19.360 | and unfortunately, lost bets.

00:01:21.800 | Please check out the sponsors in the description

00:01:23.740 | to get a discount and to support this podcast.

00:01:27.320 | As a side note, let me say that Chris

00:01:29.320 | has been an inspiration to me on a human level

00:01:32.560 | because he is so damn good as an engineer

00:01:35.240 | and leader of engineers,

00:01:36.720 | and yet he's able to stay humble,

00:01:38.600 | especially humble enough to hear the voices of disagreement

00:01:42.120 | and to learn from them.

00:01:43.800 | He was supportive of me and this podcast

00:01:46.080 | from the early days, and for that, I'm forever grateful.

00:01:49.520 | To be honest, most of my life,

00:01:51.180 | no one really believed that I would amount to much.

00:01:53.920 | So when another human being looks at me,

00:01:56.520 | it makes me feel like I might be someone special.

00:01:58.920 | It can be truly inspiring.

00:02:00.840 | That's a lesson for educators.

00:02:02.780 | The weird kid in the corner with a dream

00:02:05.640 | is someone who might need your love and support

00:02:08.160 | in order for that dream to flourish.

00:02:10.060 | If you enjoy this thing, subscribe on YouTube,

00:02:13.320 | review it with 5 Stars on Apple Podcasts,

00:02:15.480 | follow on Spotify, support on Patreon,

00:02:17.960 | or connect with me on Twitter @LexFriedman.

00:02:21.300 | And now, here's my conversation with Chris Ladner.

00:02:24.780 | - What are the strongest qualities of Steve Jobs,

00:02:28.940 | Elon Musk, and the great and powerful Jeff Dean

00:02:32.980 | since you've gotten the chance to work with each?

00:02:36.020 | - You're starting with an easy question there.

00:02:38.580 | These are three very different people.

00:02:40.700 | I guess you could do maybe a pairwise comparison

00:02:43.860 | between them instead of a group comparison.

00:02:45.740 | So if you look at Steve Jobs and Elon,

00:02:48.200 | I worked a lot more with Elon than I did with Steve.

00:02:51.040 | They have a lot of commonality.

00:02:52.400 | They're both visionary in their own way.

00:02:55.400 | They're both very demanding in their own way.

00:02:57.640 | My sense is Steve is much more human factor focused,

00:03:02.440 | where Elon is more technology focused.

00:03:04.640 | - What does human factor mean?

00:03:06.000 | - Steve's trying to build things that feel good,

00:03:08.480 | that people love, that affect people's lives, how they live.

00:03:11.600 | He's looking into the future a little bit

00:03:14.680 | in terms of what people want,

00:03:17.800 | where I think that Elon focuses more on

00:03:20.240 | learning how exponentials work

00:03:21.560 | and predicting the development of those.

00:03:24.120 | - Steve worked with a lot of engineers.

00:03:26.280 | That was one of the things that reading the biography.

00:03:29.520 | How can a designer essentially talk to engineers

00:03:33.320 | and get their respect?

00:03:35.640 | - I think, so I did not work very closely with Steve.

00:03:37.800 | I'm not an expert at all.

00:03:38.640 | My sense is that he pushed people really hard,

00:03:41.860 | but then when he got an explanation that made sense to him,

00:03:44.480 | then he would let go.

00:03:45.760 | And he did actually have a lot of respect for engineering,

00:03:49.200 | but he also knew when to push.

00:03:51.480 | And when you can read people well,

00:03:54.160 | you can know when they're holding back

00:03:56.880 | and when you can get a little bit more out of them.

00:03:58.440 | And I think he was very good at that.

00:04:00.320 | I mean, if you compare the other folks,

00:04:03.240 | so Jeff Dean, right?

00:04:05.200 | Jeff Dean's an amazing guy.

00:04:06.280 | He's super smart, as are the other guys.

00:04:09.080 | Jeff is a really, really, really nice guy.

00:04:13.200 | Well-meaning, he's a classic Googler.

00:04:15.280 | He wants people to be happy.

00:04:17.720 | He combines it with brilliance,

00:04:19.760 | so he can pull people together in a really great way.

00:04:22.600 | He's definitely not a CEO type.

00:04:24.640 | I don't think he would even want to be that.

00:04:28.040 | - Do you know if he still programs?

00:04:29.280 | - Oh yeah, he definitely programs.

00:04:30.560 | Jeff is an amazing engineer today, right?

00:04:32.840 | And that has never changed.

00:04:34.080 | So it's really hard to compare Jeff to either of those two.

00:04:40.320 | I think that Jeff leads through technology

00:04:43.640 | and building it himself

00:04:44.880 | and then pulling people in and inspiring them.

00:04:46.760 | And so I think that that's one of the amazing things

00:04:50.040 | about Jeff, but each of these people,

00:04:51.880 | with their pros and cons, all are really inspirational

00:04:55.000 | and have achieved amazing things.

00:04:56.600 | I've been very fortunate to get to work with these guys.

00:05:00.760 | - For yourself, you've led large teams,

00:05:03.880 | you've done so many incredible,

00:05:06.240 | difficult technical challenges.

00:05:08.480 | Is there something you've picked up

00:05:10.400 | from them about how to lead?

00:05:12.560 | - Yeah, I think leadership is really hard.

00:05:14.720 | It really depends on what you're looking for there.

00:05:17.200 | I think you really need to know what you're talking about.

00:05:20.220 | So being grounded on the product, on the technology,

00:05:23.000 | on the business, on the mission is really important.

00:05:26.320 | Understanding what people are looking for,

00:05:29.840 | why they're there.

00:05:30.760 | One of the most amazing things about Tesla

00:05:32.400 | is the unifying vision, right?

00:05:34.640 | People are there because they believe in clean energy

00:05:37.240 | and electrification, all these kinds of things.

00:05:39.640 | The other is to understand what really motivates people,

00:05:44.720 | how to get the best people,

00:05:45.800 | how to build a plan that actually can be executed, right?

00:05:48.920 | There's so many different aspects of leadership

00:05:50.480 | and it really depends on the time, the place, the problems.

00:05:53.680 | There's a lot of issues that don't need to be solved.

00:05:56.920 | And so if you focus on the right things and prioritize well,

00:05:59.880 | that can really help move things.

00:06:01.440 | - Two interesting things you mentioned.

00:06:03.240 | One is you really have to know what you're talking about,

00:06:06.120 | how you've worked on a lot of

00:06:10.160 | very challenging technical things.

00:06:11.960 | - Sure.

00:06:12.800 | - So I kind of assume you were born technically savvy,

00:06:17.800 | but assuming that's not the case,

00:06:20.680 | how did you develop technical expertise?

00:06:24.920 | Like even at Google, you worked on,

00:06:27.320 | I don't know how many projects,

00:06:28.920 | but really challenging, very varied.

00:06:32.200 | - Compilers, TPUs, hardware, cloud stuff,

00:06:34.600 | a bunch of different things.

00:06:36.440 | The thing that I've become comfortable with,

00:06:38.760 | more comfortable with as I've gained experience,

00:06:42.280 | is being okay with not knowing.

00:06:45.080 | And so a major part of leadership is actually,

00:06:49.120 | it's not about having the right answer,

00:06:50.840 | it's about getting the right answer.

00:06:52.840 | And so if you're working in a team of amazing people,

00:06:56.000 | right, in many of these places,

00:06:57.520 | many of these companies all have amazing people.

00:07:00.320 | It's the question of how do you get people together?

00:07:02.120 | How do you build trust?

00:07:04.160 | How do you get people to open up?

00:07:05.920 | How do you get people to be vulnerable sometimes

00:07:10.000 | with an idea that maybe isn't good enough,

00:07:11.760 | but it's the start of something beautiful?

00:07:14.000 | How do you provide an environment

00:07:17.400 | where you're not just like top-down,

00:07:18.840 | thou shalt do the thing that I tell you to do, right?

00:07:21.120 | But you're encouraging people to be part of the solution

00:07:23.720 | and providing a safe space

00:07:26.400 | where if you're not doing the right thing,

00:07:27.880 | they're willing to tell you about it.

00:07:29.640 | - So you're asking dumb questions?

00:07:31.440 | - Yeah, dumb questions are my specialty.

00:07:32.960 | Yeah.

00:07:34.000 | So I've been in the hardware realm recently

00:07:35.840 | and I don't know much at all about how chips are designed.

00:07:39.040 | I know a lot about using them.

00:07:40.040 | I know some of the principles

00:07:41.120 | and the arts technical level of this,

00:07:43.280 | but it turns out that if you ask a lot of dumb questions,

00:07:47.240 | you get smarter really quick.

00:07:48.920 | And when you're surrounded by people that wanna teach

00:07:51.040 | and learn themselves, it can be a beautiful thing.

00:07:54.080 | - So let's talk about programming languages, if it's okay.

00:07:58.840 | At the highest absurd philosophical level, 'cause I-

00:08:02.080 | - Don't get romantic on me, Lex.

00:08:03.640 | - I will forever get romantic and torture you, I apologize.

00:08:08.640 | Why do programming languages even matter?

00:08:14.160 | - Okay, well, thank you very much.

00:08:15.680 | So you're saying why should you care

00:08:17.440 | about any one programming language

00:08:18.640 | or why do we care about programming computers or?

00:08:20.920 | - No, why do we care about programming language design,

00:08:25.200 | creating effective programming languages,

00:08:27.960 | choosing A, one programming languages

00:08:32.600 | versus another programming language,

00:08:34.560 | why we keep struggling and improving

00:08:37.840 | through the evolution of these programming languages.

00:08:39.840 | - Sure, sure, sure, okay.

00:08:40.680 | So, I mean, I think you have to come back

00:08:42.080 | to what are we trying to do here, right?

00:08:43.640 | So we have these beasts called computers

00:08:47.120 | that are very good at specific kinds of things

00:08:48.840 | and we think it's useful to have them do it for us, right?

00:08:52.000 | Now you have this question of how best to express that

00:08:55.560 | because you have a human brain still

00:08:57.200 | that has an idea in its head

00:08:58.840 | and you wanna achieve something, right?

00:09:00.560 | So, well, there's lots of ways of doing this.

00:09:03.200 | You can go directly to the machine

00:09:04.720 | and speak assembly language

00:09:06.000 | and then you can express directly

00:09:07.640 | what the computer understands, that's fine.

00:09:09.800 | You can then have higher and higher and higher levels

00:09:12.800 | of abstraction up until machine learning

00:09:14.880 | and you're designing a neural net to do the work for you.

00:09:18.040 | The question is where along this way do you want to stop

00:09:21.200 | and what benefits do you get out of doing so?

00:09:23.440 | And so programming languages in general,

00:09:25.280 | you have C, you have Fortran, Java,

00:09:28.000 | and Ada, Pascal, Swift, you have lots of different things.

00:09:33.000 | They all have different trade-offs

00:09:34.360 | and they're tackling different parts of the problems.

00:09:36.520 | Now, one of the things that most programming languages do

00:09:39.960 | is they're trying to make it so that you have

00:09:41.920 | pretty basic things like portability

00:09:43.600 | across different hardware.

00:09:45.080 | So you've got, I'm gonna run on an Intel PC,

00:09:47.640 | I'm gonna run on a RISC-V PC,

00:09:49.240 | I'm gonna run on a ARM phone or something like that, fine.

00:09:53.480 | I wanna write one program and have it portable

00:09:55.520 | and this is something that assembly doesn't do.

00:09:57.760 | Now, when you start looking at the space

00:09:59.720 | of programming languages,

00:10:00.880 | this is where I think it's fun

00:10:02.400 | because programming languages all have trade-offs

00:10:06.160 | and most people will walk up to them

00:10:07.920 | and they look at the surface level of syntax

00:10:10.440 | and say, oh, I like curly braces,

00:10:12.400 | or I like tabs, or I like, you know,

00:10:15.600 | semi-colons or not or whatever, right?

00:10:17.120 | Subjective, fairly subjective, very shallow things.

00:10:21.240 | But programming languages when done right

00:10:23.140 | can actually be very powerful.

00:10:24.600 | And the benefit they bring is expression.

00:10:29.600 | Okay, and if you look at programming languages,

00:10:32.560 | there's really kind of two different levels to them.

00:10:34.400 | One is the down in the dirt, nuts and bolts

00:10:37.920 | of how do you get the computer to be efficient,

00:10:39.340 | stuff like that, how they work, type systems,

00:10:41.640 | compiler stuff, things like that.

00:10:43.480 | The other is the UI.

00:10:44.980 | And the UI for a programming language

00:10:47.160 | is really a design problem

00:10:48.560 | and a lot of people don't think about it that way.

00:10:50.600 | - And the UI, you mean all that stuff with the braces

00:10:53.400 | and the action. - Yeah, all that stuff's the UI

00:10:55.240 | and what it is, and UI means user interface.

00:10:58.200 | And so what's really going on is

00:11:00.400 | it's the interface between the guts and the human.

00:11:03.260 | And humans are hard, right?

00:11:05.880 | Humans have feelings, they have things they like,

00:11:09.520 | they have things they don't like.

00:11:10.720 | And a lot of people treat programming languages

00:11:12.720 | as though humans are just kind of abstract creatures

00:11:16.320 | that cannot be predicted.

00:11:17.520 | But it turns out that actually there is better and worse.

00:11:21.640 | Like people can tell when a programming language is good

00:11:24.960 | or when it was an accident, right?

00:11:26.880 | And one of the things with Swift in particular

00:11:29.360 | is that a tremendous amount of time

00:11:30.960 | by a tremendous number of people

00:11:33.240 | have been put into really polishing and making it feel good.

00:11:36.660 | But it also has really good nuts and bolts underneath it.

00:11:39.080 | - You said that Swift makes a lot of people feel good.

00:11:42.480 | How do you get to that point?

00:11:45.480 | So how do you predict that,

00:11:50.840 | tens of thousands, hundreds of thousands of people

00:11:52.800 | are going to enjoy using this,

00:11:55.000 | the user experience of this programming language?

00:11:57.160 | - Well, you can look at it in terms of better and worse.

00:11:59.540 | So if you have to write lots of boilerplate

00:12:01.320 | or something like that, you will feel unproductive.

00:12:03.520 | And so that's a bad thing.

00:12:05.040 | You can look at it in terms of safety.

00:12:06.680 | If like C, for example,

00:12:08.120 | is what's called a memory unsafe language.

00:12:10.040 | And so you get dangling pointers

00:12:11.560 | and you get all these kind of bugs

00:12:13.320 | that then you have spent tons of time debugging

00:12:15.000 | and it's a real pain in the butt and you feel unproductive.

00:12:17.760 | And so by subtracting these things from the experience,

00:12:19.940 | you get happier people.

00:12:22.600 | - But again, keep interrupting.

00:12:25.360 | I'm sorry.

00:12:26.200 | - It's so hard to deal with.

00:12:27.640 | (laughing)

00:12:29.200 | - If you look at the people,

00:12:30.560 | people that are most productive on Stack Overflow,

00:12:33.100 | they have a set of priorities

00:12:37.440 | that may not always correlate perfectly

00:12:39.840 | with the experience of the majority of users.

00:12:43.120 | You know, if you look at the most upvoted,

00:12:46.280 | quote unquote, correct answer on Stack Overflow,

00:12:49.120 | it usually really sort of prioritizes like safe code,

00:12:54.120 | proper code, stable code, you know, that kind of stuff.

00:13:01.860 | As opposed to like,

00:13:02.980 | if I want to use go-to statements in my basic, right?

00:13:07.060 | I want to use go-to state.

00:13:09.860 | Like what if 99% of people want to use go-to statements?

00:13:12.700 | So you use completely improper, you know, unsafe syntax.

00:13:16.620 | - I don't think that people actually,

00:13:17.900 | like if you boil it down and you get below the surface level

00:13:20.120 | people don't actually care about go-tos or if statements

00:13:23.340 | or things like this.

00:13:24.180 | They care about achieving a goal.

00:13:26.420 | - Yeah.

00:13:27.260 | - Right, so the real question is,

00:13:28.260 | I want to set up a web server and I want to do a thing,

00:13:30.580 | I want to do whatever.

00:13:32.260 | Like how quickly can I achieve that, right?

00:13:34.260 | And so from a programming language perspective,

00:13:36.420 | there's really two things that matter there.

00:13:39.020 | One is what libraries exist

00:13:41.920 | and then how quickly can you put it together

00:13:44.460 | and what are the tools around that look like, right?

00:13:47.260 | And when you want to build a library that's missing,

00:13:49.740 | what do you do?

00:13:50.580 | Okay, now this is where you see huge divergence

00:13:53.280 | in the force between worlds, okay?

00:13:55.820 | And so you look at Python, for example,

00:13:57.340 | Python is really good at assembling things,

00:13:59.220 | but it's not so great at building all the libraries.

00:14:02.500 | And so what you get because of performance reasons,

00:14:04.340 | other things like this,

00:14:05.560 | is you get Python layered on top of C, for example.

00:14:09.260 | And that means that doing certain kinds of things,

00:14:11.540 | well, it doesn't really make sense to do in Python.

00:14:13.340 | Instead you do it in C and then you wrap it

00:14:15.580 | and then you're living in two worlds

00:14:17.660 | and two worlds never is really great

00:14:19.300 | because tooling and the debugger doesn't work right

00:14:21.900 | and like all these kinds of things.

00:14:23.800 | - Can you clarify a little bit what you mean by

00:14:26.460 | Python is not good at building libraries,

00:14:28.580 | meaning it doesn't make it conducive?

00:14:30.460 | - Certain kinds of libraries.

00:14:31.540 | - No, but just the actual meaning of the sentence.

00:14:34.860 | - Yeah.

00:14:35.900 | - Meaning like it's not conducive to developers

00:14:38.380 | to come in and add libraries

00:14:40.500 | or is it the duality of the,

00:14:44.760 | it's a dance between Python and C and you can never.

00:14:48.100 | - Well, so Python's amazing.

00:14:49.460 | Python's a great language.

00:14:50.420 | I did not mean to say that Python is bad for libraries.

00:14:53.420 | What I meant to say is there are libraries

00:14:56.820 | that Python's really good at,

00:14:58.580 | that you can write in Python,

00:15:00.420 | but there are other things,

00:15:01.300 | like if you wanna build a machine learning framework,

00:15:03.620 | you're not gonna build a machine learning framework

00:15:05.020 | in Python because of performance, for example,

00:15:07.380 | or you want GPU acceleration or things like this.

00:15:10.180 | Instead what you do is you write a bunch of C

00:15:13.260 | or C++ code or something like that

00:15:15.300 | and then you talk to it from Python, right?

00:15:18.460 | And so this is because of decisions

00:15:21.100 | that were made in the Python design

00:15:23.140 | and those decisions have other counterbalancing forces,

00:15:27.140 | but the trick when you start looking at this

00:15:29.860 | from a programming language perspective

00:15:31.300 | is you start to say, okay, cool,

00:15:33.180 | how do I build this catalog of libraries

00:15:36.340 | that are really powerful?

00:15:37.820 | And how do I make it so that then they can be assembled

00:15:40.500 | into ways that feel good

00:15:42.080 | and they generally work the first time?

00:15:44.020 | Because when you're talking about building a thing,

00:15:46.900 | you have to include the debugging, the fixing,

00:15:50.220 | the turnaround cycle, the development cycle,

00:15:51.900 | all that kind of stuff into the process

00:15:55.140 | of building the thing.

00:15:56.060 | It's not just about pounding out the code.

00:15:58.300 | And so this is where things like catching bugs

00:16:01.300 | at compile time is valuable, for example.

00:16:03.300 | But if you dive into the details in this,

00:16:07.600 | Swift, for example, has certain things like value semantics,

00:16:10.560 | which is this fancy way of saying

00:16:11.960 | that when you treat a variable like a value,

00:16:16.420 | it acts like a mathematical object would.

00:16:21.480 | Okay, so you have used PyTorch a little bit.

00:16:25.200 | In PyTorch, you have tensors.

00:16:26.640 | Tensors are n-dimensional grid of numbers.

00:16:31.280 | Very simple.

00:16:32.120 | You can do plus and other operators on them.

00:16:34.640 | It's all totally fine.

00:16:35.840 | But why do you need to clone a tensor sometimes?

00:16:38.240 | Have you ever run into that?

00:16:40.840 | - Yeah.

00:16:41.660 | - Okay, and so why is that?

00:16:42.760 | Why do you need to clone a tensor?

00:16:43.920 | - It's the usual object thing that's in Python.

00:16:46.800 | - So in Python, and just like with Java

00:16:49.280 | and many other languages, this isn't unique to Python.

00:16:51.520 | In Python, it has a thing called reference semantics,

00:16:53.760 | which is the nerdy way of explaining this.

00:16:55.680 | And what that means is you actually have a pointer

00:16:58.080 | do a thing instead of the thing.

00:16:59.960 | Now, this is due to a bunch of implementation details

00:17:05.240 | that you don't wanna go into.

00:17:06.800 | But in Swift, you have this thing called value semantics.

00:17:09.560 | And so when you have a tensor in Swift, it is a value.

00:17:12.160 | If you copy it, it looks like you have a unique copy.

00:17:15.080 | And if you go change one of those copies,

00:17:16.800 | then it doesn't update the other one

00:17:19.320 | 'cause you just made a copy of this thing.

00:17:21.400 | - So that's highly error-prone in at least computer science,

00:17:26.400 | math-centric disciplines about Python.

00:17:32.120 | - The thing you would expect to behave-

00:17:34.680 | - Like math.

00:17:35.520 | - Like math, it doesn't behave like math.

00:17:38.280 | And in fact, quietly doesn't behave like math

00:17:41.680 | and then can ruin the entirety of your math thing.

00:17:43.320 | - Exactly.

00:17:44.160 | Well, and then it puts you in debugging land again.

00:17:46.040 | - Yeah.

00:17:46.880 | - Right, now you just wanna get something done

00:17:48.600 | and you're like, wait a second,

00:17:50.080 | where do I need to put clone?

00:17:51.520 | And what level of the stack, which is very complicated,

00:17:54.200 | which I thought I was reusing somebody's library

00:17:56.800 | and now I need to understand it

00:17:57.880 | to know where to clone a thing, right?

00:17:59.640 | - And hard to debug, by the way.

00:18:01.320 | - Exactly, right?

00:18:02.160 | And so this is where programming languages really matter.

00:18:04.320 | Right, so in Swift, having value semantics

00:18:06.280 | so that both you get the benefit of math working like math,

00:18:11.280 | right, but also the efficiency

00:18:13.680 | that comes with certain advantages there,

00:18:15.920 | certain implementation details there

00:18:17.320 | really benefit you as a programmer, right?

00:18:18.920 | - Can you clarify the value semantics?

00:18:20.640 | Like how do you know that a thing

00:18:22.320 | should be treated like a value?

00:18:23.720 | - Yeah, so Swift has a pretty strong culture

00:18:27.720 | and good language support for defining values.

00:18:30.400 | And so if you have an array,

00:18:31.960 | so tensors are one example

00:18:33.400 | that the machine learning folks are very used to.

00:18:36.480 | Just think about arrays, same thing,

00:18:38.280 | where you have an array, you put, you create an array,

00:18:41.640 | you put two or three or four things into it

00:18:43.920 | and then you pass it off to another function.

00:18:46.920 | What happens if that function adds some more things to it?

00:18:51.360 | Well, you'll see it on the side that you pass it in, right?

00:18:54.320 | This is called reference semantics.

00:18:56.680 | Now, what if you pass an array off to a function,

00:19:01.240 | it scrolls it away in some dictionary

00:19:02.880 | or some other data structure somewhere, right?

00:19:04.880 | Well, it thought that you just handed it that array,

00:19:07.960 | then you return back and that reference to that array

00:19:10.800 | still exists in the caller

00:19:12.800 | and they go and put more stuff in it, right?

00:19:15.760 | The person you handed it off to

00:19:17.840 | may have thought they had the only reference to that.

00:19:20.240 | And so they didn't know what they,

00:19:21.680 | that this was gonna change underneath the covers.

00:19:23.960 | And so this is where you end up having to do a clone.

00:19:26.200 | So like I was past a thing,

00:19:27.800 | I'm not sure if I have the only version of it.

00:19:30.240 | So now I have to clone it.

00:19:32.280 | So what value semantics does is it allows you to say,

00:19:34.680 | hey, I have a, so in Swift, it defaults to value semantics.

00:19:38.560 | - Oh, so it defaults to value semantics

00:19:40.240 | and then because most things should be true values,

00:19:44.120 | then it makes sense for that to be the default.

00:19:46.080 | - And one of the important things about that

00:19:47.240 | is that arrays and dictionaries

00:19:48.720 | and all these other collections

00:19:49.960 | that are aggregations of other things

00:19:51.280 | also have value semantics.

00:19:53.040 | And so when you pass this around

00:19:55.040 | to different parts of your program,

00:19:56.680 | you don't have to do these defensive copies.

00:19:59.200 | And so this is great for two sides, right?

00:20:01.280 | It's great because you define away the bug,

00:20:04.200 | which is a big deal for productivity,

00:20:05.960 | the number one thing most people care about,

00:20:08.200 | but it's also good for performance

00:20:09.720 | because when you're doing a clone,

00:20:11.600 | so you pass the array down to the thing,

00:20:13.440 | it was like, I don't know if anybody else has it,

00:20:15.400 | I have to clone it.

00:20:16.640 | Well, you just did a copy of a bunch of data.

00:20:18.480 | It could be big.

00:20:19.960 | And then it could be that the thing that called you

00:20:21.960 | is not keeping track of the old thing.

00:20:24.040 | So you just made a copy of it and you may not have had to.

00:20:27.800 | And so the way the value semantics work in Swift

00:20:30.160 | is it uses this thing called copy on write,

00:20:32.060 | which means that you get the benefit of safety

00:20:35.520 | and performance.

00:20:36.400 | And it has another special trick

00:20:38.360 | because if you think certain languages like Java,

00:20:41.200 | for example, they have immutable strings.

00:20:43.960 | And so what they're trying to do

00:20:44.920 | is they provide value semantics by having pure immutability.

00:20:49.000 | Functional languages have pure immutability

00:20:51.040 | in lots of different places.

00:20:52.280 | And this provides a much safer model

00:20:53.960 | and it provides value semantics.

00:20:56.160 | The problem with this is if you have immutability,

00:20:58.400 | everything is expensive.

00:20:59.480 | Everything requires a copy.

00:21:00.980 | For example, in Java, if you have a string X

00:21:05.440 | and a string Y, you append them together,

00:21:07.880 | we have to allocate a new string to hold XY.

00:21:11.040 | - If they're immutable.

00:21:13.720 | - Well, and strings in Java are immutable.

00:21:16.920 | And if there's optimizations for short ones,

00:21:19.320 | and it's complicated,

00:21:20.960 | but generally think about them as a separate allocation.

00:21:24.560 | And so when you append them together,

00:21:26.640 | you have to go allocate a third thing

00:21:28.560 | because somebody might have a pointer

00:21:29.680 | to either of the other ones, right?

00:21:31.080 | And you can't go change them.

00:21:32.060 | So you have to go allocate a third thing

00:21:34.720 | because of the beauty of how the Swift value semantics

00:21:36.760 | system works out.

00:21:37.760 | If you have a string on Swift and you say,

00:21:38.960 | "Hey, put in X," right?

00:21:41.000 | And they say, "Append on Y, Z, W, W."

00:21:44.880 | It knows that there's only one reference to that.

00:21:47.480 | And so it can do an in-place update.

00:21:50.240 | And so you're not allocating tons of stuff on the side.

00:21:53.440 | You don't have all those problems.

00:21:54.620 | When you pass it off,

00:21:56.040 | you can know you have the only reference.

00:21:57.520 | If you pass it off to multiple different people,

00:21:59.340 | but nobody changes it, they can all share the same thing.

00:22:02.600 | So you get a lot of the benefit of purely mutable design.

00:22:05.800 | And so you get a really nice sweet spot

00:22:07.640 | that I haven't seen in other languages.

00:22:09.280 | - Yeah, that's interesting.

00:22:10.560 | I thought there was going to be a philosophical

00:22:14.680 | like narrative here that you're gonna have to pay

00:22:17.560 | a cost for it.

00:22:19.760 | It sounds like, I think value semantics

00:22:24.480 | is beneficial for easing of debugging

00:22:27.440 | or minimizing the risk of errors,

00:22:30.980 | like bringing the errors closer to the source,

00:22:34.200 | bringing the symptom of the error closer

00:22:38.160 | to the source of the error, however you say that.

00:22:40.840 | But you're saying there's not a performance cost either

00:22:45.000 | if you implement it correctly.

00:22:46.320 | - Well, so there's trade-offs with everything.

00:22:48.280 | And so if you are doing very low level stuff,

00:22:51.880 | then sometimes you can notice the cost,

00:22:53.160 | but then what you're doing is you're saying,

00:22:54.880 | what is the right default?

00:22:56.520 | So coming back to user interface,

00:22:59.120 | when you talk about programming languages,

00:23:00.760 | one of the major things that Swift does

00:23:03.000 | that makes people love it, that is not obvious

00:23:06.900 | when it comes to designing a language

00:23:08.200 | is this UI principle of progressive disclosure of complexity.

00:23:12.280 | So Swift, like many languages, is very powerful.

00:23:16.720 | The question is, when do you have to learn the power

00:23:18.840 | as a user?

00:23:19.680 | So Swift, like Python, allows you to start

00:23:22.640 | with print hello world.

00:23:23.960 | Certain other languages start with public static void main,

00:23:28.280 | class, zzzzzzzz, like all the ceremony, right?

00:23:32.120 | And so you go to teach a new person,

00:23:34.640 | hey, welcome to this new thing.

00:23:36.760 | Let's talk about public, access control classes.

00:23:40.300 | Wait, what's that?

00:23:41.140 | String system.out.println, like packages, like, ah!

00:23:46.080 | Right, and so instead, if you take this and you say,

00:23:48.720 | hey, we need packages, you know, modules.

00:23:51.720 | We need powerful things like classes.

00:23:54.220 | We need data structures.

00:23:55.720 | We need like all these things.

00:23:57.360 | The question is, how do you factor the complexity?

00:23:59.440 | And how do you make it so that the normal case scenario

00:24:02.840 | is that you're dealing with things

00:24:04.600 | that work the right way, the right way,

00:24:06.320 | give you good performance by default.

00:24:09.360 | But then as a power user, if you want to dive down to it,

00:24:12.360 | you have full C performance,

00:24:14.600 | full control over low-level pointers.

00:24:16.000 | You can call malloc if you want to call malloc.

00:24:18.320 | This is not recommended on the first page of every tutorial,

00:24:20.800 | but it's actually really important

00:24:22.280 | when you want to get work done, right?

00:24:23.760 | And so being able to have that is really the design

00:24:27.480 | in programming language design.

00:24:28.840 | And design is really, really hard.

00:24:31.300 | It's something that I think a lot of people

00:24:33.600 | kind of outside of UI, again,

00:24:36.760 | a lot of people just think is subjective.

00:24:39.360 | Like there's nothing, you know,

00:24:41.320 | it's just like curly braces or whatever.

00:24:43.600 | It's just like somebody's preference,

00:24:45.320 | but actually good design is something that you can feel.

00:24:48.720 | - And how many people are involved with good design?

00:24:52.080 | So if we looked at Swift, but look at historically,

00:24:54.840 | I mean, this might touch like,

00:24:57.320 | it's almost like a Steve Jobs question too.

00:24:59.680 | Like how much dictatorial decision-making

00:25:03.360 | is required versus collaborative.

00:25:08.320 | And we'll talk about how all that can go wrong or right.

00:25:11.160 | But-

00:25:12.000 | - Yeah, well, Swift, so I can't speak to in general,

00:25:14.400 | all design everywhere.

00:25:15.600 | So the way it works with Swift is that there's a core team.

00:25:19.800 | And so core team is six or seven people-ish,

00:25:22.480 | something like that,

00:25:23.320 | that is people that have been working with Swift

00:25:25.440 | since very early days.

00:25:26.640 | And so-

00:25:27.480 | - And by early days is not that long ago.

00:25:30.120 | - Okay, yeah.

00:25:30.960 | So it became public in 2014.

00:25:33.640 | So it's been six years public now,

00:25:35.520 | but still that's enough time that there's a story arc there.

00:25:38.840 | (laughs)

00:25:39.680 | - Okay, yeah.

00:25:40.520 | - And there's mistakes have been made that then get fixed

00:25:42.800 | and you learn something and then you, you know,

00:25:44.680 | and so what the core team does is it provides continuity.

00:25:48.400 | And so you wanna have a,

00:25:50.400 | okay, well, there's a big hole that we wanna fill.

00:25:54.020 | We know we wanna fill it.

00:25:55.280 | So don't do other things that invade that space

00:25:58.080 | until we fill the hole, right?

00:25:59.920 | There's a boulder that's missing here.

00:26:01.120 | We wanna do, we will do that boulder,

00:26:03.040 | even though it's not today, keep out of that space.

00:26:06.080 | - And the whole team remembers the myth of the boulder

00:26:10.360 | that's there.

00:26:11.200 | - Yeah, yeah.

00:26:12.020 | There's a general sense of what the future looks like

00:26:13.520 | in broad strokes and a shared understanding of that

00:26:16.440 | combined with a shared understanding of what has happened

00:26:18.780 | in the past that worked out well and didn't work out well.

00:26:22.080 | The next level out is you have the,

00:26:24.280 | what's called the Swift Evolution Community.

00:26:25.800 | And you've got, in that case, hundreds of people

00:26:27.680 | that really care passionately about the way Swift evolves.

00:26:31.000 | And that's like an amazing thing to, again,

00:26:33.880 | the core team doesn't necessarily need to come up

00:26:35.520 | with all the good ideas.

00:26:36.760 | You got hundreds of people out there

00:26:38.000 | that care about something and they come up

00:26:39.540 | with really good ideas too.

00:26:41.040 | And that provides this like tumbling,

00:26:42.960 | rock tumbler for ideas.

00:26:45.120 | And so the evolution process is, you know,

00:26:48.720 | a lot of people in a discourse forum,

00:26:50.320 | they're like hashing it out and trying to like talk about,

00:26:52.040 | okay, well, should we go left or right?

00:26:54.080 | Or if we did this, what would be good?

00:26:55.640 | And, you know, here you're talking about hundreds of people.

00:26:57.680 | So you're not gonna get consensus necessarily.

00:27:00.320 | You're not obvious consensus.

00:27:01.920 | And so there's a proposal process that then allows

00:27:06.280 | the core team and the community to work this out.

00:27:08.360 | And what the core team does is it aims to get consensus

00:27:12.120 | out of the community and provide guardrails,

00:27:14.960 | but also provide long-term,

00:27:17.400 | make sure we're going the right direction kind of things.

00:27:20.360 | - So does that group represent like the,

00:27:23.520 | how much people will love the user interface?

00:27:27.400 | Like do you think they're able to capture that?

00:27:29.400 | - Well, I mean, it's something we talk about a lot.

00:27:31.040 | It's something we care about.

00:27:32.320 | How well we do that, it's up for debate,

00:27:34.760 | but I think that we've done pretty well so far.

00:27:36.800 | - Is the beginner in mind?

00:27:38.560 | - Yeah.

00:27:39.400 | - 'Cause you said the progressive disclosure.

00:27:40.800 | - Yeah, so we care a lot about that, a lot about power,

00:27:45.080 | a lot about efficiency, a lot about,

00:27:46.920 | there are many factors to good design

00:27:48.680 | and you have to figure out a way to kind of

00:27:51.680 | work your way through that.

00:27:53.320 | - So if you like think about like a language I love is Lisp,

00:27:57.560 | probably still because I use Emacs,

00:27:59.360 | but I haven't done anything, any serious work in Lisp,

00:28:02.160 | but it has a ridiculous amount of parentheses.

00:28:06.520 | I've also, with Java and C++, the braces,

00:28:11.520 | I like, I enjoyed the comfort of being between braces.

00:28:19.800 | - Yeah, yeah, well let's talk--

00:28:20.960 | - And then Python is, sorry to interrupt,

00:28:23.120 | just like, and last thing to me, as a designer,

00:28:25.760 | if I was a language designer, God forbid,

00:28:28.720 | is I would be very surprised that Python

00:28:32.600 | with no braces would nevertheless somehow

00:28:36.680 | be comforting also.

00:28:38.160 | So like, I could see arguments for all of these.

00:28:40.600 | - But look at this, this is evidence

00:28:41.880 | that it's not about braces versus tabs.

00:28:44.200 | - Right, exactly, you're good, it's a good point.

00:28:46.960 | - Right, so like, you know, there's evidence that--

00:28:49.960 | - But see, like, it's one of the most argued about things.

00:28:52.320 | - Oh yeah, of course, just like tabs and spaces,

00:28:54.080 | which it doesn't, I mean, there's one obvious right answer,

00:28:57.160 | but it doesn't actually matter.

00:28:59.120 | - What's that?

00:28:59.960 | - Let's not, come on, we're friends.

00:29:01.760 | Like, come on, what are you trying to do to me here?

00:29:03.480 | - People are gonna, yeah, half the people

00:29:04.840 | are gonna tune out, yeah.

00:29:06.140 | - So-- - So at least you're able

00:29:08.520 | to identify things that don't really matter

00:29:11.040 | for the experience.

00:29:12.600 | - Well, no, no, no, it's always a really hard,

00:29:14.760 | so the easy decisions are easy, right?

00:29:16.880 | I mean, fine, those are not the interesting ones.

00:29:19.520 | The hard ones are the ones that are most interesting, right?

00:29:21.760 | The hard ones are the places where,

00:29:23.560 | hey, we wanna do a thing, everybody agrees we should do it,

00:29:27.000 | there's one proposal on the table,

00:29:28.880 | but it has all these bad things associated with it.

00:29:31.560 | Well, okay, what are we gonna do about that?

00:29:33.720 | Do we just take it?

00:29:34.980 | Do we delay it?

00:29:36.260 | Do we say, hey, well, maybe there's this other feature

00:29:38.520 | that if we do that first, this will work out better?

00:29:41.520 | How does this, if we do this,

00:29:44.080 | are we paying ourselves into a corner, right?

00:29:46.160 | And so this is where, again,

00:29:47.320 | you're having that core team of people

00:29:48.600 | that has some continuity and has perspective,

00:29:51.680 | has some of the historical understanding,

00:29:53.640 | is really valuable because you get,

00:29:56.120 | it's not just like one brain,

00:29:57.200 | you get the power of multiple people coming together

00:29:59.200 | to make good decisions,

00:30:00.120 | and then you get the best out of all these people,

00:30:02.520 | and you also can harness the community around it.

00:30:06.280 | - And what about the decision of whether,

00:30:08.520 | like in Python, having one type,

00:30:10.920 | or having strict typing?

00:30:14.120 | - Yeah, okay. - Many types.

00:30:15.080 | - Yeah, let's talk about this.

00:30:16.080 | So I like how you put that, by the way.

00:30:19.600 | So many people would say that Python doesn't have types.

00:30:21.920 | - Doesn't have types, yeah.

00:30:22.920 | - But you're right. - Well, I've listened

00:30:23.880 | to you enough to where, (laughs)

00:30:26.880 | I'm a fan of yours,

00:30:27.840 | and I've listened to way too many podcasts and videos

00:30:31.120 | of you talking about this.

00:30:32.440 | - Oh yeah, so I would argue that Python has one type,

00:30:34.760 | and so when you import Python into Swift,

00:30:38.160 | which, by the way, works really well,

00:30:39.760 | you have everything comes in as a Python object.

00:30:41.800 | Now, here there are trade-offs because,

00:30:44.040 | you know, it depends on what you're optimizing for,

00:30:47.440 | and Python is a super successful language

00:30:49.240 | for a really good reason.

00:30:51.040 | Because it has one type,

00:30:52.720 | you get duck typing for free and things like this,

00:30:55.320 | but also, you're pushing,

00:30:56.920 | you're making it very easy to pound out code on one hand,

00:31:00.600 | but you're also making it very easy

00:31:01.840 | to introduce complicated bugs that you have to debug,

00:31:05.280 | and you pass a string into something

00:31:07.280 | that expects an integer,

00:31:08.200 | and it doesn't immediately die,

00:31:10.200 | it goes all the way down the stack trace,

00:31:12.080 | and you find yourself in the middle of some code

00:31:13.480 | that you really didn't wanna know anything about,

00:31:14.920 | and it blows up, and you're just saying,

00:31:16.400 | well, what did I do wrong, right?

00:31:18.200 | And so types are good and bad,

00:31:20.840 | and they have trade-offs, they're good for performance,

00:31:22.720 | and certain other things,

00:31:23.600 | depending on where you're coming from,

00:31:24.720 | but it's all about trade-offs.

00:31:26.360 | And so this is what design is, right?

00:31:28.600 | Design is about weighing trade-offs

00:31:30.240 | and trying to understand the ramifications

00:31:32.600 | of the things that you're weighing,

00:31:34.280 | like types or not, or one type or many types.

00:31:37.280 | But also, within many types,

00:31:39.820 | how powerful do you make that type system

00:31:41.720 | is another very complicated question

00:31:44.480 | with lots of trade-offs.

00:31:45.400 | It's very interesting, by the way.

00:31:47.560 | But that's like one dimension.

00:31:50.840 | And there's a bunch of other dimensions.

00:31:53.400 | JIT compiled versus static compiled,

00:31:55.240 | garbage collected versus reference counted,

00:31:57.800 | versus manual memory management,

00:32:00.000 | versus, you know, like,

00:32:01.160 | and like all these different trade-offs

00:32:03.000 | and how you balance them

00:32:03.840 | are what make a program language good.

00:32:05.600 | - Concurrency. - Yep.

00:32:07.160 | - So in all those things, I guess,

00:32:08.960 | when you're designing the language,

00:32:11.320 | you also have to think of how that's gonna get

00:32:13.040 | all compiled down to--

00:32:15.200 | - If you care about performance, yeah.

00:32:17.420 | Well, and go back to Lisp, right?

00:32:18.760 | So Lisp, also I would say JavaScript

00:32:20.920 | is another example of a very simple language, right?

00:32:24.120 | And so one of the, so I also love Lisp.

00:32:27.200 | I don't use it as much as maybe you do or you did.

00:32:29.760 | - No, I think we're both, everyone who loves Lisp,

00:32:32.480 | it's like, you love, it's like, I don't know,

00:32:35.120 | I love Frank Sinatra,

00:32:36.240 | but like how often do I seriously listen to Frank Sinatra?

00:32:39.200 | - Sure, sure.

00:32:40.040 | But you look at that or you look at JavaScript,

00:32:42.760 | which is another very different

00:32:44.080 | but relatively simple language,

00:32:45.960 | and there's certain things that don't exist in the language,

00:32:49.100 | but there is inherent complexity

00:32:51.240 | to the problems that we're trying to model.

00:32:53.120 | And so what happens to the complexity?

00:32:54.640 | In the case of both of them, for example, you say,

00:32:57.440 | well, what about large-scale software development?

00:33:00.080 | Okay, well, you need something like packages.

00:33:02.360 | Neither language has a like language affordance for packages.

00:33:05.760 | And so what you get is patterns.

00:33:07.400 | You get things like NPN, you get things like,

00:33:09.720 | you know, like these ecosystems that get built around.

00:33:12.040 | And I'm a believer that if you don't model

00:33:15.120 | at least the most important inherent complexity

00:33:17.760 | in the language, then what ends up happening

00:33:19.600 | is that complexity gets pushed elsewhere.

00:33:22.760 | And when it gets pushed elsewhere,

00:33:24.120 | sometimes that's great because often building things

00:33:26.600 | as libraries is very flexible and very powerful

00:33:28.920 | and allows you to evolve and things like that.

00:33:30.720 | But often it leads to a lot of unnecessary divergence

00:33:34.040 | in the force and fragmentation.

00:33:35.600 | And when that happens, you just get kind of a mess.

00:33:39.560 | And so the question is, how do you balance that?

00:33:42.960 | Don't put too much stuff in the language

00:33:44.280 | 'cause that's really expensive and it makes things complicated

00:33:46.760 | but how do you model enough of the inherent complexity

00:33:49.640 | of the problem that you provide the framework

00:33:52.400 | and the structure for people to think about?

00:33:54.880 | Also, so the key thing to think about

00:33:57.240 | with programming languages,

00:33:59.080 | and you think about what a programming language is there for

00:34:01.360 | is it's about making a human more productive, right?

00:34:04.240 | And so like there's an old,

00:34:05.360 | I think it's a Steve Jobs quote about,

00:34:08.160 | it's a bicycle for the mind, right?

00:34:10.680 | You can definitely walk,

00:34:13.000 | but you'll get there a lot faster

00:34:15.280 | if you can bicycle on your way.

00:34:17.540 | - And a programming language is a bicycle for the mind?

00:34:20.160 | - Yeah.

00:34:21.000 | - Crazy, wow, that's a really interesting way

00:34:23.040 | to think about it.

00:34:23.920 | - By raising the level of abstraction,

00:34:25.520 | now you can fit more things in your head.

00:34:27.400 | By being able to just directly leverage somebody's library,

00:34:30.080 | you can now get something done quickly.

00:34:33.420 | In the case of Swift, SwiftUI is this new framework

00:34:36.160 | that Apple has released recently for doing UI programming.

00:34:39.760 | And it has this declarative programming model

00:34:43.000 | which defines away entire classes of bugs.

00:34:45.160 | It builds on value semantics

00:34:47.040 | and many other nice Swift things.

00:34:48.820 | And what this does is it allows you to get way more done

00:34:51.600 | with way less code.

00:34:53.260 | And now your productivity as a developer is much higher.

00:34:56.580 | Right?

00:34:57.420 | And so that's really what programming languages

00:34:59.420 | should be about,

00:35:00.260 | is it's not about tabs versus spaces

00:35:01.780 | or curly braces or whatever.

00:35:03.300 | It's about how productive do you make the person?

00:35:05.380 | And you can only see that when you have libraries

00:35:08.980 | that were built with the right intention

00:35:11.100 | that the language was designed for.

00:35:13.760 | And with Swift, I think we're still a little bit early,

00:35:16.640 | but SwiftUI and many other things that are coming out now

00:35:19.500 | are really showing that.

00:35:20.340 | And I think that they're opening people's eyes.

00:35:22.520 | - It's kind of interesting to think about like how that,

00:35:27.020 | you know, the knowledge of something

00:35:29.640 | of how good the bicycle is,

00:35:31.640 | how people learn about that, you know?

00:35:33.740 | So I've used C++.

00:35:36.060 | Now this is not going to be a trash talking session

00:35:38.960 | about C++, but I used C++ for a really long time.

00:35:41.880 | - You can go there if you want.

00:35:42.720 | (laughing)

00:35:43.540 | I have the scars.

00:35:44.380 | (laughing)

00:35:45.220 | - I feel like I spent many years without realizing

00:35:49.620 | like there's languages that could,

00:35:51.540 | for my particular lifestyle, brain style, thinking style,

00:35:56.540 | there's languages that could make me a lot more productive

00:36:00.340 | in the debugging stage, in the, just the development stage

00:36:04.380 | and thinking like the bicycle for the mind

00:36:05.980 | that I could fit more stuff into my-

00:36:07.780 | - Python's a great example of that, right?

00:36:09.260 | I mean, a machine learning framework in Python

00:36:10.980 | is a great example of that.

00:36:12.300 | It's just very high abstraction level.

00:36:14.700 | And so you can be thinking about things

00:36:15.900 | on a like very high level algorithmic level

00:36:19.060 | instead of thinking about, okay, well,

00:36:20.460 | am I copying this tensor to a GPU or not?

00:36:22.940 | Right?

00:36:23.820 | It's not what you want to be thinking about.

00:36:25.540 | - And as I was telling you, I mean,

00:36:26.660 | I guess the question I had is, you know,

00:36:29.760 | how does a person like me or in general people

00:36:31.780 | discover more productive, you know, languages?

00:36:36.780 | Like how, as I've been telling you offline,

00:36:39.960 | I've been looking for like a project to work on in Swift

00:36:43.220 | so I can really try it out.

00:36:45.580 | I mean, my intuition was like doing a hello world

00:36:48.620 | is not going to get me there.

00:36:50.460 | To get me to experience the power of the language.

00:36:53.820 | - You need a few weeks of change in metabolism.

00:36:55.980 | - Exactly.

00:36:56.820 | I think that's beautifully put.

00:36:58.260 | That's one of the problems with people with diets.

00:37:01.500 | Like I'm actually currently, to go in parallel,

00:37:05.300 | but in a small tangent is I've been recently

00:37:07.820 | eating only meat.

00:37:09.500 | Okay?

00:37:10.340 | - Okay.

00:37:11.180 | - Okay.

00:37:12.000 | - And most people are like,

00:37:13.260 | they think that's horribly unhealthy or whatever.

00:37:16.900 | You have like a million, whatever the science is,

00:37:20.660 | it just doesn't sound right.

00:37:22.540 | - Well, so back when I was in college,

00:37:24.180 | we did the Atkins diet.

00:37:25.220 | That was a thing.

00:37:26.540 | - Similar.

00:37:27.380 | And, but if you, you have to always give these things

00:37:29.700 | a chance.

00:37:30.740 | I mean, with dieting, always not dieting,

00:37:33.340 | but just the things that you like.

00:37:35.780 | If I eat personally, if I eat meat,

00:37:38.180 | just everything, I can be super focused,

00:37:40.220 | or more focused than usual.

00:37:42.060 | I just feel great.

00:37:44.040 | I mean, I've been running a lot,

00:37:46.320 | doing pushups and pulls and so on.

00:37:48.040 | I mean, Python is similar in that sense for me.

00:37:50.720 | - Where are you going with this?

00:37:52.280 | (laughing)

00:37:53.640 | - I mean, literally, I just felt,

00:37:55.800 | I had like a stupid smile on my face

00:37:58.040 | when I first started using Python.

00:38:00.800 | I could code up really quick things.

00:38:03.000 | Like I would see the world.

00:38:05.800 | I'll be empowered to write a script to,

00:38:10.200 | you know, to do some basic data processing,

00:38:11.860 | to rename files on my computer.

00:38:13.860 | Right?

00:38:14.700 | And like Perl didn't do that for me.

00:38:16.320 | It kind of, a little bit.

00:38:19.320 | - Well, and again, none of these are about

00:38:21.220 | which is best or something like that,

00:38:23.360 | but there's definitely better and worse here.

00:38:25.080 | - But it clicks.

00:38:26.120 | Well, yeah.

00:38:26.960 | - And if you look at Perl, for example,

00:38:29.340 | you get bogged down in scalars versus arrays

00:38:32.700 | versus hashes versus type globs

00:38:34.420 | and like all that kind of stuff.

00:38:35.780 | And Python's like, yeah, let's not do this.

00:38:38.640 | Right?

00:38:39.480 | It's debugging.

00:38:40.300 | Like everyone has different priorities,

00:38:41.560 | but for me, it's, can I create systems for myself

00:38:45.020 | that empower me to debug quickly?

00:38:47.880 | Like I've always been a big fan,

00:38:50.440 | even just crude, like asserts,

00:38:52.120 | like always stating things that should be true,

00:38:57.120 | which in Python, I found myself doing more

00:38:59.840 | because of type, all these kinds of stuff.

00:39:02.400 | - Well, you could think of types in a programming language

00:39:04.600 | as being kind of assert.

00:39:05.880 | - Yeah.

00:39:06.720 | - They get checked at compile time.

00:39:07.600 | Right?

00:39:08.920 | So how do you learn a new thing?

00:39:11.040 | Well, so this, or how do people learn new things?

00:39:13.520 | Right?

00:39:14.360 | This is hard.

00:39:15.320 | People don't like to change.

00:39:17.200 | People generally don't like change around them either.

00:39:19.320 | And so we're all very slow to adapt and change.

00:39:22.880 | And usually there's a catalyst that's required

00:39:25.480 | to force yourself over this.

00:39:28.000 | So for learning a programming language

00:39:30.040 | is really comes down to finding an excuse,

00:39:32.720 | like build a thing that the language is actually good for,

00:39:36.320 | that the ecosystem's ready for.

00:39:38.840 | And so if you were to write an iOS app, for example,

00:39:43.000 | that'd be the easy case.

00:39:44.240 | Obviously you would use Swift for that.

00:39:46.080 | Right?

00:39:46.920 | There are other--

00:39:47.760 | - Android.

00:39:48.600 | - So Swift runs on Android.

00:39:50.520 | - Oh, does it?

00:39:51.360 | - Oh yeah.

00:39:52.200 | Yeah, Swift runs in lots of places.

00:39:53.040 | - How does that work?

00:39:54.720 | So--

00:39:55.560 | - Okay, so Swift is built on top of LLVM.

00:39:58.600 | LLVM runs everywhere.

00:40:00.400 | LLVM, for example, builds the Android kernel.

00:40:03.200 | - Oh, wow.

00:40:04.040 | Okay.

00:40:04.880 | - So yeah.

00:40:05.720 | - I didn't realize this.

00:40:06.800 | - Yeah, so Swift is very portable,

00:40:08.760 | runs on Windows.

00:40:09.920 | There's, it runs on lots of different things.

00:40:12.600 | - And Swift, sorry to interrupt.

00:40:14.120 | Swift UI, and then there's a thing called UIKit.

00:40:17.920 | So can I build an app with Swift?

00:40:20.200 | - Well, so that's the thing,

00:40:22.160 | is the ecosystem is what matters there.

00:40:23.880 | So Swift UI and UIKit are Apple technologies.

00:40:27.040 | - Okay, got it.

00:40:27.880 | - And so they happen to,

00:40:28.720 | like Swift UI happens to be written in Swift,

00:40:30.520 | but it's an Apple proprietary framework

00:40:32.880 | that Apple loves and wants to keep on its platform,

00:40:35.560 | which makes total sense.

00:40:36.920 | You go to Android and you don't have that library.

00:40:39.000 | - Yeah.

00:40:39.840 | - Right, and so Android has a different ecosystem of things

00:40:42.880 | that hasn't been built out

00:40:44.080 | and doesn't work as well with Swift.

00:40:45.400 | And so you can totally use Swift to do like arithmetic

00:40:48.880 | and things like this,

00:40:49.720 | but building a UI with Swift on Android

00:40:51.720 | is not a great experience right now.

00:40:54.600 | - So if I wanted to learn Swift,

00:40:57.320 | what's the, I mean,

00:40:59.080 | the one practical different version of that

00:41:01.840 | is Swift for TensorFlow, for example.

00:41:05.560 | And one of the inspiring things for me

00:41:08.400 | with both TensorFlow and PyTorch

00:41:10.440 | is how quickly the community can like switch

00:41:13.080 | from different libraries.

00:41:14.680 | - Yeah.

00:41:15.520 | - Like you could see some of the communities

00:41:17.720 | switching to PyTorch now,

00:41:19.680 | but it's very easy to see.

00:41:21.920 | And then TensorFlow is really stepping up its game.

00:41:24.480 | And then there's no reason why,

00:41:26.120 | I think the way it works is basically

00:41:27.840 | it has to be one GitHub repo,

00:41:29.560 | like one paper steps up.

00:41:31.000 | - It gets people excited.

00:41:32.360 | - It gets people excited.

00:41:33.240 | And they're like, "Oh, I have to learn this."

00:41:36.040 | Swift for, what's Swift again?

00:41:39.520 | And then they learn and they fall in love with it.

00:41:41.200 | I mean, that's what happened with PyTorch.

00:41:43.080 | - There has to be a reason, a catalyst.

00:41:44.400 | - Yeah.

00:41:45.240 | - And so, and there, I mean, people don't like change,

00:41:48.680 | but it turns out that once you've worked

00:41:50.400 | with one or two programming languages,

00:41:52.640 | the basics are pretty similar.

00:41:54.080 | And so one of the fun things

00:41:55.720 | about learning programming languages,

00:41:57.320 | even maybe Lisp, I don't know if you agree with this,

00:41:59.840 | is that when you start doing that,

00:42:01.400 | you start learning new things.

00:42:03.200 | (laughing)

00:42:04.040 | 'Cause you have a new way to do things

00:42:05.640 | and you're forced to do them.

00:42:06.800 | And that forces you to explore

00:42:09.240 | and it puts you in learning mode.

00:42:10.280 | And when you get in learning mode,

00:42:11.360 | your mind kind of opens a little bit

00:42:12.760 | and you can see things in a new way,

00:42:15.280 | even when you go back to the old place.

00:42:17.040 | - Right.

00:42:17.880 | Yeah, so with Lisp, it's functional stuff.

00:42:21.160 | But I wish there was a kind of window,

00:42:23.640 | maybe you can tell me if there is, there you go.

00:42:26.080 | This is a question to ask,

00:42:28.280 | what is the most beautiful feature

00:42:29.680 | in a programming language?

00:42:30.960 | - Before I ask it, let me say like with Python,

00:42:33.320 | I remember when I saw Lisp Comprehensions.

00:42:36.720 | - Yeah.

00:42:37.560 | - Was like, when I really took it in.

00:42:40.840 | - Yeah.

00:42:42.000 | - I don't know, I just loved it.

00:42:43.720 | It was like fun to do.

00:42:45.280 | Like it was fun to do that kind of,

00:42:47.560 | it was something about it,

00:42:50.800 | to be able to filter through a list

00:42:52.920 | and to create a new list on a single line was elegant.

00:42:56.320 | I could all get into my head

00:42:58.240 | and it just made me fall in love with the language.

00:43:01.920 | - Yeah.

00:43:02.760 | - So is there, let me ask you a question.

00:43:04.880 | Is there, what do you use the most beautiful feature

00:43:07.600 | in a programming languages that you've ever encountered?

00:43:11.760 | In Swift maybe, and then outside of Swift?

00:43:15.160 | - I think the thing that I like the most

00:43:17.440 | from a programming language,

00:43:18.840 | so I think the thing you have to think about

00:43:21.240 | with a programming language, again, what is the goal?

00:43:23.600 | You're trying to get people to get things done quickly.

00:43:27.160 | And so you need libraries, you need high quality libraries,

00:43:30.480 | and then you need a user base around them

00:43:32.600 | that can assemble them and do cool things with them.

00:43:35.040 | And so to me, the question is,

00:43:36.200 | what enables high quality libraries?

00:43:38.240 | Okay.

00:43:40.680 | - Yeah.

00:43:41.520 | - And there's a huge divide in the world

00:43:43.400 | between libraries who enable high quality libraries

00:43:48.320 | versus the ones that put special stuff in the language.

00:43:52.800 | - So programming languages that enable

00:43:55.240 | - High quality libraries.

00:43:56.080 | - High quality libraries, got it.

00:43:57.400 | - So, and what I mean by that is expressive libraries

00:44:00.840 | that then feel like a natural integrated part

00:44:03.720 | of the language itself.

00:44:05.560 | So an example of this in Swift is the int and float

00:44:09.880 | and also array and string, things like this.

00:44:12.080 | These are all part of the library.

00:44:13.720 | Like int is not hard-coded into Swift.

00:44:16.160 | And so what that means is that because int

00:44:19.880 | is just a library thing defined in the standard library,

00:44:22.600 | along with strings and arrays and all the other things

00:44:24.640 | that come with the standard library.

00:44:26.240 | Well, hopefully you do like int,

00:44:29.240 | but anything that any language features

00:44:31.960 | that you needed to define int,

00:44:33.920 | you can also use in your own types.

00:44:36.080 | So if you wanted to find a quaternion

00:44:39.560 | or something like this, right?

00:44:41.440 | Well, it doesn't come in the standard library.

00:44:43.560 | There's a very special set of people

00:44:45.640 | that care a lot about this,

00:44:47.200 | but those people are also important.

00:44:49.400 | It's not about classism, right?

00:44:51.120 | It's not about the people who care about ints and floats

00:44:53.480 | are more important than the people care about quaternions.

00:44:55.760 | And so to me, the beautiful things

00:44:56.920 | about programming languages is when you allow

00:44:58.960 | those communities to build high quality libraries

00:45:02.280 | that feel native, that feel like they're built

00:45:03.760 | into the compiler without having to be.

00:45:06.840 | - What does it mean for the int to be part

00:45:11.120 | of a not hard-coded in?

00:45:13.200 | So is it like, how, so what is an int?

00:45:18.200 | - Okay, int is just a integer.

00:45:20.800 | In this case, it's like a 64 bit integer

00:45:23.560 | or something like this.

00:45:24.400 | - But so like the 64 bit is hard-coded or no?

00:45:28.120 | - No, none of that's hard-coded.

00:45:29.400 | So int, if you go look at how it's implemented,

00:45:32.160 | it's just a struct in Swift.

00:45:34.760 | And so it's a struct.

00:45:35.880 | And then how do you add two structs?

00:45:37.440 | Well, you define plus.

00:45:38.720 | And so you can define plus on int.

00:45:41.800 | Well, you can define plus on your thing too.

00:45:43.560 | You can define int has like an,

00:45:45.800 | is odd method or something like that on it.

00:45:47.800 | And so, yeah, you can add methods on the things.

00:45:50.400 | - Yeah.

00:45:51.320 | - So you can define operators, like how it behaves.

00:45:55.360 | That to you is beautiful.

00:45:56.360 | When there's something about the language

00:45:58.240 | which enables others to create libraries,

00:46:01.920 | which are not hacky.

00:46:05.360 | - Yeah, they feel native.

00:46:07.200 | And so one of the best examples of this is Lisp, right?

00:46:10.840 | Because in Lisp, all the libraries

00:46:13.800 | are basically part of the language, right?

00:46:15.440 | You write term rewrite systems and things like this.

00:46:18.120 | - Can you, as a counter example,

00:46:20.040 | provide what makes it difficult

00:46:22.400 | to write a library that's native?

00:46:23.880 | Is it the Python C?

00:46:25.520 | - Well, so one example, I'll give you two examples,

00:46:29.000 | Java and C++, or Java and C.

00:46:31.600 | They both allow you to define your own types,

00:46:35.760 | but int is hard-coded in the language.

00:46:38.440 | Okay, well, why?

00:46:39.360 | Well, in Java, for example,

00:46:41.200 | coming back to this whole reference,

00:46:42.480 | semantic value, semantic thing,

00:46:45.160 | int gets passed around by value.

00:46:47.440 | - Yeah, that.

00:46:49.720 | - But if you make a pair or something like that,

00:46:53.760 | a complex number, right?

00:46:55.120 | It's a class in Java,

00:46:56.840 | and now it gets passed around by reference, by pointer.

00:46:59.920 | And so now you lose value semantics, right?

00:47:02.600 | You lost math.

00:47:04.200 | Okay, well, that's not great, right?

00:47:06.880 | If you can do something with int,

00:47:08.160 | why can't I do it with my type?

00:47:09.640 | - Yeah.

00:47:10.480 | - Right, so that's the negative side

00:47:13.720 | of the thing I find beautiful,

00:47:15.320 | is when you can solve that,

00:47:17.320 | when you can have full expressivity,

00:47:19.240 | where you as a user of the language

00:47:21.680 | have as much or almost as much power

00:47:24.160 | as the people who implemented

00:47:25.480 | all the standard built-in stuff,

00:47:27.240 | because what that enables

00:47:28.440 | is that enables truly beautiful libraries.

00:47:31.400 | - You know, it's kind of weird,

00:47:32.560 | 'cause I've gotten used to that.

00:47:34.840 | That's one, I guess,

00:47:37.080 | other aspect of programming language design.

00:47:39.040 | You have to think, you know,

00:47:41.080 | the old first principles thinking,

00:47:43.480 | like, why are we doing it this way?

00:47:45.520 | By the way, I mean, I remember,

00:47:47.840 | 'cause I was thinking about the Walrus operator,

00:47:50.800 | and I'll ask you about it later,

00:47:53.240 | but it hit me that like the equal sign for assignment,

00:47:57.760 | like, why are we using the equal sign for assignment?

00:48:01.600 | - It's wrong, and that's not the only solution, right?

00:48:04.480 | So if you look at Pascal,

00:48:05.440 | they use colon equals for assignment

00:48:07.760 | and equals for equality,

00:48:11.440 | and they use like less than, greater than

00:48:12.960 | instead of the not equal thing.

00:48:14.560 | - Yeah.

00:48:15.400 | - Like, there are other answers here.

00:48:16.360 | - So, but like, and yeah, like, I ask you all,

00:48:19.920 | but how do you then decide to break convention?

00:48:24.880 | To say, you know what?

00:48:26.120 | Everybody's doing it wrong.

00:48:29.720 | We're gonna do it right.

00:48:30.960 | - Yeah.

00:48:31.960 | So it's like an ROI,

00:48:33.720 | like return on investment trade-off, right?

00:48:35.480 | So if you do something weird,

00:48:37.320 | let's just say like not,

00:48:38.840 | like colon equal instead of equal for assignment,

00:48:40.960 | that would be weird with today's aesthetic, right?

00:48:44.920 | And so you'd say, cool, this is theoretically better,

00:48:47.480 | but is it better in which ways?

00:48:49.640 | Like, what do I get out of that?

00:48:50.760 | Do I define away class of bugs?

00:48:52.360 | Well, one of the class of bugs that C has

00:48:54.280 | is that you can use like, you know,

00:48:55.880 | if X equals without equals equals,

00:48:58.840 | if X equals Y, right?

00:49:01.760 | Well, it turns out you can solve that problem

00:49:04.040 | in lots of ways.

00:49:05.240 | Clang, for example, GCC,

00:49:06.960 | all these compilers will detect that as a likely bug,

00:49:09.840 | produce a warning.

00:49:10.800 | Do they?

00:49:11.640 | - Yeah.

00:49:12.480 | - I feel like they didn't, or Clang does.

00:49:13.800 | GCC didn't.

00:49:15.960 | It's like, one of the important things

00:49:17.880 | about programming language design

00:49:19.280 | is like you're literally creating suffering in the world.

00:49:22.920 | (laughing)

00:49:23.960 | - Okay.

00:49:24.920 | - Like, I feel like,

00:49:26.720 | I mean, one way to see it is the bicycle for the mind,

00:49:29.200 | but the other way is to like minimizing suffering.

00:49:32.200 | - Well, you have to decide if it's worth it, right?

00:49:33.640 | And so let's come back to that.

00:49:35.560 | - Okay.

00:49:36.400 | - But if you look at this,

00:49:38.040 | and again, this is where there's a lot of detail

00:49:40.080 | that goes into each of these things,

00:49:41.840 | equal in C returns a value.

00:49:45.120 | - Yep.

00:49:47.600 | - That's messed up.

00:49:48.920 | That allows you to say X equals Y equals Z.

00:49:51.120 | Like that works in C.

00:49:52.440 | - Yeah.

00:49:53.440 | Is it messed up?

00:49:54.600 | You know, most people think it's messed up, I think.

00:49:57.560 | - It is very, by messed up,

00:50:00.080 | what I mean is it is very rarely used for good,

00:50:03.520 | and it's often used for bugs.

00:50:05.520 | - Yeah.

00:50:06.360 | - Right, and so.

00:50:07.200 | - That's a good definition of messed up, yeah.

00:50:09.400 | - You could use, you know, it's a, in hindsight,

00:50:12.080 | this was not such a great idea, right?

00:50:13.520 | Now, one of the things with Swift that is really powerful,

00:50:16.160 | and one of the reasons it's actually good,

00:50:18.400 | versus it being full of good ideas,

00:50:20.240 | is that when we launched Swift 1,

00:50:23.400 | we announced that it was public, people could use it,

00:50:26.800 | people could build apps,

00:50:27.880 | but it was gonna change and break, okay?

00:50:30.920 | When Swift 2 came out, we said, "Hey, it's open source,

00:50:33.160 | "and there's this open process

00:50:34.360 | "which people can help evolve and direct the language."

00:50:37.880 | So the community at large, like Swift users,

00:50:40.120 | can now help shape the language as it is,

00:50:43.120 | and what happened is that, as part of that process is,

00:50:46.120 | a lot of really bad mistakes got taken out.

00:50:48.680 | So for example, Swift used to have the C style

00:50:52.480 | plus plus and minus minus operators.

00:50:55.040 | Like, what does it mean when you put it before

00:50:56.560 | versus after, right?

00:50:59.320 | Well, that got cargo-culted from C into Swift early on.

00:51:02.600 | - What's cargo-culted?

00:51:03.720 | - Cargo-culted means brought forward

00:51:05.320 | without really considering it.

00:51:07.760 | - Okay.

00:51:08.600 | - This is maybe not the most PC term, but--

00:51:11.880 | - You have to look it up in Urban Dictionary, yeah.

00:51:13.600 | - Yeah, so it got pulled into C without,

00:51:17.520 | or it got pulled into Swift without very good consideration,

00:51:20.520 | and we went through this process,

00:51:22.200 | and one of the first things got ripped out

00:51:23.720 | was plus plus and minus minus,

00:51:25.620 | because they lead to confusion,

00:51:27.760 | they have very little value over saying, you know,

00:51:29.960 | X plus equals one, and X plus equals one is way more clear,

00:51:34.240 | and so when you're optimizing for teachability

00:51:36.360 | and clarity and bugs and this multidimensional space

00:51:39.600 | that you're looking at, things like that really matter,

00:51:42.340 | and so being first principles on where you're coming from

00:51:45.560 | and what you're trying to achieve

00:51:46.520 | and being anchored on the objective is really important.

00:51:50.160 | - Well, let me ask you about the most,

00:51:53.280 | sort of this podcast isn't about information,

00:51:58.160 | it's about drama.

00:51:59.320 | - Okay.

00:52:00.160 | - Let me talk to you about some drama.

00:52:01.360 | So you mentioned Pascal and colon equals,

00:52:06.320 | there's something that's called the Walrus operator,

00:52:09.560 | and Python 3.8 added the Walrus operator,

00:52:14.560 | and the reason I think it's interesting

00:52:17.600 | is not just 'cause of the feature,

00:52:20.400 | it has the same kind of expression feature

00:52:23.480 | you can imagine to see

00:52:24.320 | that it returns the value of the assignment,

00:52:27.240 | and maybe you can comment on that in general,

00:52:29.680 | but on the other side of it,

00:52:31.240 | it's also the thing that toppled the dictator.

00:52:36.240 | - So, okay.

00:52:37.960 | - It finally drove Guido to step down from BDFL,

00:52:41.280 | the toxicity of the community.

00:52:42.880 | So maybe, what do you think about the Walrus operator

00:52:46.000 | in Python, is there an equivalent thing in Swift

00:52:50.080 | that really stress tested the community,

00:52:54.200 | and then on the flip side,

00:52:56.680 | what do you think about Guido stepping down over it?

00:52:58.720 | - Yeah, well, if I look past the details

00:53:01.160 | of the Walrus operator,

00:53:02.400 | one of the things that makes it most polarizing

00:53:04.160 | is that it's syntactic sugar.

00:53:05.720 | - Okay, what do you mean by syntactic sugar?

00:53:09.120 | - It means you can take something

00:53:10.520 | that already exists in language

00:53:11.760 | and you can express it in a more concise way.

00:53:14.400 | - So, okay, I'm gonna play devil's advocate.

00:53:15.960 | So, this is great.

00:53:17.760 | Is that a objective or subjective statement?

00:53:21.560 | Like, can you argue that basically anything

00:53:24.400 | isn't syntactic sugar or not?

00:53:26.560 | - No, not everything is syntactic sugar.

00:53:30.320 | So, for example, the type system,

00:53:32.720 | like can you have classes versus,

00:53:35.680 | like, do you have types or not, right?

00:53:39.960 | So, one type versus many types

00:53:42.160 | is not something that affects syntactic sugar.

00:53:44.760 | And so, if you say, I wanna have the ability to define types,

00:53:48.240 | I have to have all this like language mechanics

00:53:49.960 | to define classes, and oh, now I have to have inheritance,

00:53:53.320 | and I have like, I have all this stuff,

00:53:55.040 | that's just making language more complicated.

00:53:57.080 | That's not about sugaring it.

00:53:59.240 | Swift has sugar.

00:54:02.360 | So, like Swift has this thing called if let,

00:54:04.320 | and it has various operators

00:54:06.560 | that are used to concisify specific use cases.

00:54:10.480 | So, the problem with syntactic sugar,

00:54:12.840 | when you're talking about, hey, I have a thing

00:54:15.000 | that takes a lot to write, and I have a new way to write it,

00:54:17.720 | you have this like horrible trade-off,

00:54:19.920 | which becomes almost completely subjective,

00:54:22.400 | which is how often does this happen

00:54:24.680 | and does it matter?

00:54:26.320 | And one of the things that is true about human psychology,

00:54:28.520 | particularly when you're talking about

00:54:29.360 | introducing a new thing,

00:54:30.400 | is that people overestimate the burden of learning something

00:54:35.400 | and so it looks foreign when you haven't gotten used to it.

00:54:38.920 | But if it was there from the beginning,

00:54:40.440 | of course, it's just part of Python.

00:54:42.080 | Like unquestionably, like this is just the thing I know,

00:54:45.160 | and it's not a new thing

00:54:46.760 | that you're worried about learning,

00:54:47.720 | it's just part of the deal.

00:54:49.480 | Now, with Guido, I don't know Guido well.

00:54:55.480 | - Yeah, have you passed across much?

00:54:56.920 | - Yeah, I've met him a couple of times,

00:54:58.200 | but I don't know Guido well.

00:55:00.000 | But the sense that I got out of that whole dynamic

00:55:03.280 | was that he had put not just the decision-maker weight

00:55:07.760 | on his shoulders, but it was so tied to his personal identity

00:55:11.920 | that he took it personally and he felt the need

00:55:15.040 | and he kind of put himself in the situation

00:55:16.520 | of being the person,

00:55:18.160 | instead of building a base of support around him.

00:55:20.920 | I mean, this is probably not quite literally true.

00:55:23.920 | But by--

00:55:24.960 | - Too much, so there's too much--

00:55:26.640 | - Too much concentrated on him, right?

00:55:28.800 | And so--

00:55:29.640 | - And that can wear you down.

00:55:31.320 | - Well, yeah, particularly because people then say,

00:55:33.720 | Guido, you're a horrible person, I hate this thing,

00:55:36.120 | blah, blah, blah, blah, blah, blah, blah.

00:55:37.520 | And sure, it's like, you know,

00:55:38.640 | maybe 1% of the community that's doing that.

00:55:41.160 | But Python's got a big community,

00:55:43.520 | and 1% of millions of people is a lot of hate mail.

00:55:46.600 | And that just from human factor will just wear on you.

00:55:49.440 | - Well, to clarify, it looked from just what I saw

00:55:52.520 | in the messaging for the,

00:55:53.960 | let's not look at the million Python users,

00:55:55.800 | but at the Python core developers,

00:55:58.360 | it feels like the majority,

00:56:00.080 | the big majority on a vote were opposed to it.

00:56:03.680 | - Okay, I'm not that close to it, so I don't know.

00:56:06.400 | - So this, okay, so the situation is like literally,

00:56:09.240 | yeah, I mean, the majority, the core developers,

00:56:13.120 | again, it's--

00:56:13.960 | - Were opposed to it.

00:56:14.800 | - So, and they weren't,

00:56:17.480 | they weren't even like against it.

00:56:19.800 | It was, there was a few,

00:56:22.240 | well, they were against it,

00:56:23.120 | but the against it wasn't like, this is a bad idea.

00:56:27.840 | They were more like, we don't see why this is a good idea.

00:56:31.280 | And what that results in is there's a stalling feeling,

00:56:35.200 | like you just slow things down.

00:56:38.040 | Now, from my perspective,

00:56:40.040 | now you could argue this,

00:56:41.640 | and I think it's very interesting

00:56:44.640 | if we look at politics today and the way Congress works,

00:56:47.640 | it's slowed down everything.

00:56:49.600 | - It's a dampener.

00:56:50.440 | - Yeah, it's a dampener,

00:56:51.380 | but that's a dangerous thing too,

00:56:53.680 | because if it dampens things,

00:56:55.480 | if the dampening results--

00:56:58.440 | - What are you talking about?

00:56:59.400 | Like, it's a low-pass filter,

00:57:00.560 | but if you need billions of dollars

00:57:02.360 | injected into the economy, or trillions of dollars,

00:57:05.100 | then suddenly stuff happens, right?

00:57:06.880 | And so--

00:57:07.720 | - For sure.

00:57:09.400 | So you're talking about--

00:57:10.480 | - I'm not defending our political situation,

00:57:12.000 | just to be clear.

00:57:13.360 | - But you're talking about like a global pandemic.

00:57:17.200 | I was hoping we could fix the healthcare system

00:57:20.600 | and the education system.

00:57:21.900 | - I'm not a politics person, I don't know.

00:57:26.260 | When it comes to languages,

00:57:28.160 | the community's kind of right,

00:57:29.600 | in terms of it's a very high burden

00:57:31.680 | to add something to a language.

00:57:33.240 | So as soon as you add something,

00:57:34.440 | you have a community of people building on it,

00:57:35.760 | and you can't remove it, okay?

00:57:38.120 | And if there's a community of people

00:57:39.660 | that feel really uncomfortable with it,

00:57:41.680 | then taking it slow, I think, is an important thing to do,

00:57:45.640 | and there's no rush,

00:57:46.720 | particularly if it's something that's 25 years old

00:57:49.200 | and is very established,

00:57:50.360 | and it's not like coming into its own.

00:57:53.520 | - What about features?

00:57:55.840 | - Well, so I think that the issue with Guido

00:57:58.800 | is that maybe this is a case

00:58:00.360 | where he realized it had outgrown him,

00:58:03.600 | and it went from being--

00:58:04.440 | - The feature or the language?

00:58:05.360 | - The language.

00:58:06.240 | So Python, I mean, Guido's amazing,

00:58:09.660 | but Python isn't about Guido anymore.

00:58:12.260 | It's about the users,

00:58:13.520 | and to a certain extent, the users own it.

00:58:15.320 | And Guido spent years of his life,

00:58:19.720 | a significant fraction of his career on Python,

00:58:22.880 | and from his perspective, I imagine he's like,

00:58:24.640 | "Well, this is my thing.

00:58:25.720 | "I should be able to do the thing I think is right."

00:58:28.220 | But you can also understand the users,

00:58:30.320 | where they feel like, "This is my thing.

00:58:33.020 | "I use this."

00:58:34.040 | And I don't know, it's a hard thing.

00:58:38.280 | - But if we could talk about leadership in this,

00:58:41.360 | 'cause it's so interesting.

00:58:42.200 | To me, I'm gonna work.

00:58:44.400 | Hopefully somebody makes it.

00:58:45.480 | If not, I'll make it a water stopper, I'm pretty sure,

00:58:47.640 | because I think it represents to me,

00:58:50.340 | maybe it's my Russian roots or something,

00:58:52.440 | it's the burden of leadership.

00:58:56.160 | I feel like to push back,

00:58:59.200 | I feel like progress can only,

00:59:02.960 | like most difficult decisions, just like you said,

00:59:06.240 | there'll be a lot of divisiveness over,

00:59:09.080 | especially in a passionate community.

00:59:12.180 | It just feels like leaders need to take

00:59:14.540 | those risky decisions that if you listen,

00:59:19.540 | that with some non-zero probability,

00:59:23.020 | maybe even a high probability,

00:59:24.420 | would be the wrong decision.

00:59:26.100 | But they have to use their gut and make that decision.

00:59:29.260 | - Well, this is one of the things

00:59:30.940 | where you see amazing founders.

00:59:34.180 | The founders understand exactly what's happened

00:59:36.220 | and how the company got there,

00:59:37.500 | and are willing to say, "We have been doing thing X

00:59:40.860 | "the last 20 years, but today we're gonna do thing Y."

00:59:45.460 | And they make a major pivot for the whole company,

00:59:47.380 | the company lines up behind them,

00:59:48.580 | they move, and it's the right thing.

00:59:50.540 | But then when the founder dies,

00:59:52.380 | the successor doesn't always feel that agency

00:59:57.060 | to be able to make those kinds of decisions.

00:59:59.140 | Even though they're a CEO,

01:00:00.020 | they could theoretically do whatever.

01:00:02.180 | There's two reasons for that, in my opinion.

01:00:04.420 | Or in many cases, it's always different.

01:00:07.340 | But one of which is, they weren't there

01:00:09.760 | for all the decisions that were made,

01:00:11.620 | and so they don't know the principles

01:00:13.360 | in which those decisions were made.

01:00:15.340 | And once the principles change,

01:00:17.100 | you should be obligated to change what you're doing

01:00:20.740 | and change direction.

01:00:21.900 | And so, if you don't know how you got to where you are,

01:00:25.860 | it just seems like gospel.

01:00:27.420 | And you're not gonna question it.

01:00:29.860 | You may not understand that it really is

01:00:31.780 | the right thing to do, so you just may not see it.

01:00:33.460 | - That's so brilliant.

01:00:34.300 | I never thought of it that way.

01:00:36.460 | It's so much higher burden when, as a leader,

01:00:39.340 | you step into a thing that's already worked for a long time.

01:00:41.740 | - Yeah, yeah.

01:00:42.580 | Well, and if you change it and it doesn't work out,

01:00:44.100 | now you're the person who screwed it up.

01:00:46.340 | People always second-guess that.

01:00:48.420 | And the second thing is that even if you decide

01:00:49.980 | to make a change, even if you're theoretically in charge,

01:00:53.500 | you're just a person that thinks they're in charge.

01:00:57.460 | Meanwhile, you have to motivate the troops.

01:00:58.860 | You have to explain it to them in terms they'll understand.

01:01:00.540 | You have to get them to buy into it and believe in it,

01:01:02.140 | because if they don't, then they're not gonna be able

01:01:05.260 | to make the turn, even if you tell them

01:01:07.180 | their bonuses are gonna be curtailed.

01:01:08.420 | They're just not gonna buy into it.

01:01:10.700 | And so there's only so much power you have as a leader.

01:01:13.460 | You have to understand what those limitations are.

01:01:16.380 | - Are you still BDFL?

01:01:18.220 | You've been BDFL of some stuff.

01:01:20.300 | You're very heavy on the B, the benevolent,

01:01:25.940 | benevolent dictator for life.

01:01:27.900 | I guess LLVM?

01:01:29.140 | - Yeah, so I still lead the LLVM world.

01:01:32.560 | - I mean, what's the role of, so then on Swift,

01:01:36.420 | you said that there's a group of people.

01:01:38.460 | - Yeah, so if you contrast Python with Swift, right?

01:01:41.620 | One of the reasons, so everybody on the core team

01:01:44.820 | takes the role really seriously,

01:01:46.380 | and I think we all really care about where Swift goes,

01:01:49.260 | but you're almost delegating the final decision-making

01:01:52.980 | to the wisdom of the group,

01:01:54.980 | and so it doesn't become personal.

01:01:56.720 | And also, when you're talking with the community,

01:01:59.680 | so yeah, some people are very annoyed

01:02:02.100 | at certain decisions that get made.

01:02:04.380 | There's a certain faith in the process,

01:02:06.320 | because it's a very transparent process,

01:02:08.140 | and when a decision gets made,

01:02:09.980 | a full rationale is provided, things like this.

01:02:12.220 | These are almost defense mechanisms

01:02:14.460 | to help both guide future discussions

01:02:16.540 | and provide case law, kind of like Supreme Court does

01:02:18.860 | about this decision was made for this reason,

01:02:21.020 | and here's the rationale

01:02:21.940 | and what we wanna see more of or less of.

01:02:24.160 | But it's also a way to provide a defense mechanism

01:02:27.620 | so that when somebody's griping about it,

01:02:29.020 | they're not saying that person did the wrong thing.

01:02:32.020 | They're saying, well, this thing sucks,

01:02:34.020 | and (growls) and later they move on and they get over it.

01:02:38.580 | - Yeah, the analogy of the Supreme Court,

01:02:40.140 | I think, is really good, but then, okay,

01:02:43.820 | not to get personal on the Swift team,

01:02:45.680 | but it just seems like it's impossible

01:02:50.020 | for division not to emerge.

01:02:52.800 | - Well, each of the humans on the Swift core team,

01:02:55.320 | for example, are different,

01:02:56.980 | and the membership of the Swift core team

01:02:58.380 | changes slowly over time,

01:03:00.520 | which is, I think, a healthy thing.

01:03:02.540 | And so each of these different humans

01:03:04.020 | have different opinions.

01:03:05.220 | Trust me, it's not a singular consciousness

01:03:09.380 | by any stretch of the imagination.

01:03:11.000 | You've got three major organizations,

01:03:12.840 | including Apple, Google, and Sci-Fi

01:03:14.540 | all kind of working together,

01:03:16.380 | and it's a small group of people, but you need high trust.

01:03:20.140 | You need, again, it comes back to the principles

01:03:21.900 | of what you're trying to achieve

01:03:23.360 | and understanding what you're optimizing for.

01:03:27.460 | And I think that starting with strong principles

01:03:30.500 | and working towards decisions is always a good way

01:03:33.280 | to both make wise decisions in general,

01:03:36.260 | but then be able to communicate them to people

01:03:37.900 | so that they can buy into them, and that is hard.

01:03:41.380 | And so you mentioned LLVM.

01:03:42.660 | LLVM is gonna be 20 years old this December,

01:03:46.740 | so it's showing its own age.

01:03:49.460 | - Do you have like a dragon cake plan?

01:03:53.540 | - Oh, we should definitely do that.

01:03:54.700 | Yeah, if we can have a pandemic cake.

01:03:57.740 | - Pandemic cake.

01:03:58.940 | - Everybody gets a slice of cake

01:04:00.380 | and it gets sent through email.

01:04:02.260 | But LLVM has had tons of its own challenges over time too,

01:04:08.940 | right, and one of the challenges

01:04:10.340 | that the LLVM community has, in my opinion,

01:04:13.300 | is that it has a whole bunch of people

01:04:15.260 | that have been working on LLVM for 10 years, right,

01:04:19.100 | 'cause this happened somehow,

01:04:20.940 | and LLVM's always been one way,

01:04:22.780 | but it needs to be a different way, right?

01:04:25.100 | And they've worked on it for like 10 years

01:04:26.660 | is a long time to work on something,

01:04:28.580 | and you suddenly can't see the faults

01:04:32.180 | in the thing that you're working on,

01:04:33.500 | and LLVM has lots of problems,

01:04:34.900 | and we need to address them, and we need to make it better,

01:04:36.740 | and if we don't make it better,

01:04:37.740 | then somebody else will come up with a better idea, right?

01:04:40.300 | And so it's just kind of of that age

01:04:42.540 | where the community is in danger of getting too calcified,

01:04:46.620 | and so I'm happy to see new projects joining

01:04:50.460 | and new things mixing it up.

01:04:52.020 | Fortran is now a new thing in the LLVM community,

01:04:54.540 | which is hilarious and good.

01:04:56.340 | - I've been trying to find, on this little tangent,

01:04:59.020 | find people who program in Cobalt or Fortran,

01:05:02.380 | Fortran especially, to talk to, they're hard to find.

01:05:06.500 | - Yeah, look to the scientific community.

01:05:09.860 | They still use Fortran quite a bit.

01:05:11.700 | - Well, interesting thing you kind of mentioned with LLVM,

01:05:14.300 | or just in general, that if something evolved,

01:05:17.020 | you're not able to see the faults.

01:05:19.740 | So do you fall in love with the thing over time,

01:05:23.140 | or do you start hating everything about the thing over time?

01:05:26.340 | - Well, so my personal folly is that I see,

01:05:31.060 | maybe not all, but many of the faults,

01:05:33.500 | and they grate on me, and I don't have time to go fix 'em.

01:05:35.620 | - Yeah, and they get magnified over time.

01:05:37.580 | - Well, and they may not get magnified,

01:05:38.940 | but they never get fixed, and it's like sand underneath,

01:05:42.060 | it's just like grating against you,

01:05:43.620 | and it's like sand underneath your fingernails or something.

01:05:45.860 | It's just like you know it's there, you can't get rid of it.

01:05:49.700 | And so the problem is that if other people don't see it,

01:05:53.060 | nobody ever, like I can't go,

01:05:55.700 | I don't have time to go write the code and fix it anymore,

01:05:58.460 | but then people are resistant to change,

01:06:01.460 | and so you say, "Hey, we should go fix this thing."

01:06:03.060 | They're like, "Oh yeah, that sounds risky."

01:06:05.300 | It's like, well, is it the right thing or not?

01:06:07.220 | - Are the challenges the group dynamics,

01:06:10.220 | or is it also just technical?

01:06:11.660 | I mean, some of these features,

01:06:13.260 | I think as an observer, as almost like a fan in the,

01:06:19.660 | as a spectator of the whole thing,

01:06:21.420 | I don't often think about,

01:06:23.860 | some things might actually be

01:06:25.060 | technically difficult to implement.

01:06:27.580 | - An example of this is we built

01:06:29.060 | this new compiler framework called MLIR.

01:06:31.660 | MLIR is a whole new framework.

01:06:34.220 | It's not, many people think it's about machine learning.

01:06:37.340 | The ML stands for multi-level,

01:06:39.180 | because compiler people can't name things very well, I guess.

01:06:41.780 | - Can we dig into what MLIR is?

01:06:45.260 | - Yeah, so when you look at compilers,

01:06:47.740 | compilers have historically been

01:06:49.900 | solutions for a given space.

01:06:51.700 | So LLVM is a, it's really good for dealing with CPUs,

01:06:56.580 | let's just say, at a high level.

01:06:58.100 | You look at Java, Java has a JVM.

01:07:01.620 | The JVM is very good for garbage collected languages

01:07:04.300 | that need dynamic compilation,

01:07:05.540 | and it's very optimized for a specific space.

01:07:08.420 | And so Hotspot is one of the compilers

01:07:09.980 | that gets used in that space,

01:07:11.020 | and that compiler is really good at that kind of stuff.

01:07:14.080 | Usually when you build these domain-specific compilers,

01:07:16.740 | you end up building the whole thing from scratch

01:07:19.620 | for each domain.

01:07:20.540 | - What's a domain?

01:07:23.380 | So what's the scope of a domain?

01:07:26.660 | - Well, so here I would say, like, if you look at Swift,

01:07:29.180 | there's several different parts to the Swift compiler,

01:07:31.940 | one of which is covered by the LLVM part of it.

01:07:36.100 | There's also a high-level piece that's specific to Swift,

01:07:39.420 | and there's a huge amount of redundancy

01:07:41.540 | between those two different infrastructures,

01:07:44.060 | and a lot of re-implemented stuff

01:07:46.380 | that is similar but different.

01:07:48.340 | - What does LLVM define?

01:07:50.020 | - LLVM is effectively an infrastructure,

01:07:53.020 | so you can mix and match it in different ways.

01:07:55.140 | It's built out of libraries.

01:07:56.060 | You can use it for different things,

01:07:57.620 | but it's really good at CPUs and GPUs.

01:07:59.820 | CPUs and, like, the tip of the iceberg on GPUs.

01:08:02.500 | It's not really great at GPUs.

01:08:04.340 | Okay.

01:08:05.660 | But it turns out-- - And a bunch of languages

01:08:07.060 | that-- - That then use it

01:08:08.900 | to talk to CPUs. - Got it.

01:08:11.060 | - And so it turns out there's a lot of hardware out there

01:08:13.100 | that is custom accelerators,

01:08:14.820 | so machine learning, for example.

01:08:16.140 | There are a lot of matrix multiply accelerators

01:08:18.780 | and things like this.

01:08:20.580 | There's a whole world of hardware synthesis,

01:08:22.820 | so we're using MLIR to build circuits, okay?

01:08:27.180 | And so you're compiling for a domain of transistors,

01:08:30.860 | and so what MLIR does is it provides

01:08:32.460 | a tremendous amount of compiler infrastructure

01:08:34.460 | that allows you to build these domain-specific compilers

01:08:37.500 | in a much faster way and have the result be good.

01:08:41.900 | - If we're thinking about the future,

01:08:44.380 | now we're talking about, like, ASICs, so anything?

01:08:46.900 | - Yeah, yeah.

01:08:47.740 | - So if we project into the future,

01:08:50.540 | it's very possible that the number of these kinds of ASICs,

01:08:54.460 | very specific infrastructure thing,

01:08:59.460 | the architecture things, like, multiplies exponentially.

01:09:04.740 | - I hope so, yeah.

01:09:06.340 | - So that's MLIR--

01:09:08.620 | - So what MLIR does is it allows you

01:09:10.780 | to build these compilers very efficiently, right?

01:09:13.500 | Now, one of the things that, coming back to the LLVM thing,

01:09:16.740 | and then we'll go to hardware,

01:09:17.980 | is LLVM is a specific compiler for a specific domain.

01:09:22.980 | MLIR is now this very general, very flexible thing

01:09:26.900 | that can solve lots of different kinds of problems.

01:09:29.300 | So LLVM is a subset of what MLIR does.

01:09:32.420 | - So MLIR is, I mean, it's an ambitious project then.

01:09:35.380 | - Yeah, it's a very ambitious project, yeah.

01:09:37.020 | And so to make it even more confusing,

01:09:39.860 | MLIR has joined the LLVM Umbrella Project,

01:09:42.460 | so it's part of the LLVM family.

01:09:45.140 | But where this comes full circle is now folks

01:09:47.620 | that work on the LLVM part,

01:09:49.380 | the classic part that's 20 years old,

01:09:51.980 | aren't aware of all the cool new things

01:09:54.100 | that have been done in the new thing,

01:09:56.140 | that MLIR was built by me and many other people

01:09:59.620 | that knew a lot about LLVM,

01:10:01.860 | and so we fixed a lot of the mistakes that lived in LLVM.

01:10:05.140 | And so now you have this community dynamic

01:10:07.100 | where it's like, well, there's this new thing,

01:10:08.540 | but it's not familiar, nobody knows it,

01:10:10.340 | it feels like it's new, and so let's not trust it.

01:10:12.860 | And so it's just really interesting

01:10:13.980 | to see the cultural social dynamic that comes out of that.

01:10:16.900 | And I think it's super healthy

01:10:19.500 | because we're seeing the ideas percolate

01:10:21.540 | and we're seeing the technology diffusion happen

01:10:23.980 | as people get more comfortable with it,

01:10:25.260 | they start to understand things in their own terms.

01:10:27.220 | And this just gets to the,

01:10:28.820 | it takes a while for ideas to propagate,

01:10:31.220 | even though they may be very different

01:10:33.980 | than what people are used to.

01:10:35.260 | - Maybe let's talk about that a little bit,

01:10:37.220 | the world of ASICs.

01:10:38.740 | Well, actually, you have a new role at SciFive.

01:10:43.740 | What's that place about?

01:10:47.420 | What is the vision? - Sure.

01:10:49.460 | - For their vision for, I would say,

01:10:51.820 | the future of computing.

01:10:53.220 | - So I lead the engineering and product teams at SciFive.

01:10:55.940 | SciFive is a company who was founded

01:10:59.700 | with this architecture called RISC-V.

01:11:02.620 | RISC-V is a new instruction set.

01:11:04.420 | Instruction sets are the things inside of your computer

01:11:06.300 | that tell it how to run things.

01:11:08.420 | x86 from Intel and ARM from the ARM company

01:11:12.060 | and things like this are other instruction sets.

01:11:13.900 | - I've talked to, sorry to interrupt,

01:11:15.020 | I've talked to Dave Patterson,

01:11:16.020 | who's super excited about RISC-V.

01:11:17.980 | - Dave is awesome.

01:11:18.900 | - Yeah, he's brilliant.

01:11:20.540 | - The RISC-V is distinguished by not being proprietary.

01:11:23.700 | And so x86 can only be made by Intel and AMD,

01:11:28.820 | ARM can only be made by ARM,

01:11:30.380 | they sell licenses to build ARM chips to other companies,

01:11:33.340 | things like this.

01:11:34.460 | Mips is another instruction set

01:11:35.540 | that is owned by the Mips company, now Wave,

01:11:38.300 | and then it gets licensed out, things like that.

01:11:40.860 | And so RISC-V is an open standard

01:11:43.340 | that anybody can build chips for.

01:11:45.140 | And so SciFive was founded by three of the founders

01:11:48.220 | of RISC-V that designed and built it in Berkeley,

01:11:51.580 | working with Dave.

01:11:52.860 | And so that was the genesis of the company.

01:11:55.780 | SciFive today has some of the world's best RISC-V cores

01:11:59.060 | and we're selling them and that's really great.

01:12:01.420 | They're going to tons of products, it's very exciting.

01:12:04.060 | - So they're taking this thing that's open source

01:12:06.100 | and just trying to be or are the best in the world

01:12:09.620 | at building these things.

01:12:10.780 | - Yeah, so here it's the specifications open source.

01:12:13.260 | It's like saying TCP/IP is an open standard

01:12:15.940 | or C is an open standard,

01:12:18.020 | but then you have to build an implementation

01:12:19.620 | of the standard.

01:12:20.780 | And so SciFive, on the one hand, pushes forward

01:12:23.660 | and defined and pushes forward the standard.

01:12:26.260 | On the other hand, we have implementations

01:12:28.100 | that are best in class for different points in the space,

01:12:30.980 | depending on if you want a really tiny CPU

01:12:33.620 | or if you want a really big beefy one that is faster,

01:12:36.940 | but it uses more area and things like this.

01:12:38.820 | - What about the actual manufacturer?

01:12:41.220 | So like what, where does that all fit?

01:12:43.540 | I'm gonna ask a bunch of dumb questions.

01:12:45.300 | - That's okay, this is how we learn, right?

01:12:48.140 | And so the way this works is that there's generally

01:12:52.500 | a separation of the people who design the circuits

01:12:55.100 | and then the people who manufacture them.

01:12:56.820 | And so you'll hear about fabs like TSMC and Samsung

01:13:00.740 | and things like this that actually produce the chips,

01:13:03.780 | but they take a design coming in

01:13:05.820 | and that design specifies how the, you know,

01:13:09.940 | you turn code for the chip into little rectangles

01:13:14.940 | that then use photolithography to make mask sets

01:13:20.260 | and then burn transistors onto a chip

01:13:22.260 | or onto silicon rather.

01:13:24.700 | - So, and we're talking about mass manufacturing, so.

01:13:28.340 | - Yeah, they're talking about making hundreds

01:13:29.580 | of millions of parts and things like that, yeah.

01:13:31.340 | And so the fab handles the volume production,

01:13:33.540 | things like that.

01:13:34.620 | But when you look at this problem,

01:13:36.340 | the interesting thing about the space when you look at it

01:13:39.700 | is that these, the steps that you go from designing a chip

01:13:44.340 | and writing the quote unquote code for it

01:13:46.260 | and things like Verilog and languages like that,

01:13:49.180 | down to what you hand off to the fab

01:13:51.620 | is a really well-studied, really old problem, okay?

01:13:56.200 | Tons of people have worked on it.

01:13:57.540 | Lots of smart people have built systems and tools.

01:14:00.540 | These tools then have generally gone through acquisitions.

01:14:03.460 | And so they've ended up at three different major companies

01:14:06.140 | that build and sell these tools.

01:14:07.740 | They're called EDA tools,

01:14:08.940 | like for electronic design automation.

01:14:11.620 | The problem with this is you have huge amounts

01:14:13.180 | of fragmentation, you have loose standards,

01:14:17.140 | and the tools don't really work together.

01:14:20.020 | So you have tons of duct tape

01:14:21.300 | and you have tons of lost productivity.

01:14:24.220 | - Now, these are tools for designing.

01:14:26.700 | So the RISC-V is a instruction, like what is RISC-V?

01:14:31.700 | Like how deep does it go?

01:14:33.300 | How much does it touch the hardware?

01:14:35.860 | How much does it define how much of the hardware is?

01:14:38.420 | - Yeah, so RISC-V is all about, given a CPU,

01:14:41.860 | so the processor in your computer,

01:14:44.860 | how does the compiler, like the Swift compiler,

01:14:47.380 | the C compiler, things like this, how does it make it work?

01:14:50.460 | So it's what is the assembly code?

01:14:52.660 | And so you write RISC-V assembly

01:14:54.140 | instead of X86 assembly, for example.

01:14:57.060 | - But it's a set of instructions as opposed to--

01:14:59.020 | - A set of instructions, yeah.

01:15:00.060 | - What do you say, it tells you how the compiler works?

01:15:03.660 | - Sorry, it's what the compiler talks to.

01:15:05.380 | - Okay. - Yeah.

01:15:06.220 | - And then the tooling you mentioned,

01:15:08.500 | the disparate tools are for what?

01:15:10.700 | - For when you're building a specific chip.

01:15:13.340 | So RISC-V-- - In hardware.

01:15:14.900 | - In hardware, yeah.

01:15:15.740 | So RISC-V, you can buy a RISC-V core from Sci-5

01:15:19.140 | and say, "Hey, I wanna have a certain number of,

01:15:21.580 | "run a certain number of gigahertz.

01:15:23.300 | "I want it to be this big.

01:15:24.580 | "I want it to have these features.

01:15:26.740 | "I wanna have, like, I want floating point or not,"

01:15:29.860 | for example.

01:15:30.820 | And then what you get is you get a description of a CPU

01:15:34.700 | with those characteristics.

01:15:36.620 | Now, if you wanna make a chip,

01:15:38.140 | you wanna build like an iPhone chip or something like that,

01:15:41.020 | right, you have to take both the CPU,

01:15:42.740 | but then you have to talk to memory,

01:15:44.420 | you have to have timers, IOs, a GPU, other components.

01:15:49.300 | And so you need to pull all those things together

01:15:51.380 | into what's called an ASIC,

01:15:53.860 | an application-specific integrated circuit,

01:15:55.500 | so a custom chip.

01:15:56.860 | And then you take that design,

01:15:58.980 | and then you have to transform it into something

01:16:00.860 | that the fabs, like TSMC, for example,

01:16:03.940 | know how to take to production.

01:16:06.740 | - Got it.

01:16:07.580 | So, but yeah, okay.

01:16:08.580 | - And so that process, I will,

01:16:10.580 | I can't help but see it as, is a big compiler.

01:16:15.380 | (Dave laughs)

01:16:16.940 | It's a whole bunch of compilers written

01:16:18.820 | without thinking about it through that lens.

01:16:21.380 | - Isn't the universe a compiler?

01:16:23.740 | (laughs)

01:16:24.580 | - Yeah, like compilers do two things.

01:16:26.820 | They represent things and transform them.

01:16:29.100 | And so there's a lot of things that end up being compilers.

01:16:31.820 | But this is a space where we're talking about design

01:16:34.700 | and usability and the way you think about things,

01:16:37.460 | the way things compose correctly, it matters a lot.

01:16:40.900 | And so Sci-5 is investing a lot into that space.

01:16:43.460 | And we think that there's a lot of benefit

01:16:45.900 | that can be made by allowing people

01:16:47.460 | to design chips faster, get them to market quicker

01:16:50.340 | and scale out because, you know,

01:16:53.860 | the alleged end of Moore's law,

01:16:56.420 | you've got this problem of you're not getting

01:16:59.260 | free performance just by waiting another year

01:17:01.980 | for a faster CPU.

01:17:03.540 | And so you have to find performance in other ways.

01:17:06.540 | And one of the ways to do that is with custom accelerators

01:17:09.060 | and other things in hardware.

01:17:10.660 | - And so, well, we'll talk a little bit about,

01:17:16.180 | a little more about ASICs,

01:17:17.420 | but do you see that a lot of people,

01:17:21.980 | a lot of companies will try to have

01:17:25.020 | like different sets of requirements

01:17:26.980 | that this whole process to go for?

01:17:28.380 | So like almost different car companies might use different

01:17:32.620 | and like different PC manufacturers.

01:17:35.220 | Like, so is this, like is RISC-V in this whole process,

01:17:40.220 | is it potentially the future of all computing devices?

01:17:44.820 | - Yeah, I think that, so if you look at RISC-V

01:17:47.420 | and step back from the Silicon side of things,

01:17:49.620 | RISC-V is an open standard.

01:17:51.540 | And one of the things that has happened

01:17:53.860 | over the course of decades,

01:17:55.420 | if you look over the long arc of computing,

01:17:57.780 | somehow became decades old.

01:17:59.220 | - Yeah.

01:18:00.060 | - Is that you have companies that come and go

01:18:02.660 | and you have instruction sets that come and go.

01:18:04.860 | Like one example of this out of many is Sun with Spark.

01:18:09.860 | - Yeah.

01:18:10.740 | - Sun went away, Spark still lives on at Fujitsu,

01:18:12.980 | but we have HP had this instruction set called PA-RISC.

01:18:17.300 | So PA-RISC was its big server business

01:18:21.020 | and had tons of customers.

01:18:22.900 | They decided to move to this architecture

01:18:25.100 | called Itanium from Intel.

01:18:27.180 | - Yeah.

01:18:28.020 | - This didn't work out so well.

01:18:29.620 | - Yeah.

01:18:30.460 | - Right, and so you have this issue

01:18:32.180 | of you're making many billion dollar investments

01:18:35.380 | on instruction sets that are owned by a company.

01:18:38.220 | And even companies as big as Intel

01:18:39.740 | don't always execute as well as they could.

01:18:42.460 | They have their own issues.

01:18:43.860 | HP, for example, decided that it wasn't

01:18:46.700 | in their best interest to continue investing in the space

01:18:48.620 | 'cause it was very expensive.

01:18:49.700 | And so they make technology decisions

01:18:52.180 | or they make their own business decisions.

01:18:54.180 | And this means that as a customer, what do you do?

01:18:57.860 | You've sunk all this time, all this engineering,

01:18:59.680 | all this software work, all these,

01:19:01.300 | you've built other products around them

01:19:02.540 | and now you're stuck, right?

01:19:05.020 | What RISC-V does is it provides you

01:19:06.700 | more optionality in the space,

01:19:08.260 | because if you buy an implementation of RISC-V from Sci-5,

01:19:12.620 | and you should, they're the best ones.

01:19:14.380 | - Yeah.

01:19:15.200 | - But if something bad happens to Sci-5 in 20 years, right?

01:19:19.420 | Well, great, you can turn around

01:19:21.180 | and buy a RISC-V core from somebody else.

01:19:23.300 | And there's an ecosystem of people

01:19:24.980 | that are all making different RISC-V cores

01:19:26.580 | with different trade-offs,

01:19:28.100 | which means that if you have more than one requirement,

01:19:30.580 | if you have a family of products,

01:19:31.900 | you can probably find something in the RISC-V space

01:19:34.660 | that fits your needs.

01:19:35.980 | Whereas if you're talking about XA6, for example,

01:19:39.340 | Intel's only gonna bother to make

01:19:41.700 | certain classes of devices, right?

01:19:45.020 | - I see, so maybe a weird question,

01:19:47.700 | but if Sci-5 is infinitely successful

01:19:52.700 | in the next 20, 30 years, what does the world look like?

01:19:58.060 | So how does the world of computing change?

01:20:01.860 | - So too much diversity in hardware instruction sets,

01:20:05.300 | I think is bad.

01:20:06.540 | Like we have a lot of people that are using

01:20:08.660 | lots of different instruction sets,

01:20:10.980 | particularly in the embedded,

01:20:12.220 | the like very tiny microcontroller space,

01:20:14.340 | the thing in your toaster,

01:20:15.580 | that are just weird and different for historical reasons.

01:20:21.060 | And so the compilers and the tool chains

01:20:23.100 | and the languages on top of them aren't there, right?

01:20:27.220 | And so the developers for that software

01:20:29.220 | have to use really weird tools

01:20:31.060 | because the ecosystem that supports is not big enough.

01:20:34.220 | So I expect that will change, right?

01:20:35.460 | People will have better tools and better languages,

01:20:38.020 | better features everywhere that then can service

01:20:40.340 | many different points in the space.

01:20:42.100 | And I think RISC-V will progressively

01:20:46.300 | eat more of the ecosystem because it can scale up,

01:20:49.420 | it can scale down, sideways, left, right.

01:20:51.580 | It's very flexible and very well considered

01:20:53.820 | and well-designed instruction set.

01:20:56.380 | I think when you look at Sci-5 tackling silicon

01:20:58.780 | and how people build chips,

01:20:59.980 | which is a very different space,

01:21:03.940 | that's where you say,

01:21:05.140 | I think we'll see a lot more custom chips.

01:21:07.500 | And that means that you get much more battery life,

01:21:09.780 | you get better tuned solutions for your IoT thingy.

01:21:14.780 | So you get people that move faster,

01:21:18.220 | you get the ability to have faster time to market,

01:21:20.660 | for example.

01:21:21.500 | - So how many custom, so first of all,

01:21:23.660 | on the IoT side of things,

01:21:24.980 | do you see the number of smart toasters

01:21:29.020 | increasing exponentially?

01:21:30.220 | So, (laughs)

01:21:32.420 | and if you do, how much customization per toaster is there?

01:21:38.820 | Do all toasters in the world run the same silicon,

01:21:42.660 | like the same design?

01:21:44.020 | Or is it different companies have different design?

01:21:46.020 | Like how much customization is possible here?

01:21:49.700 | - Well, a lot of it comes down to cost.

01:21:52.220 | Right, and so the way that chips work

01:21:54.180 | is you end up paying by the,

01:21:56.100 | one of the factors is the size of the chip.

01:21:59.700 | And so what ends up happening

01:22:01.400 | just from an economic perspective is,

01:22:03.200 | there's only so many chips that get made in a year

01:22:05.860 | of a given design.

01:22:07.340 | And so often what customers end up having to do

01:22:10.260 | is they end up having to pick up a chip that exists

01:22:12.260 | that was built for somebody else

01:22:14.140 | so that they can then ship their product.

01:22:16.540 | And the reason for that is they don't have the volume

01:22:18.340 | of the iPhone, they can't afford to build a custom chip.

01:22:21.700 | However, what that means is they're now buying

01:22:23.820 | an off the shelf chip that isn't really good,

01:22:26.900 | isn't a perfect fit for their needs,

01:22:28.260 | and so they're paying a lot of money for it

01:22:30.060 | because they're buying silicon that they're not using.

01:22:33.500 | Well, if you now reduce the cost of designing the chip,

01:22:36.600 | now you get a lot more chips.

01:22:37.780 | And the more you reduce it, the easier it is to design chips.

01:22:41.500 | The more the world keeps evolving

01:22:44.300 | and we get more AI accelerators, we get more other things,

01:22:46.740 | we get more standards to talk to, we get 6G, right?

01:22:50.940 | You get changes in the world that you wanna be able

01:22:53.820 | to talk to these different things.

01:22:54.780 | There's more diversity in the cross product of features

01:22:57.220 | that people want, and that drives differentiated chips

01:23:01.620 | in another direction.

01:23:03.300 | And so nobody really knows what the future looks like,

01:23:05.620 | but I think that there's a lot of silicon in the future.

01:23:09.780 | - Speaking of the future,

01:23:11.180 | you said Moore's law allegedly is dead.

01:23:13.740 | So do you agree with Dave Patterson and many folks

01:23:18.740 | that Moore's law is dead?

01:23:22.080 | Or do you agree with Jim Keller,

01:23:26.180 | who's standing at the helm of the pirate ship

01:23:28.660 | saying it's-- - Still alive.

01:23:30.740 | - It's still alive.

01:23:31.660 | - Yeah, well, so I agree with what they're saying

01:23:35.740 | and different people are interpreting

01:23:37.820 | the end of Moore's law in different ways.

01:23:39.740 | So Jim would say, there's another 1000X left in physics

01:23:44.220 | and we can continue to squeeze the stone

01:23:46.940 | and make it faster and smaller and smaller geometries

01:23:50.100 | and all that kind of stuff.

01:23:51.420 | He's right.

01:23:53.540 | So Jim is absolutely right

01:23:55.260 | that there's a ton of progress left

01:23:57.860 | and we're not at the limit of physics yet.

01:23:59.960 | That's not really what Moore's law is though.

01:24:03.980 | If you look at what Moore's law is,

01:24:06.660 | is that it's a very simple evaluation of,

01:24:10.700 | okay, well, you look at the cost for,

01:24:13.620 | I think it was cost per area

01:24:15.020 | and the most economic point in that space.

01:24:17.060 | And if you go look at the now quite old paper

01:24:20.060 | that describes this,

01:24:21.900 | Moore's law has a specific economic aspect to it.

01:24:25.500 | And I think this is something that Dave

01:24:26.780 | and others often point out.

01:24:28.260 | And so on a technicality, that's right.

01:24:30.580 | I look at it from,

01:24:33.340 | so I can acknowledge both of those viewpoints.

01:24:35.020 | - They're both right.

01:24:35.860 | - They're both right.

01:24:36.700 | I'll give you a third wrong viewpoint

01:24:39.220 | that may be right in its own way,

01:24:40.340 | which is single threaded performance

01:24:43.060 | doesn't improve like it used to.

01:24:46.060 | And it used to be back when you got a,

01:24:48.500 | you know, a Pentium 66 or something.

01:24:50.640 | And the year before you had a Pentium 33

01:24:53.820 | and now it's twice as fast, right?

01:24:56.740 | Well, it was twice as fast at doing exactly the same thing.

01:25:00.380 | Okay.

01:25:01.220 | Like literally the same program ran twice as fast.

01:25:03.820 | You just wrote a check and waited a year, year and a half.

01:25:07.020 | Well, so that's what a lot of people think about Moore's law.

01:25:10.100 | And I think that is dead.

01:25:11.820 | And so what we're seeing instead is we're pushing,

01:25:15.260 | we're pushing people to write software in different ways.

01:25:17.260 | And so we're pushing people to write CUDA

01:25:19.060 | so they can get GPU compute

01:25:20.960 | and the thousands of cores on GPU.

01:25:23.400 | We're talking about C programmers having to use P threads

01:25:26.360 | because they now have, you know, a hundred threads

01:25:29.120 | or 50 cores in a machine or something like that.

01:25:31.960 | You're not talking about machine learning accelerators.

01:25:33.680 | They're now domain specific.

01:25:35.080 | And when you look at these kinds of use cases,

01:25:38.460 | you can still get performance.

01:25:40.440 | And Jim will come up with cool things

01:25:42.640 | that utilize the Silicon in new ways for sure.

01:25:45.760 | But you're also gonna change the programming model.

01:25:48.400 | - Right.

01:25:49.240 | - And now when you start talking about

01:25:50.060 | changing the programming model,

01:25:50.940 | that's when you come back to languages

01:25:53.060 | and things like this too,

01:25:54.020 | because often what you see is like,

01:25:58.020 | you take the C programming language, right?

01:25:59.820 | The C programming language is designed for CPUs.

01:26:02.220 | And so if you wanna talk to a GPU,

01:26:04.980 | now you're talking to its cousin CUDA.

01:26:08.140 | Okay, CUDA is a different thing

01:26:10.540 | with a different set of tools, a different world,

01:26:12.860 | a different way of thinking.

01:26:14.380 | And we don't have one world that scales.

01:26:16.940 | And I think that we can get there.

01:26:18.440 | We can have one world that scales in a much better way.

01:26:21.040 | - On a small tangent then,

01:26:22.480 | I think most programming languages are designed

01:26:24.720 | for CPUs for a single core, even just in their spirit,

01:26:28.920 | even if they allow for parallelization.

01:26:30.480 | So what does it look like for a programming language

01:26:34.160 | to have parallelization or massive parallelization

01:26:38.660 | as its like first principle?

01:26:41.320 | - So the canonical example of this

01:26:43.520 | is the hardware design world.

01:26:46.400 | So Verilog, VHDL, these kinds of languages,

01:26:50.020 | they're what's called a high-level synthesis language.

01:26:53.500 | This is the thing people design chips in.

01:26:56.860 | And when you're designing a chip,

01:26:58.140 | it's kind of like a brain

01:26:59.780 | where you have infinite parallelism.

01:27:01.860 | Like you've got, you're like laying down transistors.

01:27:05.580 | Transistors are always running, okay?

01:27:08.380 | And so you're not saying run this transistor,

01:27:10.260 | then this transistor, then this transistor.

01:27:12.340 | It's like your brain,

01:27:13.180 | like your neurons are always just doing something.

01:27:15.140 | They're not clocked, right?

01:27:16.820 | They're just doing their thing.

01:27:20.200 | And so when you design a chip or when you design a CPU,

01:27:23.560 | when you design a GPU,

01:27:24.540 | when you design, when you're laying down the transistors,

01:27:27.280 | similarly, you're talking about, well, okay,

01:27:29.000 | well, how do these things communicate?

01:27:31.120 | And so these languages exist.

01:27:32.760 | Verilog is a kind of mixed example of that.

01:27:36.160 | None of these languages are really great.

01:27:37.600 | - Yeah, they're very low level, yeah.

01:27:39.560 | - Yeah, they're very low level

01:27:40.680 | and abstraction is necessary here.

01:27:42.520 | And there's different approaches with that.

01:27:44.520 | And it's itself a very complicated world,

01:27:47.360 | but it's implicitly parallel.

01:27:50.640 | And so having that as the domain that you program towards

01:27:55.640 | makes it so that by default, you get parallel systems.

01:27:59.480 | If you look at CUDA,

01:28:00.320 | CUDA is a point halfway in the space where in CUDA,

01:28:03.680 | when you write a CUDA kernel for your GPU,

01:28:05.940 | it feels like you're writing a scalar program.

01:28:08.100 | So you're like, you have ifs, you have for loops,

01:28:10.000 | stuff like this, you're just writing normal code.

01:28:12.600 | But what happens outside of that in your driver

01:28:14.840 | is that it actually is running you on like

01:28:16.800 | a thousand things at once, right?

01:28:18.880 | And so it's parallel,

01:28:20.560 | but it has pulled it out of the programming model.

01:28:23.060 | And so now you as a programmer are working in a simpler

01:28:27.760 | world and it's solved that for you, right?

01:28:31.520 | - How do you take the language like Swift?

01:28:33.760 | You know, if we think about GPUs, but also ASICs,

01:28:39.040 | maybe if we can dance back and forth

01:28:40.920 | between hardware and software.

01:28:42.520 | (laughs)

01:28:43.680 | Is, you know, how do you design for these features

01:28:46.720 | to be able to program, make it a first class citizen

01:28:50.000 | to be able to do like Swift for TensorFlow,

01:28:53.080 | to be able to do machine learning on current hardware,

01:28:56.640 | but also future hardware like TPUs and all kinds of ASICs

01:29:00.600 | that I'm sure will be popping up more and more.

01:29:02.200 | - Yeah, well, so a lot of this comes down to this whole idea

01:29:05.360 | of having the nuts and bolts underneath the covers

01:29:07.360 | that work really well.

01:29:08.600 | So you need, if you're talking to TPUs,

01:29:10.400 | you need, you know, MLIR, XLA, or one of these compilers

01:29:13.760 | that talks to TPUs to build on top of, okay?

01:29:17.400 | And if you're talking to circuits,

01:29:19.320 | you need to figure out how to lay down the transistors

01:29:21.520 | and how to organize it and how to set up clocking

01:29:23.280 | and like all the domain problems that you get with circuits.

01:29:26.280 | Then you have to decide how to explain it to a human.

01:29:29.800 | What is the UI, right?

01:29:31.840 | And if you do it right, that's a library problem,

01:29:34.480 | not a language problem.

01:29:36.440 | And that works if you have a library or a language

01:29:39.080 | which allows your library to write things

01:29:42.120 | that feel native in the language by implementing libraries,

01:29:45.840 | because then you can innovate in programming models

01:29:49.240 | without having to change your syntax again.

01:29:51.200 | And like you have to invent new code formatting tools

01:29:54.880 | and like all the other things that languages come with.

01:29:57.520 | And this gets really interesting.

01:29:59.920 | And so if you look at the space, the interesting thing,

01:30:03.400 | once you separate out syntax,

01:30:05.840 | becomes what is that programming model?

01:30:07.840 | And so do you want the CUDA style,

01:30:10.240 | I write one program and it runs many places.

01:30:12.760 | Do you want the implicitly parallel model?

01:30:16.800 | How do you reason about that?

01:30:17.760 | How do you give developers, chip architects,

01:30:20.800 | the ability to express their intent?

01:30:24.080 | And that comes into this whole design question

01:30:26.280 | of how do you detect bugs quickly?

01:30:29.160 | So you don't have to tape out a chip

01:30:30.240 | to find out what's wrong, ideally, right?

01:30:32.600 | How do you, and this is a spectrum,

01:30:35.520 | how do you make it so that people feel productive?

01:30:38.520 | So their turnaround time is very quick.

01:30:40.480 | All these things are really hard problems.

01:30:42.440 | And in this world, I think that not a lot of effort

01:30:46.120 | has been put into that design problem

01:30:48.080 | and thinking about the layering and other pieces.

01:30:51.160 | - Well, you've, on the topic of concurrency,

01:30:53.520 | you've written the Swift concurrency manifest.

01:30:55.600 | I think it's kind of interesting.

01:30:57.640 | Anything that has the word manifesto in it

01:31:00.640 | is very interesting.

01:31:02.400 | Can you summarize the key ideas

01:31:04.080 | of each of the five parts you've written about?

01:31:07.400 | - So what is a manifesto?

01:31:08.880 | - Yes.

01:31:09.720 | - How about we start there?

01:31:10.920 | So in the Swift community, we have this problem,

01:31:15.160 | which is on the one hand,

01:31:16.120 | you wanna have relatively small proposals

01:31:19.320 | that you can kind of fit in your head,

01:31:21.440 | you can understand the details at a very fine-grained level

01:31:24.080 | that move the world forward.

01:31:26.000 | But then you also have these big arcs, okay?

01:31:28.880 | And often when you're working on something

01:31:30.800 | that is a big arc, but you're tackling it in small pieces,

01:31:34.080 | you have this question of,

01:31:35.160 | how do I know I'm not doing a random walk?

01:31:37.400 | Where are we going?

01:31:38.720 | Like, how does this add up?

01:31:39.760 | Furthermore, when you start that first,

01:31:42.120 | the first small step, what terminology do you use?

01:31:45.280 | How do we think about it?

01:31:46.560 | What is better and worse in the space?

01:31:47.920 | What are the principles?

01:31:48.760 | What are we trying to achieve?

01:31:50.080 | And so what a manifesto in the Swift community does

01:31:52.080 | is it starts to say, hey, well,

01:31:53.920 | let's step back from the details of everything.

01:31:56.640 | Let's paint a broad picture to talk about how,

01:31:59.720 | what we're trying to achieve.

01:32:01.280 | Let's give an example design point.

01:32:02.760 | Let's try to paint the big picture

01:32:05.280 | so that then we can zero in on the individual steps

01:32:07.400 | and make sure that we're making good progress.

01:32:09.680 | And so the Swift Concurrency Manifesto

01:32:11.240 | is something I wrote three years ago.

01:32:13.880 | It's been a while, maybe more.

01:32:16.240 | Trying to do that for Swift and concurrency.

01:32:18.660 | And it starts with some fairly simple things,

01:32:22.400 | like making the observation that

01:32:24.000 | when you have multiple different computers

01:32:26.720 | or multiple different threads that are communicating,

01:32:28.920 | it's best for them to be asynchronous.

01:32:30.800 | And so you need things to be able to run separately

01:32:34.520 | and then communicate with each other.

01:32:35.840 | And this means asynchrony.

01:32:37.440 | And this means that you need a way

01:32:39.000 | to modeling asynchronous communication.

01:32:41.720 | Many languages have features like this.

01:32:43.720 | Async/await is a popular one.

01:32:45.400 | And so that's what I think is very likely in Swift.

01:32:48.220 | But as you start building this tower of abstractions,

01:32:51.380 | it's not just about how do you write this?

01:32:53.640 | You then reach into the, how do you get memory safety?

01:32:57.460 | Because you want correctness.

01:32:58.360 | You want debuggability and sanity for developers.

01:33:01.680 | And how do you get that memory safety into the language?

01:33:06.620 | So if you take a language like Go or C

01:33:09.000 | or any of these languages,

01:33:10.420 | you get what's called a race condition

01:33:11.920 | when two different threads or Go routines or whatever

01:33:14.920 | touch the same point in memory.

01:33:16.480 | This is a huge, maddening problem to debug

01:33:21.280 | because it's not reproducible generally.

01:33:24.500 | And so there's tools, there's a whole ecosystem

01:33:26.480 | of solutions that built up around this.

01:33:28.320 | But it's a huge problem when you're writing concurrent code.

01:33:31.040 | And so with Swift, this whole value semantics thing

01:33:34.160 | is really powerful there because it turns out

01:33:36.160 | that math and copies actually work

01:33:39.080 | even in concurrent worlds.

01:33:40.680 | And so you get a lot of safety just out of the box,

01:33:43.280 | but there are also some hard problems

01:33:44.640 | and it talks about some of that.

01:33:47.040 | When you start building up to the next level up

01:33:48.840 | and you start talking beyond memory safety,

01:33:50.520 | you have to talk about what is the programmer model?

01:33:52.960 | How does a human think about this?

01:33:54.240 | So a developer that's trying to build a program

01:33:56.760 | think about this and it proposes a really old model

01:34:00.160 | with a new spin called actors.

01:34:02.040 | Actors are about saying we have islands

01:34:05.360 | of single threadedness logically.

01:34:08.120 | So you write something that feels like it's one programming,

01:34:10.660 | one program running in a unit,

01:34:13.200 | and then it communicates asynchronously with other things.

01:34:16.720 | And so making that expressive and natural feel good

01:34:20.840 | be the first thing you reach for and being safe by default

01:34:23.480 | is a big part of the design of that proposal.

01:34:26.600 | When you start going beyond that,

01:34:27.680 | now you start to say, cool,

01:34:28.680 | well, these things that communicate asynchronously,

01:34:31.080 | they don't have to share memory.

01:34:32.980 | Well, if they don't have to share memory

01:34:34.240 | and they're sending messages to each other,

01:34:36.080 | why do they have to be in the same process?

01:34:38.240 | These things should be able to be in different processes

01:34:41.480 | on your machine and why just processes?

01:34:44.040 | Well, why not different machines?

01:34:45.680 | And so now you have a very nice gradual transition

01:34:49.600 | towards distributed programming.

01:34:51.760 | And of course, when you start talking about the big future,

01:34:54.800 | the manifesto doesn't go into it,

01:34:56.980 | but accelerators are things you talk to asynchronously

01:35:01.980 | by sending messages to them.

01:35:03.560 | How do you program those?

01:35:05.820 | Well, that gets very interesting.

01:35:07.720 | That's not in the proposal.

01:35:09.400 | - So, and how much do you wanna make that explicit,

01:35:14.400 | like the control of that whole process

01:35:17.040 | explicit to the programmer?

01:35:18.120 | - Yeah, good question.

01:35:19.240 | So when you're designing any of these kinds of features

01:35:22.880 | or language features or even libraries,

01:35:25.320 | you have this really hard trade-off you have to make,

01:35:27.720 | which is how much is it magic

01:35:29.800 | or how much is it in the human's control?

01:35:32.120 | How much can they predict and control it?

01:35:34.720 | What do you do when the default case is the wrong case?

01:35:38.680 | And so when you're designing a system,

01:35:42.180 | I won't name names, but there are systems

01:35:46.520 | where it's really easy to get started

01:35:51.000 | and then you jump, so let's pick like Logo, okay?

01:35:54.360 | So something like this.

01:35:55.560 | So it's really easy to get started.

01:35:57.120 | It's really designed for teaching kids,

01:35:59.520 | but as you get into it, you hit a ceiling

01:36:02.040 | and then you can't go any higher.

01:36:03.200 | And then what do you do?

01:36:04.080 | Well, you have to go switch to a different world

01:36:05.560 | and rewrite all your code.

01:36:07.160 | And this Logo is a silly example here.

01:36:09.120 | This exists in many other languages.

01:36:11.360 | With Python, you would say like concurrency, right?

01:36:15.260 | So Python has the global interpreter lock,

01:36:17.320 | so threading is challenging in Python.

01:36:19.480 | And so if you start writing a large-scale application

01:36:22.600 | in Python and then suddenly you need concurrency,

01:36:25.140 | you're kind of stuck with a series of bad trade-offs, right?

01:36:28.420 | There's other ways to go where you say like,

01:36:32.240 | voiced all the complexity on the user all at once, right?

01:36:37.040 | And that's also bad in a different way.

01:36:38.800 | And so what I prefer is building a simple model

01:36:43.480 | that you can explain that then has an escape hatch.

01:36:46.960 | So you get in, you have guardrails,

01:36:49.440 | memory safety works like this in Swift

01:36:52.120 | where you can start with,

01:36:53.960 | like by default, if you use all the standard things,

01:36:56.400 | it's memory safe, you're not gonna shoot your foot off.

01:36:58.640 | But if you wanna get a C-level pointer to something,

01:37:02.320 | you can explicitly do that.

01:37:04.320 | - But by default, there's guardrails.

01:37:07.760 | - There's guardrails.

01:37:08.880 | - Okay, so, but like, you know,

01:37:11.120 | whose job is it to figure out

01:37:14.320 | which part of the code is parallelizable?

01:37:17.360 | - So in the case of the proposal, it is the human's job.

01:37:20.960 | So they decide how to architect their application.

01:37:24.200 | And then the runtime and the compiler is very predictable.

01:37:28.120 | And so this is in contrast to,

01:37:31.560 | like there's a long body of work,

01:37:32.920 | including on Fortran for auto-parallelizing compilers.

01:37:35.980 | And this is an example of a bad thing.

01:37:40.160 | So as a compiler person, I can rag on compiler people.

01:37:43.480 | Often compiler people will say,

01:37:45.600 | "Cool, since I can't change the code,

01:37:47.280 | I'm gonna write my compiler

01:37:48.480 | that then takes this unmodified code

01:37:50.040 | and makes it go way faster on this machine."

01:37:52.680 | Okay, application, and so it does pattern matching.

01:37:56.280 | It does like really deep analysis.

01:37:58.320 | Compiler people are really smart.

01:37:59.560 | And so they like wanna like do something

01:38:01.380 | really clever and tricky.

01:38:02.400 | And you get like 10X speed up

01:38:04.120 | by taking like an array of structures

01:38:06.100 | and turn it into a structure of arrays or something,

01:38:08.040 | because it's so much better for memory.

01:38:09.280 | Like there's bodies, like tons of tricks.

01:38:11.640 | - Yeah, they love optimization.

01:38:13.800 | - Yeah, you love optimization.

01:38:14.640 | - Everyone loves optimization.

01:38:15.720 | - Everyone loves it.

01:38:16.560 | Well, and it's this promise of build with my compiler

01:38:19.080 | and your thing goes fast.

01:38:20.720 | Right, but here's the problem.

01:38:22.520 | Lex, you write a program.

01:38:24.680 | You run it with my compiler, it goes fast.

01:38:26.560 | You're very happy.

01:38:27.400 | Wow, it's so much faster than the other compiler.

01:38:29.480 | Then you go and you add a feature to your program

01:38:31.220 | or you refactor some code.

01:38:32.680 | And suddenly you got a 10X loss in performance.

01:38:35.720 | Well, why?

01:38:36.560 | What just happened there?

01:38:37.560 | What just happened there is the heuristic,

01:38:39.840 | the pattern matching, the compiler,

01:38:41.960 | whatever analysis it was doing just got defeated

01:38:43.920 | because you didn't inline a function or something, right?

01:38:48.200 | As a user, you don't know, you don't wanna know.

01:38:50.240 | That was the whole point.

01:38:51.080 | You don't wanna know how the compiler works.

01:38:52.780 | You don't wanna know how the memory hierarchy works.

01:38:54.560 | You don't wanna know how it got parallelized

01:38:56.040 | across all these things.

01:38:57.340 | You wanted that abstract away from you.

01:38:59.840 | But then the magic is lost as soon as you did something

01:39:02.840 | and you fall off a performance cliff.

01:39:05.000 | And now you're in this funny position where,

01:39:07.520 | what do I do?

01:39:08.360 | I don't change my code.

01:39:09.180 | I don't fix that bug.

01:39:10.840 | It costs 10X performance.

01:39:12.280 | Now what do I do?

01:39:13.580 | Well, this is the problem with unpredictable performance.

01:39:15.960 | If you care about performance,

01:39:17.320 | predictability is a very important thing.

01:39:19.480 | And so what the proposal does is it provides

01:39:23.760 | architectural patterns for being able to lay out your code,

01:39:26.680 | gives you full control over that,

01:39:28.320 | makes it really simple so you can explain it.

01:39:30.120 | And then if you wanna scale out in different ways,

01:39:34.720 | you have full control over that.

01:39:36.520 | - So in your sense, the intuition is for a compiler,

01:39:39.400 | it's too hard to do automated parallelization.

01:39:42.520 | Like, you know, 'cause the compilers do stuff automatically

01:39:47.520 | that's incredibly impressive for other things.

01:39:49.860 | - Right.

01:39:50.700 | - But for parallelization, we're not even,

01:39:53.420 | we're not close to there.

01:39:54.580 | - Well, it depends on the programming model.

01:39:56.220 | So there's many different kinds of compilers.

01:39:58.420 | And so if you talk about like a C compiler,

01:40:00.340 | a Swift compiler, something like that,

01:40:01.900 | where you're writing imperative code,

01:40:04.940 | parallelizing that and reasoning about all the pointers

01:40:07.100 | and stuff like that is a very difficult problem.

01:40:10.100 | Now, if you switch domains,

01:40:12.220 | so there's this cool thing called machine learning, right?

01:40:15.540 | So the machine learning nerds,

01:40:17.620 | among other endearing things like, you know,

01:40:19.420 | solving cat detectors and other things like that,

01:40:22.080 | have done this amazing breakthrough

01:40:25.380 | of producing a programming model,

01:40:27.520 | operations that you compose together,

01:40:29.380 | that has raised the levels of abstraction high enough

01:40:33.160 | that suddenly you can have auto-parallelizing compilers.

01:40:36.740 | You can write a model using TensorFlow

01:40:39.580 | and have it run on 1,024 nodes of a TPU.

01:40:43.420 | - Yeah, that's true.

01:40:44.260 | I didn't even think about like, you know,

01:40:46.860 | 'cause there's so much flexibility

01:40:48.180 | in the design of architectures

01:40:49.580 | that ultimately boil down to a graph

01:40:51.420 | that's parallelizable for you, parallelized for you.

01:40:54.160 | - And if you think about it, that's pretty cool.

01:40:56.620 | - That's pretty cool, yeah.

01:40:57.620 | - And you think about batching, for example,

01:40:59.740 | as a way of being able to exploit more parallelism.

01:41:02.180 | - Yeah.

01:41:03.020 | - Like that's a very simple thing

01:41:03.860 | that now is very powerful.

01:41:05.380 | That didn't come out of the programming language nerds,

01:41:07.740 | right, those people.

01:41:08.880 | Like that came out of people

01:41:10.100 | that are just looking to solve a problem

01:41:11.440 | and use a few GPUs and organically developed

01:41:14.020 | by the community of people focusing on machine learning.

01:41:16.860 | And it's an incredibly powerful abstraction layer

01:41:19.860 | that enables the compiler people to go and exploit that.

01:41:22.780 | And now you can drive supercomputers from Python.

01:41:26.380 | That's pretty cool.

01:41:27.500 | - That's amazing.

01:41:28.340 | So just to pause on that,

01:41:29.420 | 'cause I'm not sufficiently low level,

01:41:32.260 | I forget to admire the beauty and power of that.

01:41:35.360 | But maybe just to linger on it,

01:41:38.500 | like what does it take to run a neural network fast?

01:41:42.620 | Like how hard is that compilation?

01:41:44.060 | - It's really hard.

01:41:45.660 | - So we just skipped,

01:41:46.900 | you said like it's amazing that that's a thing,

01:41:49.600 | but how hard is that of a thing?

01:41:51.540 | - It's hard, and I would say that

01:41:53.660 | not all of the systems are really great,

01:41:57.180 | including the ones I helped build.

01:41:58.620 | So there's a lot of work left to be done there.

01:42:00.740 | - Is it the compiler nerds working on that

01:42:02.340 | or is it a whole new group of people?

01:42:04.620 | - Well, it's a full stack problem,

01:42:05.900 | including compiler people,

01:42:07.900 | including APIs, so like Keras

01:42:10.140 | and the module API in PyTorch and Jax.

01:42:14.620 | And there's a bunch of people pushing

01:42:15.980 | on all the different parts of these things.

01:42:17.460 | Because when you look at it,

01:42:18.860 | it's both how do I express the computation?

01:42:21.300 | Do I stack up layers?

01:42:22.940 | Well, cool, like setting up a linear sequence of layers

01:42:25.660 | is great for the simple case,

01:42:26.780 | but how do I do the hard case?

01:42:28.260 | How do I do reinforcement learning?

01:42:29.540 | Well, now I need to integrate

01:42:30.380 | my application logic in this, right?

01:42:32.700 | Then it's the next level down of

01:42:34.700 | how do you represent that for the runtime?

01:42:36.700 | How do you get hardware abstraction?

01:42:39.100 | And then you get to the next level down of saying like,

01:42:40.780 | forget about abstraction,

01:42:41.860 | how do I get the peak performance out of my TPU

01:42:44.540 | or my iPhone accelerator or whatever, right?

01:42:47.620 | And all these different things.

01:42:48.980 | And so this is a layered problem

01:42:50.260 | with a lot of really interesting design

01:42:53.100 | and work going on in the space

01:42:54.540 | and a lot of really smart people working on it.

01:42:56.940 | Machine learning is a very well-funded area

01:42:59.460 | of investment right now.

01:43:00.820 | And so there's a lot of progress being made.

01:43:02.940 | - So how much innovation is there on the lower level?

01:43:05.900 | So closer to the ASIC.

01:43:08.220 | So redesigning the hardware

01:43:09.780 | or redesigning concurrently compilers with that hardware.

01:43:13.140 | Is that, if you were to predict the biggest,

01:43:16.060 | the equivalent of Moore's law improvements

01:43:20.540 | in the inference, in the training of neural networks,

01:43:24.620 | in just all of that, where is that gonna come from?

01:43:26.660 | You think?

01:43:27.500 | - Sure, you get scalability, you have different things.

01:43:28.900 | And so you get Jim Keller shrinking process technology,

01:43:33.620 | you get three nanometer instead of five or seven

01:43:36.260 | or 10 or 28 or whatever.

01:43:38.100 | And so that marches forward and that provides improvements.

01:43:41.300 | You get architectural level performance.

01:43:44.060 | And so the TPU with a matrix multiply unit

01:43:47.660 | and a systolic array is much more efficient

01:43:49.620 | than having a scalar core doing multiplies

01:43:52.780 | and adds and things like that.

01:43:54.380 | You then get system level improvements.

01:43:58.620 | So how you talk to memory,

01:43:59.860 | how you talk across a cluster of machines,

01:44:02.340 | how you scale out, how you have fast interconnects

01:44:04.820 | between machines.

01:44:06.060 | You then get system level programming models.

01:44:08.780 | So now that you have all this hardware, how do you utilize it?

01:44:11.300 | You then have algorithmic breakthroughs

01:44:12.860 | where you say, "Hey, wow, cool.

01:44:14.380 | Instead of training in a ResNet-50 in a week,

01:44:18.900 | I'm now training it in 25 seconds."

01:44:21.580 | - Yeah, and opening that--

01:44:22.420 | - And it's a combination of new optimizers

01:44:27.020 | and new just training regimens

01:44:29.700 | and different approaches to train.

01:44:32.180 | And all of these things come together

01:44:34.060 | to push the world forward.

01:44:36.100 | - That was a beautiful exposition.

01:44:39.140 | But if you were to force to bet all your money

01:44:42.820 | on one of these, would you?

01:44:45.300 | - Why do we have to?

01:44:46.300 | Unfortunately, we have people working on all this.

01:44:50.780 | It's an exciting time, right?

01:44:52.260 | - So, I mean, you know, OpenAI did this little paper

01:44:56.180 | showing the algorithmic improvement you can get

01:44:58.060 | has been improving exponentially.

01:45:00.940 | I haven't quite seen the same kind of analysis

01:45:04.260 | on other layers of the stack.

01:45:06.740 | I'm sure it's also improving significantly.

01:45:09.340 | I just, it's a nice intuition builder.

01:45:12.340 | I mean, there's a reason why Moore's law,

01:45:16.140 | that's the beauty of Moore's law,

01:45:17.340 | is somebody writes a paper

01:45:19.500 | that makes a ridiculous prediction.

01:45:21.420 | And it, you know, becomes reality in a sense.

01:45:27.180 | There's something about these narratives

01:45:28.900 | when you, when Chris Lattner on a silly little podcast

01:45:33.620 | makes, bets all his money on a particular thing,

01:45:37.260 | somehow it can have a ripple effect

01:45:39.260 | of actually becoming real.

01:45:40.780 | That's an interesting aspect of it.

01:45:43.300 | 'Cause like, it might've been, you know,

01:45:46.060 | we focus with Moore's law,

01:45:47.540 | most of the computing industry really, really focused

01:45:51.460 | on the hardware.

01:45:52.740 | I mean, software innovation,

01:45:55.100 | I don't know how much software innovation

01:45:56.500 | there was in terms of efficiency.

01:45:57.340 | - You can tell giveth, bill takes away.

01:45:59.140 | (laughing)

01:46:00.180 | - Yeah.

01:46:01.300 | I mean, compilers improved significantly also.

01:46:04.100 | - Well, not really.

01:46:04.940 | So actually, I mean, I'm joking

01:46:06.900 | about how software's gotten slower

01:46:09.020 | pretty much as fast as hardware got better,

01:46:11.620 | at least through the nineties.

01:46:13.260 | There's another joke, another law in compilers,

01:46:15.820 | which is called, I think it's called Probstein's law,

01:46:18.260 | which is compilers double the performance

01:46:21.700 | of any given code every 18 years.

01:46:23.820 | (laughing)

01:46:26.300 | - So they move slowly.

01:46:27.980 | - Yeah, well, so-

01:46:28.820 | - Well, yeah, it's exponential also.

01:46:31.060 | - Yeah, but you're making progress.

01:46:32.500 | But there, again, it's not about,

01:46:34.300 | the power of compilers is not just about

01:46:37.820 | how do you make the same thing go faster?

01:46:39.180 | It's how do you unlock the new hardware?

01:46:41.900 | A new chip came out, how do you utilize it?

01:46:43.660 | You say, oh, the programming model,

01:46:45.260 | how do we make people more productive?

01:46:47.100 | How do we, like, have better error messages?

01:46:52.060 | Even such mundane things, like,

01:46:54.220 | how do I generate a very specific error message

01:46:57.300 | about your code, actually makes people happy

01:46:59.900 | because then they know how to fix it, right?

01:47:01.940 | And it comes back to how do you help people

01:47:03.660 | get their job done?

01:47:04.660 | - Yeah, and yeah, and then in this world

01:47:06.820 | of exponentially increasing smart toasters,

01:47:10.340 | how do you expand computing to all these kinds of devices?

01:47:15.340 | Do you see this world where just everything's

01:47:18.580 | a computing surface?

01:47:20.460 | You see that possibility?

01:47:22.180 | Just everything's a computer?

01:47:24.020 | - Yeah, I don't see any reason

01:47:25.140 | that that couldn't be achieved.

01:47:27.020 | It turns out that sand goes into glass

01:47:30.500 | and glass is pretty useful too.

01:47:32.700 | And, you know, like, why not?

01:47:35.220 | - Why not?

01:47:36.060 | So, very important question then,

01:47:39.580 | if we're living in a simulation

01:47:44.580 | and the simulation is running a computer,

01:47:47.420 | like, what's the architecture of that computer,

01:47:50.020 | do you think?

01:47:51.900 | - So you're saying, is it a quantum system?

01:47:54.540 | Is it a--

01:47:55.380 | - Yeah, like this whole quantum discussion,

01:47:56.700 | is it needed or can we run it on a,

01:47:59.420 | you know, with a RISC-V architecture,

01:48:03.260 | a bunch of CPUs?

01:48:05.300 | - I think it comes down to the right tool for the job.

01:48:07.580 | Okay, and so--

01:48:08.660 | - And what's the compiler?

01:48:10.140 | - Yeah, exactly, that's my question.

01:48:12.580 | How do I get that job?

01:48:13.700 | Be the universe compiler.

01:48:14.940 | And so there, as far as we know,

01:48:19.740 | quantum systems are the bottom of the pile of turtles

01:48:23.700 | so far.

01:48:24.540 | And so we don't know efficient ways

01:48:28.300 | to implement quantum systems

01:48:29.660 | without using quantum computers.

01:48:31.260 | - Yeah, and that's totally outside

01:48:33.580 | of everything we've talked about.

01:48:35.180 | - But who runs that quantum computer?

01:48:37.060 | - Yeah.

01:48:37.900 | - Right, so if we really are living in a simulation,

01:48:41.460 | then is it bigger quantum computers?

01:48:44.420 | Is it different ones?

01:48:45.260 | Like, how does that work out?

01:48:46.580 | How does that scale?

01:48:47.700 | - Well, it's the same size.

01:48:49.940 | It's the same size.

01:48:50.780 | But then the thought of the simulation

01:48:52.660 | is that you don't have to run the whole thing,

01:48:54.220 | that, you know, we humans are cognitively very limited.

01:48:56.860 | - You do checkpoints.

01:48:57.700 | - You do checkpoints, yeah.

01:48:59.420 | And if we, the point at which we human,

01:49:02.980 | so you basically do minimal amount of,

01:49:06.820 | what is it, Swift does on write?

01:49:10.980 | Copy and run. - Copy and run, yeah.

01:49:12.180 | - So you only adjust the simulation.

01:49:15.500 | - Parallel universe theories, right?

01:49:17.020 | And so every time a decision's made,

01:49:20.540 | somebody opens the Schrodinger box,

01:49:22.060 | then there's a fork, this could happen.

01:49:24.980 | - And then, thank you for considering the possibility.

01:49:29.980 | But yeah, so it may not require, you know,

01:49:32.780 | the entirety of the universe to simulate it.

01:49:34.700 | But it's interesting to think about

01:49:38.900 | as we create these higher and higher fidelity systems.

01:49:43.340 | But I do wanna ask on the quantum computer side,

01:49:46.620 | 'cause everything we've talked about with,

01:49:49.220 | you work with Sci-5, with compilers,

01:49:52.060 | none of that includes quantum computers, right?

01:49:55.060 | - That's true.

01:49:56.060 | - So have you ever thought about what, you know,

01:50:01.060 | this whole serious engineering work

01:50:05.420 | of quantum computers looks like, of compilers,

01:50:08.100 | of architectures, all of that kind of stuff?

01:50:10.660 | - So I've looked at it a little bit.

01:50:11.820 | I know almost nothing about it,

01:50:14.300 | which means that at some point,

01:50:15.540 | I will have to find an excuse to get involved,

01:50:17.860 | 'cause that's how I work.

01:50:18.700 | - But do you think that's a thing to be,

01:50:21.140 | like, with your little tingly senses

01:50:23.420 | of the timing of one to be involved, is it not yet?

01:50:26.860 | - Well, so the thing I do really well

01:50:28.820 | is I jump into messy systems

01:50:31.660 | and figure out how to make them,

01:50:33.700 | figure out what the truth in the situation is,

01:50:35.540 | try to figure out what the unifying theory is,

01:50:39.100 | how to, like, factor the complexity,

01:50:40.980 | how to find a beautiful answer to a problem

01:50:42.860 | that has been well-studied

01:50:44.860 | and lots of people have bashed their heads against it.

01:50:47.060 | I don't know that quantum computers are mature enough

01:50:49.300 | and accessible enough to be figured out yet, right?

01:50:53.740 | And I think the open question with quantum computers is,

01:50:58.580 | is there a useful problem that gets solved

01:51:00.900 | with a quantum computer that makes it worth

01:51:04.100 | the economic cost of, like, having one of these things

01:51:06.740 | and having legions of people that set it up?

01:51:11.500 | You go back to the '50s, right,

01:51:12.780 | and there's the projections of,

01:51:13.980 | the world will only need seven computers, right?

01:51:18.220 | Well, and part of that was that people hadn't figured out

01:51:20.740 | what they're useful for.

01:51:21.980 | What are the algorithms we wanna run?

01:51:23.260 | What are the problems that get solved?

01:51:24.340 | And this comes back to, how do we make the world better,

01:51:27.620 | either economically or making somebody's life better

01:51:29.900 | or, like, solving a problem that wasn't solved before,

01:51:31.940 | things like this.

01:51:33.140 | And I think that just we're a little bit too early

01:51:36.020 | in that development cycle

01:51:36.860 | because it's still, like, literally a science project,

01:51:39.380 | not in a negative connotation, right?

01:51:41.540 | It's literally a science project

01:51:42.860 | and the progress there's amazing.

01:51:45.420 | And so I don't know if it's 10 years away,

01:51:48.900 | if it's two years away,

01:51:50.100 | exactly where that breakthrough happens,

01:51:51.660 | but you look at machine learning,

01:51:54.540 | we went through a few winners

01:51:58.420 | before the AlexNet transition,

01:52:00.180 | and then suddenly it had its breakout moment,

01:52:02.980 | and that was the catalyst that then drove

01:52:05.860 | the talent flocking into it.

01:52:07.580 | That's what drove the economic applications of it.

01:52:10.180 | That's what drove the technology to go faster

01:52:13.420 | because you now have more minds thrown at the problem.

01:52:15.940 | This is what caused a serious knee in deep learning

01:52:20.180 | and the algorithms that we're using.

01:52:22.100 | And so I think that's what quantum needs to go through.

01:52:25.540 | And so right now it's in that formidable finding itself,

01:52:28.820 | getting literally the physics figured out.

01:52:32.700 | - And then it has to figure out the application

01:52:36.100 | that makes this useful.

01:52:37.580 | - Yeah, but I'm not skeptical of that.

01:52:39.860 | I think that will happen.

01:52:40.860 | I think it's just 10 years away, something like that.

01:52:43.500 | - I forgot to ask, what programming language

01:52:46.100 | do you think the simulation is written in?

01:52:48.700 | - Ooh, probably Lisp.

01:52:50.300 | (laughing)

01:52:52.060 | - So not Swift.

01:52:53.020 | Like if you were to bet,

01:52:54.220 | I'll just leave it at that.

01:52:58.100 | So, I mean, we've mentioned that you worked

01:53:00.460 | at all these companies,

01:53:01.460 | we've talked about all these projects.

01:53:03.940 | It's kind of like if we just step back and zoom out

01:53:07.260 | about the way you did that work,

01:53:10.100 | and we look at COVID times,

01:53:12.220 | this pandemic we're living through,

01:53:13.780 | that may, if I look at the way Silicon Valley folks

01:53:17.020 | are talking about it, the way MIT's talking about it,

01:53:19.860 | this might last for a long time.

01:53:23.060 | Not just the virus, but the remote nature.

01:53:28.060 | - The economic impact.

01:53:29.660 | I mean, yeah, it's gonna be a mess.

01:53:32.140 | - Do you think, what's your prediction?

01:53:34.500 | I mean, from Sci-Fi to Google to just all the places

01:53:39.500 | you worked in just Silicon Valley,

01:53:43.380 | you're in the middle of it.

01:53:44.260 | What do you think is, how is this whole place gonna change?

01:53:46.620 | - Yeah, so, I mean, I really can only speak

01:53:49.060 | to the tech perspective.

01:53:50.460 | I am in that bubble.

01:53:52.820 | I think it's gonna be really interesting

01:53:55.700 | because the Zoom culture of being remote

01:53:58.780 | and on video chat all the time

01:54:00.260 | has really interesting effects on people.

01:54:01.980 | So on the one hand, it's a great normalizer.

01:54:05.020 | It's a normalizer that I think will help communities

01:54:09.060 | of people that have traditionally been underrepresented

01:54:12.580 | because now you're taking, in some cases, a face-off

01:54:16.340 | 'cause you don't have to have a camera going, right?

01:54:18.740 | And so you can have conversations

01:54:19.980 | without physical appearance being part of the dynamic,

01:54:22.740 | which is pretty powerful.

01:54:24.500 | You're taking remote employees that have already been remote

01:54:27.020 | and you're saying you're now on the same level

01:54:29.900 | and footing as everybody else.

01:54:31.380 | Nobody gets whiteboards.

01:54:33.460 | You're not gonna be the one person

01:54:34.580 | that doesn't get to be participating

01:54:35.980 | in the whiteboard conversation,

01:54:37.180 | and that's pretty powerful.

01:54:39.300 | You've got, you're forcing people to think asynchronously

01:54:44.100 | in some cases because it's hard

01:54:45.660 | to just get people physically together

01:54:48.140 | and the bumping into each other

01:54:49.380 | forces people to find new ways to solve those problems.

01:54:52.740 | And I think that that leads to more inclusive behavior,

01:54:55.220 | which is good.

01:54:56.740 | On the other hand, it's also, it just sucks, right?

01:55:00.740 | And so-

01:55:02.580 | - The nature, the actual communication,

01:55:05.300 | or it just sucks being not with people

01:55:08.700 | like on a daily basis and collaborating with them?

01:55:11.380 | - Yeah, all of that, right?

01:55:13.060 | I mean, everything, this whole situation is terrible.

01:55:15.620 | What I meant primarily was the,

01:55:17.580 | I think that most humans like working physically with humans.

01:55:22.940 | I think this is something that not everybody,

01:55:24.620 | but many people are programmed to do.

01:55:27.060 | And I think that we get something out of that

01:55:29.180 | that is very hard to express, at least for me.

01:55:31.420 | And so maybe this isn't true of everybody.

01:55:33.100 | But, and so the question to me is,

01:55:36.780 | when you get through that time of adaptation,

01:55:38.980 | you get out of March and April and you get into December

01:55:43.100 | and you get into next March, if it's not changed, right?

01:55:46.500 | - It's already terrifying.

01:55:47.740 | - Well, you think about that and you think about

01:55:49.540 | what is the nature of work and how do we adapt?

01:55:52.620 | And humans are very adaptable species, right?

01:55:54.980 | We can learn things.

01:55:57.100 | And when we're forced to,

01:55:58.140 | and there's a catalyst to make that happen.

01:56:00.500 | And so what is it that comes out of this

01:56:02.620 | and are we better or worse off, right?

01:56:04.660 | I think that, you look at the Bay Area,

01:56:07.100 | housing prices are insane.

01:56:08.860 | Well, why?

01:56:09.820 | Well, there's a high incentive to be physically located

01:56:12.420 | because if you don't have proximity,

01:56:14.980 | you end up paying for it in commute, right?

01:56:18.380 | And there has been huge social pressure

01:56:21.020 | in terms of like, you will be there for the meeting, right?

01:56:24.620 | Or whatever scenario it is.

01:56:26.900 | And I think that's gonna be way better.

01:56:28.220 | I think it's gonna be much more of the norm

01:56:29.980 | to have remote employees.

01:56:31.620 | And I think this is gonna be really great.

01:56:33.180 | - Do you have friends or do you hear of people moving?

01:56:36.500 | - Yeah, I know one family friend that moved.

01:56:40.740 | They moved back to Michigan and, you know,

01:56:43.620 | they were a family with three kids

01:56:45.580 | living in a small apartment and like, we're going insane.

01:56:48.900 | (laughing)

01:56:50.460 | Right, and they're in tech, husband works for Google.

01:56:54.260 | So first of all, friends of mine are in the process of,

01:56:58.100 | or have already lost the business.

01:57:00.580 | The thing that represents their passion, their dream.

01:57:03.180 | It could be small entrepreneur projects,

01:57:05.300 | but it can be large businesses like people that run gyms.

01:57:07.900 | - Oh, restaurants, like tons of things, yeah.

01:57:10.820 | - But also, people like look at themselves in the mirror

01:57:14.140 | and ask the question of like,

01:57:16.180 | what do I wanna do in life?

01:57:17.580 | For some reason, they haven't done it until COVID.

01:57:20.900 | They really ask that question

01:57:22.060 | and that results often in moving or leaving the company

01:57:26.300 | or with starting your own business

01:57:28.100 | or transitioning to different company.

01:57:30.620 | Do you think we're gonna see that a lot?

01:57:33.600 | - Well, I can't speak to that.

01:57:36.780 | I mean, we're definitely gonna see it at a higher frequency

01:57:38.500 | than we did before, just because I think what you're trying

01:57:41.900 | to say is there are decisions that you make yourself

01:57:45.820 | and big life decisions that you make yourself.

01:57:47.860 | And like, I'm gonna like quit my job and start a new thing.

01:57:50.440 | There's also decisions that get made for you.

01:57:52.880 | Like I got fired from my job.

01:57:54.560 | What am I gonna do, right?

01:57:55.860 | And that's not a decision that you think about,

01:57:58.580 | but you're forced to act, okay?

01:58:00.880 | And so I think that those you're forced to act

01:58:03.460 | kind of moments where like, you know,

01:58:05.140 | global pandemic comes and wipes out the economy

01:58:07.220 | and now your business doesn't exist.

01:58:10.400 | I think that does lead to more reflection, right?

01:58:12.340 | Because you're less anchored on what you have

01:58:14.980 | and it's not a, what do I have to lose

01:58:17.580 | versus what do I have to gain, AB comparison.

01:58:20.480 | It's more of a fresh slate.

01:58:22.440 | Cool, I could do anything now.

01:58:24.380 | Do I wanna do the same thing I was doing?

01:58:26.860 | Did that make me happy?

01:58:28.320 | Is this now time to go back to college

01:58:30.000 | and take a class and learn a new skill?

01:58:33.120 | Is this a time to spend time with family?

01:58:36.600 | If you can afford to do that, is this time to like,

01:58:39.000 | you know, literally move in with the parents, right?

01:58:41.000 | I mean, all these things that were not normative before

01:58:43.880 | suddenly become, I think, very, the value systems change.

01:58:48.880 | And I think that's actually a good thing

01:58:50.800 | in the short term at least, because it leads to, you know,

01:58:55.800 | there's kind of been an over-optimization

01:58:58.400 | along one set of priorities for the world.

01:59:01.540 | And now maybe we'll get to a more balanced

01:59:03.520 | and more interesting world

01:59:05.180 | where people are doing different things.

01:59:06.760 | I think it could be good.

01:59:07.640 | I think there could be more innovation

01:59:09.000 | that comes out of it, for example.

01:59:10.120 | - What do you think about all the social chaos

01:59:12.760 | we're in the middle of?

01:59:13.920 | - It sucks.

01:59:14.760 | (laughing)

01:59:17.520 | - Let me ask you, you think it's all gonna be okay?

01:59:21.080 | - Well, I think humanity will survive.

01:59:23.400 | - The form of nexus denture,

01:59:25.400 | like we're not all gonna kill, yeah, well.

01:59:27.280 | - Yeah, I don't think the virus

01:59:28.120 | is gonna kill all the humans.

01:59:30.360 | I don't think all the humans are gonna kill all the humans.

01:59:32.000 | I think that's unlikely.

01:59:32.880 | But I look at it as

01:59:35.560 | progress requires a catalyst, right?

01:59:42.160 | So you need a reason for people

01:59:44.760 | to be willing to do things that are uncomfortable.

01:59:47.720 | I think that the US at least,

01:59:50.720 | but I think the world in general

01:59:51.760 | is a pretty unoptimal place to live in for a lot of people.

01:59:56.760 | And I think that what we're seeing right now

01:59:58.880 | is we're seeing a lot of unhappiness.

02:00:00.440 | And because of all the pressure,

02:00:03.560 | because of all the badness in the world

02:00:05.520 | that's coming together,

02:00:06.340 | it's really kind of igniting some of that debate

02:00:07.840 | that should have happened a long time ago, right?

02:00:10.120 | I mean, I think that we'll see more progress.

02:00:11.600 | If you're asking about,

02:00:12.880 | offline you're asking about politics

02:00:14.240 | and wouldn't it be great if politics moved faster

02:00:15.760 | because there's all these problems in the world

02:00:16.600 | and we can move it.

02:00:18.160 | Well, people are inherently conservative.

02:00:22.320 | And so if you're talking about conservative people,

02:00:25.040 | particularly if they have heavy burdens on their shoulders

02:00:27.480 | 'cause they represent literally thousands of people,

02:00:30.080 | it makes sense to be conservative.

02:00:33.240 | But on the other hand, when you need change,

02:00:35.360 | how do you get it?

02:00:36.240 | The global pandemic will probably lead to some change.

02:00:40.560 | And it's not a directed plan,

02:00:44.320 | but I think that it leads to people

02:00:45.920 | asking really interesting questions.

02:00:47.400 | And some of those questions

02:00:48.240 | should have been asked a long time ago.

02:00:50.120 | - Well, let me know if you've observed this as well.

02:00:53.320 | Something that's bothering me

02:00:54.840 | in the machine learning community,

02:00:56.160 | I'm guessing it might be prevalent in other places,

02:00:59.680 | is something that feels like in 2020

02:01:02.520 | increased level of toxicity.

02:01:05.280 | Like people are just quicker to pile on

02:01:09.720 | to just be harsh on each other,

02:01:13.280 | to like mob, pick a person that screwed up

02:01:18.280 | and like make it a big thing.

02:01:22.080 | And is there something that we can like,

02:01:25.520 | have you observed that in other places?

02:01:28.240 | Is there some way out of this?

02:01:30.200 | - I think there's an inherent thing in humanity

02:01:32.200 | that's kind of an us versus them thing,

02:01:34.480 | which is that you wanna succeed.

02:01:36.240 | And how do you succeed?

02:01:37.160 | Well, it's relative to somebody else.

02:01:39.640 | And so what's happening, at least in some part,

02:01:43.160 | is that with the internet and with online communication,

02:01:47.160 | the world's getting smaller.

02:01:48.560 | Right, and so we're having some of the social ties

02:01:53.080 | of like my town versus your town's football team,

02:01:56.480 | right, turn into much larger and yet shallower problems.

02:02:02.360 | And people don't have time, the incentives,

02:02:05.680 | the clickbait and like all these things

02:02:08.080 | kind of really, really feed into this machine.

02:02:10.520 | And I don't know where that goes.

02:02:12.480 | - Yeah, I mean, the reason I think about that,

02:02:14.760 | I mentioned to you this offline a little bit,

02:02:17.520 | but I have a few difficult conversations scheduled,

02:02:22.520 | some of them political related,

02:02:25.120 | some of them within the community,

02:02:27.320 | difficult personalities that went through some stuff.

02:02:30.640 | I mean, one of them I've talked before,

02:02:32.160 | I will talk again is Yann LeCun.

02:02:34.320 | He got a little bit of crap on Twitter

02:02:37.200 | for talking about a particular paper

02:02:41.000 | and the bias within a dataset.

02:02:42.800 | And then there's been a huge, in my view,

02:02:45.960 | and I'm willing, comfortable saying it,

02:02:49.800 | irrational, over-exaggerated pile on on his comments

02:02:54.440 | because he made pretty basic comments

02:02:57.160 | about the fact that if there's bias in the data,

02:02:59.920 | there's going to be bias in the results.

02:03:02.440 | So we should not have bias in the data,

02:03:04.640 | but people piled on to him

02:03:06.600 | because he said he trivialized the problem of bias.

02:03:10.080 | Like it's a lot more than just bias in the data.

02:03:13.240 | But like, yes, that's a very good point,

02:03:16.600 | but that's- - That's not what he was saying.

02:03:19.000 | - That's not what he was saying.

02:03:19.840 | And the response, like the implied response

02:03:23.160 | that he's basically sexist and racist

02:03:26.720 | is something that completely drives away

02:03:30.480 | the possibility of nuanced discussion.

02:03:32.920 | One nice thing about like a podcast long form conversation

02:03:37.920 | is you can talk it out,

02:03:40.320 | you can lay your reasoning out.

02:03:42.880 | And even if you're wrong,

02:03:44.560 | you can still show that you're a good human being

02:03:47.200 | underneath it.

02:03:48.280 | - You know, your point about

02:03:49.160 | you can't have a productive discussion.

02:03:51.040 | Well, how do you get to that point where people can turn?

02:03:53.920 | They can learn, they can listen, they can think,

02:03:56.360 | they can engage versus just being a shallow,

02:03:59.240 | like, and then keep moving, right?

02:04:02.600 | - And I don't think that progress really comes from that.

02:04:06.720 | Right, and I don't think that one should expect that.

02:04:09.920 | I think that you'd see that as reinforcing

02:04:12.360 | individual circles and the us versus them thing.

02:04:14.560 | And I think that's fairly divisive.

02:04:17.600 | - Yeah, I think there's a big role in,

02:04:21.000 | like the people that bother me most on Twitter

02:04:24.160 | when I observe things

02:04:25.760 | is not the people who get very emotional,

02:04:28.400 | angry, like over the top.

02:04:30.160 | It's the people who like prop them up.

02:04:33.920 | It's all the, it's this.

02:04:36.200 | I think what should be the,

02:04:38.000 | we should teach each other is to be sort of empathetic.

02:04:42.360 | - The thing that it's really easy to forget,

02:04:44.760 | particularly on like Twitter or the internet or an email,

02:04:47.800 | is that sometimes people just have a bad day.

02:04:50.160 | - Yeah.

02:04:51.000 | - Right, you have a bad day or you're like,

02:04:53.200 | I've been in the situation where it's like between meetings,

02:04:55.560 | like fire off a quick response to an email

02:04:57.360 | 'cause I wanna like help get something unblocked.

02:04:59.760 | Phrase it really objectively wrong.

02:05:03.680 | I screwed up and suddenly this is now

02:05:07.160 | something that sticks with people.

02:05:08.720 | And it's not because they're bad.

02:05:10.640 | It's not because you're bad.

02:05:11.880 | Just psychology of like you said a thing,

02:05:15.240 | it sticks with you.

02:05:16.080 | You didn't mean it that way,

02:05:17.000 | but it really impacted somebody

02:05:18.520 | because the way they interpret it.

02:05:20.920 | And this is just an aspect of working together as humans.

02:05:23.400 | And I have a lot of optimism in the long-term,

02:05:26.200 | the very long-term about what we as humanity can do,

02:05:29.120 | but I think that's gonna be, it's just always a rough ride.

02:05:31.160 | And you came into this by saying like,

02:05:33.160 | what is COVID and all the social strife

02:05:36.200 | that's happening right now mean?

02:05:38.120 | And I think that it's really bad in the short-term,

02:05:40.960 | but I think it'll lead to progress.

02:05:42.580 | And for that, I'm very thankful.

02:05:44.340 | - Yeah, it's painful in the short-term though.

02:05:48.040 | - Well, yeah, I mean, people are out of jobs.

02:05:49.760 | Like some people can't eat, like it's horrible.

02:05:52.520 | And, but, but, you know, it's progress.

02:05:56.960 | So we'll see what happens.

02:05:58.560 | I mean, the real question is when you look back 10 years,

02:06:01.920 | 20 years, a hundred years from now,

02:06:03.560 | how do we evaluate the decisions that are being made

02:06:05.400 | right now?

02:06:06.860 | I think that's really the way you can frame that

02:06:09.800 | and look at it.

02:06:10.640 | And you say, you know, you integrate across all

02:06:12.840 | the short-term horribleness that's happening.

02:06:15.440 | And you look at what that means and is the, you know,

02:06:18.600 | improvement across the world or the regression

02:06:20.360 | across the world significant enough to make it a good

02:06:24.160 | or a bad thing.

02:06:25.000 | I think that's the question.

02:06:26.800 | - Yeah.

02:06:27.640 | And for that, it's good to study history.

02:06:29.480 | I mean, one of the big problems for me right now

02:06:32.060 | is I'm reading the rise and fall of the third Reich.

02:06:34.760 | - Light reading.

02:06:37.400 | - So it's everything is just, I just see parallels

02:06:40.880 | and it means it's, you have to be really careful

02:06:43.500 | not to overstep it.

02:06:45.360 | But just the thing that worries me the most is the pain

02:06:49.400 | that people feel when a few things combine,

02:06:54.400 | which is like economic depression,

02:06:56.000 | which is quite possible in this country.

02:06:57.960 | And then just being disrespected in some kind of way,

02:07:02.600 | which the German people were really disrespected

02:07:05.160 | by most of the world, like in a way that's over the top,

02:07:10.160 | that something can build up and then all you need

02:07:13.460 | is a charismatic leader to go either positive or negative

02:07:18.400 | and both work as long as they're charismatic.

02:07:21.080 | And there's--

02:07:22.160 | - It's taking advantage of, again,

02:07:24.000 | that inflection point that the world's in

02:07:26.360 | and what they do with it could be good or bad.

02:07:28.720 | - And so it's a good way to think about times now,

02:07:32.680 | like on an individual level, what we decide to do

02:07:35.760 | is when history is written, 30 years from now,

02:07:39.560 | what happened in 2020, probably history's gonna remember

02:07:42.260 | 2020.

02:07:43.120 | - Yeah, I think so.

02:07:43.960 | (laughing)

02:07:45.520 | - Either for good or bad.

02:07:46.800 | And it's like up to us to write it, so it's good.

02:07:49.520 | - Well, one of the things I've observed

02:07:50.880 | that I find fascinating is most people act

02:07:54.160 | as though the world doesn't change.

02:07:56.440 | You make decisions knowingly, right?

02:08:00.000 | You make a decision where you're predicting the future

02:08:02.620 | based on what you've seen in the recent past.

02:08:04.800 | And so if something's always been,

02:08:06.120 | it's rained every single day,

02:08:07.320 | then of course you expect it to rain today too, right?

02:08:10.080 | On the other hand, the world changes all the time.

02:08:13.400 | - Yeah.

02:08:14.240 | - Incessantly, like for better and for worse.

02:08:16.800 | And so the question is, if you're interested

02:08:18.400 | in something that's not right,

02:08:20.880 | what is the inflection point that led to a change?

02:08:22.920 | And you can look to history for this.

02:08:24.360 | Like what is the catalyst that led to that explosion

02:08:27.960 | that led to that bill that led to the,

02:08:30.240 | like you can kind of work your way backwards from that.

02:08:33.240 | And maybe if you pull together the right people

02:08:35.760 | and you get the right ideas together,

02:08:36.940 | you can actually start driving that change

02:08:39.000 | and doing it in a way that's productive

02:08:40.360 | and hurts fewer people.

02:08:41.800 | - Yeah, like a single person, a single event

02:08:43.680 | can turn all of history.

02:08:44.520 | - Yeah, absolutely.

02:08:45.340 | Everything starts somewhere.

02:08:46.400 | And often it's a combination of multiple factors,

02:08:48.480 | but yeah, these things can be engineered.

02:08:52.520 | - That's actually the optimistic view that--

02:08:54.960 | - I'm a long-term optimist on pretty much everything.

02:08:57.600 | And human nature, you know,

02:08:59.360 | we can look to all the negative things that humanity has,

02:09:02.220 | all the pettiness and all the self-servingness

02:09:05.840 | and just the cruelty, right?

02:09:09.760 | The biases, just humans can be very horrible.

02:09:13.400 | But on the other hand, we're capable of amazing things.

02:09:16.120 | (laughs)

02:09:17.120 | And the progress across, you know,

02:09:20.600 | hundred year chunks is striking.

02:09:23.280 | And even across decades, we've come a long ways

02:09:26.720 | and there's still a long ways to go,

02:09:27.840 | but that doesn't mean that we've stopped.

02:09:30.000 | - Yeah, the kind of stuff we've done

02:09:31.440 | in the last hundred years is unbelievable.

02:09:34.920 | It's kind of scary to think what's gonna happen

02:09:36.760 | in the next hundred years.

02:09:37.600 | It's scary, like exciting.

02:09:39.040 | Like scary in a sense that it's kind of sad

02:09:41.680 | that the kind of technology is gonna come out

02:09:43.760 | in 10, 20, 30 years.

02:09:45.720 | We're probably too old to really appreciate

02:09:47.800 | 'cause you don't grow up with it.

02:09:49.120 | It'll be like kids these days with their virtual reality

02:09:51.720 | and their--

02:09:52.680 | - Their TikToks and stuff like this.

02:09:54.520 | Like, oh, there's this thing and like,

02:09:56.840 | come on, give me my, you know, static photo.

02:09:59.120 | (laughs)

02:09:59.960 | - You know, my Commodore 64.

02:10:02.320 | - Yeah, exactly.

02:10:03.760 | - Okay, sorry, we kind of skipped over.

02:10:05.840 | Let me ask on, you know,

02:10:09.680 | the machine learning world has been kind of inspired,

02:10:14.400 | their imagination captivated with GPT-3

02:10:17.040 | and these language models.

02:10:18.760 | I thought it'd be cool to get your opinion on it.

02:10:21.800 | What's your thoughts on this exciting world of,

02:10:26.260 | it connects to computation actually,

02:10:29.920 | is of language models that are huge

02:10:33.000 | and take many, many computers, not just to train,

02:10:37.400 | but to also do inference on.

02:10:39.400 | - Sure.

02:10:40.440 | Well, I mean, it depends on what you're speaking to there,

02:10:43.400 | but I mean, I think that there's been

02:10:45.280 | a pretty well understood maximum deep learning

02:10:48.360 | that if you make the model bigger

02:10:49.640 | and you shove more data into it,

02:10:51.360 | assuming you train it right

02:10:52.400 | and you have a good model architecture,

02:10:54.020 | that you'll get a better model out.

02:10:55.800 | And so on one hand, GPT-3 was not that surprising.

02:10:59.740 | On the other hand, a tremendous amount of engineering

02:11:02.040 | went into making it possible.

02:11:03.500 | The implications of it are pretty huge.

02:11:07.080 | I think that when GPT-2 came out,

02:11:09.000 | there was a very provocative blog post from OpenAI

02:11:11.360 | talking about, you know, we're not gonna release it

02:11:13.640 | because of the social damage it could cause

02:11:15.440 | if it's misused.

02:11:16.520 | I think that's still a concern.

02:11:20.120 | I think that we need to look at how technology is applied

02:11:23.240 | and, you know, well-meaning tools can be applied

02:11:25.800 | in very horrible ways,

02:11:26.840 | and they can have very profound impact on that.

02:11:29.320 | I think that GPT-3 is a huge technical achievement.

02:11:33.960 | And what will GPT-4 be?

02:11:35.760 | Will it probably be bigger, more expensive to train?

02:11:38.480 | Really cool architectural tricks.

02:11:42.000 | - Do you think, is there,

02:11:43.960 | I don't know how much thought you've done

02:11:46.480 | on distributed computing.

02:11:48.720 | Is there some technical challenges that are interesting

02:11:52.960 | that you're hopeful about exploring in terms of,

02:11:55.880 | you know, a system that, like a piece of code that,

02:11:59.000 | you know, with GPT-4,

02:12:02.760 | that might have, I don't know,

02:12:07.040 | hundreds of trillions of parameters

02:12:09.360 | which have to run on thousands of computers.

02:12:11.600 | Is there some hope that we can make that happen?

02:12:15.320 | - Yeah, well, I mean, today you can write a check

02:12:18.960 | and get access to a thousand TPU cores

02:12:21.800 | and do really interesting large-scale training

02:12:23.960 | and inference and things like that in Google Cloud,

02:12:26.520 | for example, right?

02:12:27.440 | And so I don't think it's a question about scale,

02:12:31.320 | it's a question about utility.

02:12:33.200 | And when I look at the transformer series of architectures

02:12:36.200 | that the GPT series is based on,

02:12:38.760 | it's really interesting to look at that

02:12:39.880 | because they're actually very simple designs.

02:12:42.920 | They're not recurrent.

02:12:44.720 | The training regimens are pretty simple.

02:12:47.440 | And so they don't really reflect like human brains, right?

02:12:51.680 | But they're really good at learning language models

02:12:54.600 | and they're unrolled enough that you get,

02:12:56.500 | you can simulate some recurrence, right?

02:12:59.120 | And so the question I think about is,

02:13:02.080 | where does this take us?

02:13:03.240 | Like, so we can just keep scaling it,

02:13:05.120 | have more parameters, more data, more things,

02:13:07.640 | we'll get a better result for sure.

02:13:09.400 | But are there architectural techniques

02:13:11.800 | that can lead to progress at a faster pace?

02:13:14.220 | Right, this is when, how do you get,

02:13:17.680 | instead of just like making it a constant time bigger,

02:13:20.600 | how do you get like an algorithmic improvement

02:13:23.320 | out of this, right?

02:13:24.160 | And whether it be a new training regimen,

02:13:25.720 | if it becomes sparse networks, for example,

02:13:30.320 | human brain sparse, all these networks are dense,

02:13:33.600 | the connectivity patterns can be very different.

02:13:36.120 | I think this is where I get very interested

02:13:38.240 | and I'm way out of my league

02:13:39.480 | on the deep learning side of this.

02:13:41.560 | But I think that could lead to big breakthroughs.

02:13:43.680 | When you talk about large scale networks,

02:13:46.160 | one of the things that Jeff Dean likes to talk about

02:13:48.000 | and he's given a few talks on is this idea

02:13:51.680 | of having a sparsely gated mixture of experts

02:13:54.200 | kind of a model where you have, you know,

02:13:57.400 | different nets that are trained

02:13:59.480 | and are really good at certain kinds of tasks.

02:14:02.080 | And so you have this distributed across a cluster.

02:14:04.840 | And so you have a lot of different computers

02:14:06.400 | that end up being kind of locally specialized

02:14:08.520 | in different demands.

02:14:09.720 | And then when a query comes in,

02:14:11.040 | you gate it and you use learn techniques

02:14:13.720 | to route to different parts of the network.

02:14:15.440 | And then you utilize the compute resources

02:14:18.000 | of the entire cluster by having specialization within it.

02:14:20.640 | And I don't know where that goes

02:14:23.680 | or if it starts to, when it starts to work,

02:14:25.520 | but I think things like that

02:14:26.680 | could be really interesting as well.

02:14:28.360 | - And then on the data side too,

02:14:30.000 | if you can think of data selection as a kind of programming.

02:14:35.000 | - Yeah.

02:14:36.680 | - I mean, essentially, if you look at like Karpathy

02:14:38.760 | talked about software 2.0,

02:14:40.640 | I mean, in a sense, data is the programming.

02:14:44.040 | - Yeah, yeah.

02:14:44.880 | So let me try to summarize Andre's position really quick

02:14:48.320 | before I disagree with it.

02:14:50.000 | - Yeah.

02:14:51.120 | - So Andre Karpathy is amazing.

02:14:53.400 | So this is nothing personal with him.

02:14:55.200 | He's an amazing engineer.

02:14:57.400 | - And also a good blog post writer.

02:14:59.240 | - Yeah, well, he's a great communicator.

02:15:01.080 | He's just an amazing person.

02:15:02.400 | He's also really sweet.

02:15:03.720 | So his basic premise is that software is suboptimal.

02:15:09.400 | I think we can all agree to that.

02:15:11.040 | He also points out that deep learning

02:15:14.480 | and other learning-based techniques are really great

02:15:16.360 | because you can solve problems in more structured ways

02:15:19.120 | with less like ad hoc code that people write out

02:15:23.040 | and don't write test cases for in some cases.

02:15:25.160 | And so they don't even know if it works in the first place.

02:15:27.800 | And so if you start replacing systems of imperative code

02:15:32.320 | with deep learning models, then you get a better result.

02:15:37.120 | And I think that he argues that software 2.0

02:15:40.680 | is a pervasively learned set of models

02:15:44.120 | and you get away from writing code.

02:15:45.920 | And he's given talks where he talks about

02:15:47.920 | swapping over more and more and more parts of the code

02:15:50.960 | to being learned and driven that way.

02:15:54.840 | I think that works.

02:15:56.640 | And if you're predisposed to liking machine learning,

02:15:59.240 | then I think that that's definitely a good thing.

02:16:01.760 | I think this is also good for accessibility in many ways

02:16:04.700 | because certain people are not gonna write C code

02:16:06.800 | or something.

02:16:07.720 | And so having a data-driven approach to do this kind of

02:16:10.620 | stuff, I think can be very valuable.

02:16:12.720 | On the other hand, there are huge trade-offs.

02:16:14.200 | And it's not clear to me that software 2.0 is the answer.

02:16:19.200 | And probably Andre wouldn't argue that it's the answer

02:16:21.440 | for every problem either.

02:16:22.960 | But I look at machine learning as not a replacement

02:16:26.760 | for software 1.0.

02:16:27.920 | I look at it as a new programming paradigm.

02:16:30.120 | And so programming paradigms, when you look across domains,

02:16:35.140 | is structured programming where you go from go-tos

02:16:38.480 | to if-then-else, or functional programming from Lisp.

02:16:42.280 | And you start talking about higher order functions

02:16:44.440 | and values and things like this.

02:16:45.880 | Or you talk about object-oriented programming.

02:16:48.040 | You're talking about encapsulation, subclassing,

02:16:49.960 | inheritance.

02:16:50.800 | You start talking about generic programming

02:16:52.640 | where you start talking about code reuse

02:16:54.480 | through specialization and different type instantiations.

02:16:59.480 | When you start talking about differentiable programming,

02:17:01.720 | something that I am very excited about in the context

02:17:04.960 | of machine learning, talking about taking functions

02:17:07.200 | and generating variants, like the derivative

02:17:10.280 | of another function.

02:17:11.120 | Like that's a programming paradigm that's very useful

02:17:13.760 | for solving certain classes of problems.

02:17:16.220 | Machine learning is amazing at solving certain classes

02:17:18.680 | of problems.

02:17:19.520 | Like you're not gonna write a cat detector

02:17:21.940 | or even a language translation system by writing C code.

02:17:25.920 | That's not a very productive way to do things anymore.

02:17:28.920 | And so machine learning is absolutely the right way

02:17:31.480 | to do that.

02:17:32.320 | In fact, I would say that learned models are really

02:17:35.000 | one of the best ways to work with the human world

02:17:37.280 | in general.

02:17:38.240 | And so anytime you're talking about sensory input

02:17:40.320 | of different modalities, anytime that you're talking

02:17:42.320 | about generating things in a way that makes sense

02:17:45.120 | to a human, I think that learned models are really,

02:17:47.840 | really useful.

02:17:48.920 | And that's because humans are very difficult

02:17:50.560 | to characterize, okay?

02:17:52.660 | And so this is a very powerful paradigm for solving

02:17:55.660 | classes of problems.

02:17:57.120 | But on the other hand, imperative code is too.

02:17:59.680 | You're not gonna write a bootloader for your computer

02:18:02.600 | in with a deep learning model.

02:18:04.060 | Deep learning models are very hardware intensive.

02:18:07.040 | They're very energy intensive because you have a lot

02:18:09.900 | of parameters and you can provably implement any function

02:18:14.500 | with a learned model, like this has been shown,

02:18:17.700 | but that doesn't make it efficient.

02:18:19.900 | And so if you're talking about caring about a few orders

02:18:22.300 | of magnitudes worth of energy usage,

02:18:24.080 | then it's useful to have other tools in the toolbox.

02:18:26.940 | - There's also robustness too.

02:18:28.420 | I mean, as a-- - Yeah, exactly.

02:18:29.900 | All the problems of dealing with data and bias in data,

02:18:32.500 | all the problems of, you know, software 2.0.

02:18:35.100 | And one of the great things that Andre is arguing towards,

02:18:39.320 | which I completely agree with him, is that when you start

02:18:43.100 | implementing things with deep learning, you need to learn

02:18:45.180 | from software 1.0 in terms of testing,

02:18:47.660 | continuous integration, how you deploy,

02:18:50.020 | how do you validate all these things and building systems

02:18:53.060 | around that so that you're not just saying like,

02:18:54.980 | "Oh, it seems like it's good, ship it."

02:18:57.580 | Right?

02:18:58.420 | Well, what happens when I regress something?

02:18:59.820 | What happens when I make a classification that's wrong

02:19:02.480 | and now I hurt somebody, right?

02:19:05.540 | All these things you have to reason about.

02:19:07.340 | - Yeah, but at the same time, the bootloader that works

02:19:10.140 | for us humans looks awfully a lot like a neural network.

02:19:14.900 | Right?

02:19:15.740 | So it's messy and you can cut out different parts

02:19:19.180 | of the brain.

02:19:20.020 | There's a lot of this neuroplasticity work that shows

02:19:22.900 | that it's gonna adjust.

02:19:24.140 | It's a really interesting question,

02:19:26.900 | how much of the world's programming could be replaced

02:19:30.420 | by software 2.0?

02:19:31.780 | Like with-- - Oh, well, I mean,

02:19:33.340 | it's provably true that you could replace all of it.

02:19:36.600 | - Right, so then it's a question of trade-offs.

02:19:39.260 | - Right, so anything that's a function, you can.

02:19:40.980 | So it's not a question about if.

02:19:42.980 | I think it's a economic question.

02:19:44.940 | It's a, what kind of talent can you get?

02:19:47.740 | What kind of trade-offs in terms of maintenance?

02:19:50.060 | Right, those kinds of questions, I think.

02:19:51.680 | What kind of data can you collect?

02:19:53.260 | I think one of the reasons that I'm most interested

02:19:55.100 | in machine learning as a programming paradigm is that one

02:19:59.160 | of the things that we've seen across computing in general

02:20:01.520 | is that being laser focused on one paradigm often puts you

02:20:06.100 | in a box that's not super great.

02:20:08.460 | And so you look at object-oriented programming,

02:20:10.420 | like it was all the rage in the early '80s.

02:20:12.060 | And like, everything has to be objects.

02:20:13.500 | And people forgot about functional programming,

02:20:15.620 | even though it came first.

02:20:17.380 | And then people rediscovered that, hey,

02:20:20.020 | if you mix functional and object-oriented and structure,

02:20:22.700 | like you mix these things together,

02:20:24.260 | you can provide very interesting tools

02:20:25.780 | that are good at solving different problems.

02:20:28.420 | And so the question there is how do you get the best way

02:20:31.180 | to solve the problems?

02:20:32.620 | It's not about whose tribe should win, right?

02:20:35.980 | It's not about, you know, that shouldn't be the question.

02:20:38.780 | The question is how do you make it

02:20:40.020 | so that people can solve those problems the fastest

02:20:42.180 | and they have the right tools in their box

02:20:44.300 | to build good libraries and they can solve these problems.

02:20:47.140 | And when you look at that, that's like, you know,

02:20:49.060 | you look at reinforcement learning

02:20:50.300 | as one really interesting subdomain of this.

02:20:52.620 | Reinforcement learning, often you have to have

02:20:55.060 | the integration of a learned model combined with your Atari

02:20:59.380 | or whatever the other scenario it is that you're working in.

02:21:02.860 | You have to combine that thing

02:21:04.420 | with the robot control for the arm, right?

02:21:07.620 | And so now it's not just about that one paradigm.

02:21:11.900 | It's about integrating that with all the other systems

02:21:14.540 | that you have, including often legacy systems

02:21:17.020 | and things like this, right?

02:21:18.100 | And so to me, I think that the interesting thing to say

02:21:21.460 | is like, how do you get the best out of this domain

02:21:23.820 | and how do you enable people to achieve things

02:21:25.820 | that they otherwise couldn't do

02:21:27.300 | without excluding all the good things

02:21:29.700 | we already know how to do?

02:21:31.300 | - Right, but, okay, this is a crazy question,

02:21:35.300 | but we talked a little bit about GPT-3,

02:21:38.820 | but do you think it's possible that these language models

02:21:42.340 | that in essence, in the language domain,

02:21:47.340 | software 2.0 could replace some aspect of compilation,

02:21:51.820 | for example, or do program synthesis

02:21:54.260 | replace some aspect of programming?

02:21:56.860 | - Yeah, absolutely.

02:21:57.700 | So I think that learned models in general

02:22:00.340 | are extremely powerful

02:22:01.580 | and I think that people underestimate them.

02:22:03.700 | - Maybe you can suggest what I should do.

02:22:07.140 | So I've access to the GPT-3 API.

02:22:11.380 | Would I be able to generate Swift code, for example?

02:22:14.260 | Do you think that could do something interesting

02:22:16.020 | and would it work?

02:22:17.060 | - So GPT-3 is probably not trained on the right corpus.

02:22:21.140 | So it probably has the ability to generate some Swift.

02:22:23.700 | I bet it does.

02:22:25.220 | It's probably not gonna generate a large enough body of Swift

02:22:27.620 | to be useful, but like taking it a next step further,

02:22:30.580 | like if you had the goal of training something like GPT-3

02:22:33.980 | and you wanted to train it to generate source code, right?

02:22:38.020 | It could definitely do that.

02:22:39.780 | Now the question is, how do you express the intent

02:22:42.660 | of what you want filled in?

02:22:44.300 | You can definitely like write scaffolding of code

02:22:47.060 | and say, fill in the hole

02:22:48.900 | and sort of put in some for loops

02:22:50.340 | or put in some classes or whatever.

02:22:51.540 | And the power of these models is impressive,

02:22:53.700 | but there's an unsolved question, at least unsolved to me,

02:22:56.940 | which is how do I express the intent of what to fill in?

02:22:59.740 | Right, and kind of what you'd really want to have,

02:23:03.180 | and I don't know that these models are up to the task,

02:23:06.340 | is you wanna be able to say,

02:23:08.300 | here's a scaffolding and here are the assertions at the end

02:23:11.260 | and the assertions always pass.

02:23:14.060 | And so you want a generative model on the one hand, yes.

02:23:16.620 | - That's fascinating, yeah.

02:23:17.620 | - Right, but you also want some loopback,

02:23:20.500 | some reinforcement learning system or something

02:23:23.220 | where you're actually saying like,

02:23:24.700 | I need to hill climb towards something that is more correct.

02:23:28.540 | And I don't know that we have that.

02:23:29.780 | - So it would generate not only a bunch of the code,

02:23:33.700 | but like the checks that do the testing,

02:23:35.980 | it would generate the test.

02:23:37.100 | - I think the humans would generate the test, right?

02:23:38.860 | - Oh, okay.

02:23:39.700 | - The test would be-- - But it would be fascinating--

02:23:41.380 | - Well, the test are the requirements.

02:23:43.060 | - Yes, but the, okay, so--

02:23:44.220 | - 'Cause you have to express to the model what you want to,

02:23:47.060 | you don't just want gibberish code.

02:23:48.820 | Look at how compelling this code looks.

02:23:51.300 | You want a story about four horned unicorns or something.

02:23:54.740 | - Well, okay, so exactly, but that's human requirements.

02:23:57.700 | But then I thought it's a compelling idea

02:24:00.180 | that the GPT-4 model could generate checks

02:24:06.260 | like that are more high fidelity that check for correctness.

02:24:11.260 | Because the code it generates,

02:24:15.500 | like say I ask it to generate a function

02:24:18.420 | that gives me the Fibonacci sequence.

02:24:21.620 | - Sure.

02:24:22.460 | - I don't like--

02:24:24.340 | - So decompose the problem, right?

02:24:25.620 | So you have two things.

02:24:26.980 | You have, you need the ability to generate

02:24:29.380 | syntactically correct Swift code that's interesting, right?

02:24:33.100 | I think GPT series of model architectures can do that.

02:24:37.580 | But then you need the ability to add the requirements.

02:24:41.340 | So generate Fibonacci.

02:24:43.060 | - Yeah.

02:24:43.900 | - The human needs to express that goal.

02:24:46.040 | We don't have that language that I know of.

02:24:49.140 | - No, I mean, it can generate stuff.

02:24:50.820 | Have you seen with GPT-3, it can generate,

02:24:52.820 | you can say, I mean, there's interface stuff,

02:24:55.780 | like it can generate HTML,

02:24:58.380 | it can generate basic for loops that give you like--

02:25:02.020 | - Right, but pick HTML.

02:25:02.900 | How do I say I want google.com?

02:25:05.120 | - Well, no, you could say--

02:25:07.820 | - Or not literally google.com.

02:25:09.380 | How do I say I want a webpage that's got a shopping cart

02:25:11.740 | and this and that?

02:25:12.580 | - Yeah, it does that.

02:25:14.020 | I mean, so, okay, so just,

02:25:16.140 | I don't know if you've seen these demonstrations,

02:25:17.720 | but you type in, I want a red button

02:25:20.380 | with the text that says hello,

02:25:22.480 | and you type that in natural language,

02:25:24.220 | and it generates the correct HTML.

02:25:25.940 | - Okay.

02:25:26.780 | - I've done this demo.

02:25:27.620 | It's kind of compelling.

02:25:29.020 | So you have to prompt it with similar kinds of mappings.

02:25:33.300 | Of course, it's probably handpicked.

02:25:35.660 | I got to experiment.

02:25:36.580 | They probably, but the fact that you can do that once,

02:25:39.500 | even out of like 20, is quite impressive.

02:25:43.180 | Again, that's very basic.

02:25:45.220 | Like the HTML is kind of messy and bad.

02:25:48.420 | But yes, the intent is,

02:25:49.980 | the idea is the intent is specified in natural language.

02:25:52.660 | - Okay.

02:25:53.500 | Yeah, so I have not seen that.

02:25:54.420 | That's really cool.

02:25:55.240 | - Yeah. (laughs)

02:25:56.080 | - Yeah.

02:25:56.920 | - So the question is the correctness of that.

02:25:59.880 | Like visually you can check, oh, the button is red.

02:26:02.880 | But for more,

02:26:04.660 | for more complicated functions,

02:26:10.200 | where the intent is harder to check,

02:26:12.120 | this goes into like NP completeness kind of things.

02:26:15.480 | Like I want to know that this code is correct.

02:26:18.160 | And generates a giant thing.

02:26:20.120 | - Yeah.

02:26:20.960 | - That does some kind of calculation.

02:26:23.720 | It seems to be working.

02:26:25.440 | It's interesting to think like,

02:26:27.880 | should the system also try to generate checks

02:26:30.720 | for itself for correctness?

02:26:32.080 | - Yeah, I don't know.

02:26:33.000 | And this is way beyond my experience.

02:26:35.160 | (laughs)

02:26:36.000 | The thing that I think about is that

02:26:39.200 | there doesn't seem to be a lot of

02:26:41.120 | equational reasoning going on.

02:26:43.280 | - Right.

02:26:44.100 | - There's a lot of pattern matching and filling in.

02:26:45.280 | And kind of propagating patterns that have been seen before

02:26:48.480 | into the future and into the generated result.

02:26:50.680 | And so if you want to get correctness,

02:26:53.240 | you kind of need to improving kind of things.

02:26:55.180 | And like higher level logic.

02:26:57.320 | And I don't know that,

02:26:58.600 | you could talk to Jan about that.

02:26:59.920 | (laughs)

02:27:00.760 | And see what the bright minds are thinking about right now.

02:27:04.720 | But I don't think the GPT is in that vein.

02:27:08.180 | It's still really cool.

02:27:09.240 | - Yeah, and surprisingly, who knows?

02:27:11.880 | Maybe reasoning is--

02:27:13.960 | - Is overrated.

02:27:14.800 | - Yeah, is overrated.

02:27:15.640 | - Right, I mean, do we reason?

02:27:17.320 | - Yeah.

02:27:18.160 | - How do you tell, right?

02:27:18.980 | Are we just pattern matching based on what we have?

02:27:20.560 | And then reverse justify to ourselves?

02:27:22.800 | - Yeah, exactly, the reverse.

02:27:24.280 | So I think what the neural networks are missing,

02:27:26.940 | and I think GPT4 might have,

02:27:29.820 | is to be able to tell stories to itself about what it did.

02:27:33.800 | - Well, that's what humans do, right?

02:27:34.900 | I mean, you talk about network explainability, right?

02:27:38.260 | And we give neural nets a hard time about this.

02:27:40.700 | But humans don't know why we make decisions.

02:27:42.420 | We have this thing called intuition,

02:27:43.780 | and then we try to say,

02:27:45.220 | "This feels like the right thing, but why?"

02:27:47.100 | Right, and you wrestle with that

02:27:49.140 | when you're making hard decisions.

02:27:50.300 | And is that science?

02:27:52.220 | Not really.

02:27:53.380 | (laughs)

02:27:54.440 | - Let me ask you about a few high-level questions, I guess.

02:27:57.440 | You've done a million things in your life

02:28:02.400 | and been very successful.

02:28:04.240 | A bunch of young folks listen to this,

02:28:07.000 | ask for advice from successful people like you.

02:28:10.720 | If you were to give advice to somebody,

02:28:16.000 | an undergraduate student or a high school student,

02:28:19.040 | about pursuing a career in computing

02:28:23.520 | or just advice about life in general,

02:28:25.560 | is there some words of wisdom you can give them?

02:28:28.840 | - So I think you come back to change.

02:28:30.840 | And profound leaps happen

02:28:34.120 | because people are willing to believe

02:28:35.400 | that change is possible and that the world does change

02:28:39.160 | and are willing to do the hard thing

02:28:41.000 | that it takes to make change happen.

02:28:42.680 | And whether it be implementing a new programming language

02:28:45.880 | or implementing a new system

02:28:47.080 | or implementing a new research paper,

02:28:49.200 | designing a new thing,

02:28:50.200 | moving the world forward in science and philosophy,

02:28:52.680 | whatever, it really comes down to somebody

02:28:54.520 | who's willing to put in the work.

02:28:56.760 | Right, and you have,

02:28:57.960 | the work is hard for a whole bunch of different reasons,

02:29:01.520 | one of which is you,

02:29:04.200 | it's work, right?

02:29:06.920 | And so you have to have the space in your life

02:29:08.800 | in which you can do that work,

02:29:09.840 | which is why going to grad school

02:29:10.980 | can be a beautiful thing for certain people.

02:29:14.720 | But also there's a self-doubt that happens.

02:29:16.840 | Like you're two years into a project,

02:29:18.320 | is it going anywhere, right?

02:29:20.280 | Well, what do you do?

02:29:21.120 | Do you just give up because it's hard?

02:29:23.280 | Well, no, I mean, some people like suffering.

02:29:25.620 | And so you plow through it.

02:29:29.280 | The secret to me is that you have to love what you're doing

02:29:31.960 | and follow that passion

02:29:35.000 | because when you get to the hard times,

02:29:37.080 | that's when, if you love what you're doing,

02:29:40.080 | you're willing to kind of push through.

02:29:41.680 | And this is really hard

02:29:45.440 | because it's hard to know what you will love doing

02:29:48.640 | until you start doing a lot of things.

02:29:50.200 | And so that's why I think that,

02:29:51.640 | particularly early in your career,

02:29:53.280 | it's good to experiment.

02:29:54.900 | Do a little bit of everything.

02:29:56.400 | Go take the survey class on,

02:29:59.320 | the first half of every class

02:30:01.480 | in your upper division lessons

02:30:03.720 | and just get exposure to things

02:30:05.680 | because certain things will resonate with you

02:30:07.080 | and you'll find out, wow, I'm really good at this.

02:30:08.920 | I'm really smart at this.

02:30:10.040 | Well, it's just because it works with the way your brain.

02:30:13.000 | - And when something jumps out,

02:30:14.320 | I mean, that's one of the things

02:30:15.600 | that people often ask about is like,

02:30:19.120 | well, I think there's a bunch of cool stuff out there.

02:30:21.360 | Like, how do I pick the thing?

02:30:22.940 | - Yeah.

02:30:25.160 | - How do you hook, in your life,

02:30:27.560 | how did you just hook yourself in and stuck with it?

02:30:30.440 | - Well, I got lucky, right?

02:30:31.680 | I mean, I think that many people forget

02:30:34.800 | that a huge amount of it or most of it is luck, right?

02:30:38.760 | So let's not forget that.

02:30:40.860 | So for me, I fell in love with computers early on

02:30:44.800 | because they spoke to me, I guess.

02:30:47.700 | - What language did they speak?

02:30:50.720 | - Basic.

02:30:51.560 | - Basic, yeah.

02:30:52.380 | - But then it was just kind of following

02:30:56.960 | a set of logical progressions,

02:30:58.200 | but also deciding that something that was hard

02:31:01.400 | was worth doing and a lot of fun, right?

02:31:04.080 | And so I think that that is also something

02:31:06.240 | that's true for many other domains,

02:31:08.100 | which is if you find something that you love doing,

02:31:10.400 | that's also hard, if you invest yourself in it

02:31:13.480 | and add value to the world,

02:31:15.000 | then it will mean something, generally, right?

02:31:17.160 | And again, that can be a research paper,

02:31:19.160 | that can be a software system,

02:31:20.440 | that can be a new robot,

02:31:22.080 | that can be, there's many things that can be,

02:31:24.820 | but a lot of it is like real value

02:31:27.160 | comes from doing things that are hard.

02:31:29.360 | And that doesn't mean you have to suffer.

02:31:32.000 | But--

02:31:34.000 | - It's hard.

02:31:34.820 | I mean, you don't often hear that message.

02:31:36.400 | We talked about it last time a little bit,

02:31:38.040 | but it's one of my, not enough people talk about this.

02:31:42.860 | It's beautiful to hear a successful person.

02:31:47.440 | - Well, and self-doubt and imposter syndrome,

02:31:49.480 | and these are all things that successful people

02:31:52.400 | suffer with as well,

02:31:54.000 | particularly when they put themselves

02:31:55.160 | in a point of being uncomfortable,

02:31:56.700 | which I like to do now and then,

02:31:59.240 | just because it puts you in learning mode.

02:32:02.120 | Like if you wanna grow as a person,

02:32:04.120 | put yourself in a room with a bunch of people

02:32:07.040 | that know way more about whatever you're talking about

02:32:09.200 | than you do, and ask dumb questions.

02:32:11.560 | And guess what?

02:32:13.080 | Smart people love to teach, often, not always, but often.

02:32:16.840 | And if you listen, if you're prepared to listen,

02:32:18.360 | if you're prepared to grow,

02:32:19.200 | if you're prepared to make connections,

02:32:20.720 | you can do some really interesting things.

02:32:22.400 | And I think that a lot of progress is made by people

02:32:25.400 | who kind of hop between domains now and then,

02:32:28.040 | because they bring a perspective into a field

02:32:32.520 | that nobody else has,

02:32:34.760 | if people have only been working in that field themselves.

02:32:38.320 | - We mentioned that the universe is kind of like a compiler,

02:32:41.440 | the entirety of it, the whole evolution

02:32:44.920 | is kind of a kind of compilation.

02:32:46.740 | Maybe us human beings are kind of compilers.

02:32:50.680 | Let me ask the old sort of question

02:32:53.600 | that I didn't ask you last time,

02:32:54.960 | which is what's the meaning of it all?

02:32:57.780 | Is there a meaning?

02:32:58.760 | Like if you asked a compiler why,

02:33:00.860 | what would a compiler say?

02:33:03.400 | What's the meaning of life?

02:33:04.640 | - What's the meaning of life?

02:33:06.840 | I'm prepared for it not to mean anything.

02:33:08.840 | Here we are all biological things programmed to survive

02:33:14.200 | and propagate our DNA.

02:33:17.520 | And maybe the universe is just a computer

02:33:21.440 | and you just go until entropy takes over the world

02:33:24.160 | and it takes over the universe and then you're done.

02:33:27.440 | I don't think that's a very productive way

02:33:29.680 | to live your life, if so.

02:33:33.000 | And so I prefer to bias towards the other way,

02:33:34.760 | which is saying the universe has a lot of value.

02:33:37.960 | And I take happiness out of other people.

02:33:41.800 | And a lot of times part of that's having kids,

02:33:43.840 | but also the relationships you build with other people.

02:33:46.940 | And so the way I try to live my life is like,

02:33:49.680 | what can I do that has value?

02:33:51.240 | How can I move the world forward?

02:33:52.480 | How can I take what I'm good at

02:33:54.540 | and bring it into the world?

02:33:57.600 | And how can I, I'm one of these people

02:33:59.520 | that likes to work really hard

02:34:00.640 | and be very focused on the things that I do.

02:34:03.160 | And so if I'm gonna do that,

02:34:05.040 | how can it be in a domain that actually will matter?

02:34:08.080 | Because a lot of things that we do,

02:34:10.040 | we find ourselves in the cycle of like,

02:34:11.680 | okay, I'm doing a thing, I'm very familiar with it,

02:34:13.740 | I've done it for a long time,

02:34:15.400 | I've never done anything else,

02:34:16.680 | but I'm not really learning.

02:34:18.960 | I'm keeping things going,

02:34:21.740 | but there's a younger generation

02:34:23.440 | that can do the same thing,

02:34:24.640 | maybe even better than me.

02:34:26.480 | Maybe if I actually step out of this

02:34:28.000 | and jump into something I'm less comfortable with,

02:34:31.280 | it's scary, but on the other hand,

02:34:33.440 | it gives somebody else a new opportunity.

02:34:34.920 | It also then puts you back in learning mode,

02:34:37.480 | and that can be really interesting.

02:34:38.920 | And one of the things I've learned

02:34:40.580 | is that when you go through that,

02:34:42.360 | that first you're deep into imposter syndrome,

02:34:45.040 | but when you start working your way out,

02:34:46.940 | you start to realize,

02:34:47.780 | hey, well, there's actually a method to this.

02:34:50.000 | And now I'm able to add new things

02:34:53.280 | 'cause I bring different perspective.

02:34:54.680 | And this is one of the good things

02:34:57.240 | about bringing different kinds of people together.

02:34:59.800 | Diversity of thought is really important.

02:35:01.860 | And if you can pull together people

02:35:04.440 | that are coming at things from different directions,

02:35:06.480 | you often get innovation.

02:35:07.760 | And I love to see that, that aha moment

02:35:10.560 | where you're like, oh, we've really cracked this.

02:35:12.760 | This is something nobody's ever done before.

02:35:15.200 | And then if you can do it in the context

02:35:16.760 | where it adds value, other people can build on it,

02:35:18.960 | it helps move the world,

02:35:20.280 | then that's what really excites me.

02:35:22.720 | - So that kind of description

02:35:24.480 | of the magic of the human experience,

02:35:26.480 | do you think we'll ever create that in like an AGI system?

02:35:29.880 | Do you think we'll be able to create,

02:35:33.480 | give AI systems a sense of meaning

02:35:38.040 | where they operate in this kind of world

02:35:39.640 | exactly in the way you've described,

02:35:41.800 | which is they interact with each other,

02:35:43.240 | they interact with us humans?

02:35:44.800 | - Sure, sure.

02:35:45.640 | Well, so I mean, why are you being so speciest?

02:35:50.040 | Right?

02:35:50.880 | All right, so AGIs versus bionets,

02:35:54.600 | or versus biology, right?

02:35:56.520 | What are we but machines, right?

02:36:00.240 | We're just programmed to run our,

02:36:02.880 | we have our objective function that we were optimized for.

02:36:05.520 | Right?

02:36:06.400 | And so we're doing our thing.

02:36:07.600 | We think we have purpose, but do we really?

02:36:09.280 | - Yeah.

02:36:10.120 | - Right, I'm not prepared to say

02:36:10.960 | that those newfangled AGIs have no soul

02:36:14.560 | just because we don't understand them, right?

02:36:16.840 | And I think that would be, when they exist,

02:36:20.080 | that would be very premature to look at a new thing

02:36:24.160 | through your own lens without fully understanding it.

02:36:26.760 | - You might be just saying that

02:36:29.400 | because AI systems in the future will be listening to this.

02:36:32.720 | And then--

02:36:33.560 | - Oh yeah, yeah, exactly.

02:36:34.400 | - You don't wanna say anything.

02:36:35.220 | - Please be nice to me.

02:36:36.060 | You know, when Skynet kills everybody, please spare me.

02:36:39.160 | - So wise, wise look ahead thinking.

02:36:42.640 | - Yeah, but I mean, I think that people

02:36:44.560 | will spend a lot of time worrying about this kind of stuff.

02:36:46.360 | And I think that what we should be worrying about

02:36:48.200 | is how do we make the world better?

02:36:49.880 | And the thing that I'm most scared about with AGIs

02:36:52.880 | is not that necessarily the Skynet

02:36:57.480 | will start shooting everybody with lasers

02:36:59.000 | and stuff like that to use us for calories.

02:37:02.120 | The thing that I'm worried about is that

02:37:05.440 | humanity I think needs a challenge.

02:37:08.320 | And if we get into a mode of not having a personal challenge,

02:37:11.640 | not having a personal contribution,

02:37:13.600 | whether that be like, you know, your kids

02:37:15.920 | and seeing what they grow into and helping guide them,

02:37:18.840 | whether it be your community that you're engaged in,

02:37:21.960 | you're driving forward, whether it be your work

02:37:23.920 | and the things that you're doing

02:37:25.040 | and the people you're working with

02:37:25.960 | and the products you're building

02:37:26.800 | and the contribution there.

02:37:28.880 | If people don't have a objective,

02:37:31.960 | I'm afraid what that means.

02:37:33.360 | And I think that this would lead to a rise

02:37:37.840 | of the worst part of people, right?

02:37:39.920 | Instead of people striving together

02:37:42.240 | and trying to make the world better,

02:37:45.080 | it could degrade into a very unpleasant world.

02:37:49.720 | But I don't know.

02:37:51.140 | I mean, we hopefully have a long ways to go

02:37:53.600 | before we discover that.

02:37:54.760 | (laughing)

02:37:55.720 | Unfortunately, we have pretty on the ground problems

02:37:57.680 | with the pandemic right now.

02:37:58.680 | And so I think we should be focused on that as well.

02:38:01.480 | - Yeah, ultimately, just as you said, you're optimistic.

02:38:04.640 | I think it helps for us to be optimistic.

02:38:07.320 | So that's, take it until you make it.

02:38:10.360 | - Yeah, well, and why not?

02:38:11.840 | What's the other side?

02:38:12.680 | Right, so I mean, I'm not personally a very religious person,

02:38:17.460 | but I've heard people say like,

02:38:19.200 | oh yeah, of course I believe in God.

02:38:20.440 | Of course I go to church, because if God's real,

02:38:23.340 | (laughing)

02:38:24.440 | you know, I wanna be on the right side of that.

02:38:25.920 | And if it's not real, it doesn't matter.

02:38:27.080 | - Yeah, it doesn't matter.

02:38:27.920 | - And so, you know, that's a fair way to do it.

02:38:30.960 | - Yeah, I mean, the same thing with nuclear deterrence,

02:38:35.600 | all of, you know, global warming, all these things,

02:38:38.400 | all these threats, natural, engineer, pandemics,

02:38:41.340 | all these threats we face.

02:38:42.680 | I think it's paralyzing to be terrified

02:38:49.660 | of all the possible ways we could destroy ourselves.

02:38:52.540 | I think it's much better, or at least productive,

02:38:56.580 | to be hopeful and to engineer defenses against these things,

02:39:00.820 | to engineer a future where like, you know,

02:39:04.820 | see like a positive future and engineer that future.

02:39:07.940 | - Yeah, well, and I think that's another thing

02:39:10.220 | to think about as, you know, a human,

02:39:12.700 | particularly if you're young and trying to figure out

02:39:14.540 | what it is that you wanna be when you grow up, like I am.

02:39:18.100 | I'm always looking for that.

02:39:19.820 | The question then is, how do you wanna spend your time?

02:39:23.360 | And right now there seems to be a norm

02:39:25.980 | of being a consumption culture.

02:39:28.780 | Like I'm gonna watch the news and revel

02:39:31.500 | in how horrible everything is right now.

02:39:33.500 | I'm going to go find out about the latest atrocity

02:39:36.540 | and find out all the details of like the terrible thing

02:39:38.820 | that happened and be outraged by it.

02:39:40.620 | You can spend a lot of time watching TV

02:39:43.980 | and watching the new sitcom or whatever

02:39:46.600 | people watch these days, I don't know.

02:39:49.300 | But that's a lot of hours, right?

02:39:51.100 | And those are hours that if you're turning

02:39:53.420 | to being productive, learning, growing, experiencing,

02:39:58.420 | you know, when the pandemic's over, going exploring, right?

02:40:02.060 | It leads to more growth.

02:40:03.620 | And I think it leads to more optimism and happiness

02:40:06.420 | because you're building, right?

02:40:08.660 | You're building yourself, you're building your capabilities,

02:40:11.000 | you're building your viewpoints,

02:40:12.220 | you're building your perspective.

02:40:13.460 | And I think that a lot of the consuming

02:40:18.380 | of other people's messages leads to kind

02:40:20.780 | of a negative viewpoint, which you need to be aware

02:40:23.260 | of what's happening because that's also important,

02:40:25.660 | but there's a balance that I think focusing

02:40:28.100 | on creation is a very valuable thing to do.

02:40:31.980 | - Yeah, so what you're saying is people should focus

02:40:33.840 | on working on the sexiest field of them all,

02:40:37.300 | which is compiler design.

02:40:38.420 | - Exactly.

02:40:39.660 | Hey, you could go work on machine learning

02:40:41.160 | and be crowded out by the thousands of graduates popping

02:40:43.980 | out of school that all want to do the same thing.

02:40:45.620 | Or you could work in the place that people overpay you

02:40:48.580 | because there's not enough smart people working in it.

02:40:51.260 | And here at the end of Moore's law, according

02:40:53.780 | to some people, actually the software is the hard part too.

02:40:57.140 | - I mean, optimization is truly, truly beautiful.

02:41:02.300 | And also on the YouTube side or education side, you know,

02:41:06.500 | it'd be nice to have some material that shows the beauty

02:41:10.620 | of compilers.

02:41:12.120 | - Yeah, yeah.

02:41:13.160 | - That's something.

02:41:14.480 | So that's a call for people to create that kind

02:41:17.800 | of content as well.

02:41:18.920 | Chris, you're one of my favorite people to talk to.

02:41:22.840 | It's such a huge honor that you would waste your time

02:41:25.560 | talking to me.

02:41:26.400 | I've always appreciated it.

02:41:27.760 | Thank you so much for talking today.

02:41:30.120 | - The truth of it is you spent a lot of time talking to me

02:41:32.320 | just on walks and other things like that.

02:41:34.440 | So it's great to catch up.

02:41:35.640 | - Thanks, man.

02:41:37.200 | Thanks for listening to this conversation

02:41:39.240 | with Chris Latner.

02:41:40.400 | A thank you to our sponsors.

02:41:42.360 | Blinkist, an app that summarizes key ideas

02:41:45.200 | from thousands of books.

02:41:46.600 | Neuro, which is a maker of functional gum and mints

02:41:49.640 | that supercharge my mind.

02:41:51.440 | Masterclass, which are online courses from world experts.

02:41:55.480 | And finally Cash App, which is an app

02:41:57.840 | for sending money to friends.

02:42:00.200 | Please check out these sponsors in the description

02:42:02.360 | to get a discount and to support this podcast.

02:42:06.120 | If you enjoy this thing, subscribe on YouTube,

02:42:08.440 | review it with Five Stars on Apple Podcast,

02:42:10.600 | follow on Spotify, support on Patreon,

02:42:13.280 | connect with me on Twitter @LexFriedman.

02:42:16.320 | And now let me leave you with some words from Chris Latner.

02:42:19.080 | So much of language design is about trade-offs

02:42:21.760 | and you can't see those trade-offs

02:42:23.680 | unless you have a community of people

02:42:25.640 | that really represent those different points.

02:42:28.560 | Thank you for listening and hope to see you next time.

02:42:31.640 | (upbeat music)

02:42:34.220 | (upbeat music)

02:42:36.800 | [BLANK_AUDIO]

Chris Lattner: The Future of Computing and Programming Languages | Lex Fridman Podcast #131

Chapters