back to indexChris Lattner: The Future of Computing and Programming Languages | Lex Fridman Podcast #131
Chapters
0:0 Introduction
2:25 Working with Elon Musk, Steve Jobs, Jeff Dean
7:55 Why do programming languages matter?
13:55 Python vs Swift
24:48 Design decisions
30:6 Types
33:54 Programming languages are a bicycle for the mind
36:26 Picking what language to learn
42:25 Most beautiful feature of a programming language
51:50 Walrus operator
61:16 LLVM
66:28 MLIR compiler framework
70:35 SiFive semiconductor design
83:9 Moore's Law
86:22 Parallelization
90:50 Swift concurrency manifesto
101:39 Running a neural network fast
107:16 Is the universe a quantum computer?
112:57 Effects of the pandemic on society
130:9 GPT-3
134:28 Software 2.0
147:54 Advice for young people
152:37 Meaning of life
00:00:00.000 |
The following is a conversation with Chris Lattner, 00:00:07.800 |
having created LLVM compiler infrastructure project, 00:00:11.480 |
the Clang compiler, the Swift programming language, 00:00:14.640 |
a lot of key contributions to TensorFlow and TPUs 00:00:19.080 |
He served as vice president of autopilot software at Tesla, 00:00:23.520 |
was a software innovator and leader at Apple, 00:00:28.280 |
as senior vice president of platform engineering, 00:00:38.240 |
followed by some thoughts related to the episode. 00:00:42.400 |
an app that summarizes key ideas from thousands of books. 00:00:45.400 |
I use it almost every day to learn new things 00:00:48.040 |
or to pick which books I want to read or listen to next. 00:00:53.920 |
the maker of functional sugar-free gum and mints 00:01:03.240 |
online courses from the best people in the world 00:01:15.680 |
the app I use to send money to friends for food, drinks, 00:01:21.800 |
Please check out the sponsors in the description 00:01:23.740 |
to get a discount and to support this podcast. 00:01:29.320 |
has been an inspiration to me on a human level 00:01:38.600 |
especially humble enough to hear the voices of disagreement 00:01:46.080 |
from the early days, and for that, I'm forever grateful. 00:01:51.180 |
no one really believed that I would amount to much. 00:01:56.520 |
it makes me feel like I might be someone special. 00:02:05.640 |
is someone who might need your love and support 00:02:10.060 |
If you enjoy this thing, subscribe on YouTube, 00:02:21.300 |
And now, here's my conversation with Chris Ladner. 00:02:24.780 |
- What are the strongest qualities of Steve Jobs, 00:02:28.940 |
Elon Musk, and the great and powerful Jeff Dean 00:02:32.980 |
since you've gotten the chance to work with each? 00:02:36.020 |
- You're starting with an easy question there. 00:02:40.700 |
I guess you could do maybe a pairwise comparison 00:02:48.200 |
I worked a lot more with Elon than I did with Steve. 00:02:55.400 |
They're both very demanding in their own way. 00:02:57.640 |
My sense is Steve is much more human factor focused, 00:03:06.000 |
- Steve's trying to build things that feel good, 00:03:08.480 |
that people love, that affect people's lives, how they live. 00:03:26.280 |
That was one of the things that reading the biography. 00:03:29.520 |
How can a designer essentially talk to engineers 00:03:35.640 |
- I think, so I did not work very closely with Steve. 00:03:38.640 |
My sense is that he pushed people really hard, 00:03:41.860 |
but then when he got an explanation that made sense to him, 00:03:45.760 |
And he did actually have a lot of respect for engineering, 00:03:56.880 |
and when you can get a little bit more out of them. 00:04:19.760 |
so he can pull people together in a really great way. 00:04:34.080 |
So it's really hard to compare Jeff to either of those two. 00:04:44.880 |
and then pulling people in and inspiring them. 00:04:46.760 |
And so I think that that's one of the amazing things 00:04:51.880 |
with their pros and cons, all are really inspirational 00:04:56.600 |
I've been very fortunate to get to work with these guys. 00:05:14.720 |
It really depends on what you're looking for there. 00:05:17.200 |
I think you really need to know what you're talking about. 00:05:20.220 |
So being grounded on the product, on the technology, 00:05:23.000 |
on the business, on the mission is really important. 00:05:34.640 |
People are there because they believe in clean energy 00:05:37.240 |
and electrification, all these kinds of things. 00:05:39.640 |
The other is to understand what really motivates people, 00:05:45.800 |
how to build a plan that actually can be executed, right? 00:05:48.920 |
There's so many different aspects of leadership 00:05:50.480 |
and it really depends on the time, the place, the problems. 00:05:53.680 |
There's a lot of issues that don't need to be solved. 00:05:56.920 |
And so if you focus on the right things and prioritize well, 00:06:03.240 |
One is you really have to know what you're talking about, 00:06:12.800 |
- So I kind of assume you were born technically savvy, 00:06:38.760 |
more comfortable with as I've gained experience, 00:06:45.080 |
And so a major part of leadership is actually, 00:06:52.840 |
And so if you're working in a team of amazing people, 00:06:57.520 |
many of these companies all have amazing people. 00:07:00.320 |
It's the question of how do you get people together? 00:07:05.920 |
How do you get people to be vulnerable sometimes 00:07:18.840 |
thou shalt do the thing that I tell you to do, right? 00:07:21.120 |
But you're encouraging people to be part of the solution 00:07:35.840 |
and I don't know much at all about how chips are designed. 00:07:43.280 |
but it turns out that if you ask a lot of dumb questions, 00:07:48.920 |
And when you're surrounded by people that wanna teach 00:07:51.040 |
and learn themselves, it can be a beautiful thing. 00:07:54.080 |
- So let's talk about programming languages, if it's okay. 00:07:58.840 |
At the highest absurd philosophical level, 'cause I- 00:08:03.640 |
- I will forever get romantic and torture you, I apologize. 00:08:18.640 |
or why do we care about programming computers or? 00:08:20.920 |
- No, why do we care about programming language design, 00:08:37.840 |
through the evolution of these programming languages. 00:08:47.120 |
that are very good at specific kinds of things 00:08:48.840 |
and we think it's useful to have them do it for us, right? 00:08:52.000 |
Now you have this question of how best to express that 00:09:00.560 |
So, well, there's lots of ways of doing this. 00:09:09.800 |
You can then have higher and higher and higher levels 00:09:14.880 |
and you're designing a neural net to do the work for you. 00:09:18.040 |
The question is where along this way do you want to stop 00:09:21.200 |
and what benefits do you get out of doing so? 00:09:28.000 |
and Ada, Pascal, Swift, you have lots of different things. 00:09:34.360 |
and they're tackling different parts of the problems. 00:09:36.520 |
Now, one of the things that most programming languages do 00:09:39.960 |
is they're trying to make it so that you have 00:09:49.240 |
I'm gonna run on a ARM phone or something like that, fine. 00:09:53.480 |
I wanna write one program and have it portable 00:09:55.520 |
and this is something that assembly doesn't do. 00:10:02.400 |
because programming languages all have trade-offs 00:10:17.120 |
Subjective, fairly subjective, very shallow things. 00:10:29.600 |
Okay, and if you look at programming languages, 00:10:32.560 |
there's really kind of two different levels to them. 00:10:37.920 |
of how do you get the computer to be efficient, 00:10:39.340 |
stuff like that, how they work, type systems, 00:10:48.560 |
and a lot of people don't think about it that way. 00:10:50.600 |
- And the UI, you mean all that stuff with the braces 00:10:53.400 |
and the action. - Yeah, all that stuff's the UI 00:11:00.400 |
it's the interface between the guts and the human. 00:11:05.880 |
Humans have feelings, they have things they like, 00:11:10.720 |
And a lot of people treat programming languages 00:11:12.720 |
as though humans are just kind of abstract creatures 00:11:17.520 |
But it turns out that actually there is better and worse. 00:11:21.640 |
Like people can tell when a programming language is good 00:11:26.880 |
And one of the things with Swift in particular 00:11:33.240 |
have been put into really polishing and making it feel good. 00:11:36.660 |
But it also has really good nuts and bolts underneath it. 00:11:39.080 |
- You said that Swift makes a lot of people feel good. 00:11:50.840 |
tens of thousands, hundreds of thousands of people 00:11:55.000 |
the user experience of this programming language? 00:11:57.160 |
- Well, you can look at it in terms of better and worse. 00:12:01.320 |
or something like that, you will feel unproductive. 00:12:13.320 |
that then you have spent tons of time debugging 00:12:15.000 |
and it's a real pain in the butt and you feel unproductive. 00:12:17.760 |
And so by subtracting these things from the experience, 00:12:30.560 |
people that are most productive on Stack Overflow, 00:12:39.840 |
with the experience of the majority of users. 00:12:46.280 |
quote unquote, correct answer on Stack Overflow, 00:12:49.120 |
it usually really sort of prioritizes like safe code, 00:12:54.120 |
proper code, stable code, you know, that kind of stuff. 00:13:02.980 |
if I want to use go-to statements in my basic, right? 00:13:09.860 |
Like what if 99% of people want to use go-to statements? 00:13:12.700 |
So you use completely improper, you know, unsafe syntax. 00:13:17.900 |
like if you boil it down and you get below the surface level 00:13:20.120 |
people don't actually care about go-tos or if statements 00:13:28.260 |
I want to set up a web server and I want to do a thing, 00:13:34.260 |
And so from a programming language perspective, 00:13:44.460 |
and what are the tools around that look like, right? 00:13:47.260 |
And when you want to build a library that's missing, 00:13:50.580 |
Okay, now this is where you see huge divergence 00:13:59.220 |
but it's not so great at building all the libraries. 00:14:02.500 |
And so what you get because of performance reasons, 00:14:05.560 |
is you get Python layered on top of C, for example. 00:14:09.260 |
And that means that doing certain kinds of things, 00:14:11.540 |
well, it doesn't really make sense to do in Python. 00:14:19.300 |
because tooling and the debugger doesn't work right 00:14:23.800 |
- Can you clarify a little bit what you mean by 00:14:31.540 |
- No, but just the actual meaning of the sentence. 00:14:35.900 |
- Meaning like it's not conducive to developers 00:14:44.760 |
it's a dance between Python and C and you can never. 00:14:50.420 |
I did not mean to say that Python is bad for libraries. 00:15:01.300 |
like if you wanna build a machine learning framework, 00:15:03.620 |
you're not gonna build a machine learning framework 00:15:05.020 |
in Python because of performance, for example, 00:15:07.380 |
or you want GPU acceleration or things like this. 00:15:10.180 |
Instead what you do is you write a bunch of C 00:15:23.140 |
and those decisions have other counterbalancing forces, 00:15:37.820 |
And how do I make it so that then they can be assembled 00:15:44.020 |
Because when you're talking about building a thing, 00:15:46.900 |
you have to include the debugging, the fixing, 00:15:58.300 |
And so this is where things like catching bugs 00:16:07.600 |
Swift, for example, has certain things like value semantics, 00:16:35.840 |
But why do you need to clone a tensor sometimes? 00:16:43.920 |
- It's the usual object thing that's in Python. 00:16:49.280 |
and many other languages, this isn't unique to Python. 00:16:51.520 |
In Python, it has a thing called reference semantics, 00:16:55.680 |
And what that means is you actually have a pointer 00:16:59.960 |
Now, this is due to a bunch of implementation details 00:17:06.800 |
But in Swift, you have this thing called value semantics. 00:17:09.560 |
And so when you have a tensor in Swift, it is a value. 00:17:12.160 |
If you copy it, it looks like you have a unique copy. 00:17:21.400 |
- So that's highly error-prone in at least computer science, 00:17:38.280 |
And in fact, quietly doesn't behave like math 00:17:41.680 |
and then can ruin the entirety of your math thing. 00:17:44.160 |
Well, and then it puts you in debugging land again. 00:17:46.880 |
- Right, now you just wanna get something done 00:17:51.520 |
And what level of the stack, which is very complicated, 00:17:54.200 |
which I thought I was reusing somebody's library 00:18:02.160 |
And so this is where programming languages really matter. 00:18:06.280 |
so that both you get the benefit of math working like math, 00:18:27.720 |
and good language support for defining values. 00:18:33.400 |
that the machine learning folks are very used to. 00:18:38.280 |
where you have an array, you put, you create an array, 00:18:43.920 |
and then you pass it off to another function. 00:18:46.920 |
What happens if that function adds some more things to it? 00:18:51.360 |
Well, you'll see it on the side that you pass it in, right? 00:18:56.680 |
Now, what if you pass an array off to a function, 00:19:02.880 |
or some other data structure somewhere, right? 00:19:04.880 |
Well, it thought that you just handed it that array, 00:19:07.960 |
then you return back and that reference to that array 00:19:17.840 |
may have thought they had the only reference to that. 00:19:21.680 |
that this was gonna change underneath the covers. 00:19:23.960 |
And so this is where you end up having to do a clone. 00:19:27.800 |
I'm not sure if I have the only version of it. 00:19:32.280 |
So what value semantics does is it allows you to say, 00:19:34.680 |
hey, I have a, so in Swift, it defaults to value semantics. 00:19:40.240 |
and then because most things should be true values, 00:19:44.120 |
then it makes sense for that to be the default. 00:20:13.440 |
it was like, I don't know if anybody else has it, 00:20:16.640 |
Well, you just did a copy of a bunch of data. 00:20:19.960 |
And then it could be that the thing that called you 00:20:24.040 |
So you just made a copy of it and you may not have had to. 00:20:27.800 |
And so the way the value semantics work in Swift 00:20:32.060 |
which means that you get the benefit of safety 00:20:38.360 |
because if you think certain languages like Java, 00:20:44.920 |
is they provide value semantics by having pure immutability. 00:20:56.160 |
The problem with this is if you have immutability, 00:21:20.960 |
but generally think about them as a separate allocation. 00:21:34.720 |
because of the beauty of how the Swift value semantics 00:21:44.880 |
It knows that there's only one reference to that. 00:21:50.240 |
And so you're not allocating tons of stuff on the side. 00:21:57.520 |
If you pass it off to multiple different people, 00:21:59.340 |
but nobody changes it, they can all share the same thing. 00:22:02.600 |
So you get a lot of the benefit of purely mutable design. 00:22:10.560 |
I thought there was going to be a philosophical 00:22:14.680 |
like narrative here that you're gonna have to pay 00:22:30.980 |
like bringing the errors closer to the source, 00:22:38.160 |
to the source of the error, however you say that. 00:22:40.840 |
But you're saying there's not a performance cost either 00:22:46.320 |
- Well, so there's trade-offs with everything. 00:22:48.280 |
And so if you are doing very low level stuff, 00:23:03.000 |
that makes people love it, that is not obvious 00:23:08.200 |
is this UI principle of progressive disclosure of complexity. 00:23:12.280 |
So Swift, like many languages, is very powerful. 00:23:16.720 |
The question is, when do you have to learn the power 00:23:23.960 |
Certain other languages start with public static void main, 00:23:28.280 |
class, zzzzzzzz, like all the ceremony, right? 00:23:36.760 |
Let's talk about public, access control classes. 00:23:41.140 |
String system.out.println, like packages, like, ah! 00:23:46.080 |
Right, and so instead, if you take this and you say, 00:23:57.360 |
The question is, how do you factor the complexity? 00:23:59.440 |
And how do you make it so that the normal case scenario 00:24:09.360 |
But then as a power user, if you want to dive down to it, 00:24:16.000 |
You can call malloc if you want to call malloc. 00:24:18.320 |
This is not recommended on the first page of every tutorial, 00:24:23.760 |
And so being able to have that is really the design 00:24:45.320 |
but actually good design is something that you can feel. 00:24:48.720 |
- And how many people are involved with good design? 00:24:52.080 |
So if we looked at Swift, but look at historically, 00:25:08.320 |
And we'll talk about how all that can go wrong or right. 00:25:12.000 |
- Yeah, well, Swift, so I can't speak to in general, 00:25:15.600 |
So the way it works with Swift is that there's a core team. 00:25:23.320 |
that is people that have been working with Swift 00:25:35.520 |
but still that's enough time that there's a story arc there. 00:25:40.520 |
- And there's mistakes have been made that then get fixed 00:25:42.800 |
and you learn something and then you, you know, 00:25:44.680 |
and so what the core team does is it provides continuity. 00:25:50.400 |
okay, well, there's a big hole that we wanna fill. 00:25:55.280 |
So don't do other things that invade that space 00:26:03.040 |
even though it's not today, keep out of that space. 00:26:06.080 |
- And the whole team remembers the myth of the boulder 00:26:12.020 |
There's a general sense of what the future looks like 00:26:13.520 |
in broad strokes and a shared understanding of that 00:26:16.440 |
combined with a shared understanding of what has happened 00:26:18.780 |
in the past that worked out well and didn't work out well. 00:26:25.800 |
And you've got, in that case, hundreds of people 00:26:27.680 |
that really care passionately about the way Swift evolves. 00:26:33.880 |
the core team doesn't necessarily need to come up 00:26:50.320 |
they're like hashing it out and trying to like talk about, 00:26:55.640 |
And, you know, here you're talking about hundreds of people. 00:26:57.680 |
So you're not gonna get consensus necessarily. 00:27:01.920 |
And so there's a proposal process that then allows 00:27:06.280 |
the core team and the community to work this out. 00:27:08.360 |
And what the core team does is it aims to get consensus 00:27:17.400 |
make sure we're going the right direction kind of things. 00:27:23.520 |
how much people will love the user interface? 00:27:27.400 |
Like do you think they're able to capture that? 00:27:29.400 |
- Well, I mean, it's something we talk about a lot. 00:27:34.760 |
but I think that we've done pretty well so far. 00:27:39.400 |
- 'Cause you said the progressive disclosure. 00:27:40.800 |
- Yeah, so we care a lot about that, a lot about power, 00:27:53.320 |
- So if you like think about like a language I love is Lisp, 00:27:59.360 |
but I haven't done anything, any serious work in Lisp, 00:28:02.160 |
but it has a ridiculous amount of parentheses. 00:28:11.520 |
I like, I enjoyed the comfort of being between braces. 00:28:23.120 |
just like, and last thing to me, as a designer, 00:28:38.160 |
So like, I could see arguments for all of these. 00:28:44.200 |
- Right, exactly, you're good, it's a good point. 00:28:46.960 |
- Right, so like, you know, there's evidence that-- 00:28:49.960 |
- But see, like, it's one of the most argued about things. 00:28:52.320 |
- Oh yeah, of course, just like tabs and spaces, 00:28:54.080 |
which it doesn't, I mean, there's one obvious right answer, 00:29:01.760 |
Like, come on, what are you trying to do to me here? 00:29:12.600 |
- Well, no, no, no, it's always a really hard, 00:29:16.880 |
I mean, fine, those are not the interesting ones. 00:29:19.520 |
The hard ones are the ones that are most interesting, right? 00:29:23.560 |
hey, we wanna do a thing, everybody agrees we should do it, 00:29:28.880 |
but it has all these bad things associated with it. 00:29:36.260 |
Do we say, hey, well, maybe there's this other feature 00:29:38.520 |
that if we do that first, this will work out better? 00:29:44.080 |
are we paying ourselves into a corner, right? 00:29:48.600 |
that has some continuity and has perspective, 00:29:57.200 |
you get the power of multiple people coming together 00:30:00.120 |
and then you get the best out of all these people, 00:30:02.520 |
and you also can harness the community around it. 00:30:19.600 |
So many people would say that Python doesn't have types. 00:30:27.840 |
and I've listened to way too many podcasts and videos 00:30:32.440 |
- Oh yeah, so I would argue that Python has one type, 00:30:39.760 |
you have everything comes in as a Python object. 00:30:44.040 |
you know, it depends on what you're optimizing for, 00:30:52.720 |
you get duck typing for free and things like this, 00:30:56.920 |
you're making it very easy to pound out code on one hand, 00:31:01.840 |
to introduce complicated bugs that you have to debug, 00:31:12.080 |
and you find yourself in the middle of some code 00:31:13.480 |
that you really didn't wanna know anything about, 00:31:20.840 |
and they have trade-offs, they're good for performance, 00:31:34.280 |
like types or not, or one type or many types. 00:32:11.320 |
you also have to think of how that's gonna get 00:32:20.920 |
is another example of a very simple language, right? 00:32:27.200 |
I don't use it as much as maybe you do or you did. 00:32:29.760 |
- No, I think we're both, everyone who loves Lisp, 00:32:32.480 |
it's like, you love, it's like, I don't know, 00:32:36.240 |
but like how often do I seriously listen to Frank Sinatra? 00:32:40.040 |
But you look at that or you look at JavaScript, 00:32:45.960 |
and there's certain things that don't exist in the language, 00:32:54.640 |
In the case of both of them, for example, you say, 00:32:57.440 |
well, what about large-scale software development? 00:33:00.080 |
Okay, well, you need something like packages. 00:33:02.360 |
Neither language has a like language affordance for packages. 00:33:07.400 |
You get things like NPN, you get things like, 00:33:09.720 |
you know, like these ecosystems that get built around. 00:33:15.120 |
at least the most important inherent complexity 00:33:24.120 |
sometimes that's great because often building things 00:33:26.600 |
as libraries is very flexible and very powerful 00:33:28.920 |
and allows you to evolve and things like that. 00:33:30.720 |
But often it leads to a lot of unnecessary divergence 00:33:35.600 |
And when that happens, you just get kind of a mess. 00:33:39.560 |
And so the question is, how do you balance that? 00:33:44.280 |
'cause that's really expensive and it makes things complicated 00:33:46.760 |
but how do you model enough of the inherent complexity 00:33:49.640 |
of the problem that you provide the framework 00:33:59.080 |
and you think about what a programming language is there for 00:34:01.360 |
is it's about making a human more productive, right? 00:34:17.540 |
- And a programming language is a bicycle for the mind? 00:34:21.000 |
- Crazy, wow, that's a really interesting way 00:34:27.400 |
By being able to just directly leverage somebody's library, 00:34:33.420 |
In the case of Swift, SwiftUI is this new framework 00:34:36.160 |
that Apple has released recently for doing UI programming. 00:34:39.760 |
And it has this declarative programming model 00:34:48.820 |
And what this does is it allows you to get way more done 00:34:53.260 |
And now your productivity as a developer is much higher. 00:34:57.420 |
And so that's really what programming languages 00:35:03.300 |
It's about how productive do you make the person? 00:35:05.380 |
And you can only see that when you have libraries 00:35:13.760 |
And with Swift, I think we're still a little bit early, 00:35:16.640 |
but SwiftUI and many other things that are coming out now 00:35:20.340 |
And I think that they're opening people's eyes. 00:35:22.520 |
- It's kind of interesting to think about like how that, 00:35:36.060 |
Now this is not going to be a trash talking session 00:35:38.960 |
about C++, but I used C++ for a really long time. 00:35:45.220 |
- I feel like I spent many years without realizing 00:35:51.540 |
for my particular lifestyle, brain style, thinking style, 00:35:56.540 |
there's languages that could make me a lot more productive 00:36:00.340 |
in the debugging stage, in the, just the development stage 00:36:09.260 |
I mean, a machine learning framework in Python 00:36:29.760 |
how does a person like me or in general people 00:36:31.780 |
discover more productive, you know, languages? 00:36:39.960 |
I've been looking for like a project to work on in Swift 00:36:45.580 |
I mean, my intuition was like doing a hello world 00:36:50.460 |
To get me to experience the power of the language. 00:36:53.820 |
- You need a few weeks of change in metabolism. 00:36:58.260 |
That's one of the problems with people with diets. 00:37:01.500 |
Like I'm actually currently, to go in parallel, 00:37:13.260 |
they think that's horribly unhealthy or whatever. 00:37:16.900 |
You have like a million, whatever the science is, 00:37:27.380 |
And, but if you, you have to always give these things 00:37:48.040 |
I mean, Python is similar in that sense for me. 00:38:23.360 |
but there's definitely better and worse here. 00:38:41.560 |
but for me, it's, can I create systems for myself 00:38:52.120 |
like always stating things that should be true, 00:39:02.400 |
- Well, you could think of types in a programming language 00:39:11.040 |
Well, so this, or how do people learn new things? 00:39:17.200 |
People generally don't like change around them either. 00:39:19.320 |
And so we're all very slow to adapt and change. 00:39:22.880 |
And usually there's a catalyst that's required 00:39:32.720 |
like build a thing that the language is actually good for, 00:39:38.840 |
And so if you were to write an iOS app, for example, 00:40:00.400 |
LLVM, for example, builds the Android kernel. 00:40:09.920 |
There's, it runs on lots of different things. 00:40:14.120 |
Swift UI, and then there's a thing called UIKit. 00:40:23.880 |
So Swift UI and UIKit are Apple technologies. 00:40:28.720 |
like Swift UI happens to be written in Swift, 00:40:32.880 |
that Apple loves and wants to keep on its platform, 00:40:36.920 |
You go to Android and you don't have that library. 00:40:39.840 |
- Right, and so Android has a different ecosystem of things 00:40:45.400 |
And so you can totally use Swift to do like arithmetic 00:41:21.920 |
And then TensorFlow is really stepping up its game. 00:41:33.240 |
And they're like, "Oh, I have to learn this." 00:41:39.520 |
And then they learn and they fall in love with it. 00:41:45.240 |
- And so, and there, I mean, people don't like change, 00:41:57.320 |
even maybe Lisp, I don't know if you agree with this, 00:42:23.640 |
maybe you can tell me if there is, there you go. 00:42:30.960 |
- Before I ask it, let me say like with Python, 00:42:52.920 |
and to create a new list on a single line was elegant. 00:42:58.240 |
and it just made me fall in love with the language. 00:43:04.880 |
Is there, what do you use the most beautiful feature 00:43:07.600 |
in a programming languages that you've ever encountered? 00:43:21.240 |
with a programming language, again, what is the goal? 00:43:23.600 |
You're trying to get people to get things done quickly. 00:43:27.160 |
And so you need libraries, you need high quality libraries, 00:43:32.600 |
that can assemble them and do cool things with them. 00:43:43.400 |
between libraries who enable high quality libraries 00:43:48.320 |
versus the ones that put special stuff in the language. 00:43:57.400 |
- So, and what I mean by that is expressive libraries 00:44:00.840 |
that then feel like a natural integrated part 00:44:05.560 |
So an example of this in Swift is the int and float 00:44:19.880 |
is just a library thing defined in the standard library, 00:44:22.600 |
along with strings and arrays and all the other things 00:44:41.440 |
Well, it doesn't come in the standard library. 00:44:51.120 |
It's not about the people who care about ints and floats 00:44:53.480 |
are more important than the people care about quaternions. 00:44:56.920 |
about programming languages is when you allow 00:44:58.960 |
those communities to build high quality libraries 00:45:02.280 |
that feel native, that feel like they're built 00:45:24.400 |
- But so like the 64 bit is hard-coded or no? 00:45:29.400 |
So int, if you go look at how it's implemented, 00:45:47.800 |
And so, yeah, you can add methods on the things. 00:45:51.320 |
- So you can define operators, like how it behaves. 00:46:07.200 |
And so one of the best examples of this is Lisp, right? 00:46:15.440 |
You write term rewrite systems and things like this. 00:46:25.520 |
- Well, so one example, I'll give you two examples, 00:46:31.600 |
They both allow you to define your own types, 00:46:49.720 |
- But if you make a pair or something like that, 00:46:56.840 |
and now it gets passed around by reference, by pointer. 00:47:47.840 |
'cause I was thinking about the Walrus operator, 00:47:53.240 |
but it hit me that like the equal sign for assignment, 00:47:57.760 |
like, why are we using the equal sign for assignment? 00:48:01.600 |
- It's wrong, and that's not the only solution, right? 00:48:16.360 |
- So, but like, and yeah, like, I ask you all, 00:48:19.920 |
but how do you then decide to break convention? 00:48:38.840 |
like colon equal instead of equal for assignment, 00:48:40.960 |
that would be weird with today's aesthetic, right? 00:48:44.920 |
And so you'd say, cool, this is theoretically better, 00:49:01.760 |
Well, it turns out you can solve that problem 00:49:06.960 |
all these compilers will detect that as a likely bug, 00:49:19.280 |
is like you're literally creating suffering in the world. 00:49:26.720 |
I mean, one way to see it is the bicycle for the mind, 00:49:29.200 |
but the other way is to like minimizing suffering. 00:49:32.200 |
- Well, you have to decide if it's worth it, right? 00:49:38.040 |
and again, this is where there's a lot of detail 00:49:54.600 |
You know, most people think it's messed up, I think. 00:50:00.080 |
what I mean is it is very rarely used for good, 00:50:07.200 |
- That's a good definition of messed up, yeah. 00:50:09.400 |
- You could use, you know, it's a, in hindsight, 00:50:13.520 |
Now, one of the things with Swift that is really powerful, 00:50:23.400 |
we announced that it was public, people could use it, 00:50:30.920 |
When Swift 2 came out, we said, "Hey, it's open source, 00:50:34.360 |
"which people can help evolve and direct the language." 00:50:43.120 |
and what happened is that, as part of that process is, 00:50:48.680 |
So for example, Swift used to have the C style 00:50:55.040 |
Like, what does it mean when you put it before 00:50:59.320 |
Well, that got cargo-culted from C into Swift early on. 00:51:11.880 |
- You have to look it up in Urban Dictionary, yeah. 00:51:17.520 |
or it got pulled into Swift without very good consideration, 00:51:27.760 |
they have very little value over saying, you know, 00:51:29.960 |
X plus equals one, and X plus equals one is way more clear, 00:51:34.240 |
and so when you're optimizing for teachability 00:51:36.360 |
and clarity and bugs and this multidimensional space 00:51:39.600 |
that you're looking at, things like that really matter, 00:51:42.340 |
and so being first principles on where you're coming from 00:51:46.520 |
and being anchored on the objective is really important. 00:51:53.280 |
sort of this podcast isn't about information, 00:52:06.320 |
there's something that's called the Walrus operator, 00:52:27.240 |
and maybe you can comment on that in general, 00:52:31.240 |
it's also the thing that toppled the dictator. 00:52:37.960 |
- It finally drove Guido to step down from BDFL, 00:52:42.880 |
So maybe, what do you think about the Walrus operator 00:52:46.000 |
in Python, is there an equivalent thing in Swift 00:52:56.680 |
what do you think about Guido stepping down over it? 00:53:02.400 |
one of the things that makes it most polarizing 00:53:11.760 |
and you can express it in a more concise way. 00:53:42.160 |
is not something that affects syntactic sugar. 00:53:44.760 |
And so, if you say, I wanna have the ability to define types, 00:53:48.240 |
I have to have all this like language mechanics 00:53:49.960 |
to define classes, and oh, now I have to have inheritance, 00:53:55.040 |
that's just making language more complicated. 00:54:06.560 |
that are used to concisify specific use cases. 00:54:12.840 |
when you're talking about, hey, I have a thing 00:54:15.000 |
that takes a lot to write, and I have a new way to write it, 00:54:26.320 |
And one of the things that is true about human psychology, 00:54:30.400 |
is that people overestimate the burden of learning something 00:54:35.400 |
and so it looks foreign when you haven't gotten used to it. 00:54:42.080 |
Like unquestionably, like this is just the thing I know, 00:55:00.000 |
But the sense that I got out of that whole dynamic 00:55:03.280 |
was that he had put not just the decision-maker weight 00:55:07.760 |
on his shoulders, but it was so tied to his personal identity 00:55:11.920 |
that he took it personally and he felt the need 00:55:18.160 |
instead of building a base of support around him. 00:55:20.920 |
I mean, this is probably not quite literally true. 00:55:31.320 |
- Well, yeah, particularly because people then say, 00:55:33.720 |
Guido, you're a horrible person, I hate this thing, 00:55:43.520 |
and 1% of millions of people is a lot of hate mail. 00:55:46.600 |
And that just from human factor will just wear on you. 00:55:49.440 |
- Well, to clarify, it looked from just what I saw 00:56:00.080 |
the big majority on a vote were opposed to it. 00:56:03.680 |
- Okay, I'm not that close to it, so I don't know. 00:56:06.400 |
- So this, okay, so the situation is like literally, 00:56:09.240 |
yeah, I mean, the majority, the core developers, 00:56:23.120 |
but the against it wasn't like, this is a bad idea. 00:56:27.840 |
They were more like, we don't see why this is a good idea. 00:56:31.280 |
And what that results in is there's a stalling feeling, 00:56:44.640 |
if we look at politics today and the way Congress works, 00:57:02.360 |
injected into the economy, or trillions of dollars, 00:57:13.360 |
- But you're talking about like a global pandemic. 00:57:17.200 |
I was hoping we could fix the healthcare system 00:57:34.440 |
you have a community of people building on it, 00:57:41.680 |
then taking it slow, I think, is an important thing to do, 00:57:46.720 |
particularly if it's something that's 25 years old 00:58:19.720 |
a significant fraction of his career on Python, 00:58:22.880 |
and from his perspective, I imagine he's like, 00:58:25.720 |
"I should be able to do the thing I think is right." 00:58:38.280 |
- But if we could talk about leadership in this, 00:58:45.480 |
If not, I'll make it a water stopper, I'm pretty sure, 00:59:02.960 |
like most difficult decisions, just like you said, 00:59:26.100 |
But they have to use their gut and make that decision. 00:59:34.180 |
The founders understand exactly what's happened 00:59:37.500 |
and are willing to say, "We have been doing thing X 00:59:40.860 |
"the last 20 years, but today we're gonna do thing Y." 00:59:45.460 |
And they make a major pivot for the whole company, 00:59:52.380 |
the successor doesn't always feel that agency 01:00:17.100 |
you should be obligated to change what you're doing 01:00:21.900 |
And so, if you don't know how you got to where you are, 01:00:31.780 |
the right thing to do, so you just may not see it. 01:00:36.460 |
It's so much higher burden when, as a leader, 01:00:39.340 |
you step into a thing that's already worked for a long time. 01:00:42.580 |
Well, and if you change it and it doesn't work out, 01:00:48.420 |
And the second thing is that even if you decide 01:00:49.980 |
to make a change, even if you're theoretically in charge, 01:00:53.500 |
you're just a person that thinks they're in charge. 01:00:58.860 |
You have to explain it to them in terms they'll understand. 01:01:00.540 |
You have to get them to buy into it and believe in it, 01:01:02.140 |
because if they don't, then they're not gonna be able 01:01:10.700 |
And so there's only so much power you have as a leader. 01:01:13.460 |
You have to understand what those limitations are. 01:01:32.560 |
- I mean, what's the role of, so then on Swift, 01:01:38.460 |
- Yeah, so if you contrast Python with Swift, right? 01:01:41.620 |
One of the reasons, so everybody on the core team 01:01:46.380 |
and I think we all really care about where Swift goes, 01:01:49.260 |
but you're almost delegating the final decision-making 01:01:56.720 |
And also, when you're talking with the community, 01:02:09.980 |
a full rationale is provided, things like this. 01:02:16.540 |
and provide case law, kind of like Supreme Court does 01:02:18.860 |
about this decision was made for this reason, 01:02:24.160 |
But it's also a way to provide a defense mechanism 01:02:29.020 |
they're not saying that person did the wrong thing. 01:02:34.020 |
and (growls) and later they move on and they get over it. 01:02:52.800 |
- Well, each of the humans on the Swift core team, 01:03:16.380 |
and it's a small group of people, but you need high trust. 01:03:20.140 |
You need, again, it comes back to the principles 01:03:23.360 |
and understanding what you're optimizing for. 01:03:27.460 |
And I think that starting with strong principles 01:03:30.500 |
and working towards decisions is always a good way 01:03:36.260 |
but then be able to communicate them to people 01:03:37.900 |
so that they can buy into them, and that is hard. 01:04:02.260 |
But LLVM has had tons of its own challenges over time too, 01:04:15.260 |
that have been working on LLVM for 10 years, right, 01:04:34.900 |
and we need to address them, and we need to make it better, 01:04:37.740 |
then somebody else will come up with a better idea, right? 01:04:42.540 |
where the community is in danger of getting too calcified, 01:04:52.020 |
Fortran is now a new thing in the LLVM community, 01:04:56.340 |
- I've been trying to find, on this little tangent, 01:04:59.020 |
find people who program in Cobalt or Fortran, 01:05:02.380 |
Fortran especially, to talk to, they're hard to find. 01:05:11.700 |
- Well, interesting thing you kind of mentioned with LLVM, 01:05:14.300 |
or just in general, that if something evolved, 01:05:19.740 |
So do you fall in love with the thing over time, 01:05:23.140 |
or do you start hating everything about the thing over time? 01:05:33.500 |
and they grate on me, and I don't have time to go fix 'em. 01:05:38.940 |
but they never get fixed, and it's like sand underneath, 01:05:43.620 |
and it's like sand underneath your fingernails or something. 01:05:45.860 |
It's just like you know it's there, you can't get rid of it. 01:05:49.700 |
And so the problem is that if other people don't see it, 01:05:55.700 |
I don't have time to go write the code and fix it anymore, 01:06:01.460 |
and so you say, "Hey, we should go fix this thing." 01:06:05.300 |
It's like, well, is it the right thing or not? 01:06:13.260 |
I think as an observer, as almost like a fan in the, 01:06:34.220 |
It's not, many people think it's about machine learning. 01:06:39.180 |
because compiler people can't name things very well, I guess. 01:06:51.700 |
So LLVM is a, it's really good for dealing with CPUs, 01:07:01.620 |
The JVM is very good for garbage collected languages 01:07:05.540 |
and it's very optimized for a specific space. 01:07:11.020 |
and that compiler is really good at that kind of stuff. 01:07:14.080 |
Usually when you build these domain-specific compilers, 01:07:16.740 |
you end up building the whole thing from scratch 01:07:26.660 |
- Well, so here I would say, like, if you look at Swift, 01:07:29.180 |
there's several different parts to the Swift compiler, 01:07:31.940 |
one of which is covered by the LLVM part of it. 01:07:36.100 |
There's also a high-level piece that's specific to Swift, 01:07:53.020 |
so you can mix and match it in different ways. 01:07:59.820 |
CPUs and, like, the tip of the iceberg on GPUs. 01:08:05.660 |
But it turns out-- - And a bunch of languages 01:08:11.060 |
- And so it turns out there's a lot of hardware out there 01:08:16.140 |
There are a lot of matrix multiply accelerators 01:08:27.180 |
And so you're compiling for a domain of transistors, 01:08:32.460 |
a tremendous amount of compiler infrastructure 01:08:34.460 |
that allows you to build these domain-specific compilers 01:08:37.500 |
in a much faster way and have the result be good. 01:08:44.380 |
now we're talking about, like, ASICs, so anything? 01:08:50.540 |
it's very possible that the number of these kinds of ASICs, 01:08:59.460 |
the architecture things, like, multiplies exponentially. 01:09:10.780 |
to build these compilers very efficiently, right? 01:09:13.500 |
Now, one of the things that, coming back to the LLVM thing, 01:09:17.980 |
is LLVM is a specific compiler for a specific domain. 01:09:22.980 |
MLIR is now this very general, very flexible thing 01:09:26.900 |
that can solve lots of different kinds of problems. 01:09:32.420 |
- So MLIR is, I mean, it's an ambitious project then. 01:09:45.140 |
But where this comes full circle is now folks 01:09:56.140 |
that MLIR was built by me and many other people 01:10:01.860 |
and so we fixed a lot of the mistakes that lived in LLVM. 01:10:07.100 |
where it's like, well, there's this new thing, 01:10:10.340 |
it feels like it's new, and so let's not trust it. 01:10:13.980 |
to see the cultural social dynamic that comes out of that. 01:10:21.540 |
and we're seeing the technology diffusion happen 01:10:25.260 |
they start to understand things in their own terms. 01:10:38.740 |
Well, actually, you have a new role at SciFive. 01:10:53.220 |
- So I lead the engineering and product teams at SciFive. 01:11:04.420 |
Instruction sets are the things inside of your computer 01:11:12.060 |
and things like this are other instruction sets. 01:11:20.540 |
- The RISC-V is distinguished by not being proprietary. 01:11:23.700 |
And so x86 can only be made by Intel and AMD, 01:11:30.380 |
they sell licenses to build ARM chips to other companies, 01:11:38.300 |
and then it gets licensed out, things like that. 01:11:45.140 |
And so SciFive was founded by three of the founders 01:11:48.220 |
of RISC-V that designed and built it in Berkeley, 01:11:55.780 |
SciFive today has some of the world's best RISC-V cores 01:11:59.060 |
and we're selling them and that's really great. 01:12:01.420 |
They're going to tons of products, it's very exciting. 01:12:04.060 |
- So they're taking this thing that's open source 01:12:06.100 |
and just trying to be or are the best in the world 01:12:10.780 |
- Yeah, so here it's the specifications open source. 01:12:20.780 |
And so SciFive, on the one hand, pushes forward 01:12:28.100 |
that are best in class for different points in the space, 01:12:33.620 |
or if you want a really big beefy one that is faster, 01:12:48.140 |
And so the way this works is that there's generally 01:12:52.500 |
a separation of the people who design the circuits 01:12:56.820 |
And so you'll hear about fabs like TSMC and Samsung 01:13:00.740 |
and things like this that actually produce the chips, 01:13:09.940 |
you turn code for the chip into little rectangles 01:13:14.940 |
that then use photolithography to make mask sets 01:13:24.700 |
- So, and we're talking about mass manufacturing, so. 01:13:28.340 |
- Yeah, they're talking about making hundreds 01:13:29.580 |
of millions of parts and things like that, yeah. 01:13:31.340 |
And so the fab handles the volume production, 01:13:36.340 |
the interesting thing about the space when you look at it 01:13:39.700 |
is that these, the steps that you go from designing a chip 01:13:46.260 |
and things like Verilog and languages like that, 01:13:51.620 |
is a really well-studied, really old problem, okay? 01:13:57.540 |
Lots of smart people have built systems and tools. 01:14:00.540 |
These tools then have generally gone through acquisitions. 01:14:03.460 |
And so they've ended up at three different major companies 01:14:11.620 |
The problem with this is you have huge amounts 01:14:26.700 |
So the RISC-V is a instruction, like what is RISC-V? 01:14:35.860 |
How much does it define how much of the hardware is? 01:14:44.860 |
how does the compiler, like the Swift compiler, 01:14:47.380 |
the C compiler, things like this, how does it make it work? 01:14:57.060 |
- But it's a set of instructions as opposed to-- 01:15:00.060 |
- What do you say, it tells you how the compiler works? 01:15:15.740 |
So RISC-V, you can buy a RISC-V core from Sci-5 01:15:19.140 |
and say, "Hey, I wanna have a certain number of, 01:15:26.740 |
"I wanna have, like, I want floating point or not," 01:15:30.820 |
And then what you get is you get a description of a CPU 01:15:38.140 |
you wanna build like an iPhone chip or something like that, 01:15:44.420 |
you have to have timers, IOs, a GPU, other components. 01:15:49.300 |
And so you need to pull all those things together 01:15:58.980 |
and then you have to transform it into something 01:16:10.580 |
I can't help but see it as, is a big compiler. 01:16:29.100 |
And so there's a lot of things that end up being compilers. 01:16:31.820 |
But this is a space where we're talking about design 01:16:34.700 |
and usability and the way you think about things, 01:16:37.460 |
the way things compose correctly, it matters a lot. 01:16:40.900 |
And so Sci-5 is investing a lot into that space. 01:16:47.460 |
to design chips faster, get them to market quicker 01:16:56.420 |
you've got this problem of you're not getting 01:16:59.260 |
free performance just by waiting another year 01:17:03.540 |
And so you have to find performance in other ways. 01:17:06.540 |
And one of the ways to do that is with custom accelerators 01:17:10.660 |
- And so, well, we'll talk a little bit about, 01:17:28.380 |
So like almost different car companies might use different 01:17:35.220 |
Like, so is this, like is RISC-V in this whole process, 01:17:40.220 |
is it potentially the future of all computing devices? 01:17:44.820 |
- Yeah, I think that, so if you look at RISC-V 01:17:47.420 |
and step back from the Silicon side of things, 01:18:00.060 |
- Is that you have companies that come and go 01:18:02.660 |
and you have instruction sets that come and go. 01:18:04.860 |
Like one example of this out of many is Sun with Spark. 01:18:10.740 |
- Sun went away, Spark still lives on at Fujitsu, 01:18:12.980 |
but we have HP had this instruction set called PA-RISC. 01:18:32.180 |
of you're making many billion dollar investments 01:18:35.380 |
on instruction sets that are owned by a company. 01:18:46.700 |
in their best interest to continue investing in the space 01:18:54.180 |
And this means that as a customer, what do you do? 01:18:57.860 |
You've sunk all this time, all this engineering, 01:19:08.260 |
because if you buy an implementation of RISC-V from Sci-5, 01:19:15.200 |
- But if something bad happens to Sci-5 in 20 years, right? 01:19:28.100 |
which means that if you have more than one requirement, 01:19:31.900 |
you can probably find something in the RISC-V space 01:19:35.980 |
Whereas if you're talking about XA6, for example, 01:19:52.700 |
in the next 20, 30 years, what does the world look like? 01:20:01.860 |
- So too much diversity in hardware instruction sets, 01:20:15.580 |
that are just weird and different for historical reasons. 01:20:23.100 |
and the languages on top of them aren't there, right? 01:20:31.060 |
because the ecosystem that supports is not big enough. 01:20:35.460 |
People will have better tools and better languages, 01:20:38.020 |
better features everywhere that then can service 01:20:46.300 |
eat more of the ecosystem because it can scale up, 01:20:56.380 |
I think when you look at Sci-5 tackling silicon 01:21:07.500 |
And that means that you get much more battery life, 01:21:09.780 |
you get better tuned solutions for your IoT thingy. 01:21:18.220 |
you get the ability to have faster time to market, 01:21:32.420 |
and if you do, how much customization per toaster is there? 01:21:38.820 |
Do all toasters in the world run the same silicon, 01:21:44.020 |
Or is it different companies have different design? 01:21:46.020 |
Like how much customization is possible here? 01:22:03.200 |
there's only so many chips that get made in a year 01:22:07.340 |
And so often what customers end up having to do 01:22:10.260 |
is they end up having to pick up a chip that exists 01:22:16.540 |
And the reason for that is they don't have the volume 01:22:18.340 |
of the iPhone, they can't afford to build a custom chip. 01:22:21.700 |
However, what that means is they're now buying 01:22:23.820 |
an off the shelf chip that isn't really good, 01:22:30.060 |
because they're buying silicon that they're not using. 01:22:33.500 |
Well, if you now reduce the cost of designing the chip, 01:22:37.780 |
And the more you reduce it, the easier it is to design chips. 01:22:44.300 |
and we get more AI accelerators, we get more other things, 01:22:46.740 |
we get more standards to talk to, we get 6G, right? 01:22:50.940 |
You get changes in the world that you wanna be able 01:22:54.780 |
There's more diversity in the cross product of features 01:22:57.220 |
that people want, and that drives differentiated chips 01:23:03.300 |
And so nobody really knows what the future looks like, 01:23:05.620 |
but I think that there's a lot of silicon in the future. 01:23:13.740 |
So do you agree with Dave Patterson and many folks 01:23:26.180 |
who's standing at the helm of the pirate ship 01:23:31.660 |
- Yeah, well, so I agree with what they're saying 01:23:39.740 |
So Jim would say, there's another 1000X left in physics 01:23:46.940 |
and make it faster and smaller and smaller geometries 01:23:59.960 |
That's not really what Moore's law is though. 01:24:17.060 |
And if you go look at the now quite old paper 01:24:21.900 |
Moore's law has a specific economic aspect to it. 01:24:33.340 |
so I can acknowledge both of those viewpoints. 01:24:56.740 |
Well, it was twice as fast at doing exactly the same thing. 01:25:01.220 |
Like literally the same program ran twice as fast. 01:25:03.820 |
You just wrote a check and waited a year, year and a half. 01:25:07.020 |
Well, so that's what a lot of people think about Moore's law. 01:25:11.820 |
And so what we're seeing instead is we're pushing, 01:25:15.260 |
we're pushing people to write software in different ways. 01:25:23.400 |
We're talking about C programmers having to use P threads 01:25:26.360 |
because they now have, you know, a hundred threads 01:25:29.120 |
or 50 cores in a machine or something like that. 01:25:31.960 |
You're not talking about machine learning accelerators. 01:25:35.080 |
And when you look at these kinds of use cases, 01:25:42.640 |
that utilize the Silicon in new ways for sure. 01:25:45.760 |
But you're also gonna change the programming model. 01:25:59.820 |
The C programming language is designed for CPUs. 01:26:10.540 |
with a different set of tools, a different world, 01:26:18.440 |
We can have one world that scales in a much better way. 01:26:22.480 |
I think most programming languages are designed 01:26:24.720 |
for CPUs for a single core, even just in their spirit, 01:26:30.480 |
So what does it look like for a programming language 01:26:34.160 |
to have parallelization or massive parallelization 01:26:50.020 |
they're what's called a high-level synthesis language. 01:27:01.860 |
Like you've got, you're like laying down transistors. 01:27:08.380 |
And so you're not saying run this transistor, 01:27:13.180 |
like your neurons are always just doing something. 01:27:20.200 |
And so when you design a chip or when you design a CPU, 01:27:24.540 |
when you design, when you're laying down the transistors, 01:27:50.640 |
And so having that as the domain that you program towards 01:27:55.640 |
makes it so that by default, you get parallel systems. 01:28:00.320 |
CUDA is a point halfway in the space where in CUDA, 01:28:05.940 |
it feels like you're writing a scalar program. 01:28:08.100 |
So you're like, you have ifs, you have for loops, 01:28:10.000 |
stuff like this, you're just writing normal code. 01:28:12.600 |
But what happens outside of that in your driver 01:28:20.560 |
but it has pulled it out of the programming model. 01:28:23.060 |
And so now you as a programmer are working in a simpler 01:28:33.760 |
You know, if we think about GPUs, but also ASICs, 01:28:43.680 |
Is, you know, how do you design for these features 01:28:46.720 |
to be able to program, make it a first class citizen 01:28:53.080 |
to be able to do machine learning on current hardware, 01:28:56.640 |
but also future hardware like TPUs and all kinds of ASICs 01:29:00.600 |
that I'm sure will be popping up more and more. 01:29:02.200 |
- Yeah, well, so a lot of this comes down to this whole idea 01:29:05.360 |
of having the nuts and bolts underneath the covers 01:29:10.400 |
you need, you know, MLIR, XLA, or one of these compilers 01:29:19.320 |
you need to figure out how to lay down the transistors 01:29:21.520 |
and how to organize it and how to set up clocking 01:29:23.280 |
and like all the domain problems that you get with circuits. 01:29:26.280 |
Then you have to decide how to explain it to a human. 01:29:31.840 |
And if you do it right, that's a library problem, 01:29:36.440 |
And that works if you have a library or a language 01:29:42.120 |
that feel native in the language by implementing libraries, 01:29:45.840 |
because then you can innovate in programming models 01:29:51.200 |
And like you have to invent new code formatting tools 01:29:54.880 |
and like all the other things that languages come with. 01:29:59.920 |
And so if you look at the space, the interesting thing, 01:30:24.080 |
And that comes into this whole design question 01:30:35.520 |
how do you make it so that people feel productive? 01:30:42.440 |
And in this world, I think that not a lot of effort 01:30:48.080 |
and thinking about the layering and other pieces. 01:30:53.520 |
you've written the Swift concurrency manifest. 01:31:04.080 |
of each of the five parts you've written about? 01:31:10.920 |
So in the Swift community, we have this problem, 01:31:21.440 |
you can understand the details at a very fine-grained level 01:31:30.800 |
that is a big arc, but you're tackling it in small pieces, 01:31:42.120 |
the first small step, what terminology do you use? 01:31:50.080 |
And so what a manifesto in the Swift community does 01:31:53.920 |
let's step back from the details of everything. 01:31:56.640 |
Let's paint a broad picture to talk about how, 01:32:05.280 |
so that then we can zero in on the individual steps 01:32:07.400 |
and make sure that we're making good progress. 01:32:18.660 |
And it starts with some fairly simple things, 01:32:26.720 |
or multiple different threads that are communicating, 01:32:30.800 |
And so you need things to be able to run separately 01:32:45.400 |
And so that's what I think is very likely in Swift. 01:32:48.220 |
But as you start building this tower of abstractions, 01:32:53.640 |
You then reach into the, how do you get memory safety? 01:32:58.360 |
You want debuggability and sanity for developers. 01:33:01.680 |
And how do you get that memory safety into the language? 01:33:11.920 |
when two different threads or Go routines or whatever 01:33:24.500 |
And so there's tools, there's a whole ecosystem 01:33:28.320 |
But it's a huge problem when you're writing concurrent code. 01:33:31.040 |
And so with Swift, this whole value semantics thing 01:33:34.160 |
is really powerful there because it turns out 01:33:40.680 |
And so you get a lot of safety just out of the box, 01:33:47.040 |
When you start building up to the next level up 01:33:50.520 |
you have to talk about what is the programmer model? 01:33:54.240 |
So a developer that's trying to build a program 01:33:56.760 |
think about this and it proposes a really old model 01:34:08.120 |
So you write something that feels like it's one programming, 01:34:13.200 |
and then it communicates asynchronously with other things. 01:34:16.720 |
And so making that expressive and natural feel good 01:34:20.840 |
be the first thing you reach for and being safe by default 01:34:23.480 |
is a big part of the design of that proposal. 01:34:28.680 |
well, these things that communicate asynchronously, 01:34:38.240 |
These things should be able to be in different processes 01:34:45.680 |
And so now you have a very nice gradual transition 01:34:51.760 |
And of course, when you start talking about the big future, 01:34:56.980 |
but accelerators are things you talk to asynchronously 01:35:09.400 |
- So, and how much do you wanna make that explicit, 01:35:19.240 |
So when you're designing any of these kinds of features 01:35:25.320 |
you have this really hard trade-off you have to make, 01:35:34.720 |
What do you do when the default case is the wrong case? 01:35:51.000 |
and then you jump, so let's pick like Logo, okay? 01:36:04.080 |
Well, you have to go switch to a different world 01:36:11.360 |
With Python, you would say like concurrency, right? 01:36:19.480 |
And so if you start writing a large-scale application 01:36:22.600 |
in Python and then suddenly you need concurrency, 01:36:25.140 |
you're kind of stuck with a series of bad trade-offs, right? 01:36:32.240 |
voiced all the complexity on the user all at once, right? 01:36:38.800 |
And so what I prefer is building a simple model 01:36:43.480 |
that you can explain that then has an escape hatch. 01:36:53.960 |
like by default, if you use all the standard things, 01:36:56.400 |
it's memory safe, you're not gonna shoot your foot off. 01:36:58.640 |
But if you wanna get a C-level pointer to something, 01:37:17.360 |
- So in the case of the proposal, it is the human's job. 01:37:20.960 |
So they decide how to architect their application. 01:37:24.200 |
And then the runtime and the compiler is very predictable. 01:37:32.920 |
including on Fortran for auto-parallelizing compilers. 01:37:40.160 |
So as a compiler person, I can rag on compiler people. 01:37:52.680 |
Okay, application, and so it does pattern matching. 01:38:06.100 |
and turn it into a structure of arrays or something, 01:38:16.560 |
Well, and it's this promise of build with my compiler 01:38:27.400 |
Wow, it's so much faster than the other compiler. 01:38:29.480 |
Then you go and you add a feature to your program 01:38:32.680 |
And suddenly you got a 10X loss in performance. 01:38:41.960 |
whatever analysis it was doing just got defeated 01:38:43.920 |
because you didn't inline a function or something, right? 01:38:48.200 |
As a user, you don't know, you don't wanna know. 01:38:52.780 |
You don't wanna know how the memory hierarchy works. 01:38:59.840 |
But then the magic is lost as soon as you did something 01:39:13.580 |
Well, this is the problem with unpredictable performance. 01:39:23.760 |
architectural patterns for being able to lay out your code, 01:39:28.320 |
makes it really simple so you can explain it. 01:39:30.120 |
And then if you wanna scale out in different ways, 01:39:36.520 |
- So in your sense, the intuition is for a compiler, 01:39:39.400 |
it's too hard to do automated parallelization. 01:39:42.520 |
Like, you know, 'cause the compilers do stuff automatically 01:39:47.520 |
that's incredibly impressive for other things. 01:39:56.220 |
So there's many different kinds of compilers. 01:40:04.940 |
parallelizing that and reasoning about all the pointers 01:40:07.100 |
and stuff like that is a very difficult problem. 01:40:12.220 |
so there's this cool thing called machine learning, right? 01:40:19.420 |
solving cat detectors and other things like that, 01:40:29.380 |
that has raised the levels of abstraction high enough 01:40:33.160 |
that suddenly you can have auto-parallelizing compilers. 01:40:51.420 |
that's parallelizable for you, parallelized for you. 01:40:54.160 |
- And if you think about it, that's pretty cool. 01:40:59.740 |
as a way of being able to exploit more parallelism. 01:41:05.380 |
That didn't come out of the programming language nerds, 01:41:14.020 |
by the community of people focusing on machine learning. 01:41:16.860 |
And it's an incredibly powerful abstraction layer 01:41:19.860 |
that enables the compiler people to go and exploit that. 01:41:22.780 |
And now you can drive supercomputers from Python. 01:41:32.260 |
I forget to admire the beauty and power of that. 01:41:38.500 |
like what does it take to run a neural network fast? 01:41:46.900 |
you said like it's amazing that that's a thing, 01:41:58.620 |
So there's a lot of work left to be done there. 01:42:22.940 |
Well, cool, like setting up a linear sequence of layers 01:42:39.100 |
And then you get to the next level down of saying like, 01:42:41.860 |
how do I get the peak performance out of my TPU 01:42:54.540 |
and a lot of really smart people working on it. 01:43:02.940 |
- So how much innovation is there on the lower level? 01:43:09.780 |
or redesigning concurrently compilers with that hardware. 01:43:20.540 |
in the inference, in the training of neural networks, 01:43:24.620 |
in just all of that, where is that gonna come from? 01:43:27.500 |
- Sure, you get scalability, you have different things. 01:43:28.900 |
And so you get Jim Keller shrinking process technology, 01:43:33.620 |
you get three nanometer instead of five or seven 01:43:38.100 |
And so that marches forward and that provides improvements. 01:44:02.340 |
how you scale out, how you have fast interconnects 01:44:06.060 |
You then get system level programming models. 01:44:08.780 |
So now that you have all this hardware, how do you utilize it? 01:44:14.380 |
Instead of training in a ResNet-50 in a week, 01:44:39.140 |
But if you were to force to bet all your money 01:44:46.300 |
Unfortunately, we have people working on all this. 01:44:52.260 |
- So, I mean, you know, OpenAI did this little paper 01:44:56.180 |
showing the algorithmic improvement you can get 01:45:00.940 |
I haven't quite seen the same kind of analysis 01:45:21.420 |
And it, you know, becomes reality in a sense. 01:45:28.900 |
when you, when Chris Lattner on a silly little podcast 01:45:33.620 |
makes, bets all his money on a particular thing, 01:45:47.540 |
most of the computing industry really, really focused 01:46:01.300 |
I mean, compilers improved significantly also. 01:46:13.260 |
There's another joke, another law in compilers, 01:46:15.820 |
which is called, I think it's called Probstein's law, 01:46:54.220 |
how do I generate a very specific error message 01:47:10.340 |
how do you expand computing to all these kinds of devices? 01:47:15.340 |
Do you see this world where just everything's 01:47:47.420 |
like, what's the architecture of that computer, 01:48:05.300 |
- I think it comes down to the right tool for the job. 01:48:19.740 |
quantum systems are the bottom of the pile of turtles 01:48:37.900 |
- Right, so if we really are living in a simulation, 01:48:52.660 |
is that you don't have to run the whole thing, 01:48:54.220 |
that, you know, we humans are cognitively very limited. 01:49:24.980 |
- And then, thank you for considering the possibility. 01:49:38.900 |
as we create these higher and higher fidelity systems. 01:49:43.340 |
But I do wanna ask on the quantum computer side, 01:49:52.060 |
none of that includes quantum computers, right? 01:49:56.060 |
- So have you ever thought about what, you know, 01:50:05.420 |
of quantum computers looks like, of compilers, 01:50:15.540 |
I will have to find an excuse to get involved, 01:50:23.420 |
of the timing of one to be involved, is it not yet? 01:50:33.700 |
figure out what the truth in the situation is, 01:50:35.540 |
try to figure out what the unifying theory is, 01:50:44.860 |
and lots of people have bashed their heads against it. 01:50:47.060 |
I don't know that quantum computers are mature enough 01:50:49.300 |
and accessible enough to be figured out yet, right? 01:50:53.740 |
And I think the open question with quantum computers is, 01:51:04.100 |
the economic cost of, like, having one of these things 01:51:13.980 |
the world will only need seven computers, right? 01:51:18.220 |
Well, and part of that was that people hadn't figured out 01:51:24.340 |
And this comes back to, how do we make the world better, 01:51:27.620 |
either economically or making somebody's life better 01:51:29.900 |
or, like, solving a problem that wasn't solved before, 01:51:33.140 |
And I think that just we're a little bit too early 01:51:36.860 |
because it's still, like, literally a science project, 01:52:00.180 |
and then suddenly it had its breakout moment, 01:52:07.580 |
That's what drove the economic applications of it. 01:52:10.180 |
That's what drove the technology to go faster 01:52:13.420 |
because you now have more minds thrown at the problem. 01:52:15.940 |
This is what caused a serious knee in deep learning 01:52:22.100 |
And so I think that's what quantum needs to go through. 01:52:25.540 |
And so right now it's in that formidable finding itself, 01:52:32.700 |
- And then it has to figure out the application 01:52:40.860 |
I think it's just 10 years away, something like that. 01:53:03.940 |
It's kind of like if we just step back and zoom out 01:53:13.780 |
that may, if I look at the way Silicon Valley folks 01:53:17.020 |
are talking about it, the way MIT's talking about it, 01:53:34.500 |
I mean, from Sci-Fi to Google to just all the places 01:53:44.260 |
What do you think is, how is this whole place gonna change? 01:54:05.020 |
It's a normalizer that I think will help communities 01:54:09.060 |
of people that have traditionally been underrepresented 01:54:12.580 |
because now you're taking, in some cases, a face-off 01:54:16.340 |
'cause you don't have to have a camera going, right? 01:54:19.980 |
without physical appearance being part of the dynamic, 01:54:24.500 |
You're taking remote employees that have already been remote 01:54:27.020 |
and you're saying you're now on the same level 01:54:39.300 |
You've got, you're forcing people to think asynchronously 01:54:49.380 |
forces people to find new ways to solve those problems. 01:54:52.740 |
And I think that that leads to more inclusive behavior, 01:54:56.740 |
On the other hand, it's also, it just sucks, right? 01:55:08.700 |
like on a daily basis and collaborating with them? 01:55:13.060 |
I mean, everything, this whole situation is terrible. 01:55:17.580 |
I think that most humans like working physically with humans. 01:55:22.940 |
I think this is something that not everybody, 01:55:27.060 |
And I think that we get something out of that 01:55:29.180 |
that is very hard to express, at least for me. 01:55:36.780 |
when you get through that time of adaptation, 01:55:38.980 |
you get out of March and April and you get into December 01:55:43.100 |
and you get into next March, if it's not changed, right? 01:55:47.740 |
- Well, you think about that and you think about 01:55:49.540 |
what is the nature of work and how do we adapt? 01:55:52.620 |
And humans are very adaptable species, right? 01:56:09.820 |
Well, there's a high incentive to be physically located 01:56:21.020 |
in terms of like, you will be there for the meeting, right? 01:56:33.180 |
- Do you have friends or do you hear of people moving? 01:56:45.580 |
living in a small apartment and like, we're going insane. 01:56:50.460 |
Right, and they're in tech, husband works for Google. 01:56:54.260 |
So first of all, friends of mine are in the process of, 01:57:00.580 |
The thing that represents their passion, their dream. 01:57:05.300 |
but it can be large businesses like people that run gyms. 01:57:07.900 |
- Oh, restaurants, like tons of things, yeah. 01:57:10.820 |
- But also, people like look at themselves in the mirror 01:57:17.580 |
For some reason, they haven't done it until COVID. 01:57:22.060 |
and that results often in moving or leaving the company 01:57:36.780 |
I mean, we're definitely gonna see it at a higher frequency 01:57:38.500 |
than we did before, just because I think what you're trying 01:57:41.900 |
to say is there are decisions that you make yourself 01:57:45.820 |
and big life decisions that you make yourself. 01:57:47.860 |
And like, I'm gonna like quit my job and start a new thing. 01:57:50.440 |
There's also decisions that get made for you. 01:57:55.860 |
And that's not a decision that you think about, 01:58:00.880 |
And so I think that those you're forced to act 01:58:05.140 |
global pandemic comes and wipes out the economy 01:58:10.400 |
I think that does lead to more reflection, right? 01:58:12.340 |
Because you're less anchored on what you have 01:58:17.580 |
versus what do I have to gain, AB comparison. 01:58:36.600 |
If you can afford to do that, is this time to like, 01:58:39.000 |
you know, literally move in with the parents, right? 01:58:41.000 |
I mean, all these things that were not normative before 01:58:43.880 |
suddenly become, I think, very, the value systems change. 01:58:50.800 |
in the short term at least, because it leads to, you know, 01:59:10.120 |
- What do you think about all the social chaos 01:59:17.520 |
- Let me ask you, you think it's all gonna be okay? 01:59:30.360 |
I don't think all the humans are gonna kill all the humans. 01:59:44.760 |
to be willing to do things that are uncomfortable. 01:59:51.760 |
is a pretty unoptimal place to live in for a lot of people. 02:00:06.340 |
it's really kind of igniting some of that debate 02:00:07.840 |
that should have happened a long time ago, right? 02:00:10.120 |
I mean, I think that we'll see more progress. 02:00:14.240 |
and wouldn't it be great if politics moved faster 02:00:15.760 |
because there's all these problems in the world 02:00:22.320 |
And so if you're talking about conservative people, 02:00:25.040 |
particularly if they have heavy burdens on their shoulders 02:00:27.480 |
'cause they represent literally thousands of people, 02:00:36.240 |
The global pandemic will probably lead to some change. 02:00:50.120 |
- Well, let me know if you've observed this as well. 02:00:56.160 |
I'm guessing it might be prevalent in other places, 02:01:30.200 |
- I think there's an inherent thing in humanity 02:01:39.640 |
And so what's happening, at least in some part, 02:01:43.160 |
is that with the internet and with online communication, 02:01:48.560 |
Right, and so we're having some of the social ties 02:01:53.080 |
of like my town versus your town's football team, 02:01:56.480 |
right, turn into much larger and yet shallower problems. 02:02:08.080 |
kind of really, really feed into this machine. 02:02:12.480 |
- Yeah, I mean, the reason I think about that, 02:02:14.760 |
I mentioned to you this offline a little bit, 02:02:17.520 |
but I have a few difficult conversations scheduled, 02:02:27.320 |
difficult personalities that went through some stuff. 02:02:49.800 |
irrational, over-exaggerated pile on on his comments 02:02:57.160 |
about the fact that if there's bias in the data, 02:03:06.600 |
because he said he trivialized the problem of bias. 02:03:10.080 |
Like it's a lot more than just bias in the data. 02:03:32.920 |
One nice thing about like a podcast long form conversation 02:03:44.560 |
you can still show that you're a good human being 02:03:51.040 |
Well, how do you get to that point where people can turn? 02:03:53.920 |
They can learn, they can listen, they can think, 02:04:02.600 |
- And I don't think that progress really comes from that. 02:04:06.720 |
Right, and I don't think that one should expect that. 02:04:12.360 |
individual circles and the us versus them thing. 02:04:21.000 |
like the people that bother me most on Twitter 02:04:38.000 |
we should teach each other is to be sort of empathetic. 02:04:44.760 |
particularly on like Twitter or the internet or an email, 02:04:47.800 |
is that sometimes people just have a bad day. 02:04:53.200 |
I've been in the situation where it's like between meetings, 02:04:57.360 |
'cause I wanna like help get something unblocked. 02:05:20.920 |
And this is just an aspect of working together as humans. 02:05:23.400 |
And I have a lot of optimism in the long-term, 02:05:26.200 |
the very long-term about what we as humanity can do, 02:05:29.120 |
but I think that's gonna be, it's just always a rough ride. 02:05:38.120 |
And I think that it's really bad in the short-term, 02:05:44.340 |
- Yeah, it's painful in the short-term though. 02:05:48.040 |
- Well, yeah, I mean, people are out of jobs. 02:05:49.760 |
Like some people can't eat, like it's horrible. 02:05:58.560 |
I mean, the real question is when you look back 10 years, 02:06:03.560 |
how do we evaluate the decisions that are being made 02:06:06.860 |
I think that's really the way you can frame that 02:06:10.640 |
And you say, you know, you integrate across all 02:06:12.840 |
the short-term horribleness that's happening. 02:06:15.440 |
And you look at what that means and is the, you know, 02:06:18.600 |
improvement across the world or the regression 02:06:20.360 |
across the world significant enough to make it a good 02:06:29.480 |
I mean, one of the big problems for me right now 02:06:32.060 |
is I'm reading the rise and fall of the third Reich. 02:06:37.400 |
- So it's everything is just, I just see parallels 02:06:40.880 |
and it means it's, you have to be really careful 02:06:45.360 |
But just the thing that worries me the most is the pain 02:06:57.960 |
And then just being disrespected in some kind of way, 02:07:02.600 |
which the German people were really disrespected 02:07:05.160 |
by most of the world, like in a way that's over the top, 02:07:10.160 |
that something can build up and then all you need 02:07:13.460 |
is a charismatic leader to go either positive or negative 02:07:18.400 |
and both work as long as they're charismatic. 02:07:26.360 |
and what they do with it could be good or bad. 02:07:28.720 |
- And so it's a good way to think about times now, 02:07:32.680 |
like on an individual level, what we decide to do 02:07:35.760 |
is when history is written, 30 years from now, 02:07:39.560 |
what happened in 2020, probably history's gonna remember 02:07:46.800 |
And it's like up to us to write it, so it's good. 02:08:00.000 |
You make a decision where you're predicting the future 02:08:02.620 |
based on what you've seen in the recent past. 02:08:07.320 |
then of course you expect it to rain today too, right? 02:08:10.080 |
On the other hand, the world changes all the time. 02:08:14.240 |
- Incessantly, like for better and for worse. 02:08:20.880 |
what is the inflection point that led to a change? 02:08:24.360 |
Like what is the catalyst that led to that explosion 02:08:30.240 |
like you can kind of work your way backwards from that. 02:08:33.240 |
And maybe if you pull together the right people 02:08:46.400 |
And often it's a combination of multiple factors, 02:08:54.960 |
- I'm a long-term optimist on pretty much everything. 02:08:59.360 |
we can look to all the negative things that humanity has, 02:09:02.220 |
all the pettiness and all the self-servingness 02:09:09.760 |
The biases, just humans can be very horrible. 02:09:13.400 |
But on the other hand, we're capable of amazing things. 02:09:23.280 |
And even across decades, we've come a long ways 02:09:34.920 |
It's kind of scary to think what's gonna happen 02:09:41.680 |
that the kind of technology is gonna come out 02:09:49.120 |
It'll be like kids these days with their virtual reality 02:10:09.680 |
the machine learning world has been kind of inspired, 02:10:18.760 |
I thought it'd be cool to get your opinion on it. 02:10:21.800 |
What's your thoughts on this exciting world of, 02:10:33.000 |
and take many, many computers, not just to train, 02:10:40.440 |
Well, I mean, it depends on what you're speaking to there, 02:10:45.280 |
a pretty well understood maximum deep learning 02:10:55.800 |
And so on one hand, GPT-3 was not that surprising. 02:10:59.740 |
On the other hand, a tremendous amount of engineering 02:11:09.000 |
there was a very provocative blog post from OpenAI 02:11:11.360 |
talking about, you know, we're not gonna release it 02:11:20.120 |
I think that we need to look at how technology is applied 02:11:23.240 |
and, you know, well-meaning tools can be applied 02:11:26.840 |
and they can have very profound impact on that. 02:11:29.320 |
I think that GPT-3 is a huge technical achievement. 02:11:35.760 |
Will it probably be bigger, more expensive to train? 02:11:48.720 |
Is there some technical challenges that are interesting 02:11:52.960 |
that you're hopeful about exploring in terms of, 02:11:55.880 |
you know, a system that, like a piece of code that, 02:12:11.600 |
Is there some hope that we can make that happen? 02:12:15.320 |
- Yeah, well, I mean, today you can write a check 02:12:21.800 |
and do really interesting large-scale training 02:12:23.960 |
and inference and things like that in Google Cloud, 02:12:27.440 |
And so I don't think it's a question about scale, 02:12:33.200 |
And when I look at the transformer series of architectures 02:12:39.880 |
because they're actually very simple designs. 02:12:47.440 |
And so they don't really reflect like human brains, right? 02:12:51.680 |
But they're really good at learning language models 02:13:05.120 |
have more parameters, more data, more things, 02:13:17.680 |
instead of just like making it a constant time bigger, 02:13:20.600 |
how do you get like an algorithmic improvement 02:13:30.320 |
human brain sparse, all these networks are dense, 02:13:33.600 |
the connectivity patterns can be very different. 02:13:41.560 |
But I think that could lead to big breakthroughs. 02:13:46.160 |
one of the things that Jeff Dean likes to talk about 02:13:51.680 |
of having a sparsely gated mixture of experts 02:13:59.480 |
and are really good at certain kinds of tasks. 02:14:02.080 |
And so you have this distributed across a cluster. 02:14:06.400 |
that end up being kind of locally specialized 02:14:18.000 |
of the entire cluster by having specialization within it. 02:14:30.000 |
if you can think of data selection as a kind of programming. 02:14:36.680 |
- I mean, essentially, if you look at like Karpathy 02:14:44.880 |
So let me try to summarize Andre's position really quick 02:15:03.720 |
So his basic premise is that software is suboptimal. 02:15:14.480 |
and other learning-based techniques are really great 02:15:16.360 |
because you can solve problems in more structured ways 02:15:19.120 |
with less like ad hoc code that people write out 02:15:23.040 |
and don't write test cases for in some cases. 02:15:25.160 |
And so they don't even know if it works in the first place. 02:15:27.800 |
And so if you start replacing systems of imperative code 02:15:32.320 |
with deep learning models, then you get a better result. 02:15:47.920 |
swapping over more and more and more parts of the code 02:15:56.640 |
And if you're predisposed to liking machine learning, 02:15:59.240 |
then I think that that's definitely a good thing. 02:16:01.760 |
I think this is also good for accessibility in many ways 02:16:04.700 |
because certain people are not gonna write C code 02:16:07.720 |
And so having a data-driven approach to do this kind of 02:16:12.720 |
On the other hand, there are huge trade-offs. 02:16:14.200 |
And it's not clear to me that software 2.0 is the answer. 02:16:19.200 |
And probably Andre wouldn't argue that it's the answer 02:16:22.960 |
But I look at machine learning as not a replacement 02:16:30.120 |
And so programming paradigms, when you look across domains, 02:16:35.140 |
is structured programming where you go from go-tos 02:16:38.480 |
to if-then-else, or functional programming from Lisp. 02:16:42.280 |
And you start talking about higher order functions 02:16:45.880 |
Or you talk about object-oriented programming. 02:16:48.040 |
You're talking about encapsulation, subclassing, 02:16:54.480 |
through specialization and different type instantiations. 02:16:59.480 |
When you start talking about differentiable programming, 02:17:01.720 |
something that I am very excited about in the context 02:17:04.960 |
of machine learning, talking about taking functions 02:17:11.120 |
Like that's a programming paradigm that's very useful 02:17:16.220 |
Machine learning is amazing at solving certain classes 02:17:21.940 |
or even a language translation system by writing C code. 02:17:25.920 |
That's not a very productive way to do things anymore. 02:17:28.920 |
And so machine learning is absolutely the right way 02:17:32.320 |
In fact, I would say that learned models are really 02:17:35.000 |
one of the best ways to work with the human world 02:17:38.240 |
And so anytime you're talking about sensory input 02:17:40.320 |
of different modalities, anytime that you're talking 02:17:42.320 |
about generating things in a way that makes sense 02:17:45.120 |
to a human, I think that learned models are really, 02:17:52.660 |
And so this is a very powerful paradigm for solving 02:17:57.120 |
But on the other hand, imperative code is too. 02:17:59.680 |
You're not gonna write a bootloader for your computer 02:18:04.060 |
Deep learning models are very hardware intensive. 02:18:07.040 |
They're very energy intensive because you have a lot 02:18:09.900 |
of parameters and you can provably implement any function 02:18:14.500 |
with a learned model, like this has been shown, 02:18:19.900 |
And so if you're talking about caring about a few orders 02:18:24.080 |
then it's useful to have other tools in the toolbox. 02:18:29.900 |
All the problems of dealing with data and bias in data, 02:18:35.100 |
And one of the great things that Andre is arguing towards, 02:18:39.320 |
which I completely agree with him, is that when you start 02:18:43.100 |
implementing things with deep learning, you need to learn 02:18:50.020 |
how do you validate all these things and building systems 02:18:53.060 |
around that so that you're not just saying like, 02:18:59.820 |
What happens when I make a classification that's wrong 02:19:07.340 |
- Yeah, but at the same time, the bootloader that works 02:19:10.140 |
for us humans looks awfully a lot like a neural network. 02:19:15.740 |
So it's messy and you can cut out different parts 02:19:20.020 |
There's a lot of this neuroplasticity work that shows 02:19:26.900 |
how much of the world's programming could be replaced 02:19:33.340 |
it's provably true that you could replace all of it. 02:19:36.600 |
- Right, so then it's a question of trade-offs. 02:19:39.260 |
- Right, so anything that's a function, you can. 02:19:47.740 |
What kind of trade-offs in terms of maintenance? 02:19:53.260 |
I think one of the reasons that I'm most interested 02:19:55.100 |
in machine learning as a programming paradigm is that one 02:19:59.160 |
of the things that we've seen across computing in general 02:20:01.520 |
is that being laser focused on one paradigm often puts you 02:20:08.460 |
And so you look at object-oriented programming, 02:20:13.500 |
And people forgot about functional programming, 02:20:20.020 |
if you mix functional and object-oriented and structure, 02:20:28.420 |
And so the question there is how do you get the best way 02:20:32.620 |
It's not about whose tribe should win, right? 02:20:35.980 |
It's not about, you know, that shouldn't be the question. 02:20:40.020 |
so that people can solve those problems the fastest 02:20:44.300 |
to build good libraries and they can solve these problems. 02:20:47.140 |
And when you look at that, that's like, you know, 02:20:52.620 |
Reinforcement learning, often you have to have 02:20:55.060 |
the integration of a learned model combined with your Atari 02:20:59.380 |
or whatever the other scenario it is that you're working in. 02:21:07.620 |
And so now it's not just about that one paradigm. 02:21:11.900 |
It's about integrating that with all the other systems 02:21:14.540 |
that you have, including often legacy systems 02:21:18.100 |
And so to me, I think that the interesting thing to say 02:21:21.460 |
is like, how do you get the best out of this domain 02:21:23.820 |
and how do you enable people to achieve things 02:21:31.300 |
- Right, but, okay, this is a crazy question, 02:21:38.820 |
but do you think it's possible that these language models 02:21:47.340 |
software 2.0 could replace some aspect of compilation, 02:22:11.380 |
Would I be able to generate Swift code, for example? 02:22:14.260 |
Do you think that could do something interesting 02:22:17.060 |
- So GPT-3 is probably not trained on the right corpus. 02:22:21.140 |
So it probably has the ability to generate some Swift. 02:22:25.220 |
It's probably not gonna generate a large enough body of Swift 02:22:27.620 |
to be useful, but like taking it a next step further, 02:22:30.580 |
like if you had the goal of training something like GPT-3 02:22:33.980 |
and you wanted to train it to generate source code, right? 02:22:39.780 |
Now the question is, how do you express the intent 02:22:44.300 |
You can definitely like write scaffolding of code 02:22:53.700 |
but there's an unsolved question, at least unsolved to me, 02:22:56.940 |
which is how do I express the intent of what to fill in? 02:22:59.740 |
Right, and kind of what you'd really want to have, 02:23:03.180 |
and I don't know that these models are up to the task, 02:23:08.300 |
here's a scaffolding and here are the assertions at the end 02:23:14.060 |
And so you want a generative model on the one hand, yes. 02:23:20.500 |
some reinforcement learning system or something 02:23:24.700 |
I need to hill climb towards something that is more correct. 02:23:29.780 |
- So it would generate not only a bunch of the code, 02:23:37.100 |
- I think the humans would generate the test, right? 02:23:39.700 |
- The test would be-- - But it would be fascinating-- 02:23:44.220 |
- 'Cause you have to express to the model what you want to, 02:23:51.300 |
You want a story about four horned unicorns or something. 02:23:54.740 |
- Well, okay, so exactly, but that's human requirements. 02:24:06.260 |
like that are more high fidelity that check for correctness. 02:24:29.380 |
syntactically correct Swift code that's interesting, right? 02:24:33.100 |
I think GPT series of model architectures can do that. 02:24:37.580 |
But then you need the ability to add the requirements. 02:24:52.820 |
you can say, I mean, there's interface stuff, 02:24:58.380 |
it can generate basic for loops that give you like-- 02:25:09.380 |
How do I say I want a webpage that's got a shopping cart 02:25:16.140 |
I don't know if you've seen these demonstrations, 02:25:29.020 |
So you have to prompt it with similar kinds of mappings. 02:25:36.580 |
They probably, but the fact that you can do that once, 02:25:49.980 |
the idea is the intent is specified in natural language. 02:25:56.920 |
- So the question is the correctness of that. 02:25:59.880 |
Like visually you can check, oh, the button is red. 02:26:12.120 |
this goes into like NP completeness kind of things. 02:26:15.480 |
Like I want to know that this code is correct. 02:26:27.880 |
should the system also try to generate checks 02:26:44.100 |
- There's a lot of pattern matching and filling in. 02:26:45.280 |
And kind of propagating patterns that have been seen before 02:26:48.480 |
into the future and into the generated result. 02:26:53.240 |
you kind of need to improving kind of things. 02:27:00.760 |
And see what the bright minds are thinking about right now. 02:27:18.980 |
Are we just pattern matching based on what we have? 02:27:24.280 |
So I think what the neural networks are missing, 02:27:29.820 |
is to be able to tell stories to itself about what it did. 02:27:34.900 |
I mean, you talk about network explainability, right? 02:27:38.260 |
And we give neural nets a hard time about this. 02:27:54.440 |
- Let me ask you about a few high-level questions, I guess. 02:28:07.000 |
ask for advice from successful people like you. 02:28:16.000 |
an undergraduate student or a high school student, 02:28:25.560 |
is there some words of wisdom you can give them? 02:28:35.400 |
that change is possible and that the world does change 02:28:42.680 |
And whether it be implementing a new programming language 02:28:50.200 |
moving the world forward in science and philosophy, 02:28:57.960 |
the work is hard for a whole bunch of different reasons, 02:29:06.920 |
And so you have to have the space in your life 02:29:23.280 |
Well, no, I mean, some people like suffering. 02:29:29.280 |
The secret to me is that you have to love what you're doing 02:29:45.440 |
because it's hard to know what you will love doing 02:30:05.680 |
because certain things will resonate with you 02:30:07.080 |
and you'll find out, wow, I'm really good at this. 02:30:10.040 |
Well, it's just because it works with the way your brain. 02:30:19.120 |
well, I think there's a bunch of cool stuff out there. 02:30:27.560 |
how did you just hook yourself in and stuck with it? 02:30:34.800 |
that a huge amount of it or most of it is luck, right? 02:30:40.860 |
So for me, I fell in love with computers early on 02:30:58.200 |
but also deciding that something that was hard 02:31:08.100 |
which is if you find something that you love doing, 02:31:10.400 |
that's also hard, if you invest yourself in it 02:31:15.000 |
then it will mean something, generally, right? 02:31:22.080 |
that can be, there's many things that can be, 02:31:38.040 |
but it's one of my, not enough people talk about this. 02:31:47.440 |
- Well, and self-doubt and imposter syndrome, 02:31:49.480 |
and these are all things that successful people 02:32:04.120 |
put yourself in a room with a bunch of people 02:32:07.040 |
that know way more about whatever you're talking about 02:32:13.080 |
Smart people love to teach, often, not always, but often. 02:32:16.840 |
And if you listen, if you're prepared to listen, 02:32:22.400 |
And I think that a lot of progress is made by people 02:32:25.400 |
who kind of hop between domains now and then, 02:32:28.040 |
because they bring a perspective into a field 02:32:34.760 |
if people have only been working in that field themselves. 02:32:38.320 |
- We mentioned that the universe is kind of like a compiler, 02:33:08.840 |
Here we are all biological things programmed to survive 02:33:21.440 |
and you just go until entropy takes over the world 02:33:24.160 |
and it takes over the universe and then you're done. 02:33:33.000 |
And so I prefer to bias towards the other way, 02:33:34.760 |
which is saying the universe has a lot of value. 02:33:41.800 |
And a lot of times part of that's having kids, 02:33:43.840 |
but also the relationships you build with other people. 02:33:46.940 |
And so the way I try to live my life is like, 02:34:05.040 |
how can it be in a domain that actually will matter? 02:34:11.680 |
okay, I'm doing a thing, I'm very familiar with it, 02:34:28.000 |
and jump into something I'm less comfortable with, 02:34:42.360 |
that first you're deep into imposter syndrome, 02:34:47.780 |
hey, well, there's actually a method to this. 02:34:57.240 |
about bringing different kinds of people together. 02:35:04.440 |
that are coming at things from different directions, 02:35:10.560 |
where you're like, oh, we've really cracked this. 02:35:16.760 |
where it adds value, other people can build on it, 02:35:26.480 |
do you think we'll ever create that in like an AGI system? 02:35:45.640 |
Well, so I mean, why are you being so speciest? 02:36:02.880 |
we have our objective function that we were optimized for. 02:36:14.560 |
just because we don't understand them, right? 02:36:20.080 |
that would be very premature to look at a new thing 02:36:24.160 |
through your own lens without fully understanding it. 02:36:29.400 |
because AI systems in the future will be listening to this. 02:36:36.060 |
You know, when Skynet kills everybody, please spare me. 02:36:44.560 |
will spend a lot of time worrying about this kind of stuff. 02:36:46.360 |
And I think that what we should be worrying about 02:36:49.880 |
And the thing that I'm most scared about with AGIs 02:37:08.320 |
And if we get into a mode of not having a personal challenge, 02:37:15.920 |
and seeing what they grow into and helping guide them, 02:37:18.840 |
whether it be your community that you're engaged in, 02:37:21.960 |
you're driving forward, whether it be your work 02:37:45.080 |
it could degrade into a very unpleasant world. 02:37:55.720 |
Unfortunately, we have pretty on the ground problems 02:37:58.680 |
And so I think we should be focused on that as well. 02:38:01.480 |
- Yeah, ultimately, just as you said, you're optimistic. 02:38:12.680 |
Right, so I mean, I'm not personally a very religious person, 02:38:20.440 |
Of course I go to church, because if God's real, 02:38:24.440 |
you know, I wanna be on the right side of that. 02:38:27.920 |
- And so, you know, that's a fair way to do it. 02:38:30.960 |
- Yeah, I mean, the same thing with nuclear deterrence, 02:38:35.600 |
all of, you know, global warming, all these things, 02:38:38.400 |
all these threats, natural, engineer, pandemics, 02:38:49.660 |
of all the possible ways we could destroy ourselves. 02:38:52.540 |
I think it's much better, or at least productive, 02:38:56.580 |
to be hopeful and to engineer defenses against these things, 02:39:04.820 |
see like a positive future and engineer that future. 02:39:07.940 |
- Yeah, well, and I think that's another thing 02:39:12.700 |
particularly if you're young and trying to figure out 02:39:14.540 |
what it is that you wanna be when you grow up, like I am. 02:39:19.820 |
The question then is, how do you wanna spend your time? 02:39:33.500 |
I'm going to go find out about the latest atrocity 02:39:36.540 |
and find out all the details of like the terrible thing 02:39:53.420 |
to being productive, learning, growing, experiencing, 02:39:58.420 |
you know, when the pandemic's over, going exploring, right? 02:40:03.620 |
And I think it leads to more optimism and happiness 02:40:08.660 |
You're building yourself, you're building your capabilities, 02:40:20.780 |
of a negative viewpoint, which you need to be aware 02:40:23.260 |
of what's happening because that's also important, 02:40:31.980 |
- Yeah, so what you're saying is people should focus 02:40:41.160 |
and be crowded out by the thousands of graduates popping 02:40:43.980 |
out of school that all want to do the same thing. 02:40:45.620 |
Or you could work in the place that people overpay you 02:40:48.580 |
because there's not enough smart people working in it. 02:40:51.260 |
And here at the end of Moore's law, according 02:40:53.780 |
to some people, actually the software is the hard part too. 02:40:57.140 |
- I mean, optimization is truly, truly beautiful. 02:41:02.300 |
And also on the YouTube side or education side, you know, 02:41:06.500 |
it'd be nice to have some material that shows the beauty 02:41:14.480 |
So that's a call for people to create that kind 02:41:18.920 |
Chris, you're one of my favorite people to talk to. 02:41:22.840 |
It's such a huge honor that you would waste your time 02:41:30.120 |
- The truth of it is you spent a lot of time talking to me 02:41:46.600 |
Neuro, which is a maker of functional gum and mints 02:41:51.440 |
Masterclass, which are online courses from world experts. 02:42:00.200 |
Please check out these sponsors in the description 02:42:02.360 |
to get a discount and to support this podcast. 02:42:06.120 |
If you enjoy this thing, subscribe on YouTube, 02:42:16.320 |
And now let me leave you with some words from Chris Latner. 02:42:19.080 |
So much of language design is about trade-offs 02:42:25.640 |
that really represent those different points. 02:42:28.560 |
Thank you for listening and hope to see you next time.