back to indexGuido van Rossum: Python and the Future of Programming | Lex Fridman Podcast #341
Chapters
0:0 Introduction
0:48 CPython
6:1 Code readability
10:22 Indentation
26:58 Bugs
38:26 Programming fads
53:37 Speed of Python 3.11
78:31 Type hinting
83:49 mypy
89:5 TypeScript vs JavaScript
105:5 Best IDE for Python
115:5 Parallelism
132:58 Global Interpreter Lock (GIL)
142:36 Python 4.0
154:53 Machine learning
164:35 Benevolent Dictator for Life (BDFL)
176:11 Advice for beginners
182:43 GitHub Copilot
186:10 Future of Python
00:00:07.080 |
that would necessitate the creation of the new 4.0? 00:00:12.080 |
Given the amount of pain and joy, suffering and triumph 00:00:23.020 |
The following is a conversation with Guido van Rossum, 00:00:29.880 |
He is the creator of the Python programming language 00:00:33.000 |
and is Python's emeritus BDFL, Benevolent Dictator for Life. 00:00:41.920 |
please check out our sponsors in the description. 00:00:44.320 |
And now, dear friends, here's Guido van Rossum. 00:00:52.480 |
In it, CPython claimed to be 10 to 60% faster. 00:01:00.560 |
CPython is the last Python implementation standing, 00:01:13.200 |
is implemented in another programming language called C? 00:01:16.920 |
What kind of audience do you have in mind here? 00:01:21.320 |
No, there's somebody on a boat that's into fishing 00:01:38.360 |
Silicon Valley programmer that's programmed in everything. 00:01:45.080 |
and knows the entire history of programming languages. 00:01:49.160 |
- I imagine that boat in the middle of the ocean. 00:01:52.240 |
- I'm gonna please the guy who knows how to fish first. 00:01:57.240 |
- He seems like the most useful in the middle of the ocean. 00:02:07.880 |
but he must have heard that inside his cell phone 00:02:20.400 |
It's zeros and ones, and then there's assembly, and then-- 00:02:24.120 |
- Oh yeah, we don't talk about these really low levels 00:02:30.120 |
I mean, when we're talking about human language, 00:02:39.640 |
when you have a Chinese person and they speak English, 00:02:44.000 |
this is a bit of a stereotype they often don't know, 00:02:48.520 |
or they can't seem to make the difference well 00:03:02.440 |
that in Chinese there is not really a difference. 00:03:05.680 |
And it could be that there are regional variations 00:03:08.680 |
in how native Chinese speakers pronounce that one sound 00:03:13.680 |
that sounds like L to some of them, like R to others. 00:03:19.280 |
- So it's both the sounds you produce with your mouth 00:03:31.960 |
the letter zh, like Americans or English speakers 00:03:43.640 |
Yeah, so I'm, oh yes, okay, so we're not going 00:03:52.360 |
we're not going into the ones and zeros or machine language. 00:04:02.240 |
that sort of tells you how to do a certain thing, 00:04:08.040 |
Well, acquire a loaf of bread, cut it in slices, 00:04:22.480 |
I've heard that science teachers can actually 00:04:29.680 |
and trying to interpret their students' instructions 00:04:39.120 |
between natural languages and programming languages. 00:04:48.880 |
That's the dance of communication between humans. 00:04:54.280 |
- Well, for lawyers, ambiguity certainly is a feature. 00:05:03.880 |
is not much of a feature, but we work around it, of course. 00:05:11.200 |
So with context, the precision of the statement 00:05:22.440 |
the person doesn't try to compile that statement 00:05:24.720 |
and return an error saying, please define love. 00:05:30.000 |
that my wife and my son interpret it very differently. 00:05:36.560 |
the same three words. - But imprecisely still. 00:05:44.560 |
- Nevertheless, the context is already different 00:06:01.400 |
You go through in PEP8 the style guide for Python code, 00:06:05.080 |
some ideas of what this language should look like, 00:06:11.400 |
And the big idea there is that code readability counts. 00:06:21.360 |
Because on the one hand, we always explain the concept 00:06:26.240 |
of programming language as computers need instructions 00:06:42.280 |
But what we've seen emerge during the development 00:06:46.680 |
of software starting in the, probably in the late 40s, 00:06:59.520 |
who sits alone in his lab writing brilliant code. 00:07:08.880 |
Even the mad scientist sitting alone in his lab 00:07:14.800 |
so that by the time he's done with his coding, 00:07:17.680 |
he still remembers what the first few lines he wrote mean. 00:07:22.080 |
So even the mad scientist coding alone in his lab 00:07:40.040 |
between a cookbook recipe and a computer program. 00:07:44.360 |
The cookbook recipe, the author of the cookbook 00:07:47.760 |
writes it once and then it's printed in 100,000 copies. 00:08:03.840 |
And so there, the goal of the cookbook author 00:08:08.840 |
is to make it clear to the human reader of the recipe, 00:08:45.800 |
is so complex that you don't get all of it right at once. 00:09:06.560 |
- That means broadly, it could be stupid little errors 00:09:19.680 |
to building something that does what you tell it to do, 00:09:26.480 |
- Yeah, it seems to work really well 99% of the time, 00:09:30.720 |
but does weird things 1% of the time on some edge cases. 00:09:44.120 |
But it's not just about the complexity of the program. 00:09:55.320 |
But you're in a group of people improving that recipe. 00:10:04.200 |
that he created a year ago and making it better. 00:10:13.880 |
he wants some decoration on his pie or icing. 00:10:22.680 |
So first of all, the thing that people first experience 00:10:30.120 |
but there's also like a spatial structure to it. 00:10:34.200 |
Can you explain the indentation style of Python 00:10:39.400 |
- Spaces are important for readability of any kind of text. 00:10:57.880 |
maybe you leave the spaces between the words, 00:11:01.320 |
When you're in the kitchen trying to figure out, 00:11:05.600 |
oh, what are the ingredients and what are the steps? 00:11:09.400 |
And where does this step end and the next step begin? 00:11:16.640 |
On the other hand, what a typical cookbook does 00:11:38.720 |
sort of write two sentences on how you have to cut the onion 00:11:46.960 |
small, medium, and in slices, or something like that. 00:11:58.040 |
We're talking to programmers with a metaphor of cooking. 00:12:01.680 |
But there is a strictness to the spacing that Python defines. 00:12:06.040 |
So there's some looser things, some stricter things, 00:12:14.840 |
It really defines what the language looks and feels like. 00:12:19.840 |
- Because indentation sort of taking a block of text 00:12:27.400 |
a smaller block of text that is indented further 00:12:31.000 |
as sort of a group, it's like you have a bulleted list 00:12:39.760 |
and inside some of the bullets are other bulleted lists. 00:12:45.440 |
If each bulleted list is indented several inches, 00:12:50.280 |
then at two levels deep, there's no space left on the page 00:12:58.160 |
On the other hand, if you don't indent at all, 00:13:01.440 |
you can't tell whether something is a top-level bullet 00:13:04.440 |
or a second-level bullet or a third-level bullet. 00:13:15.160 |
and the sort of the typical width of a computer screen 00:13:23.120 |
we came up with sort of four spaces as a compromise. 00:13:50.240 |
it's harder to, at a glance, understand the code 00:14:01.200 |
On the other hand, there are other programming languages 00:14:04.200 |
where the indentation is eight spaces or a whole tab stop 00:14:12.600 |
because you sort of, after three indent levels, 00:14:23.800 |
The code compiles even without any indentation. 00:14:29.000 |
indentation is a fundamental part of the language, right? 00:14:34.840 |
So you can code Python with two spaces per block 00:14:39.000 |
or six spaces or 12 if you really want to go wild. 00:14:44.000 |
But sort of everything that belongs to the same block 00:15:28.640 |
- They'd suggest the proper indentation for you, 00:15:38.240 |
which is their notion of sort of begin an indented block. 00:15:46.240 |
and then it automatically indents four or eight spaces 00:15:55.880 |
in which you considered having braces in Python? 00:16:34.440 |
most programmers are familiar with multiple languages, 00:16:46.120 |
I don't know how that's written these days anymore, 00:16:48.160 |
but all the other languages, Java, Rust, C, C++, 00:16:55.440 |
are all using curly braces to sort of indicate blocks. 00:17:06.600 |
Do you still, as a radical renegade revolutionary, 00:17:15.960 |
Like, what, can you dig into it a little bit more, 00:17:29.440 |
So for Python, there's no chance that we can switch. 00:17:34.440 |
Python is using curly braces for something else, 00:17:41.160 |
We would get in trouble if we wanted to switch. 00:17:44.640 |
Just like you couldn't redefine C to use indentation, 00:17:53.640 |
sort of in a greenfield environment would be better, 00:17:57.960 |
you can't change that kind of thing in a language. 00:18:10.320 |
we did have a big debate about tabs versus spaces 00:18:17.160 |
And we sort of came up with a recommended standard 00:18:21.520 |
and sort of options for people who want to be different. 00:18:30.080 |
I'd like you to consider is if you could travel back 00:18:33.080 |
through time when the compatibility is not an issue 00:18:46.160 |
- Well, it frees up a pair of matched brackets 00:18:57.960 |
sort of easier to grasp for people who don't already know 00:19:22.440 |
for the total newbie who has not coded before, 00:19:32.000 |
a whole bunch of concepts in programming are very alien 00:19:46.680 |
And there are many different things you have to learn. 00:19:55.520 |
you have to, if it's like really learning to program 00:20:05.240 |
you have to cover syntax, you have to cover variables, 00:20:30.240 |
oh, the compiler complains every time I put a semicolon 00:20:36.920 |
in the wrong place, or I forget to put a semicolon. 00:20:42.200 |
Python doesn't have semicolons in that sense. 00:20:53.120 |
because you don't learn about them in the first place. 00:20:56.960 |
- The flip side of that is forcing the strictness 00:21:03.480 |
that programming values attention to details. 00:21:08.040 |
You don't get to just write the way you write 00:21:24.000 |
and I'm sure there's other languages like this, 00:21:36.000 |
of why this is good for a programming language. 00:21:38.480 |
I'm not sure if you ever thought about that one. 00:21:44.400 |
There is a whole lineage of programming languages. 00:22:03.360 |
because the very earliest shells had a notion of scripting, 00:22:19.640 |
that is read by a very primitive command processor 00:22:23.480 |
that then sort of takes the first word on the line 00:22:27.040 |
as the name of a program and passes all the rest 00:22:35.000 |
for the program to figure out what to do with as arguments. 00:22:39.640 |
And so by the time scripting was slightly more mature 00:22:46.600 |
there was a convention that just like the first word 00:23:15.480 |
Parameters are usually what starts this process. 00:23:21.440 |
because you can't just say the parameters are X, Y, and Z. 00:23:27.480 |
And so now we call, say, let's say X is the input file 00:23:31.920 |
and Y is the output file, and let's forget about Z for now. 00:23:41.120 |
because that presumably means X itself is the file. 00:23:50.120 |
And so the inventors of things like the Unix shell 00:23:57.440 |
and I'm sure job command language at IBM before that 00:24:09.760 |
here is an X that is not actually the name of a file 00:24:14.560 |
which you just pass through to the program you're running. 00:24:41.000 |
because it had to fit in a very small part of memory. 00:24:44.440 |
And so saying, oh, just look at each character 00:24:59.580 |
And so it was sort of invented as a clever way 00:25:05.480 |
to make parsing of things that contain both variable 00:25:12.200 |
and fixed parts very easy in a very simple script processor. 00:25:18.080 |
- It also helps, even then, it also helps the human author 00:25:23.080 |
and the human reader of the script to quickly see, 00:25:28.160 |
oh, 20 lines down in the script, I see a reference to XYZ. 00:25:41.000 |
Several things to say, which is the leftovers 00:26:00.400 |
- It's funny that those decisions, or not funny, 00:26:02.640 |
it's fascinating that those decisions permeate through time. 00:26:10.000 |
I mean, the sort of, the inner workings of DNA 00:26:13.880 |
have been stable for, well, I don't know how long it was, 00:26:17.720 |
like 300 million years, half a billion years. 00:26:22.400 |
- And there are all sorts of weird quirks there 00:26:26.040 |
that don't make a lot of sense if you were to design 00:26:29.440 |
a system like self-replicating molecules from scratch. 00:26:33.240 |
- Well, that system has a lot of interesting resilience. 00:26:52.640 |
- You'd be surprised how much resilience modern code has. 00:26:57.640 |
I mean, if you look at the number of bugs per line of code, 00:27:10.760 |
there are actually lots of things that don't work fine. 00:27:18.720 |
or self-correcting mechanisms at many levels. 00:27:25.000 |
- Well, in the end, the user who sort of is told, 00:27:28.600 |
well, you got to reboot your PC, is part of that system. 00:27:33.600 |
And a slightly less drastic thing is reload the page, 00:27:38.640 |
which we all know how to do without thinking about it 00:27:45.560 |
You try to reload a few times before you say, 00:27:54.560 |
- Well, yeah, we should all have learned not to do that 00:27:57.880 |
because that's probably just gonna turn the light back off. 00:28:05.040 |
And I wonder how many people actually like the dollar sign. 00:28:14.120 |
So to me, it's whatever the opposite of syntactic sugar is, 00:28:31.960 |
It is a kind of documentation, that's the pro, 00:28:34.080 |
and the con is it's a source of a lot of bugs. 00:28:39.640 |
this is a really interesting idea of bugs per line of code. 00:28:43.640 |
If you look at all the computer systems out there, 00:28:49.600 |
to the code that runs all the amazing companies 00:28:55.440 |
the code that runs Twitter and Facebook and Dropbox 00:28:58.120 |
and Google and Microsoft, Windows, and so on, 00:29:03.680 |
wouldn't that be a cool table, bugs per line of code? 00:29:13.520 |
Do you think we'd be surprised by the number we see there 00:29:17.520 |
- That depends on whether you've ever read about research 00:29:26.400 |
And I don't know, the last time I saw some research like that 00:29:35.120 |
and the research might have been done in the '80s, 00:29:38.120 |
but the conclusion was across a wide variety of companies 00:29:43.120 |
a wide range of different software, different languages, 00:29:48.120 |
different companies, different development styles. 00:29:56.840 |
I think it's in the order of about one bug per thousand lines 00:30:00.640 |
in sort of mature software that is considered 00:30:13.960 |
So here's a report from a programming analytics company. 00:30:41.240 |
- Oh, I was wrong by an order of magnitude there. 00:30:51.240 |
- 75% of a developer's time is spent on debugging. 00:31:13.080 |
for someone who claims to have a golden bullet 00:31:15.800 |
or a silver bullet that makes all that investment 00:31:31.200 |
that is, you know, there's a contact us button 00:31:36.200 |
Presumably, if you just spend a little bit less 00:31:42.520 |
Right, and there's also a report on Stack Exchange, 00:31:50.480 |
the page says Stack Overflow is currently offline 00:31:58.360 |
Anyway, I mean, can you believe that number of bugs? 00:32:04.080 |
- Isn't that scary that 70 bugs per 1,000 lines of code, 00:32:18.240 |
how many bugs are gonna be found if you're typing it in? 00:32:22.200 |
- Well, the development process is extremely iterative. 00:32:26.280 |
Typically, you don't make a plan for what software 00:32:35.880 |
because actually all the details themselves consist, 00:32:58.080 |
and I'm actually really, I'm a really bad typist, 00:33:08.880 |
- Well, I use all 10 of them, but not very well, 00:33:20.400 |
I had to learn the layout of a QWERTY keyboard, 00:33:24.960 |
was actually in college, in my first programming classes, 00:33:37.920 |
Watch anyone give you a little coding demonstration, 00:33:43.840 |
they'll have to produce like four lines of code, 00:33:47.440 |
and now see how many times they use the backspace key, 00:33:54.400 |
and some people, especially when someone else is looking, 00:34:15.080 |
or your mouse to, but the mouse is usually slower 00:34:34.280 |
and sometimes it takes three, four times to get it right, 00:34:37.080 |
so I don't know what your definition of bug is, 00:34:44.000 |
and then correcting it immediately is not a bug, 00:34:47.520 |
on the other hand, you already do sort of lose time, 00:34:54.520 |
there's sort of a typo that you don't get in that process, 00:35:10.920 |
that you had to initialize a variable or something. 00:35:16.880 |
you have to actually run the code to discover that typo, 00:35:30.840 |
And sort of modern compilers are usually pretty good 00:35:40.040 |
there might be another variable that is initialized, 00:35:51.080 |
- It's like name the same, but it's a different thing, 00:36:08.000 |
and one of the biggest reasons I use that keyboard 00:36:12.120 |
is because you realize in order to use the backspace 00:36:15.600 |
on a usual keyboard, you have to stretch your pinky out. 00:36:31.960 |
because of the backspace key being so far away. 00:36:35.000 |
So with the Kinesis, it's right under the thumb, 00:36:37.480 |
so you don't have to actually move your hands, 00:36:40.560 |
- What do you do if you're ever not with your own keyboard 00:36:45.200 |
and you have to use someone else's PC keyboard 00:37:02.880 |
- Yeah, so it's very inefficient note-taking, 00:37:13.040 |
I just don't anticipate, you have to calculate 00:37:19.440 |
I have a keyboard with me. - You pull it out. 00:37:31.880 |
and I anticipate to do programming or a lot of typing, 00:37:35.440 |
I will have a laptop that will pull out a Kinesis keyboard 00:37:39.920 |
in addition to the laptop, and it's just who I am. 00:37:57.160 |
And it's like some people have a warm blanket 00:38:12.000 |
I use the state-of-the-art IDEs for everything, 00:38:14.760 |
but my comfort place, just like the Kinesis keyboard, 00:38:25.920 |
that's one of some of the debates I have with myself 00:38:28.920 |
about everything from a technology perspective 00:38:31.600 |
is how much to hold on to the tools you're comfortable with 00:38:36.120 |
versus how much to invest in using modern tools. 00:38:40.280 |
And the signal that the communities provide you with 00:38:43.320 |
is the noisy one, because a lot of people year to year 00:38:53.200 |
or something that will transform programming? 00:39:04.160 |
there's a lot of different styles that came and went. 00:39:06.820 |
I remember learning, what was it called, ActionScript? 00:39:16.960 |
learning how to design, do graphic animation, 00:39:22.600 |
I remember creating quite a lot of Java applets, 00:39:25.360 |
thinking that this potentially defines the future of the web. 00:39:32.520 |
the particular technology eventually gets replaced, 00:39:37.520 |
but many of the concepts that the technology introduced 00:39:44.440 |
or made accessible first are preserved, of course, 00:39:49.440 |
because yeah, we're not using Java applets anymore, 00:40:05.880 |
like pressing a button or a link or hovering even, 00:40:12.600 |
And that those animations that were made painfully 00:40:20.920 |
I mean, Flash was an innovation when it first came up. 00:40:25.360 |
And when it was replaced by JavaScript equivalence stuff, 00:40:30.360 |
it was a somewhat better way to do animations, 00:40:57.920 |
There wouldn't be jet planes without propeller planes. 00:41:11.200 |
it feels like all the time I've spent with ActionScript, 00:41:16.200 |
all the time I've spent with Java on the Applet side 00:41:34.160 |
the skill you picked up learning ActionScript, 00:41:38.560 |
was sort of, it was perhaps a super valuable skill 00:41:57.140 |
- Well, that's the calculation you have to make 00:42:00.460 |
Like today people start learning programming. 00:42:02.640 |
Today I'm trying to see what are the new languages to try? 00:42:10.200 |
What are the new IDs to try to keep improving? 00:42:14.080 |
- That's why we start when we're young, right? 00:42:33.500 |
Try not to get yourself killed or seriously maimed, 00:42:47.200 |
you'll just learn why everybody else is not doing that, 00:42:50.680 |
or why everybody else is doing it some other way. 00:42:55.520 |
you discover something that's better or that somehow works. 00:43:16.300 |
you're probably going to be a little more risk averse 00:43:24.480 |
where you were experimenting with crazy shit. 00:43:29.320 |
solidifies your choice of programming language. 00:43:36.200 |
which I think is misinterpreted by most people. 00:43:38.300 |
But anyway, I feel like the choices you make early on, 00:43:44.640 |
they're going to define the rest of your life's trajectory 00:43:47.400 |
in a way that, like you basically are picking a camp. 00:43:51.200 |
So, you know, there's, if you invest a lot in PHP, 00:44:14.760 |
- I mean, if at age 16, you learn coding in C 00:44:19.760 |
and by the time you're 26, C is like a dead language, 00:44:34.560 |
or whatever it's called in sort of your observation 00:44:49.800 |
because that technology is now so ubiquitous, of course, 00:44:59.040 |
- Well, for me personally, I had a very difficult 00:45:04.040 |
and in my own head, brave leap that I had to take 00:45:09.240 |
which is most of my life I programmed in C and C++. 00:45:12.960 |
And so having that hammer, everything looked like a nail. 00:45:17.960 |
So I would literally even do scripting in C++. 00:45:21.880 |
Like I would create programs that do script like things. 00:45:25.000 |
And when I first came to Google and before then, 00:45:29.120 |
it became already, before TensorFlow, before all of that, 00:45:41.240 |
A lot of things has to do with community and culture 00:45:46.920 |
But for me to decide to take the leap to Python, 00:45:50.600 |
like all out, basically switch completely from C++ 00:45:54.080 |
except for a highly performant robotics applications. 00:45:58.640 |
There was still a culture of C++ in the space of robotics. 00:46:05.880 |
Like I had to, you know, like people have like 00:46:09.600 |
existential crises or midlife crises or whatever. 00:46:29.440 |
'Cause C++ is still one of the most popular languages 00:46:38.920 |
I mean, that is not a sort of a fossilizing community. 00:46:47.120 |
- They are doing great innovative work actually. 00:46:50.520 |
- But that sort of their innovations are hard to follow 00:47:00.800 |
The old meta programming, template programming. 00:47:03.000 |
Like I would start using the modern C++ as it developed. 00:47:12.400 |
That makes it easier for you to work with some of the flaws. 00:47:20.760 |
But then you have to just empirically look and step back 00:47:24.600 |
and say, what language am I more productive in? 00:47:28.640 |
Sorry to say, what language do I enjoy my life with more? 00:47:52.520 |
Am I just infatuated with a new fad, new cool thing? 00:47:56.800 |
Or is this actually going to make my life better? 00:47:59.080 |
And I think a lot of people face that kind of decision. 00:48:01.840 |
It was a difficult decision for me when I made it. 00:48:09.560 |
But at that time, it wasn't quite yet so obvious. 00:48:16.200 |
with I still, because of my connection to WordPress, 00:48:20.920 |
I still do a lot of backend programming in PHP. 00:48:24.240 |
And the question is, you know, Node.js, Python, 00:48:29.440 |
do you switch backend to any of those programmings? 00:48:36.240 |
Well, more and more and more of the front end, 00:48:40.560 |
And fascinating cool stuff is done in JavaScript. 00:49:12.000 |
but I think they reflect the struggles of a lot of people 00:49:16.120 |
with different problems they're trying to solve. 00:49:31.320 |
who you want to work with, what communities you like, 00:49:37.720 |
Maybe if you sort of, if you can look back 20 years, 00:49:42.520 |
you can say, well, that whole detour through ActionScript 00:49:45.840 |
was a waste of time, but nobody could know that. 00:49:54.880 |
You just need to accept that not every choice you make 00:50:02.200 |
Maybe sort of keep a plan B in the back of your mind, 00:50:22.720 |
I expect to make X million dollars in a lifetime. 00:50:28.080 |
I expect to make Y million dollars in a lifetime. 00:50:45.600 |
you can do, diversifying your investment is good. 00:50:54.280 |
boy, that spreadsheet is possible to construct. 00:51:02.800 |
where you think you can maximally impact the world, 00:51:14.000 |
about where you predict the community's headed, 00:51:28.920 |
that are very hard to measure and even harder. 00:51:32.360 |
I mean, they're hard to measure retroactively 00:51:41.160 |
Well, better is one of those incredibly difficult words. 00:51:46.160 |
What's better for you is not better for someone else. 00:52:12.520 |
you wanna say, okay, I want to build a large company 00:52:28.400 |
Then you look at performant, more newer languages like Rust, 00:52:51.760 |
in terms of the development of the language itself. 00:53:02.520 |
Like, don't you believe in that gut feeling about-- 00:53:07.160 |
and yes, you most certainly can have a gut feeling, 00:53:25.080 |
than there's room in the world for Google-sized companies. 00:53:28.560 |
And they're gonna have to duke it out in the market space. 00:53:47.360 |
that we tried to define as the reference implementation. 00:53:50.440 |
And one of the big things that's coming out in 3.11, 00:53:54.720 |
- We tend to say 3.11, because it really was like, 00:54:29.840 |
or working with a great team, make it faster? 00:54:36.200 |
- It has to do with simplicity of software versus performance. 00:54:42.240 |
And so, even though C is known to be a low-level language, 00:54:50.400 |
sort of a high-performance language interpreter, 00:55:28.660 |
when it comes to language design, as well as implementation. 00:55:32.260 |
I also wrote much of the code as simple as it could be. 00:55:49.960 |
It's a bit of a sort of a time-space trade-off 00:56:01.200 |
And every time you get presented with a new input, 00:56:19.000 |
at least in the sort of mathematical sense of correct. 00:56:42.120 |
you might be able to rewrite that same algorithm 00:56:46.120 |
using more memory, maybe remember previous results 00:56:51.120 |
so you don't have to recompute everything from scratch. 00:56:54.800 |
Like the classic example is computing prime numbers. 00:57:09.600 |
And we go all the way to, is it divisible by nine? 00:57:28.840 |
that two, three, five, and seven are prime numbers, 00:57:32.280 |
and you know a little bit about the mathematics 00:57:42.200 |
you don't actually have to check is it divisible by four 00:58:04.760 |
So if you know basically nothing about prime numbers 00:58:10.420 |
maybe you go for x from two through n minus one 00:58:20.800 |
if you got all nos for every single one of those questions, 00:58:29.620 |
Well, the first thing is you can stop iterating 00:58:35.040 |
And the second is you can also stop iterating 00:58:47.440 |
it must also have a divisor smaller than the square root. 00:58:54.120 |
we don't need to bother with checking for even numbers 00:58:56.880 |
because all even numbers are divisible by two. 00:59:02.240 |
we would already have come across the question, 00:59:09.980 |
And then you just check three, five, seven, 11. 00:59:12.840 |
And so now you've sort of reduced your search space 00:59:17.320 |
by 50% again, by skipping all the even numbers 00:59:24.920 |
or you just read in your book about the history of math, 00:59:29.280 |
one of the first algorithms ever written down, 00:59:34.900 |
is it divisible by any of the previous prime numbers 00:59:41.100 |
And before you get to a better algorithm than that, 00:59:45.540 |
you have to have several PhDs in discrete math. 01:00:00.420 |
of how to come up with an efficient algorithm. 01:00:05.780 |
is not so much more complex than the inefficient one. 01:00:08.820 |
But that's an art, and it's not always the case. 01:00:12.580 |
In the general cases, the more performant the algorithm, 01:00:29.820 |
you look at the simplest way to get there first. 01:00:39.220 |
not the fastest or the most memory efficient or whatever, 01:00:43.340 |
a simple solution, and simple is fairly subjective, 01:00:58.300 |
But the simpler solutions tend to be easier to follow 01:01:03.300 |
for other programmers who haven't made a study 01:01:34.340 |
the simplest way I could solve a particular sub-problem. 01:01:37.540 |
Because when you're designing and implementing a language, 01:01:42.460 |
you have many hundreds of little problems to solve. 01:01:45.940 |
And you have to have solutions for every one of them 01:02:02.500 |
It takes in this readable language that we talked about, 01:02:09.820 |
it's sort of a recipe for understanding recipes. 01:02:14.820 |
So instead of a recipe that says, bake me a cake, 01:02:30.820 |
And that is sort of the recipe for building a computer. 01:02:47.620 |
but also now, how is it possible that 3.11 in year 2022, 01:02:52.780 |
it's possible to get such a big performance improvement? 01:03:02.100 |
where we still felt there was low-hanging fruit. 01:03:06.780 |
The biggest one is actually the interpreter itself. 01:03:11.740 |
And this has to do with details of how Python is defined. 01:03:28.340 |
even though it's always called an interpreted language, 01:03:38.740 |
which is sort of code for an imaginary computer 01:03:45.540 |
- So it's compiling code that is more easily digestible 01:03:51.220 |
- It is the code that is digested by the interpreter. 01:03:57.940 |
Almost all the work was done in the interpreter 01:04:07.060 |
and then you run the code a whole bunch of times. 01:04:27.900 |
We just made the interpreter a little more efficient. 01:04:37.740 |
although it's now applied to almost all programming languages, 01:04:56.660 |
was there something interesting to say about the compiler? 01:04:58.660 |
It's interesting that you haven't changed that, 01:05:08.100 |
And so we only had to change the parts of the compiler 01:05:11.420 |
where we decided that the breakdown of a Python program 01:05:15.260 |
in bytecode instructions had to be slightly different. 01:05:19.220 |
But that didn't gain us the performance improvements. 01:05:38.260 |
from some internal data structures used by the interpreter. 01:05:41.660 |
But the key idea is an adaptive specializing interface 01:05:53.580 |
- Well, let me first talk about the specializing part 01:05:59.340 |
the second order effect, but they're both important. 01:06:03.540 |
So bytecode is a bunch of machine instructions, 01:06:10.740 |
But the machine can do things like call a function, 01:06:18.060 |
Those are sort of typical instructions in Python. 01:06:21.660 |
And if we take the example of adding two numbers, 01:06:39.860 |
You might as well be adding two strings or two lists 01:06:47.900 |
that happened to implement this operator called add. 01:06:59.860 |
because it means that a certain category of functions 01:07:04.860 |
can be written using a single symbol, the plus sign, 01:07:10.460 |
and sort of a bunch of other functions can be written 01:07:13.620 |
using another single symbol, the multiply sign. 01:07:16.460 |
So if we take addition, the way traditionally in Python, 01:07:34.540 |
An object is basically a pointer to a bunch of memory 01:07:41.100 |
- Well, not quite, but there are a lot of them. 01:07:43.660 |
So to simplify a bit, we look up in one of the objects, 01:07:53.260 |
And does that object type define an add operation? 01:07:58.260 |
And so you can imagine that there is a sort of 01:08:18.300 |
are sort of important, I think, mostly historically, 01:08:37.300 |
- If you take the basics of int and float and add, 01:08:39.940 |
who carries the knowledge of how to add two integers? 01:08:51.460 |
Does the operator just exist as a platonic form 01:09:21.980 |
and there are like 30 other functions for other operations. 01:09:32.900 |
there is a distinct slot for the add operations. 01:09:37.100 |
Let's say the add operation is the first operation of a type 01:09:40.860 |
and the multiply is the second operation of a type. 01:09:47.980 |
In both cases, the add operation is the first slot 01:10:39.660 |
and the fabrication going on here at the table. 01:11:02.580 |
and that function sort of takes a bunch of inputs 01:11:05.540 |
and at some point it adds two of the inputs together. 01:11:09.260 |
Now I bet you, even if you call your function a thousand times 01:11:21.140 |
because maybe your program is all about integers 01:11:35.340 |
the variables A and B that are being added together 01:11:39.980 |
And so what we do is instead of having this single byte code 01:11:48.260 |
and the implementation of add is fully generic. 01:12:00.740 |
Now the function has to look at the other argument 01:12:13.700 |
and add the two bit patterns in the right way. 01:12:28.420 |
in the end, after we hit the code that did the addition 01:12:38.780 |
And then after a few times through that code, 01:12:57.460 |
it might as well be the add integer operation. 01:13:01.140 |
And add integer operation is much more efficient 01:13:05.260 |
because it just says, assume that A and B are integers, 01:13:10.260 |
do the addition operation, do it right there in line 01:13:21.180 |
even if you have great evidence that in the past 01:13:25.220 |
it was always two integers that you were adding, 01:13:28.420 |
at some point in the future, that same line of code 01:13:31.020 |
could still be hit with two floating points or two strings, 01:13:35.980 |
- It's not a great lie, that's just the fact of life. 01:13:39.300 |
- I didn't account for what should happen in that case 01:14:01.780 |
We applied some tricks to make those checks efficient. 01:14:06.220 |
And we know statistically that the outcome is almost always, 01:14:17.500 |
and then we proceed with the sort of add integer operation. 01:14:21.420 |
And then there is a fallback mechanism where we say, 01:14:47.060 |
Basically we're sort of hoping that most of the time 01:14:52.340 |
because if it turns out that we guessed wrong too often 01:15:00.260 |
things might actually end up running a little slower. 01:15:10.740 |
someone could easily construct a counter example 01:15:16.220 |
and then now it runs five times as slow in Python 3.11 01:15:28.340 |
- It's a fun reverse engineering task though. 01:15:44.620 |
of saying, you seem to be working adding two integers, 01:16:05.100 |
That is already so much better than guessing randomly. 01:16:13.860 |
Hey, I wonder if instead of adding two generic types, 01:16:49.940 |
which is Facebook's efficient compiler for PHP. 01:17:03.980 |
- And so the trick here is that the type itself doesn't, 01:17:12.300 |
where you can afford to have a shortcut to saying it's ints. 01:17:17.300 |
- This is a trick that is especially important 01:17:20.420 |
for interpreted languages with dynamic typing, 01:17:24.660 |
because if the compiler could read in the source 01:17:29.660 |
these X and Y that we're adding are integers, 01:17:34.100 |
the compiler can just insert a single add machine code 01:17:38.020 |
that hardware machine instruction that exists 01:17:48.940 |
you don't generally declare the types of your variables. 01:17:53.620 |
You don't even declare the existence of your variables. 01:17:57.140 |
They just spring into existence when you first assign them, 01:18:01.180 |
which is really cool and sort of helps those beginners 01:18:08.980 |
before they can start playing around with code, 01:18:12.380 |
but it makes the interpretation of the code less efficient. 01:18:17.380 |
And so we're sort of trying to make the interpretation 01:18:36.660 |
What is type hinting and is it used by the interpreter, 01:18:50.460 |
And it's especially popular with sort of larger companies 01:18:55.180 |
that have very large code bases written in Python. 01:18:58.620 |
- Do you think of it as almost like documentation 01:19:09.380 |
where you can express the types of variables. 01:19:16.180 |
And here's an argument to this function and it's a string. 01:19:18.940 |
And here is a function that returns a list of strings. 01:19:22.580 |
- But that's not checked when you run the code. 01:19:28.940 |
called a static type checker that reads all your source code 01:19:32.700 |
without executing it and thinks long and hard 01:19:36.660 |
about what it looks from just reading the code 01:19:51.620 |
- So this is something you're supposed to run 01:19:58.420 |
but the type annotations currently are not used 01:20:11.780 |
Even when they do use them, they sometimes contain lies 01:20:16.780 |
where the static type checker says, everything's fine. 01:20:22.220 |
I cannot prove that this integer is ever not an integer, 01:20:36.700 |
If we started enforcing type annotations in Python, 01:20:45.180 |
And some Python programs wouldn't even be possible 01:20:50.140 |
And so we made a choice of not using the annotations. 01:21:05.740 |
to sort of provide hints because we can still say, 01:21:18.340 |
And so we can generate an add integer instruction, 01:21:26.860 |
oh, if somehow the code at runtime provided something else, 01:21:35.980 |
we can still use that generic add operation as a fallback, 01:22:00.460 |
- There are third-party libraries that are in that business. 01:22:06.860 |
Is it possible for a third-party library to take a hint 01:22:16.500 |
I think this is a fairly unique feature in Python. 01:22:20.020 |
The type hints can be introspected at runtime. 01:22:27.420 |
they mean Python is a very introspectable language. 01:22:37.620 |
And if that variable happens to refer to a function, 01:22:41.900 |
you can ask, what are the arguments to the function? 01:22:48.220 |
what are the type annotations for the function? 01:22:50.820 |
- So the type annotations are there inside the variable 01:22:55.660 |
- They're mostly associated with the function object, 01:23:00.460 |
but you can sort of map from the arguments to the variables. 01:23:05.460 |
- And that's what a third-party library can help with. 01:23:12.940 |
is going to slow your code down instead of speed it up. 01:23:17.660 |
- I think to reference this sales pitchy blog post 01:23:22.660 |
that says 75% of developers' time is spent on debugging, 01:23:27.140 |
I would say that in some cases that might be okay. 01:23:29.900 |
It might be okay to pay the cost of performance 01:23:38.420 |
doing it statically before you ship your code to production 01:23:45.060 |
is more efficient than doing it at runtime piecemeal. 01:24:00.100 |
what is the future of static typing in Python? 01:24:04.040 |
- Well, so MYPY was started by a Finnish developer, 01:24:11.700 |
- So many cool things out of Finland, I gotta say. 01:24:25.380 |
But MYPY is the original static type checker for Python. 01:24:30.380 |
And the type annotations that were introduced 01:24:43.540 |
And in fact, Jukka had first invented a different syntax 01:24:50.600 |
And Jukka and I sort of met at a Python conference 01:24:58.140 |
And we sort of came up with a compromise syntax 01:25:04.140 |
that would not require any changes to Python. 01:25:15.860 |
- Just out of curiosity, was it like double colon 01:25:17.940 |
or something, what was he proposing that would break Python? 01:25:21.340 |
- I think he was using angular brackets for types 01:25:29.020 |
- Yeah, you can't use angular brackets in Python. 01:25:31.900 |
It would be too tricky for template type stuff. 01:25:41.780 |
We just didn't know what to use them for yet. 01:25:45.260 |
So type annotations were just the sort of most logical thing 01:26:10.020 |
He had a parser that translated MYPY into Python 01:26:24.040 |
and all the angular brackets from the positions 01:26:29.340 |
But a pre-processor model doesn't work very well 01:26:33.420 |
with the typical workflow of Python development projects. 01:26:38.820 |
I mean, that could have been another major split 01:26:42.980 |
Like if you watch TypeScript versus JavaScript, 01:26:46.900 |
it's like a split in the community over types, right? 01:26:59.740 |
but just use the original JavaScript notation, 01:27:04.580 |
just like there are many people in the Python world 01:27:12.940 |
between TypeScript and old school JavaScript, 01:27:19.420 |
transpilers are sort of the standard way of working anyway, 01:27:23.700 |
which is why TypeScript being a transpiler itself 01:27:33.300 |
it's the code, I guess you call it pre-processing code 01:27:36.860 |
that translates from one language to the other. 01:27:40.020 |
part of the workflow of the JavaScript community. 01:27:47.940 |
in the JavaScript/TypeScript world at the moment 01:27:51.380 |
is that there is a proposal under consideration, 01:28:12.100 |
And what it ignores is more or less a superset 01:28:21.660 |
- So that would mean that eventually, if you wanted to, 01:28:31.180 |
into a JavaScript interpreter without transpilation. 01:28:36.020 |
The interesting thing in the JavaScript world, 01:28:40.620 |
the web browsers have changed how they deploy 01:28:43.660 |
and they sort of update their JavaScript engines 01:28:48.660 |
much more quickly than they used to in the early days. 01:29:07.380 |
do you see, if you were to recommend somebody use a thing, 01:29:11.180 |
would you recommend TypeScript or JavaScript? 01:29:16.940 |
- Just because of the strictness of the typing? 01:29:23.260 |
that helps you sort of keep your head straight 01:29:36.500 |
It helps with ensuring that your code is not too incorrect. 01:29:41.500 |
And it's actually quite compatible with JavaScript, 01:29:52.980 |
But any library that is written in pure JavaScript 01:30:10.580 |
That sort of compatibility is sort of the key 01:30:19.140 |
it's almost like a biological system that's evolving. 01:30:21.540 |
It's fascinating to see JavaScript evolve the way it does. 01:30:24.540 |
- Well, maybe we should consider that biological systems 01:30:39.900 |
because there's just so much code written in JavaScript 01:30:48.300 |
If you're talking about bugs per line of code, 01:30:55.340 |
It beats Python by a lot in terms of number of bugs, 01:31:00.940 |
And then obviously the browsers are developed. 01:31:05.500 |
I mean, just there's so much active development. 01:31:15.060 |
versus Python is more, all that stuff is happening, 01:31:21.700 |
of stable working giant software systems written in Python 01:31:26.100 |
versus JavaScript is just a giant, beautiful, 01:31:33.140 |
And to some extent, differences in culture are random, 01:31:39.100 |
the differences have to do with the environment. 01:31:48.620 |
the language for developing web applications, 01:31:55.700 |
and the fact that it's basically the only language 01:32:02.260 |
makes that community sort of just have a different nature 01:32:16.260 |
on all kinds of shapes of screens and devices 01:32:31.620 |
What's the future of static typing in Python? 01:32:43.380 |
- What's the connection between PEP484 type hints and MyPy? 01:32:53.140 |
So MyPy quickly evolved from Yuka's own variant of Python 01:33:06.380 |
a very productive year where like many hundreds of messages 01:33:18.380 |
And so MyPy is a static type checker for Python. 01:33:53.380 |
is actually worth an investment for our company. 01:34:02.140 |
making MyPy faster say, or adding new features to MyPy, 01:34:09.540 |
but both Google and Facebook and later Microsoft 01:34:20.300 |
they decided that they wanted to use the same technology 01:34:28.900 |
because they sort of, they had a bunch of compiler writers 01:34:44.980 |
And they had done it in a certain way, sort of. 01:34:47.580 |
They wrote a big, highly parallel application 01:35:10.340 |
and they worked on it in secret for about a year 01:35:13.740 |
and then they came clean and went open source. 01:35:20.380 |
something called PyType, which was mostly interesting 01:35:31.140 |
So all the code is checked into a single repository. 01:35:37.340 |
So Facebook developed Pyre, which was written in OCaml, 01:35:42.220 |
which worked well with Facebook's development workflow. 01:35:46.380 |
Google developed something they called PyType, 01:36:17.260 |
And it's just a workflow, like a debugger for the programmers. 01:36:24.540 |
But it's a thing that runs through the code continuously, 01:36:28.380 |
pre-processing to find issues based on style, documentation. 01:36:36.140 |
It can check that, what usual things does a linter do? 01:36:39.660 |
Maybe check that you haven't too many characters 01:36:48.660 |
where they try to point out things that are likely mistakes, 01:36:52.740 |
but not incorrect according to the language specification. 01:36:57.260 |
Like maybe you have a variable that you never use. 01:37:16.460 |
A linter will tell you that variable is not used. 01:37:26.460 |
or there are a number of sort of common scenarios. 01:37:29.900 |
And a linter is often a big collection of little heuristics 01:37:39.980 |
of how your code is laid out, maybe how it's indented, 01:37:44.980 |
but also just things like definition of names, use of names, 01:37:56.460 |
And in some cases, linters are really style checkers. 01:38:06.020 |
do you use the PEP-8 recommended naming scheme 01:38:11.020 |
for your functions and classes and variables? 01:38:25.020 |
"whose first letter is not an uppercase letter." 01:38:38.940 |
that if the linter is no longer complaining about their code 01:38:48.060 |
joining a team to learn the style rules, right? 01:38:55.580 |
not so much to sort of enforce team uniformity, 01:39:05.780 |
that the compilers for whatever reason don't catch. 01:39:15.340 |
focuses on a particular aspect of the linting, 01:39:34.740 |
it will tell you, "Hey, that string is not an integer." 01:39:38.540 |
Either you were incorrect when you said it was an integer 01:39:42.980 |
or you're incorrect when you're passing it a string. 01:39:49.820 |
As you said, it's interesting that the companies 01:39:51.660 |
didn't choose to invest in this centralized development 01:40:10.980 |
- Well, Microsoft is hoping that Microsoft's horse 01:40:22.940 |
Yeah, all my word processors tend to typo correct 01:40:41.140 |
but, okay, so let me ask the question a different way. 01:40:46.540 |
where the static type checker gets integrated 01:40:59.180 |
That doesn't mean that five or 10 years from now, 01:41:18.260 |
than Python and its annotation syntax evolve. 01:41:26.580 |
Those are the only times that you can introduce 01:41:32.220 |
And there are always people who invent new annotation syntax 01:41:48.700 |
At least the sort of deprecating an existing feature 01:41:53.180 |
because you have to assume that people started using it 01:41:58.580 |
And then you can't take it away from them right away. 01:42:05.140 |
"but we're not gonna tell you that it's an error yet." 01:42:11.300 |
and then eventually three releases in the future, 01:42:15.180 |
On the other hand, the typical static type checker 01:42:41.060 |
The static type checkers also just get better 01:42:45.860 |
at discovering things that sort of are unspecified 01:42:50.860 |
by the language, but that sort of could make sense. 01:43:00.100 |
- So it's cool, it's like a laboratory of experiments. 01:43:03.180 |
- Microsoft, Google, and all, and you get to see. 01:43:06.620 |
Because there's not one single JavaScript engine either. 01:43:11.620 |
There is one in Chrome, there is one in Safari, 01:43:15.940 |
- But that said, you said there's not interest, 01:43:19.220 |
I think there is a lot of interest in type hinting, right? 01:43:29.260 |
How many people use, 'cause it's optional, it's a sugar. 01:43:38.740 |
that do interesting things with it at runtime, 01:43:41.740 |
and the fact that there are like now three or four 01:43:54.540 |
which has a sort of more heuristic-based type checker 01:44:15.100 |
especially anybody who has a continuous integration cycle 01:44:20.100 |
probably has one of the steps in their testing routine 01:44:44.580 |
20 to 30% of Python 3 codebases are using type hints. 01:44:57.540 |
They did a quick, not all of, but like a random sampling. 01:45:13.220 |
And you're extremely biased now that you're with Microsoft. 01:45:21.340 |
- Historically, I actually started out with using Vim, 01:45:28.540 |
For a very long time, I think from the early 80s to, 01:45:56.660 |
I mean, PyCharm is like driving an 18-wheeler truck, 01:46:15.540 |
and you know what every little rattle of the car means. 01:46:21.820 |
but there were certain things it couldn't do. 01:46:44.460 |
just grabbing all that code for where is there a class, 01:47:02.260 |
and once it's indexed, your repository was very helpful. 01:47:10.540 |
I would jump back to Emacs and do all my editing there 01:47:14.020 |
because I could type much faster and switch between files 01:47:18.180 |
when I knew which file I wanted much quicker. 01:47:33.540 |
And I feel like I'm just being an old grumpy man 01:47:37.700 |
for not learning how to quickly switch between files 01:47:42.580 |
that has to do with, I mean, you just have to get accustomed 01:47:53.340 |
You can type with two fingers just fine in the short term, 01:47:56.180 |
but in the long term, your life will become better 01:48:09.260 |
Like you look at the next 20, 30 years of your life, 01:48:13.380 |
you have to anticipate where technology is going. 01:48:26.460 |
So there's no reason to actually practice handwriting. 01:48:30.660 |
You can actually estimate, back to the spreadsheet, 01:48:33.980 |
the number of paragraphs, sentences, or words you write 01:48:43.420 |
- You go again with the spreadsheet of my life, huh? 01:48:47.140 |
All of that is not actual, like converted to a spreadsheet, 01:48:51.700 |
Like I have the same kind of gut feeling about books. 01:48:54.580 |
I've almost exclusively switched to Kindle now, 01:49:15.180 |
in terms of consuming books and content of that nature. 01:49:23.700 |
In that same way, it feels like PyCharm or VS Code. 01:49:27.180 |
I think PyCharm is the most sort of sophisticated, 01:49:35.060 |
It feels like I should probably at some point very soon, 01:49:38.940 |
switch entire, like I'm not allowed to use anything else 01:49:49.140 |
So I think I'm limiting myself in the same way 01:49:51.980 |
that using two fingers for typing is limiting myself. 01:50:00.100 |
But I'm sure a lot of people are thinking this way, right? 01:50:04.740 |
- I think that sort of everybody has to decide 01:50:07.940 |
for themselves which one they want to invest more time in. 01:50:12.420 |
I actually ended up giving VS Code a very tentative try 01:50:18.660 |
when I started out at Microsoft and really liking it. 01:50:33.820 |
of VS Code may not necessarily agree with me on this. 01:50:45.620 |
Because as you probably know, as an old Emacs hack, 01:50:51.700 |
the key part of Emacs is that it's mostly written in Lisp. 01:51:08.700 |
And oh yeah, there's also some very obscure thing 01:51:21.260 |
There's a core implementation that sort of can read a file 01:51:29.740 |
and it can sort of manage memory and buffers. 01:51:33.700 |
And then what makes it an editor full of features 01:51:39.780 |
And of course the design of how the Lisp packages 01:51:42.860 |
interact with each other and with that sort of 01:51:46.420 |
that base layer of the core immutable engine. 01:51:51.420 |
But almost everything in that core engine in Emacs case 01:52:14.220 |
I mean, it's open source, but nobody except the people 01:52:28.220 |
and a whole series of interfaces for packages 01:52:35.660 |
for how packages should interact with the lower layers 01:52:47.460 |
or select pieces of text or delete pieces of text 01:53:02.940 |
and the package ecosystem that you see in VS Code 01:53:08.220 |
is a mirror of very similar architectural features in Emacs. 01:53:16.580 |
'cause as far as sort of the hype and the excitement 01:53:24.340 |
The interesting thing about PyCharm and what is it, 01:53:29.260 |
PHP Storm, which are these JetBrains specific IDs 01:53:33.980 |
that are designed for one programming language. 01:53:36.340 |
It's interesting to, when an ID is specialized, right? 01:53:41.060 |
- They're usually actually just specializations of IntelliJ 01:53:45.980 |
because underneath it's all the same editing engine 01:54:05.780 |
In PyCharm, it is possible to have third-party extensions 01:54:17.180 |
- Yeah, I remember that it might've been five years ago 01:54:21.580 |
or so we were trying to get some better MyPy integration 01:54:26.180 |
into PyCharm 'cause MyPy is sort of Python tooling 01:54:30.260 |
and PyCharm had its own type checking heuristic thing 01:54:35.260 |
that we wanted to replace with something based on MyPy 01:54:42.300 |
because that was what we were using in the company. 01:54:44.860 |
And for the guy who was writing that PyCharm extension, 01:54:49.860 |
it was really a struggle to sort of find documentation 01:55:08.740 |
In your post titled "Reasoning about AsyncIO Semaphore," 01:55:13.460 |
you talk about a fast food restaurant in Silicon Valley 01:55:21.860 |
or is that an actual restaurant in Silicon Valley? 01:55:29.380 |
So for people who don't then read the thing, you should. 01:55:33.620 |
But it was a idea of a restaurant where there's only 01:55:43.220 |
And I actually looked it up and there is restaurants 01:55:50.500 |
You stand in line, you show up, there's one table. 01:55:58.860 |
- It sounds like you'd find places like that in Tokyo. 01:56:10.460 |
- The fascinating thing is you propose it's a fast food. 01:56:14.340 |
- It was one of my rare sort of more literary 01:56:19.220 |
or poetic moments where I thought I'll just open 01:56:23.380 |
with a crazy example to catch your attention. 01:56:26.980 |
And the rest is very dry stuff about locks and semaphores 01:56:31.500 |
and how a semaphore is a generalization of a lock. 01:56:35.060 |
- Well, it was very poetic and well delivered. 01:56:36.980 |
And it actually made me wonder if it's real or not 01:56:43.660 |
And in fact, I wouldn't be surprised if somebody 01:56:45.460 |
like listens to this and knows exactly a restaurant 01:56:49.780 |
Anyway, can we step back and can you just talk 01:56:52.860 |
about parallelism, concurrency, threading, asynchronous, 01:56:59.500 |
What is it, sort of a high philosophical level? 01:57:04.720 |
- Well, the idea is if the fisherman has two fishing rods, 01:57:32.900 |
And so as long as you can afford the equipment, 01:57:52.820 |
- And that's actually, I think, how deep sea fishing is done. 01:57:55.340 |
You could just have a rod and you put in a hole 01:58:01.860 |
between parallelism and concurrency and asynchronous? 01:58:10.820 |
- In the computer world, there is a big difference. 01:58:29.620 |
and share something like memory or an IO bus. 01:58:35.620 |
Concurrency can be a much more abstract concept 01:58:50.660 |
is it spends a little time running this program for a while, 01:58:55.660 |
and then it spends some time running that program 01:59:05.620 |
and concurrency is part reality, part illusion. 01:59:11.820 |
that there is multiple copies of the hardware. 01:59:15.700 |
- You write that implementing synchronization primitives 01:59:23.580 |
Why is it hard to implement synchronization primitives? 01:59:29.980 |
our brains are not trained to sort of keep track 01:59:39.380 |
Like, obviously you can walk and chew gum at the same time, 01:59:45.980 |
that require only a little bit of your conscious activity, 01:59:59.540 |
or you'll miss an essential clue in the TV show. 02:00:12.700 |
is responsible for writing the code correctly, 02:00:17.500 |
and it's hard enough to keep track of a recipe 02:00:27.660 |
Chop the carrots, then peel the potatoes, mix the icing. 02:00:40.780 |
Okay, we're loading the number of mermaids in variable A, 02:01:12.540 |
that are sort of being executed simultaneously, 02:01:17.060 |
whether it's using the parallel or the concurrent approach, 02:01:41.540 |
if first you do your mermaid merpeople computation, 02:01:45.060 |
and then you do your people in the boat computation, 02:01:48.380 |
it doesn't matter that the variables are called A and B, 02:01:53.540 |
because you're done with one use of that variable. 02:02:04.340 |
and your computation goes dramatically wrong. 02:02:08.100 |
- And there's all kinds of ordering of operations 02:02:11.940 |
that could result in the assignment of those variables, 02:02:14.380 |
and so you have to anticipate all possible orderings. 02:02:30.860 |
- So a lock is a mechanism by which you forbid 02:02:42.420 |
- And then semaphores allow you to do what, multiple ovens? 02:02:52.220 |
and you have multiple people all baking cakes, 02:02:56.940 |
then maybe you can tell that the oven is in use, 02:03:01.940 |
And so maybe you make a sign that says, "Oven in use," 02:03:25.380 |
someone who comes in wants to see at a glance, 02:03:29.060 |
and maybe there's an electronic sign that says, 02:03:34.700 |
Or maybe there are already three people waiting for an oven, 02:03:40.860 |
so you can, if you see an oven that's not in use, 02:03:49.380 |
And that's sort of what the restaurant metaphor 02:04:03.020 |
to what degree can any of these ideas be integrated and not. 02:04:15.780 |
- Wow, yeah, so we had this really old library 02:04:27.700 |
and networking IO was especially sort of a popular topic. 02:04:38.860 |
we had a brief period where there was lots of development, 02:04:45.100 |
and I think it was late '90s, maybe early 2000s, 02:04:53.260 |
that were the state of the art of doing asynchronous IO, 02:05:13.820 |
or reading and writing to a hard drive, to storage. 02:05:17.700 |
- And you can do the ideas you could do to multiple 02:05:24.940 |
So running some code that does some fancy stuff. 02:05:28.100 |
- Yeah, like when you're writing a web server, 02:05:32.740 |
a user sort of needs to see a particular web page, 02:05:37.100 |
you have to find that page maybe in the database 02:05:40.580 |
and format it properly and send it back to the client, 02:05:46.540 |
waiting for the database, waiting for the network, 02:05:51.500 |
or millions of requests concurrently on one machine. 02:05:55.700 |
Anyway, ways of doing that in Python were kind of stagnated, 02:06:00.460 |
and I forget, it might've been around 2012, 2014, 02:06:05.460 |
when someone for the umpteenth time actually said, 02:06:18.500 |
are not quite enough to solve my particular problem, 02:06:35.940 |
about what the right third-party library was. 02:06:39.020 |
And somehow I felt that there was actually a cue for, 02:06:44.020 |
well, maybe we need a better state-of-the-art module 02:06:50.620 |
in the standard library for multiplexing input/output 02:06:57.540 |
You could say that it spiraled out of control a little bit, 02:07:03.380 |
Python enhancement proposal that was ever proposed. 02:07:09.060 |
- At the time, I was very much involved with that, 02:07:18.780 |
who had already developed serious third-party libraries 02:07:30.860 |
and eventually we put it in the standard library, 02:07:51.300 |
- So initially, what are some of the design challenges there 02:07:58.460 |
what are some things that got accepted to stand out to you? 02:08:06.980 |
and this happens sort of at an architectural level 02:08:23.180 |
say a connection with a web browser that's your client, 02:08:26.660 |
and say you're waiting for an incoming request. 02:08:45.100 |
like a packet came in on that network connection. 02:08:58.660 |
and we can only manage one web connection at a time, 02:09:09.940 |
and I'm just blocked until something comes in, 02:09:23.620 |
no, sort of, I'm waiting for the next packet. 02:09:29.180 |
One is a paradigm where there is sort of notionally 02:09:35.420 |
whether it's an actual operating system thread 02:09:37.900 |
or more an abstraction in async IO, we call them tasks. 02:09:41.340 |
But a task in async IO or a thread in other contexts 02:09:54.620 |
like first wait for the first line of the web request, 02:09:58.900 |
parse it, because then you know if it's a get or a post 02:10:14.540 |
and then wait for the rest of the data to come in 02:10:33.140 |
where I just have a whole bunch of stacks in front of me, 02:10:45.540 |
and I say, "Oh, that packet goes on this pile," 02:10:51.260 |
and then sort of that pile provides my context, 02:10:57.820 |
I sort of, I can forget everything about what's going on, 02:11:13.500 |
I can toss it away or use it for a new space. 02:11:16.580 |
But several traditional third-party libraries 02:11:29.500 |
of different stacks of paper in front of you, 02:11:38.660 |
And that leads to a certain style of spaghetti code 02:11:44.580 |
that I find sort of aesthetically not pleasing, 02:12:01.840 |
It was very prevalent in JavaScript at the time at least, 02:12:06.340 |
because it was like how the JavaScript event loop 02:12:25.940 |
And I thought, I want to build a whole library 02:12:41.900 |
and tried to see how far I could get with that. 02:12:45.820 |
And it turns out that it's a pretty good paradigm. 02:12:48.980 |
- So people enjoy that kind of paradigm programming 02:12:58.620 |
So how does that all interplay with the infamous GIL, 02:13:08.620 |
and how does it dance beautifully with asyncio? 02:13:12.060 |
- The Global Interpreter Lock solves the problem 02:13:19.740 |
with either asynchronous or parallelism in mind at all. 02:13:49.820 |
that lets you do multiple things in parallel. 02:13:57.260 |
which is the operating system handles the threads for you. 02:14:01.720 |
And the program can pretend that there are as many CPUs 02:14:13.220 |
And those CPUs work completely independently. 02:14:20.380 |
the operating system sort of simulates those extra CPUs. 02:14:40.860 |
And so as libraries for multithreading were added to C, 02:15:04.300 |
Because they seemed at the time in the early '90s, 02:15:09.980 |
they seemed a cool, interesting programming paradigm. 02:15:16.020 |
at least at the time, felt was nice about the language 02:15:23.900 |
of all kinds of cool new operating system toys 02:15:30.220 |
Like I remember one or two years before threading, 02:15:36.140 |
I had spent some time adding networking sockets to Python. 02:15:46.940 |
that were in the BSD operating system, so Unix BSD. 02:15:50.480 |
But the nice thing was if you were using sockets from Python, 02:15:55.460 |
then all the things you can do wrong with sockets in C 02:15:59.100 |
would automatically give you a clear error message 02:16:07.140 |
"Well, we'll do the same thing with threading." 02:16:10.100 |
But we didn't really want to rewrite the interpreter 02:16:17.220 |
because that would be a very complex refactoring 02:16:22.220 |
of all the interpreter code and all the runtime code, 02:16:27.500 |
because all the objects were written with the assumption 02:16:32.300 |
And so we said, "Okay, well, we'll take our losses. 02:16:35.940 |
We'll provide something that looks like threads. 02:16:39.860 |
And as long as you only have a single CPU on your computer," 02:16:48.540 |
Because the whole idea of multiple threads in the OS 02:16:53.540 |
was that even if your computer only had one CPU, 02:16:57.420 |
you could still fire up as many threads as you wanted. 02:17:01.020 |
Well, within reason, maybe 10 or 12, not 5,000. 02:17:22.860 |
And then, of course, a couple of more iterations 02:17:26.500 |
of Moore's law, and computers getting faster. 02:17:29.600 |
And at some point, the chip designers decided 02:17:40.060 |
And so they could put multiple CPUs on one chip. 02:17:49.520 |
And that's where the solution we had in Python didn't work. 02:17:55.340 |
And that's sort of the moment that the GIL became infamous. 02:18:09.020 |
and share it between all the different operating system 02:18:14.920 |
And so as long as the hardware physically only had one CPU, 02:18:21.600 |
And then as hardware vendors were suddenly telling us all, 02:18:30.120 |
People started saying, "Oh, but we can use multiple threads 02:18:35.680 |
And then they discovered, "Oh, but actually all threads 02:18:42.640 |
I mean, is there a way, is there ideas in the future 02:18:52.240 |
some tricky interpreters on top of interpreters 02:18:57.680 |
- Yeah, there are a couple of possible futures there. 02:19:02.520 |
The most likely future is that we'll get multiple 02:19:07.280 |
sub-interpreters, which each run a completely 02:19:15.220 |
- But there's still some benefit of sort of faster 02:19:25.060 |
- But it's also managing for you this running 02:19:33.540 |
- It's hidden from you, but you have to spend more time 02:19:39.180 |
Because the sort of, the attractive thing about the 02:19:43.960 |
multi-threaded model is that the threads can share objects. 02:19:48.860 |
At the same time, that's also the downfall of the 02:19:53.900 |
Because when you do share objects, you weren't, 02:19:58.260 |
and you didn't necessarily intend to share them, 02:20:01.460 |
or there were aspects of those objects that were not 02:20:06.460 |
reusable, you get all kinds of concurrency bugs. 02:20:11.420 |
And so the reason I wrote that little blog post 02:20:15.880 |
about semaphores was that concurrency bugs are just harder. 02:20:20.360 |
It would be nice if Python had no global interpreter lock, 02:20:28.540 |
but it would also cause a lot more software bugs. 02:20:34.380 |
The interesting thing is that there is still a possible 02:20:39.080 |
future where we are actually going to, or where we could 02:20:43.220 |
experiment at least with that, because there is a guy 02:20:48.220 |
working for Facebook who has developed a fork of CPython 02:20:54.240 |
that he called the no-gill interpreter, where he removed 02:21:00.220 |
the gill and made a whole bunch of optimizations 02:21:08.300 |
too much slower, and multi-threaded case will actually 02:21:15.740 |
And so that would be an interesting possibility 02:21:22.680 |
if we would be willing as Python core developers 02:21:34.920 |
And if we're willing to put up with the additional 02:21:38.540 |
complexity of the interpreter and the additional 02:21:42.300 |
sort of overhead for the single-threaded case. 02:21:49.080 |
there are enough people needing the speed of multiple 02:21:56.720 |
threads with their Python programs that it's worth 02:22:03.640 |
to sort of take that performance hit and that complexity hit. 02:22:08.640 |
And I feel that the gill actually is a pretty nice 02:22:13.720 |
Goldilocks point between no threads and all threads 02:22:18.720 |
all the time, but not everybody agrees on that. 02:22:24.760 |
The sub-interpreters look like a fairly safe bet for 3.12. 02:22:45.640 |
Now, before you say it's currently a joke and probably not, 02:22:55.040 |
can you imagine possible features that Python 4.0 02:23:02.360 |
might have that would necessitate the creation 02:23:07.200 |
of the new 4.0 given the amount of pain and joy, 02:23:12.200 |
suffering and triumph that was involved in the move 02:23:36.720 |
which is one reason that sort of everybody is happy 02:23:43.280 |
that we've decided there's not going to be a 4.0 02:23:52.800 |
we'll sort of plan the transition very differently. 02:24:00.080 |
that transition caused for our users in the Python 3 case. 02:24:05.080 |
And had we known we could have sort of designed 02:24:10.720 |
Python 3 somewhat differently without making it any worse, 02:24:25.600 |
were capable of when it comes to that kind of transition. 02:24:32.080 |
like a year and a half before the Python 2 officially-- 02:24:49.360 |
- Everyone on the core team had basically moved 02:24:54.840 |
- It was purely, it was a little symbolic moment 02:25:03.520 |
that there was no longer going to be any new releases 02:25:45.960 |
- So that was a very difficult decision to cancel it, 02:25:51.680 |
So anyway, if we're going to have a Python 4, 02:25:54.600 |
we're going to have to have both a different reason 02:26:07.400 |
so I think you're implying that if there is a 4.0, 02:26:11.280 |
in some ways it would break back compatibility? 02:26:14.920 |
- Well, so here is a concrete thought I've had, 02:26:20.640 |
and I'm not unique, but not everyone agrees with this, 02:26:26.360 |
If we were to try something like that Nogill Python, 02:26:32.400 |
my expectation is that it would feel just different enough, 02:26:40.600 |
at least for the part of the Python ecosystem 02:26:54.480 |
and that is like the entire machine learning, 02:27:06.400 |
And so those people would likely feel the pain the most 02:27:25.720 |
we could even say, suppose that after Python say 3.19, 02:27:33.720 |
Suppose that's the time when we flip the switch to 4.0, 02:27:43.520 |
So I would probably say that particular year, 02:27:48.520 |
the release that we name 4.0 will be syntactically, 02:27:54.840 |
it will not have any new syntactical features, 02:28:12.520 |
However, extension modules will have to make a change. 02:28:21.720 |
They will not have the same binary interface. 02:28:40.200 |
And so for a pure Python user, 4.0 would be a breeze, 02:28:45.200 |
except that there are very few pure Python users left 02:28:52.560 |
for something significant is using third-party extensions. 02:28:58.480 |
several hundreds of thousands of third-party extensions 02:29:23.680 |
- So there you can give a huge heads up to them 02:29:26.520 |
if you go to 4.0 to really keep developing it. 02:29:30.400 |
- Yeah, we'd probably have to do something like 02:29:32.800 |
several years before, who knows, maybe five years earlier, 02:30:00.760 |
you have to recompile Python from source for your platform 02:30:06.240 |
All you have to do is change one configuration variable 02:30:09.760 |
and then you just run make or configure and make 02:30:35.720 |
that's not a very practical thing for Python users, 02:31:15.960 |
where the Python 4 is more and more imminent. 02:31:28.280 |
that works for Nogail Python for that new API. 02:31:33.880 |
And then sort of Python 4.0 is like the official moment 02:31:38.880 |
that the mayor comes out and cuts the ribbon. 02:31:47.640 |
is the default and maybe the only mode there is. 02:32:09.360 |
In your opinion, are there must-have PyPI libraries 02:32:15.240 |
- Oh my, I should really have a standard answer 02:32:19.800 |
for that question, but like a positive standard answer. 02:32:30.080 |
When I write Python code, I'm usually developing 02:32:43.440 |
So I tend to just use the standard library and-- 02:32:46.920 |
- That's where your focus is, that's where your mind is. 02:32:58.360 |
It's a good kind of landscape of what's missing 02:33:17.800 |
or maybe possibly multiple third-party implementations, 02:33:25.760 |
than they could when they're in the standard library. 02:33:33.040 |
to incorporate things like that in the standard library. 02:33:38.200 |
So I like that there is a lively package ecosystem 02:33:41.880 |
and that sort of recent trends in the standard library 02:33:56.840 |
that have not had a lot of change in a long time 02:34:02.040 |
and that maybe would be better off not existing 02:34:24.080 |
If you look through the commit history, it's very sad. 02:34:29.240 |
All cosmetic changes, like changes in the indentation style 02:34:34.760 |
or the name of this other standard library module 02:34:42.800 |
The API is identical to what it was 20 years ago. 02:34:47.320 |
- So speaking of packages, they have a lot of impact 02:34:54.000 |
Does it make sense to you why Python has become 02:35:00.920 |
So packages like PyTorch, TensorFlow, Scikit-learn, 02:35:05.160 |
and even like the lower level stuff like NumPy, SciPy, 02:35:11.080 |
Can you like, does it make sense to you why it, 02:35:21.080 |
- Well, part of it is an effect that's as simple 02:35:25.320 |
as we're all driving on the right side of the road, right? 02:35:34.200 |
- It's, and part of it is not quite as fundamental 02:35:54.680 |
that it really looked like Perl was going to dominate 02:35:58.280 |
like biosciences, because DNA search was all based 02:36:02.600 |
on regular expressions and Perl has the fastest 02:36:05.120 |
and most comprehensive regular expression engine, still does. 02:36:14.040 |
Letting go of this kind of data processing system. 02:36:19.040 |
- The reasons why Python became the lingua franca 02:36:24.520 |
of scientific code and machine learning in particular 02:36:47.520 |
in the sort of computing division wrote me his memoirs 02:36:52.520 |
and he had his own view of how he helped something 02:37:00.160 |
he called computational steering into existence. 02:37:04.880 |
And this was the idea that you take libraries 02:37:30.440 |
specific applications and answer different questions. 02:37:39.960 |
to use say Fortran because Fortran was the language 02:37:47.120 |
And then the scientists would have to write an application 02:37:51.360 |
that sort of uses the library to solve a particular equation 02:37:59.560 |
And the same for C++ because there's interoperability. 02:38:06.080 |
So the dusty decks are written either in C++ or Fortran. 02:38:10.720 |
And so Paul Dubois was one of the people who, 02:38:31.400 |
mathematical algorithms of linear algebra and other stuff. 02:38:36.000 |
And so gradually some libraries started appearing 02:38:55.840 |
I thought that was like an outdated data type 02:39:02.800 |
and like Python was good and fast at string manipulation 02:39:09.920 |
were not very efficient and the multidimensional arrays 02:39:19.880 |
that Python had extensibility that was flexible enough 02:39:39.640 |
through sort of different parts of the scientific community. 02:39:44.640 |
I remembered that the Hubble Space Telescope people 02:39:47.920 |
in Baltimore were somehow big Python fans in the late '90s. 02:39:52.800 |
And at various points, small improvements were made 02:39:57.800 |
and more people got in touch with using Python 02:40:02.680 |
to derive these libraries of interesting algorithms. 02:40:14.880 |
say they're all working on stuff that comes in 02:40:28.240 |
but the underlying libraries are still the same. 02:40:39.480 |
or I wrote a Python library to solve this class of problems. 02:40:43.960 |
And the other guys either say, oh, I can use that library too 02:40:48.440 |
or if you make a few changes, I can use that library too. 02:40:59.400 |
for arrays of numbers yet, whereas in Python you have it. 02:41:04.720 |
And so more and more scientists at different places 02:41:16.360 |
for an important new fundamental library decided, 02:41:20.040 |
oh, Python is actually already known to our users. 02:41:28.880 |
I think that's how Tensor, I imagine at least 02:41:37.920 |
there's a deeper history of what the community, 02:41:42.840 |
so it's not just like what packages it needs. 02:41:50.360 |
had a prior library that was internal to Google 02:41:55.160 |
but there was also competing machine learning frameworks 02:42:08.820 |
And it's interesting because there's other languages 02:42:16.560 |
that a lot of people used but different design choices 02:42:26.000 |
And one of the choices of MATLAB by MathWorks 02:43:11.800 |
With Python it feels like you can build a package 02:43:16.120 |
and get excited about sharing that package with others. 02:43:19.040 |
And that creates an excitement about a language. 02:43:22.300 |
- I tend to like Python's approach to open source 02:43:34.920 |
There's obviously some because like you all need to decide 02:43:38.960 |
whether you drive on the left or the right side 02:43:42.680 |
But there is a lot of access for people with little power. 02:43:47.340 |
You don't have to work for a big tech company 02:43:52.500 |
We have affordable events that really care about community 02:44:21.580 |
They do some, but most of the money that the PSF forks out 02:44:39.940 |
it was just after you stepped down from your role 02:44:47.500 |
Looking back, what are your insights and lessons 02:44:52.440 |
about Python developer community, about human nature, 02:45:07.320 |
I remember being just extremely stressed for a long time 02:45:13.800 |
and it wasn't very clear to me what was leading, 02:45:30.920 |
I should have sort of relinquished my central role 02:45:39.080 |
- What were the pros and cons of the BDFL role? 02:45:42.880 |
Like what were the, you not relinquishing it, 02:45:45.320 |
what are the benefits of that for the community? 02:45:50.560 |
- Well, the benefits for the community would be things like 02:45:58.920 |
clarity of vision and sort of a clear direction 02:46:03.920 |
because I had certain ideas in mind when I created Python. 02:46:19.980 |
and became more successful and more complex and more used, 02:46:47.780 |
It modeled to the community how to think about 02:46:58.560 |
- It was a source of stress for me personally, 02:47:08.520 |
had learned how I was thinking and could predict 02:47:13.080 |
but how I would decide about a particular issue 02:47:33.280 |
and they roll all that back and do those kinds of things. 02:47:36.640 |
There is a clear, fairly straight path ahead. 02:47:45.080 |
with the steering council has sort of found a similar way 02:47:50.680 |
of leading the community in a fairly steady direction 02:48:03.640 |
Yeah, oh yeah, there's a bug in multi-processing. 02:48:07.800 |
Let someone else decide whether that's important 02:48:18.640 |
- Yeah, it allows you to focus a little bit more. 02:48:21.640 |
- What are interesting differences in culture 02:48:25.120 |
if you can comment on between Google, Dropbox 02:48:27.480 |
and Microsoft from a Python programming perspective, 02:48:32.920 |
Is there a difference or is it just about people 02:48:41.760 |
- So Dropbox is much smaller than the other two 02:48:52.600 |
- The set of products they provide is narrower 02:49:03.720 |
had the tendency of sort of making a big plan, 02:49:08.720 |
putting the whole company behind that plan for a year 02:49:12.400 |
and then evaluate and then suddenly find that 02:49:19.960 |
and then they had to do something completely different. 02:49:22.800 |
So there was like the annual engineering reorg 02:49:28.480 |
was sort of an unpleasant tradition at Dropbox 02:49:31.800 |
because like, oh, there's a new VP of engineering 02:49:34.520 |
and so now all the directors are being reshuffled 02:49:37.280 |
and this guy was in charge of infrastructure one year 02:49:49.600 |
you don't think about these companies internally 02:49:57.440 |
There's certain like programs and online services 02:50:04.360 |
but one of the powers of those kinds of services, 02:50:08.680 |
You're not supposed to think about how it all works 02:50:19.120 |
and like don't have to worry about conflicts. 02:50:23.440 |
you know, as a person that comes from a version 02:50:30.120 |
and just keeping different versions of different files 02:50:34.120 |
The fact that they could take care of that is just, 02:50:36.920 |
The engineering behind the scenes must be super difficult 02:50:40.440 |
both on the compute infrastructure and the software. 02:50:49.100 |
but the product itself always worked very smoothly. 02:50:54.560 |
Well, there's probably a lot of lessons to that. 02:50:59.920 |
but if the product is good, the product is good 02:51:06.840 |
it's like with Google, focus on the search and the ads. 02:51:19.640 |
in what ways do you provide value and happiness 02:51:25.700 |
Is there something else to say about Google and Microsoft? 02:51:29.680 |
Microsoft has had a very fascinating shift recently 02:51:50.400 |
that I would stay retired for the rest of my life, 02:52:01.880 |
that work can also provide a source of fulfillment, 02:52:30.000 |
I mean, I've been talking to a bunch of Excel people lately 02:52:49.840 |
there've been so many incredible tools through the years. 02:52:57.640 |
is that I've never learned how to use Excel well. 02:53:02.080 |
I mean, it just always felt like so many features are there. 02:53:11.200 |
to the dumbest way to use a thing to get the job done 02:53:14.120 |
when clearly there's so much more power at your fingertips. 02:53:18.600 |
- But I do think there's probably expert users of Excel. 02:53:39.720 |
- Okay, now I need to definitely learn Excel a little better. 02:53:47.760 |
I mean, Microsoft sometimes, it's changed over the years, 02:53:51.420 |
but sometimes they kind of want to make things easier 02:54:00.600 |
that like to have shortcuts and all that kind of stuff 02:54:05.120 |
Now, Excel's probably, people are probably yelling at me. 02:54:07.760 |
It's like, no, Excel probably has a lot of ways 02:54:21.040 |
And now, like I'm embarrassed that it's just-- 02:54:32.680 |
because they go back even longer than 35 years. 02:54:39.880 |
and how hard it is for a CEO to sort of pivot a company 02:54:43.480 |
towards open source, towards developer culture? 02:54:48.200 |
what's the role of leadership in such a pivot 02:54:54.120 |
- I've never met him, but I hear he's just a really sharp, 02:55:03.800 |
but he also has an incredible business sense. 02:55:09.000 |
He took the organization that had very solid pieces, 02:55:23.640 |
I imagine in part through his personal charm and thinking, 02:55:36.160 |
and sort of change it from openly hostile to open source 02:55:51.440 |
but that means that there's room for a product like VS Code, 02:56:07.720 |
'cause it gets harder and harder as the company gets large. 02:56:10.720 |
You wrote a blog post in response to a person 02:56:13.960 |
looking for advice about whether with a CS degree 02:56:16.640 |
to choose a nine to five job or to become an entrepreneur. 02:56:23.000 |
If you just think from first principles right now, 02:56:26.200 |
somebody has took a few years in programming, 02:56:31.040 |
in some sense creating Python is an entrepreneurial endeavor. 02:56:40.640 |
Do I work for a big company or do I create something new? 02:56:54.520 |
- Yeah, I mean, big companies have individuals 02:56:58.600 |
who create new stuff that eventually grows big all the time. 02:57:03.600 |
- And if you're the person that creates a new thing 02:57:08.360 |
to move up quickly in the company to run that thing. 02:57:11.280 |
- If that's your aspiration, what can also happen 02:57:19.320 |
and sort of builds a great first version of a product 02:57:25.320 |
and has no aspirations to then become a manager 02:57:30.320 |
and grow the team from five people to 20 people 02:57:40.400 |
And they move on to inventing another crazy thing 02:57:51.200 |
or they move to a different great large or small company. 02:57:58.560 |
And sometimes people sort of do have this whole trajectory 02:58:07.520 |
not nine to five, but more like noon till midnight, 02:58:13.360 |
seven days a week, and coming up with a product 02:58:22.840 |
I mean, if you take Drew Houston, Dropbox's founder, 02:58:47.200 |
if he always aspired that, I think when he was 16, 02:59:00.600 |
sort of skillset needed to grow and stay on top. 02:59:05.600 |
And other people sort of are brilliant engineers 02:59:12.200 |
I count myself at least in the second category. 02:59:19.400 |
is to be the quote unquote individual contributor. 02:59:24.240 |
- Do you have advice for a programming beginner 02:59:32.520 |
- Find something you actually want to do with it. 02:59:50.720 |
and it can be a crazy problem you want to solve. 03:00:03.680 |
into actually learning coding in some language. 03:00:12.760 |
you can look for, like that doesn't have to be 03:00:30.560 |
- Nowadays, you can take machine learning components 03:00:49.400 |
can get you to start using pre-trained models 03:00:56.200 |
'cause you learn just enough to run this model, 03:01:04.200 |
how to write basic I/O, how to run functions. 03:01:13.280 |
but it could be nice to just fall in love first 03:01:31.440 |
where he said, "I see all these ads for things 03:01:40.320 |
And he said, "The goal should be learn Python in 10 years." 03:01:45.240 |
- That's hilarious, but I completely disagree with that. 03:01:51.480 |
the places just like the blog post from earlier, 03:01:58.880 |
they're actually usually really bad tutorials. 03:02:01.040 |
So the thing is, I do believe that you can learn a thing 03:02:05.360 |
in an hour to get some interesting, quick, it hooks you. 03:02:11.680 |
But it just takes a tremendous amount of skill 03:02:16.080 |
Richard Feynman was able to condense a lot of ideas 03:02:25.440 |
I think the 10 years is about the experience, 03:02:29.800 |
the pain along the way, and there's something fundamental. 03:02:33.200 |
You can memorize the syntax, but, well, I couldn't, 03:02:42.240 |
- Yeah, actually, coding has changed in fascinating ways 03:02:53.560 |
And I don't wanna talk down to that kind of style of coding 03:03:20.000 |
in a line of text that otherwise it generated perfectly. 03:03:34.120 |
And so begin is blah, blah, blah, search for begin. 03:03:46.520 |
and it completes the whole line with end instead of begin. 03:03:52.720 |
Sometimes it sort of, if I name my function right, 03:04:13.080 |
I'm very much appreciative of all the typing it does for me. 03:04:18.080 |
Much better actually than the previous generation 03:04:23.360 |
of suggestions that are also still built in VS Code 03:04:29.960 |
it tries to guess what the type is of the variable 03:04:37.320 |
a pop down menu of what the attributes of that object are. 03:04:42.120 |
But Copilot is much, much smoother than that. 03:04:49.320 |
Do you think, do you worry about the future of that? 03:04:56.400 |
the increasing amount of that kind of capability, 03:05:03.600 |
or is there still a significant role for humans? 03:05:14.640 |
and you shouldn't try to use it to do something 03:05:18.920 |
that you have no way of understanding what you're doing yet. 03:05:32.920 |
which I could do, I could look up how to do it, 03:05:47.320 |
Does it use a builder object or a constructor 03:06:05.040 |
what you want the code to do is totally yours. 03:06:16.400 |
You ever imagine a future of human civilization 03:06:29.440 |
- It'll eventually become sort of a legacy language 03:06:41.840 |
just like all kinds of basic structures in biology, 03:06:51.120 |
- So it permeates all of life, all of digital life, 03:06:57.680 |
and they only know the stuff that's on top of it. 03:07:12.120 |
- Yeah, or even think about it or even learn about it, 03:07:38.240 |
I learned some of the basic, at least concepts 03:07:57.440 |
And I can forget about all that most of the time, 03:08:00.080 |
but I sort of, I enjoy knowing, oh, if you go deeper, 03:08:11.760 |
And when it comes to the point of how do you actually 03:08:20.280 |
- But you enjoy knowing that you can walk a while 03:08:23.640 |
towards the lower and lower layers, but you don't need to. 03:08:28.360 |
- The other day as a sort of a mental exercise, 03:08:49.280 |
Yeah, there's like this electromagnetic force 03:08:55.120 |
And you can have like, it can open one switch 03:09:00.160 |
and shut another, and you can have multiple contacts 03:09:07.200 |
And how many relays do I really need to sort of represent 03:09:14.560 |
And it was, I don't think I got to the final solution, 03:09:18.480 |
but it was fun that I could still do a little bit 03:09:23.400 |
of problem solving and thinking at that level. 03:09:26.960 |
- And it's cool how we build on top of each other. 03:09:33.520 |
and there's others who'll stand on your shoulders, 03:09:38.360 |
- Yeah, I feel I sort of covered this middle layer 03:09:41.920 |
of the technology stack where it sort of peters out 03:10:00.800 |
that will help us understand the lowest layer 03:10:03.120 |
of the physics, and thereby the universe figures out 03:10:29.400 |
incredible parallel operations like image recognition. 03:10:35.840 |
Does huge amount of processing that goes on in parallel. 03:10:40.120 |
There's lots of nerves between my eyes and my brain, 03:10:43.560 |
and the brain does a whole bunch of stuff all at once, 03:10:48.960 |
but there are many of them that all work together. 03:10:57.220 |
I have to sort of string words together one at a time, 03:11:09.100 |
I'm also thinking of everything like one step at a time. 03:11:13.880 |
And so we've sort of, we've got all this incredible 03:11:26.680 |
a single threaded, much, much higher level interpreter. 03:11:31.680 |
- That's exactly, I mean, that's the illusion of it. 03:11:39.280 |
that it's a single sequential set of thoughts, 03:11:53.240 |
The information and how to use that information 03:12:03.040 |
And so you don't buy a computer, you buy like a-- 03:12:21.560 |
It gets stale, but gives birth to young computers 03:12:32.560 |
And those computers, when they go to college, 03:12:40.680 |
increasingly higher and higher levels of abstractions. 03:12:46.700 |
you see the same thing appearing at different levels, though, 03:13:02.400 |
but then the animal, or the plant, or the human, 03:13:11.040 |
that is sort of connected in a very complicated way 03:13:16.040 |
to the mechanism of replication of the cells. 03:13:22.600 |
if you see how DNA and proteins are connected, 03:13:26.640 |
then there is yet another completely different mechanism 03:13:33.520 |
using enzymes and a little bit of code from DNA. 03:13:39.640 |
And of course, viruses break into it at that level. 03:13:44.160 |
- And while the mechanisms might be different, 03:13:46.940 |
it seems like the nature of the mechanism is the same, 03:14:03.560 |
and then all the way down to the single-cell organisms. 03:14:07.720 |
- It is fascinating to see what abstraction levels 03:14:18.600 |
that sort of have a similar self-preservation, 03:14:24.500 |
I don't know what it is, instinct, nature, abstraction, 03:14:33.400 |
- And they self-replicate and breed in different ways. 03:14:48.480 |
the higher-level organism of human civilization 03:14:51.720 |
as part of this bigger organism of life on Earth itself. 03:14:55.480 |
In fact, that could be an organism just alone, 03:15:03.960 |
both philosophical and technical conversation. 03:15:13.240 |
and one of the earliest first people I've talked to, 03:15:18.520 |
It's just a huge honor that you did it at that time, 03:15:28.600 |
please check out our sponsors in the description. 03:15:31.240 |
And now, let me leave you with some words from Oscar Wilde. 03:15:39.080 |
Thank you for listening, and hope to see you next time.