back to indexBjarne Stroustrup: C++ | Lex Fridman Podcast #48
Chapters
0:0 Introduction
1:40 First program
2:18 First programming language
4:21 Type system
6:18 Programming languages
10:14 Objectoriented programming
13:20 Lisp
16:45 Languages
22:27 Larger code bases
25:7 Efficiency and reliability
27:32 Safety and reliability
29:27 Simplification
31:52 Code review
35:52 Static analysis
39:27 What is static analysis
41:16 How do you design
47:12 The magic of C
50:1 Different implementations of C
54:0 Key features of C
58:4 Inheritance
00:00:00.000 |
The following is a conversation with Bjorn Stroustrup. 00:00:03.120 |
He's the creator of C++, a programming language that after 40 years 00:00:08.240 |
is still one of the most popular and powerful languages in the world. 00:00:12.480 |
Its focus on fast, stable, robust code underlies many of the biggest systems in 00:00:17.440 |
the world that we have come to rely on as a society. 00:00:20.720 |
If you're watching this on YouTube, for example, many of the critical back-end 00:00:24.480 |
components of YouTube are written in C++. Same goes for 00:00:28.560 |
Google, Facebook, Amazon, Twitter, most Microsoft 00:00:32.240 |
applications, Adobe applications, most database systems, and most physical 00:00:37.440 |
systems that operate in the real world, like cars, robots, rockets that launch us 00:00:42.560 |
into space, and one day will land us on Mars. 00:00:46.480 |
C++ also happens to be the language that I use more than any other in my life. 00:00:52.640 |
I've written several hundred thousand lines of C++ source code. 00:00:56.560 |
Of course, lines of source code don't mean much, but they do give hints of my 00:01:01.520 |
personal journey through the world of software. I've enjoyed watching the 00:01:05.440 |
development of C++ as a programming language, 00:01:08.320 |
leading up to the big update in a standard in 2011, 00:01:12.480 |
and those that followed in '14, '17, and toward the new C++20 00:01:18.160 |
standard, hopefully coming out next year. This is 00:01:21.840 |
the Artificial Intelligence Podcast. If you enjoy it, 00:01:24.880 |
subscribe on YouTube, give it five stars on iTunes, 00:01:28.000 |
support it on Patreon, or simply connect with me on Twitter 00:01:31.440 |
at Lex Friedman, spelled F-R-I-D-M-A-N. And now, here's my conversation with Bjorn 00:01:38.240 |
Stroustrup. What was the first program you've ever 00:01:42.400 |
written? Do you remember? It was my second year in 00:01:46.880 |
university, first year of computer science, and it 00:01:56.720 |
super ellipse and then connected points on the perimeter, creating star 00:02:12.800 |
Yeah, I learned to program the second year in university. 00:02:21.360 |
if I may ask it this way, that you fell in love with? 00:02:34.000 |
Snowball, I remember Fortran, didn't fall in love with that, I remember 00:02:40.320 |
Pascal, didn't fall in love with that, it all 00:02:43.440 |
got in the way of me. And then I discovered Assembler, and 00:02:48.160 |
that was much more fun. And from there I went to 00:02:53.200 |
Microcode. So you were drawn to the, you found the low-level stuff 00:03:00.240 |
beautiful. I went through a lot of languages and 00:03:03.840 |
then I spent significant time in Assembler and 00:03:11.040 |
profitable things, and I paid for my master's actually. 00:03:15.360 |
And then I discovered Simula, which was absolutely great. 00:03:19.520 |
Simula? Simula was the extension of Alcol 60, 00:03:25.920 |
done primarily for simulation, but basically they invented object-oriented 00:03:30.400 |
programming at inheritance and runtime polymorphism 00:03:37.520 |
And that was the language that taught me that you could have 00:03:43.760 |
the sort of the problems of a program grow with size of the program rather 00:03:52.160 |
program. That is, you can actually modularize 00:04:00.640 |
It was also a surprise to me that a stricter type system than Pascal's 00:04:07.440 |
was helpful, whereas Pascal's type system got in my way all the time. 00:04:13.040 |
So you need a strong type system to organize your code well, but it has to 00:04:19.360 |
be extensible and flexible. Let's get into the details a little bit. 00:04:23.200 |
If you remember, what kind of type system did Pascal have? 00:04:27.120 |
What type system, typing system, did Alcol 60 have? 00:04:31.040 |
Basically, Pascal was sort of the simplest language that 00:04:36.320 |
Niklaus Wirt could define that served the needs of Niklaus 00:04:41.040 |
Wirt at the time. And it has a sort of a highly 00:04:50.560 |
in Pascal, it's good. And if you can't, it's not so good. 00:04:54.560 |
Whereas Simula allowed you basically to build your own type system. 00:05:07.200 |
Niklaus Wirt's world, Christen Nygård's language and 00:05:12.160 |
Ole Johan Dahl's language allowed you to build your own. 00:05:15.760 |
So it's sort of close to the original idea of you build a domain-specific 00:05:23.200 |
language. As a matter of fact, what you build is a 00:05:27.440 |
set of types and relations among types that allows you 00:05:34.800 |
application. So when you say types, stuff you're saying has echoes of 00:05:40.240 |
object-oriented programming. Yes, they invented it. Every language 00:05:44.800 |
that uses the word "class" for type is a descendant of 00:05:55.040 |
Christen Nygård and Ole Johan Dahl were mathematicians, 00:06:03.680 |
But they understood sets and classes of elements, and so they called their types 00:06:14.640 |
Simula, classes are user-defined type. So can you try the impossible task and 00:06:20.960 |
give a brief history of programming languages from your 00:06:25.360 |
perspective? So we started with Algol 60, Simula, 00:06:30.080 |
Pascal, but that's just the 60s and 70s. I can try. 00:06:36.560 |
The most sort of interesting and major improvement of programming languages was 00:06:43.600 |
Fortran, the first Fortran. Because before that, 00:06:47.120 |
all code was written for a specific machine, and each 00:06:50.720 |
specific machine had a language, a simply language or 00:06:56.240 |
macro assembler or some extension of that idea. 00:07:09.280 |
Barkas and his team at IBM built a language that 00:07:13.680 |
would allow you to write what you really wanted. 00:07:17.840 |
That is, you could write it in a language that was natural for people. 00:07:23.280 |
Now these people happened to be engineers and physicists, so 00:07:26.800 |
the language that came out was somewhat unusual for the rest of the world. 00:07:30.560 |
But basically they said "formula translation" because they wanted to have 00:07:34.240 |
the mathematical formulas translated into the machine. And as a 00:07:39.120 |
side effect, they got portability. Because now they are 00:07:44.240 |
writing in the terms that the humans used, and 00:07:48.640 |
the way humans thought. And then they had a program that 00:07:52.800 |
translated it into the machine's needs. And that was new, and that was 00:07:58.320 |
great. And it's something to remember. We want to 00:08:02.240 |
raise the language to the human level, but we don't want to lose the efficiency. 00:08:09.200 |
And that was the first step towards the human. 00:08:12.320 |
That was the first step. And of course, they were very particular kind of 00:08:16.960 |
humans. Business people were different, so they 00:08:20.160 |
got Cobol instead, and etc. And Simula came out. 00:08:25.840 |
No, let's not go to Simula yet. Let's go to Algol. 00:08:30.720 |
Fortran didn't have, at the time, the notions of... 00:08:36.640 |
not a precise notion of type, not a precise notion of scope, 00:08:41.920 |
not a set of translation phases that was what we have today. 00:08:49.520 |
Lexical, syntax, semantics. It was sort of a bit of a model in the early days, but 00:08:56.560 |
hey, they'd just done the biggest breakthrough in the history of 00:09:00.240 |
programming, right? So you can't criticize them for not having gotten all the 00:09:04.560 |
technical details right. So we got Algol. That was very pretty. 00:09:14.400 |
considered it useless, because it was not flexible enough, and it wasn't 00:09:18.800 |
efficient enough, and etc. But that was a breakthrough from a 00:09:24.480 |
technical point of view. Then Simula came along to make that idea 00:09:30.640 |
more flexible, and you could define your own types. 00:09:39.760 |
Christen Nygård, who's the main idea man behind 00:09:43.040 |
Simula. That was late 60s. This was late 60s. 00:09:46.720 |
Well, I was a visiting professor in Aarhus, and so I learned object-oriented 00:09:52.480 |
programming by sitting around and, well, in theory, 00:10:02.720 |
But Christen, once you get started and in full flow, it's very hard to get a 00:10:08.880 |
word in edgeways. You're just listening. So it was great. I learned it from there. 00:10:14.160 |
Not to romanticize the notion, but it seems like a big leap 00:10:21.840 |
It's really a leap of abstraction. Yes. And was that as 00:10:36.560 |
at the time? It was not obvious, and many people have tried to do 00:10:43.200 |
something like that, and most people didn't come up with 00:10:46.240 |
something as wonderful as Simula. Lots of people got their 00:10:51.520 |
PhDs and made their careers out of forgetting about Simula or never 00:10:56.800 |
knowing it. For me, the key idea was basically I 00:11:00.640 |
could get my own types. And that's the idea that goes further 00:11:10.640 |
more flexible types and more efficient types. But it's still the 00:11:14.160 |
fundamental idea. When I want to write a program, I want to write it with my types 00:11:19.200 |
that is appropriate to my problem and under the constraints that I'm under 00:11:29.840 |
And that's the key idea. People picked up on the class hierarchies 00:11:36.400 |
and the virtual functions and the inheritance, and 00:11:41.760 |
that was only part of it. It was an interesting and major part and still a 00:11:47.440 |
major part in a lot of graphic stuff, but it was not the most fundamental. 00:11:53.600 |
It was when you wanted to relate one type to another, you don't want 00:11:58.320 |
them all to be independent. The classical example is that you 00:12:03.920 |
don't actually want to write a city simulation with vehicles, 00:12:10.400 |
where you say, well, if it's a bicycle, write the code for turning a bicycle to 00:12:15.360 |
the left. If it's a normal car, turn right the 00:12:18.480 |
normal car way. If it's a fire engine, turn right the fire engine way. 00:12:23.520 |
You get these big case statements and bunches of if statements and such. 00:12:34.880 |
that's the vehicle and say, turn left the way you want to. 00:12:39.600 |
And this is actually a real example. They used it to simulate 00:12:51.200 |
somewhere in Norway back in the 60s. So this was one of the early examples 00:12:58.480 |
for why you needed inheritance and you needed a runtime polymorphism. 00:13:09.520 |
vehicles in a manageable way. You can't just rewrite your code each 00:13:19.920 |
Yeah, that's a beautiful, powerful idea. And of course, it stretches through your 00:13:24.080 |
work with C++, as we'll talk about. But I think you've structured it nicely. 00:13:31.440 |
What other breakthroughs came along in the history of programming languages? 00:13:39.440 |
Obviously, I'm better at telling the part of the history that 00:13:42.800 |
that is the path I'm on, as opposed to all the paths. 00:13:46.560 |
Yeah, you skipped the hippie John McCarthy in Lisp, 00:13:50.160 |
one of my favorite languages. But Lisp is not one of my favorite 00:13:58.400 |
obviously interesting. Lots of people write code in it and then 00:14:02.640 |
they rewrite it into C or C++ when they want to go to production. 00:14:06.800 |
It's in the world I'm at, which are constrained by performance, 00:14:19.680 |
cost of hardware. I don't like things to be too dynamic. 00:14:26.480 |
It is really hard to write a piece of code that's perfectly flexible, 00:14:32.480 |
that you can also deploy on a small computer, 00:14:36.240 |
and that you can also put in, say, a telephone switch 00:14:39.280 |
in Bogota. What's the chance, if you get an error and you find yourself in the 00:14:44.480 |
debugger, that the telephone switch in Bogota on 00:14:51.360 |
The chance is zero. And so a lot of things I think most about 00:14:58.240 |
can't afford that flexibility. I'm quite aware that maybe 00:15:05.680 |
70, 80 percent of all code are not under the kind of constraints 00:15:11.120 |
I'm interested in. But somebody has to do the job I'm 00:15:19.280 |
high-level flexible languages to the hardware. 00:15:23.040 |
The stuff that lasts for 10, 20, 30 years is robust, 00:15:27.280 |
operates under very constrained conditions, yes, absolutely. 00:15:30.880 |
That's right. And it's fascinating and beautiful in its own way. 00:15:34.080 |
C++ is one of my favorite languages, and so is Lisp. So I can 00:15:40.880 |
embody it, too, for different reasons as a programmer. 00:15:47.120 |
I understand why Lisp is popular, and I can see 00:15:55.200 |
Smalltalk. It's just not as relevant in my world. 00:16:05.120 |
And by the way, I distinguish between those and the functional languages 00:16:13.600 |
Different kind of languages, they have a different kind of beauty, 00:16:18.160 |
and they're very interesting. And I actually try to learn from 00:16:23.680 |
all the languages I encounter to see what is there that would make 00:16:29.920 |
working on the kind of problems I'm interested in 00:16:33.680 |
with the kind of constraints that I'm interested in, 00:16:38.800 |
what can actually be done better, because we can surely do better than we do today. 00:16:45.760 |
You've said that it's good for any professional programmer to know at 00:16:49.920 |
least five languages, speaking about a variety of languages 00:16:54.240 |
that you've taken inspiration from. And you've listed 00:16:58.800 |
yours as being, at least at the time, C++, obviously, Java, Python, 00:17:05.440 |
Ruby, and JavaScript. Can you, first of all, update that list, modify it? 00:17:12.240 |
You don't have to be constrained to just five, but can you describe what 00:17:17.520 |
you picked up also from each of these languages, how you 00:17:22.080 |
see them as inspirations for even you working with C++? 00:17:25.920 |
This is a very hard question to answer. So, about languages, you should know 00:17:33.680 |
languages. I reckon I knew about 25 or thereabouts when I did C++. 00:17:41.040 |
It was easier in those days because the languages were smaller 00:17:44.800 |
and you didn't have to learn a whole programming environment and such to do 00:17:50.000 |
it. You could learn the language quite easily. 00:17:55.920 |
I imagine, just like with natural language for communication, 00:18:02.320 |
there's different paradigms that emerge in all of them. 00:18:08.800 |
So, I picked five out of a hat. You picked five out of a hat. 00:18:12.640 |
Obviously. The important thing that the number is not one. 00:18:17.680 |
That's right. It's like, I don't like, I mean, if you're a monoglot, you are 00:18:22.880 |
likely to think that your own culture is the only one 00:18:26.000 |
superior to everybody else's. A good learning of a foreign language and a 00:18:30.160 |
foreign culture is important. It helps you think and be a 00:18:34.320 |
better person. With programming languages, you become a 00:18:37.440 |
better programmer, better designer with the second language. 00:18:41.680 |
Now, once you've got two, the wait of five is not 00:18:45.600 |
that long. It's the second one that's most important. And then when I had to 00:18:56.400 |
thinking what kinds of languages are there. Well, there's a 00:19:00.080 |
really low level stuff. It's good. It's actually good to know machine code. 00:19:08.320 |
The C++ optimizers write better machine code than I do. 00:19:13.760 |
Yes. But I don't think I could appreciate them if I actually didn't 00:19:18.560 |
understand machine code and machine architecture. 00:19:22.480 |
At least in my position, I have to understand a bit of it. 00:19:30.320 |
in performance by a factor of a hundred. Right? It shouldn't be that if you are 00:19:35.760 |
interested in either performance or the size of the computer you have to 00:19:46.000 |
I used to mention C, but these days going low level 00:19:50.240 |
is not actually what gives you the performance. 00:19:53.280 |
It is to express your ideas so cleanly that you can think about it and the 00:20:01.200 |
My favorite way of optimizing these days is to throw 00:20:04.720 |
out the clever bits and see if it still runs fast. 00:20:09.120 |
And sometimes it runs faster. So I need the abstraction mechanisms or 00:20:14.800 |
something like C++ to write compact high performance code. 00:20:20.480 |
There was a beautiful keynote by Jason Turner at the CPP Con a couple of years 00:20:32.880 |
Motorola 6800 I think it was. And he says, "Well this is relevant 00:20:39.840 |
because it looks like a microcontroller. It has specialized hardware. It has not 00:20:48.320 |
And so he shows in real time how he writes Pong 00:20:52.480 |
starting with fairly straightforward low level stuff, 00:20:57.200 |
improving his abstractions. And what he's doing 00:21:06.400 |
into 86 Assembler which you can do with Clang and you can see it in real 00:21:12.720 |
time. It's the Compiler Explorer which you can 00:21:16.800 |
use on the web. And then he wrote a little program that 00:21:23.760 |
Motorola Assembler. And so he types and you can see this thing 00:21:28.400 |
in real time. Wow. You can see it in real time and even if you can't read the 00:21:32.320 |
assembly code you can just see it. His code gets 00:21:35.840 |
better. The code, the assembler gets smaller. He 00:21:40.480 |
increases the abstraction level, uses C++ 11 as it were better. 00:21:46.560 |
This code gets cleaner. It gets easier maintainable. The code shrinks 00:21:50.720 |
and it keeps shrinking. And I could not in any reasonable amount 00:22:01.680 |
compiler generated from really quite nice modern C++. 00:22:06.560 |
And I'll go as far as to say that the thing that looked like C 00:22:10.640 |
was significantly uglier and smaller when it became and larger 00:22:18.960 |
when it became machine code. So the abstractions that can be 00:22:24.720 |
optimized are important. I would love to see that kind of 00:22:28.560 |
visualization in larger code bases. Yeah. That might be beautiful. But you can't 00:22:32.960 |
show a larger code base in a one hour talk and have it fit on 00:22:37.360 |
screen. Right. So that's C and C++. So my two languages would be machine code 00:22:42.800 |
and C++. And then I think you can learn a lot 00:22:47.440 |
from the functional languages. So Pig has GloyML. I don't care which. 00:22:57.200 |
of expressing especially mathematical notions really clearly 00:23:03.280 |
and having a type system that's really strict. 00:23:08.000 |
And then you should probably have a language for 00:23:11.360 |
sort of quickly churning out something. You could pick JavaScript. You could 00:23:17.600 |
pick Python. You could pick Ruby. What do you make of JavaScript in general? 00:23:22.560 |
So you're talking in the platonic sense about languages, about 00:23:27.600 |
what they're good at, what their philosophy of design is. But 00:23:31.920 |
there's also a large user base behind each of these languages and they use it 00:23:36.080 |
in the way sometimes maybe it wasn't really designed 00:23:39.120 |
for. That's right. JavaScript is used way beyond 00:23:42.080 |
probably what it was designed for. Let me say it this way. When you build a 00:23:46.160 |
tool, you do not know how it's going to be used. 00:23:49.520 |
You try to improve the tool by looking at how it's being used and when people 00:23:54.320 |
cut their fingers or from trying to stop that from happening. 00:23:58.320 |
But really you have no control over how something is used. 00:24:02.480 |
So I'm very happy and proud of some of the things C++ is being used at 00:24:07.040 |
and some of the things I wish people wouldn't do. 00:24:10.640 |
Bitcoin mining being my favorite example. It uses as much energy as Switzerland 00:24:16.560 |
and mostly serves criminals. Yeah. But back to the languages. 00:24:23.280 |
I actually think that having JavaScript run in the browser 00:24:28.080 |
was an enabling thing for a lot of things. Yes, you could have 00:24:33.440 |
done it better, but people were trying to do it better 00:24:39.520 |
sort of more principled language designs, but they just couldn't do it right. 00:24:44.800 |
And the non-professional programmers that write 00:24:49.760 |
lots of that code just couldn't understand them. So 00:24:53.840 |
it did an amazing job for what it was. It's not the prettiest 00:25:02.720 |
the prettiest language, but let's not be bigots here. 00:25:07.680 |
So what was the origin story of C++? You basically gave a few 00:25:14.800 |
perspectives of your inspiration of object-oriented 00:25:23.280 |
performance efficiency was an important thing you 00:25:26.640 |
were drawn to. Efficiency and reliability. Reliability. 00:25:35.600 |
really want my telephone calls to get through 00:25:39.200 |
and I want the quality of what I am talking coming out at the other end. 00:25:44.720 |
The other end might be in London or wherever. 00:25:49.840 |
And you don't want the system to be crashing. 00:25:57.040 |
It might be your bank account that is in trouble. There's different 00:26:02.640 |
constraints like in games, it doesn't matter too much if there's a crash. 00:26:06.240 |
Nobody dies and nobody gets ruined, but I'm interested in the combination of 00:26:16.320 |
speed of things being done, part of being able to do things that is 00:26:26.480 |
of larger systems. If you spend all your time 00:26:31.520 |
interpreting a simple function call, you are not going to have enough time to 00:26:37.280 |
do proper signal processing to get the telephone calls to sound right. 00:26:42.560 |
Either that or you have to have 10 times as many computers and you can't afford 00:26:46.880 |
your phone anymore. It's a ridiculous idea in the modern 00:26:51.120 |
world because we have solved all of those problems. 00:26:55.280 |
I mean they keep popping up in different ways because we 00:26:58.640 |
tackle bigger and bigger problems, so efficiency remains always an important 00:27:02.480 |
aspect. But you have to think about efficiency not just 00:27:06.240 |
as speed but as an enabler to important things and one of the things 00:27:18.080 |
When I press the pedal, the brake pedal of a car, 00:27:26.640 |
anything but a computer. That computer better work. 00:27:31.680 |
Let's talk about reliability just a little bit. So 00:27:34.960 |
modern cars have ECUs, have millions of lines of code today. 00:27:42.560 |
So this is certainly especially true of autonomous vehicles where some of the 00:27:46.640 |
aspects of the control or driver assistance systems that steer 00:27:49.840 |
the car, keeping the lanes on. So how do you think, you know, I talk to 00:27:54.720 |
regulators, people in government who are very nervous about testing the 00:27:59.440 |
safety of these systems of software. Ultimately software that makes 00:28:13.600 |
First of all, safety like performance and like security 00:28:21.200 |
is the system's property. People tend to look at one part of a system at a time 00:28:26.800 |
and saying something like, "This is secure." That's 00:28:30.880 |
all right. I don't need to do that. Yeah, that piece of code is secure. I'll 00:28:39.760 |
reliability, if you want to have performance, if you want to have 00:28:43.600 |
security, you have to look at the whole system. 00:28:46.960 |
I did not expect you to say that, but that's very true. Yes. 00:28:50.160 |
I'm dealing with one part of the system and I want my part to be 00:28:53.920 |
really good, but I know it's not the whole system. 00:28:57.280 |
Furthermore, making an individual part perfect 00:29:04.000 |
may actually not be the best way of getting the highest degree of 00:29:07.680 |
reliability and performance and such. There's people who say C++ is type safe, 00:29:13.200 |
not type safe. You can break it. Sure. I can break anything that runs on a 00:29:18.880 |
computer. I may not go through your type system. 00:29:23.360 |
If I wanted to break into your computer, I'll probably try SQL injection. 00:29:28.640 |
It's very true. If you think about safety or even reliability at a system level, 00:29:38.160 |
it starts becoming hopeless pretty quickly in terms of 00:29:45.040 |
proving that something is safe to a certain level. 00:29:49.280 |
Because there's so many variables, it's so complex. Well, let's get back to 00:29:53.200 |
something we can talk about and actually make some progress on. 00:30:01.680 |
try and make sure they crash less often. The way you do that 00:30:09.040 |
is largely by simplification. It is not... the first step 00:30:15.760 |
is to simplify the code, have less code, have code that are less likely to go 00:30:21.040 |
wrong. It's not by runtime testing everything. 00:30:24.480 |
It is not by big test frameworks that you're using. 00:30:29.040 |
Yes, we do that also. But the first step is actually to make sure that when you 00:30:35.200 |
want to express something, you can express it directly in code 00:30:40.560 |
rather than going through endless loops and convolutions in your head 00:30:45.440 |
before it gets down the code. That if the way you are thinking about a 00:30:51.520 |
problem is not in the code. There is a missing 00:30:56.000 |
piece that's just in your head. And the code, 00:30:59.360 |
you can see what it does, but it cannot see what you 00:31:03.600 |
thought about it unless you have expressed things 00:31:06.240 |
directly. When you express things directly, you can 00:31:10.480 |
maintain it. It's easier to find errors. It's easier 00:31:13.680 |
to make modifications. It's actually easier to test it. And 00:31:18.240 |
lo and behold, it runs faster. And therefore you can use a 00:31:23.760 |
smaller number of computers, which means there's less 00:31:26.800 |
hardware that could possibly break. So I think the key here is 00:31:32.480 |
simplification. But it has to be, to use the Einstein 00:31:40.080 |
There are other areas with under constraint where you can be 00:31:44.880 |
simpler than you can be in C++, but in the domain I'm dealing with, 00:31:50.240 |
that's the simplification I'm after. So how do you inspire or 00:31:57.200 |
ensure that the Einstein level of simplification is reached? 00:32:03.360 |
So can you do code review? Can you look at code? 00:32:08.000 |
Is there, if I gave you the code for the Ford F-150 and said, 00:32:13.760 |
here, is this a mess or is this okay? Is it possible to tell? 00:32:20.080 |
Is it possible to regulate? An experienced developer can look 00:32:26.160 |
at code and see if it smells. I mix metaphors deliberately. 00:32:31.760 |
The point is that it is hard to generate something that is 00:32:43.360 |
really obviously clean and can be appreciated, but you can 00:32:50.000 |
usually recognize when you haven't reached that point. 00:32:53.840 |
And so if I, I've never looked at the F-150 code, so I wouldn't know. 00:33:03.360 |
But I know what I would be looking for. I'll be looking for some 00:33:07.520 |
tricks that correlate with bugs and elsewhere. 00:33:10.720 |
And I have tried to formulate rules for what good code looks like. 00:33:19.200 |
And the current version of that is called the C++ core guidelines. 00:33:26.880 |
One thing people should remember is there's what you can do 00:33:31.760 |
in a language and what you should do. In a language, you have lots of things that 00:33:38.720 |
is necessary in some contexts, but not in others. 00:33:42.080 |
There's things that exist just because there's 00:33:45.040 |
30-year-old code out there and you can't get rid of it. 00:33:48.640 |
But you can't have rules that says when you create it, try and follow these rules. 00:33:54.640 |
This does not create good programs by themselves, 00:34:03.600 |
it limits the possibilities of mistakes. And basically, we are trying to 00:34:12.720 |
at the fairly simple level of where you use the language and how you use it. 00:34:17.760 |
Now, I can put all the rules for chiseling in marble. It doesn't mean 00:34:27.760 |
can do a masterpiece by Michelangelo. That is, there's something else to write 00:34:35.440 |
a good program. Just is there something else to create 00:34:38.720 |
an important work of art. That is, there's some kind of 00:34:51.120 |
approach the sort of technical, the craftsmanship level of it. 00:35:02.960 |
was among other things, superb craftsman. They could express their ideas 00:35:10.640 |
using their tools very well. And so these days, I think what I'm doing, 00:35:18.000 |
what a lot of people are doing, we are still trying to figure out how it 00:35:21.600 |
is to use our tools very well. For a really good piece of code, 00:35:28.960 |
you need a spark of inspiration and you can't, I think, regulate that. 00:35:33.280 |
You cannot say that I'll take a picture only, 00:35:38.640 |
I'll buy your picture only if you're at least Van Gogh. 00:35:44.640 |
There are things you can regulate, but not the inspiration. 00:35:50.400 |
I think that's quite beautifully put. It is true 00:35:54.160 |
that there is, as an experienced programmer, when you see code that's 00:36:05.920 |
you know it when you see it. And the opposite of that is code that 00:36:11.680 |
is messy, code that smells, you know when you see it. 00:36:14.960 |
And I'm not sure you can describe it in words except 00:36:18.400 |
vaguely through guidelines and so on. Yes, it's 00:36:21.520 |
easier to recognize ugly than to recognize beauty 00:36:26.800 |
in code. And for the reason is that sometimes 00:36:30.080 |
beauty comes from something that's innovative and unusual. 00:36:34.000 |
And you have to sometimes think reasonably hard to appreciate that. 00:36:42.720 |
in common. And you can have static checkers and dynamic 00:36:55.200 |
most common mistakes. You can catch a lot of sloppiness mechanically. I'm a 00:37:07.840 |
because you can check for not just the language rules but for the usage of 00:37:11.840 |
language rules. And I think we will see much more 00:37:15.600 |
static analysis in the coming decade. Can you describe 00:37:19.520 |
what static analysis is? You represent a piece of code 00:37:25.840 |
so that you can write a program that goes over 00:37:30.320 |
that representation and look for things that are 00:37:45.600 |
resources are leaked. That's one of my favorite 00:37:50.640 |
problems. It's not actually all that hard in modern C++ but you can do it. 00:37:56.960 |
If you are writing in the C level you have to have a malloc and a free 00:38:01.360 |
and they have to match. If you have them in a single function you can 00:38:07.760 |
usually do it very easily. If there's a malloc here there should be a free there. 00:38:14.320 |
On the other hand, in between can be true and complete code and then it becomes 00:38:22.640 |
memory out of a function and then want to make sure that the free 00:38:29.200 |
is done somewhere else. Now it gets really difficult. 00:38:33.360 |
And so for static analysis you can run through a program 00:38:37.600 |
and you can try and figure out if there's any leaks. 00:38:42.880 |
And what you will probably find is that you will find some leaks 00:38:48.160 |
and you'll find quite a few places where your analysis can't be complete. 00:38:56.960 |
on the cleverness of your analyzer. And it might take a long time. Some of 00:39:03.840 |
these programs run for a long time. But if you combine 00:39:10.080 |
such analysis with a set of rules that says how 00:39:14.480 |
people could use it, you can actually see why the rules are violated. 00:39:19.760 |
And that stops you from getting into the impossible complexities. You don't want 00:39:28.720 |
So static analysis is looking at the code without running the code. 00:39:32.240 |
Yes. And thereby it's almost, not in production code, but it's almost 00:39:38.800 |
like an educational tool of how the language should be used. 00:39:43.760 |
It guides you. At its best, it would guide you in how you write future 00:39:52.400 |
Yes. So basically you need a set of rules for how you use the language. 00:39:56.960 |
Then you need a static analysis that catches your mistakes when you violate 00:40:04.560 |
the rules or when your code ends up doing things that it shouldn't, despite 00:40:09.760 |
the rules, because there is the language rules. We can go further. 00:40:13.600 |
And again, it's back to my idea that I would much rather find errors before I 00:40:18.960 |
start running the code. If nothing else, once the code runs, if 00:40:24.160 |
it catches an error at run times, I have to have an error handler. 00:40:28.240 |
And one of the hardest things to write in code is error handling code, because 00:40:33.840 |
you know something went wrong. Do you know really exactly what went 00:40:38.400 |
wrong? Usually not. How can you recover when you 00:40:41.760 |
don't know what the problem was? You can't be 100% sure what the problem 00:40:46.400 |
was in many, many cases. And this is part of it. So yes, we need 00:40:54.240 |
good languages with good type systems. We need rules for how to use them. 00:40:58.880 |
We need static analysis. And the ultimate for static analysis is, 00:41:03.120 |
of course, program proof, but that still doesn't scale to the kind 00:41:07.600 |
of systems we deploy. Then we start needing testing and 00:41:15.280 |
So C++ is an object-oriented programming language that creates, 00:41:21.520 |
especially with its newer versions, as we'll talk about, higher and higher 00:41:24.560 |
levels of abstraction. So how do you design... 00:41:30.480 |
Let's even go back to the origin of C++. How do you design something with so much 00:41:39.520 |
is still something that you can manage, do static analysis on, you can 00:41:47.200 |
have constraints on, that can be reliable, all those things we've talked about. 00:41:58.000 |
high-level abstraction and efficiency. That's a good question. I could probably 00:42:03.760 |
have a year's course just trying to answer it. 00:42:08.400 |
Yes, there's a tension between efficiency and abstraction, 00:42:12.240 |
but you also get the interesting situation that you get the best 00:42:17.040 |
efficiency out of the best abstraction. And my main tool 00:42:23.680 |
for efficiency, for performance, actually is abstraction. 00:42:32.000 |
You said it was an object-oriented programming language. I actually never 00:42:35.520 |
said that. It's always quoted, but I never did. I 00:42:39.600 |
said C++ supports object-oriented programming and 00:42:44.080 |
other techniques. And that's important, because I think 00:42:52.800 |
complex, interesting problems require ideas and techniques from 00:42:59.520 |
things that have been called object-oriented, 00:43:04.560 |
data abstraction, functional, traditional C-style code, 00:43:11.840 |
all of the above. And so when I was designing C++, 00:43:18.720 |
I soon realized I couldn't just add features. 00:43:23.520 |
If you just add what looks pretty, or what people ask for, 00:43:27.200 |
or what you think is good, one by one, you're not going to get a 00:43:31.840 |
coherent whole. What you need is a set of guidelines 00:43:36.320 |
that guides your decisions. Should this feature be in, or 00:43:41.760 |
should this feature be out? How should a feature be modified before 00:43:46.640 |
it can go in, and such? And there's a, in the book I wrote 00:43:50.960 |
about that, that's "Sign Evolution of C++," there's a 00:43:54.560 |
whole bunch of rules like that. Most of them are not 00:44:02.880 |
"Don't violate static type system," because I like static type system 00:44:07.200 |
for the obvious reason that I like things to be reliable on 00:44:13.760 |
reasonable amounts of hardware. But one of these rules is, 00:44:19.360 |
it's a zero overhead principle. The what kind of principle? The zero overhead 00:44:28.320 |
have an abstraction, it should not cost anything compared to write 00:44:39.360 |
So if I have, say, a matrix multiply, it should be written in such a way 00:44:48.720 |
that you could not drop to the C level of abstraction and use arrays and 00:44:53.920 |
pointers and such and run faster. And so people have written such 00:45:00.800 |
matrix multiplications, and they've actually gotten 00:45:04.800 |
code that ran faster than Fortran, because once you had the right 00:45:08.560 |
abstraction, you can eliminate, you can eliminate temporaries, and you 00:45:17.680 |
that. That's quite hard to do by hand and in a 00:45:20.400 |
lower level language. And there's some really nice examples of 00:45:29.520 |
multiplication, the matrix abstraction, allows you to write code that's simple 00:45:37.520 |
But with C++, it has the features so that you can also 00:45:41.040 |
have this thing run faster than if you hand-coded it. 00:45:45.360 |
Now, people have given that lecture many times, I and others, 00:45:50.320 |
and a very common question after the talk, where you have demonstrated that 00:45:54.640 |
you can outperform Fortran for dense matrix multiplication, people 00:46:01.680 |
If I rewrote your code in C, how much faster would it run?" 00:46:06.000 |
The answer is, much slower. This happened the first time, actually, 00:46:12.400 |
back in the '80s, with a friend of mine called Doug McElroy, 00:46:20.080 |
And so, the principle is, you should give programmers the tools so that their 00:46:26.720 |
abstractions can follow the zero-overhead principle. 00:46:30.320 |
Furthermore, when you put in a language feature in C++, 00:46:34.000 |
or a standard library feature, you try to meet this. 00:46:38.000 |
It doesn't mean it's absolutely optimal, but it means if you hand-code it 00:46:46.880 |
in C++, in C, you should not be able to better it. 00:46:51.040 |
Usually, you can do better if you use embedded assembler for machine code, 00:46:57.920 |
for some of the details to utilize part of a computer that the compiler doesn't 00:47:03.440 |
know about. But you should get to that point before 00:47:19.040 |
There's some of it is the compilation process, so the implementation of C++. 00:47:23.520 |
Some of it is the design of the feature itself, 00:47:27.520 |
the guidelines. So, I've recently, and often, talked to Chris Latner, 00:47:33.440 |
so, Clang. What, just out of curiosity, is your 00:47:40.320 |
relationship in general with the different implementations of C++ 00:47:44.800 |
as you think about you and committee and other people in C++, think about the 00:47:49.280 |
design of new features or design of previous features. 00:47:54.080 |
In trying to reach the ideal of zero overhead, 00:47:59.840 |
does the magic come from the design, the guidelines, or from the 00:48:13.840 |
program language features, and implementation techniques. You need all 00:48:18.960 |
And how can you think about all three at the same time? 00:48:23.680 |
It takes some experience, takes some practice, and sometimes you get it wrong. 00:48:36.560 |
Brian Kernighan pointed out that one of the reasons C++ 00:48:42.400 |
succeeded was some of the craftsmanship I put into the 00:48:48.720 |
early compilers. And, of course, I did the language 00:48:52.160 |
design, and of course, I wrote a fair amount of code using this kind of 00:49:00.560 |
involve progress in all three areas together. 00:49:05.600 |
A small group of people can do that. Two, three people 00:49:09.600 |
can work together to do something like that. It's ideal if it's one person 00:49:13.600 |
that has all the skills necessary, but nobody has all the skills necessary 00:49:18.240 |
in all the fields where C++ is used. So if you want to 00:49:22.560 |
approach my ideal in, say, concurrent programming, you need to 00:49:27.200 |
know about algorithms from concurrent programming. 00:49:30.400 |
You need to know the trigger of lock-free programming. 00:49:34.240 |
You need to know something about compiler techniques. 00:49:37.920 |
And then you have to know some of the program areas, 00:49:51.040 |
what we call a web-serving kind of stuff. And that's very hard to get 00:49:57.600 |
into a single head, but small groups can do it too. 00:50:01.440 |
So is there differences in your view, not saying which is better or so on, 00:50:08.080 |
but differences in the different implementations 00:50:10.400 |
of C++? Why are there several sort of maybe naive questions for me? 00:50:18.720 |
GCC, Clang, so on. This is a very reasonable question. 00:50:32.720 |
Because if you run on an IBM, if you run on a Sun, if you run on a Motorola, 00:50:39.200 |
there was just many, many companies and they each have their own compilation 00:50:42.960 |
structure and their old compilers. It was just fairly common that there 00:50:47.360 |
was many of them. And I wrote C front assuming that other 00:50:52.640 |
people would write compilers for C++ if I was successful. 00:50:57.920 |
And furthermore, I wanted to utilize all the back-end infrastructures that 00:51:04.400 |
were available. I soon realized that my users were using 00:51:07.920 |
25 different linkers. I couldn't write my own linker. 00:51:12.800 |
Yes, I could, but I couldn't write 25 linkers and also get any work done on 00:51:18.560 |
the language. And so it came from a world where there 00:51:27.120 |
compiler front-ends, not to start, but many operating systems. 00:51:34.240 |
The whole world was not an 86 and a Linux box or something, 00:51:39.440 |
whatever is the standard today. In the old days, they said a set of 00:51:43.680 |
backs. So basically, I assumed there would be 00:51:47.840 |
lots of compilers. It was not a decision that there should 00:51:51.520 |
be many compilers. It was just a fact. That's the way the 00:51:59.440 |
many compilers emerged. And today, there's at least four front-ends, 00:52:08.480 |
Clang, GCC, Microsoft, and EDG. It is the same group. 00:52:15.200 |
They supply a lot of the independent organizations and the embedded systems 00:52:22.000 |
industry. And there's lots and lots of back-ends. 00:52:26.240 |
We have to think about how many dozen back-ends there are. 00:52:31.680 |
Because different machines have different things. Especially in the 00:52:35.280 |
embedded world, the machines are very different. The architectures are very 00:52:43.680 |
having a single implementation was never an option. Now, I also happen to 00:52:58.560 |
monoculture can go stale, and there's no competition, 00:53:03.120 |
and there's no incentive to innovate. There's a lot of incentive to put 00:53:08.320 |
barriers in the way of change. Because, hey, we own the world, and it's 00:53:13.680 |
a very comfortable world for us. And who are you to 00:53:16.960 |
to mess with that? So, I really am very happy that there's four 00:53:23.360 |
front-ends for C++. Clang's great, but GCC was great. 00:53:30.800 |
But then it got somewhat stale. Clang came along, 00:53:34.400 |
and GCC is much better now. Competition is good. 00:53:38.480 |
Microsoft is much better now. So, at least a low number of front-ends 00:53:49.920 |
standards compliance, and also on performance, and error messages, 00:53:55.920 |
and compile time, speed, all this good stuff that we want. 00:54:01.120 |
Do you think, crazy question, there might come along, 00:54:05.760 |
do you hope there might come along, implementation of C++ 00:54:10.640 |
written, given all its history, written from scratch? So, written today 00:54:18.240 |
from scratch? Well, Clang and LLVM is more or less written by from scratch. 00:54:24.880 |
But there's been C++ 11, 14, 17, 20, you know, there's been a lot of... 00:54:30.880 |
Sooner or later, somebody's going to try again. 00:54:33.920 |
There has been attempts to write new C++ compilers, and 00:54:39.040 |
some of them has been used, and some of them has been absorbed into others, and 00:54:42.960 |
such. Yeah, it'll happen. So, what are the key features of C++? 00:54:49.600 |
And let's use that as a way to sort of talk about 00:54:57.600 |
at the highest level, what are the features that were there in the 00:55:01.360 |
beginning? What features got added? Let's first get a principle, 00:55:07.760 |
an aim in place. C++ is for people who want to use 00:55:14.160 |
hardware really well, and then manage the complexity of doing that through 00:55:24.720 |
you have is a way of manipulating the machines at a fairly low level. That 00:55:31.280 |
looks very much like C. It has loops, it has variables, it 00:55:38.000 |
has pointers, like machine addresses, it can access memory directly, it can 00:55:44.160 |
allocate stuff in the absolute minimum of space 00:55:49.360 |
needed on the machine. There's a machine-facing part of C++, 00:55:54.320 |
which is roughly equivalent to C. I said C++ could beat C, and it can. 00:56:00.080 |
It doesn't mean I dislike C. If I disliked C, 00:56:03.280 |
I wouldn't have built on it. Furthermore, after Dennis Ritchie, I'm 00:56:13.760 |
And, well, I had lunch with Dennis most days for 16 years, and we never 00:56:25.360 |
these C versus C++ fights are for people who don't quite understand 00:56:30.400 |
what's going on. Then the other part is the abstraction. 00:56:35.440 |
And there, the key is the class, which is a user-defined type. 00:56:40.160 |
And my idea for the class is that you should be able to build a type 00:56:44.560 |
that's just like the built-in types, in the way you use them, in the way you 00:56:53.840 |
and you can do just as well. So, in C++, there's an int, 00:56:59.680 |
as in C. You should be able to build an abstraction, a class, which we can call 00:57:05.760 |
capital int, that you can use exactly like an integer 00:57:10.800 |
and run just as fast as an integer. There's the idea right there. And, of 00:57:16.560 |
course, you probably don't want to use the int 00:57:19.520 |
itself, but it has happened. People have wanted integers that were 00:57:25.200 |
range-checked so that you couldn't overflow and such, especially for very 00:57:28.960 |
safety-critical applications, like the fuel injection for a 00:57:37.040 |
This is a real example, by the way. This has been done. 00:57:40.640 |
They built themselves an integer that was just like integer, 00:57:45.120 |
except that it couldn't overflow. If there was an overflow, you went into the error 00:57:49.840 |
handling. And then you built more interesting 00:57:54.080 |
types. You can build a matrix, which you need to do graphics, 00:58:04.960 |
And all these are classes and they appear just like the built-in types? 00:58:08.400 |
Exactly. In terms of efficiency and so on. So, what else is there? 00:58:11.760 |
And flexibility. So, I don't know. For people who are not 00:58:18.080 |
familiar with object-oriented programming, there's inheritance. 00:58:24.960 |
just like you said, create a generic vehicle that can turn left. 00:58:29.120 |
So, what people found was that you don't actually... 00:58:36.880 |
No, how do I say this? A lot of types are related. 00:58:43.760 |
That is, the vehicles, all vehicles are related. 00:58:48.720 |
Bicycles, cars, fire engines, tanks. They have some things in common and 00:58:55.440 |
some things that differ. And you would like to have the common 00:58:58.800 |
things common and having the differences specific. 00:59:03.600 |
And when you didn't want to know about the differences, like, 00:59:06.560 |
just turn left. You don't have to worry about it. That's how you get 00:59:12.800 |
the traditional object-oriented programming coming out of Simula, 00:59:16.240 |
adopted by Smalltalk and C++ and all the other languages. 00:59:21.600 |
The other kind of obvious similarity between types 00:59:32.560 |
of doubles. But the minute you have a vector of doubles, you want a vector 00:59:39.360 |
of double precision doubles, and for short doubles, for graphics. 00:59:45.680 |
Why should you not have a vector of integers while you're at it? 00:59:49.680 |
Or a vector of vectors, a vector of vectors of chess pieces. 00:59:55.280 |
Now you have a board, right? So, this is, you express the commonality 01:00:03.840 |
as the idea of a vector, and the variations come through parameterization. 01:00:10.080 |
And so, here we get the two fundamental ways of abstracting, 01:00:21.440 |
There's the inheritance, and there's a parameterization. 01:00:24.320 |
There's the object-oriented programming, and there's the generic programming. 01:00:28.480 |
With the templates for the generic programming? 01:00:31.360 |
So, you've presented it very nicely, but now you have to make all that happen 01:00:38.880 |
and make it efficient. So, generic programming, 01:00:42.480 |
with templates, there's all kinds of magic going on, especially recently, 01:00:47.200 |
that you can help catch up on. But it feels to me like you can do way more 01:00:52.240 |
than what you just said, with templates. You can start doing 01:00:56.560 |
this kind of metaprogramming, this kind of... 01:00:58.320 |
You can do metaprogramming also. I didn't go there in that explanation. 01:01:04.240 |
We're trying to be very basic, but go back on to the implementation. 01:01:08.640 |
If you couldn't implement this efficiently, if you couldn't use it 01:01:13.840 |
so that it became efficient, it has no place in C++, 01:01:17.600 |
because it will violate the zero overhead principle. 01:01:20.880 |
So, when I had to get object-oriented programming inheritance, 01:01:27.440 |
I took the idea of virtual functions from simula. 01:01:32.320 |
Virtual functions is a simula term. Class is a simula term. 01:01:37.120 |
If you ever use those words, say thanks to Christian Nygaard 01:01:40.960 |
and Ole Johan Dahl. And I did the simplest implementation 01:01:50.800 |
So, you get the virtual function table, the function goes in, 01:01:55.760 |
does an indirection through a table, and get the right function. 01:01:59.120 |
That's how you pick the right thing there. And I thought that was trivial. 01:02:08.720 |
It turned out the simula had a more complicated way of doing it, 01:02:12.000 |
and therefore slower. And it turns out that most languages 01:02:16.480 |
have something that's a little bit more complicated, 01:02:21.280 |
And one of the strengths of C++ was that you could actually do 01:02:25.600 |
this object-oriented stuff, and your overhead compared to 01:02:31.200 |
ordinary functions, there's no indirection, it's sort of in 5, 10, 25 percent. 01:02:37.440 |
Just the call. It's down there. It's not two. 01:02:41.120 |
And that means you can afford to use it. Furthermore, in C++, 01:02:47.120 |
you have the distinction between a virtual function and a non-virtual function. 01:02:51.840 |
If you don't want any overhead, if you don't need the indirection 01:02:56.000 |
that gives you the flexibility in object-oriented programming, 01:02:59.440 |
just don't ask for it. So the idea is that you only use 01:03:04.960 |
virtual functions if you actually need the flexibility. 01:03:07.840 |
So it's not zero overhead, but it's zero overhead compared 01:03:12.080 |
to any other way of achieving the flexibility. 01:03:18.720 |
Basically, the compiler looks at the template, 01:03:28.880 |
say the vector, and it looks at the parameter, 01:03:34.320 |
and then combines the two and generates a piece of code 01:03:38.800 |
that is exactly as if you've written a vector of that specific type. 01:03:43.440 |
So that's the minimal overhead. If you have many template parameters, 01:03:50.000 |
you can actually combine code that the compiler couldn't usually see 01:03:54.000 |
at the same time, and therefore get code that is faster 01:03:59.440 |
than if you had handwritten the stuff, unless you are very, very clever. 01:04:05.040 |
So the thing is, parameterized code, the compiler fills stuff in 01:04:11.200 |
during the compilation process, not during runtime. 01:04:14.720 |
That's right. And furthermore, it gives all the information it's gotten, 01:04:20.640 |
which is the template, the parameter, and the context of use. 01:04:26.640 |
It combines the three and generates good code. 01:04:32.480 |
Now, it's a little outside of what I'm even comfortable thinking about, 01:04:40.580 |
And how do you... I remember being both amazed at the power of that idea, 01:04:56.800 |
Come back to this, because I have a solution. 01:05:02.320 |
The code generated by C++ has always been ugly, 01:05:09.360 |
because there's these inherent optimizations. 01:05:12.080 |
A modern C++ compiler has front-end, middle-end, and back-end optimizations. 01:05:17.680 |
Even C-Front, back in '83, had front-end and back-end optimizations. 01:05:23.760 |
I actually took the code, generated an internal representation, 01:05:29.200 |
munched that representation to generate good code. 01:05:33.680 |
So people say, "This is not a compiler that generates C." 01:05:36.640 |
The reason it generated C was I wanted to use C's code generators 01:05:41.040 |
that was really good at back-end optimizations. 01:05:46.640 |
and therefore the C I generated was optimized C. 01:05:51.280 |
The way a really good handcrafted optimizer, human, could generate it, 01:06:01.120 |
It was the output of a program, and it's much worse today. 01:06:05.120 |
And with templates, it gets much worse still. 01:06:07.680 |
So it's hard to combine simple debugging with optimal code, 01:06:16.960 |
because the idea is to drag in information from different parts of the code 01:06:29.360 |
So what people often do for debugging is they turn the optimizer off. 01:06:35.920 |
And so you get code that, when something in your source code 01:06:42.720 |
looks like a function call, it is a function call. 01:06:45.920 |
When the optimizer is turned on, it may disappear, the function call. 01:06:51.120 |
And so one of the things you can do is you can actually get code 01:07:01.520 |
because you eliminate the function preamble and return, 01:07:08.880 |
One of the key things when I did templates was 01:07:14.640 |
I wanted to make sure that if you have, say, a sort algorithm, 01:07:23.600 |
if that sorting criteria is simply comparing things with less than, 01:07:34.080 |
not an indirect function call to a comparison object, 01:07:44.240 |
But we really want down to the single instruction. 01:07:47.200 |
And but anyway, turn off the optimizer, and you can debug. 01:08:01.360 |
- And then there's this idea of concepts that puts some... 01:08:11.840 |
I don't know if it was ever available in any form, 01:08:14.240 |
but it puts some constraints on the stuff you can parameterize, essentially. 01:08:27.680 |
We have had versions of it that actually work for the last four or five years. 01:08:33.840 |
It was a design by Gabby Dos Reis, Drew Sautin, and me. 01:08:40.960 |
We were professors and postdocs in Texas at the time. 01:08:44.240 |
And the implementation by Andrew Sautin has been available for that time. 01:09:06.240 |
It's available in Clang and GCC, GCC for a couple of years. 01:09:13.920 |
And I believe Microsoft is soon going to do it. 01:09:16.960 |
We expect all of C++20 to be available in all the major compilers in 20. 01:09:28.640 |
I'm just saying that because otherwise people might think 01:09:34.640 |
And so what I'm going to say is concrete, you can run it today. 01:09:41.840 |
So the basic idea is that when you have a generic component, like a sort function, 01:09:51.440 |
the sort function will require at least two parameters. 01:09:56.400 |
One, a data structure with a given type and a comparison criteria. 01:10:04.080 |
And these things are related, but obviously you can't compare things 01:10:08.880 |
if you don't know what the type of things you compare. 01:10:11.280 |
And so you want to be able to say, I'm going to sort something. 01:10:23.280 |
It has to be a sequence with a beginning and an end. 01:10:27.040 |
There has to be random access to that sequence. 01:10:31.120 |
And there has to be, the element types has to be comparable. 01:10:37.120 |
Which means less than operator can operate on it. 01:10:41.200 |
Less than logical operator can operate on it. 01:10:42.800 |
Basically what concepts are, they're compile time predicates. 01:10:47.200 |
They're predicates you can ask, are you a sequence? 01:10:59.120 |
Is your element type something that has a less than? 01:11:07.360 |
And so instead of saying, I will take a parameter of any type, 01:11:11.760 |
it'll say, I'll take something that's sortable. 01:11:17.440 |
And so we say, okay, you can sort with less than. 01:11:24.240 |
So you have two parameters, the sortable thing and the 01:11:27.600 |
comparison criteria and the comparison criteria will say, 01:11:31.520 |
well, I can, you can write it saying it should operate on the 01:11:37.040 |
element type and it has the comparison operations. 01:11:48.880 |
So it specifies the requirements of the code on the parameters 01:11:54.560 |
that it gets, it's very similar to types actually. 01:12:04.080 |
The word concept was used by Alex Stefanov, who is sort of 01:12:10.560 |
the father of generic programming in the context of C++. 01:12:14.000 |
There's other places that use that word, but the way we call 01:12:22.400 |
It's that word, but the way we call generic programming is 01:12:25.120 |
Alex's and he called them concepts because he said they're 01:12:28.880 |
the sort of the fundamental concepts of an area. 01:12:36.480 |
If you look at the K&R book about C, C has arithmetic types 01:12:49.120 |
And then it lists what they are and they have certain properties. 01:12:53.360 |
The difference today is that we can actually write a concept 01:12:57.040 |
that will ask a type, are you an integral type? 01:13:00.560 |
Do you have the properties necessary to be an integral type? 01:13:07.600 |
So maybe the story of concepts, because I thought it might be 01:13:22.720 |
What was the, why didn't it, like what, we'll talk a little 01:13:28.160 |
bit about this fascinating process of standards because I 01:13:32.560 |
It's interesting for me, but why did it take so long? 01:13:48.240 |
- In 1987 or thereabouts, when I was designing templates, 01:13:52.880 |
obviously I wanted to express the notion of what is required 01:14:00.000 |
And so I looked at this and basically for templates, 01:14:10.400 |
It had to be able to express things I couldn't imagine 01:14:16.160 |
because I know I can't imagine everything and I've been 01:14:19.600 |
suffering from languages that try to constrain you to only do 01:14:28.000 |
Secondly, it had to run faster, as fast or faster than 01:14:35.200 |
So basically if I have a vector of T and I take a vector of 01:14:39.520 |
char, it should run as fast as you build a vector of char 01:14:46.640 |
And thirdly, I wanted to be able to express the constraints 01:14:53.680 |
of the arguments, have proper type checking of the 01:15:00.400 |
And neither I nor anybody else at the time knew how to get 01:15:06.320 |
And I thought for C++, I must have the two first. 01:15:13.680 |
And it bothered me for another couple of decades that I 01:15:18.560 |
I mean, I was the one that put function argument type checking 01:15:30.880 |
And I wanted to do the same for templates, of course, and I 01:15:43.120 |
Gabby Desrais and I started analyzing the problem, explained 01:15:52.000 |
A group in University of Indiana, an old friend of mine, 01:16:04.800 |
we thought we could get a good system of concepts in another 01:16:23.520 |
Well, it turns out that I think we got a lot of the fundamental 01:16:40.640 |
It didn't serve mixed type arithmetic, mixed type 01:16:48.000 |
A lot of stuff came out of the functional community and that 01:16:55.840 |
community didn't deal with multiple types in the same way 01:17:05.600 |
Had more constraints on what you could express and didn't have 01:17:19.200 |
We had some successes, but it just in the end wasn't. 01:17:23.840 |
Didn't compile fast enough, was too hard to use, and didn't run 01:17:29.360 |
fast enough unless you had optimizers that was beyond the 01:17:39.520 |
Basically, it was the idea that a set of parameters has defined 01:17:46.400 |
a set of operations and you go through an indirection table 01:17:50.800 |
just like for virtual functions, and then you try to optimize 01:18:05.520 |
We are standardizing C++ under ISO rules, which are very open 01:18:12.720 |
People come in, there's no requirements for education or 01:18:17.200 |
So you've started to develop C++, and there's a whole... 01:18:26.960 |
The ISO standard, is there a committee that you're referring 01:18:38.080 |
So sometime in early 1989, I think, I think it was. 01:18:47.360 |
In 1989, two people, one from IBM, one from HP, turned up in 01:18:56.160 |
my office and told me I would like to standardize C++. 01:19:01.120 |
This was a new idea to me, and I pointed out that it wasn't 01:19:10.000 |
It wasn't ready for formal standardization and such. 01:19:13.120 |
And they say, "No, Bjarne, you haven't gotten it. 01:19:19.360 |
We cannot depend on something that's owned by another 01:19:26.880 |
Of course, we could rely on you, but you might get run over by 01:19:36.320 |
It has to be standardized under formal rules, and we are going 01:19:48.080 |
And you really want to be part of it because basically, 01:19:56.400 |
So through a combination of arm twisting and flattery, it got 01:20:17.600 |
It was ANSI, the American National Standard, we're doing. 01:20:24.000 |
We were lectured on the rules of how to do an ANSI standard. 01:20:28.720 |
There was about 25 of us there, which apparently was a new 01:20:35.600 |
And some of the old C guys that has been standardizing C was 01:20:42.320 |
So the way this works is that it's an open process. 01:20:46.800 |
Anybody can sign up if they pay the minimal fee, which is 01:20:52.320 |
There was less then, just a little bit more now. 01:21:07.280 |
We try two meetings a year for a couple of years that didn't 01:21:17.360 |
And you meet and you have technical discussions, and then 01:21:28.640 |
The votes are done one vote per organization. 01:21:35.200 |
So you can't have, say, IBM come in with 10 people and 01:21:42.640 |
And these are organizations that extend to the UC++? 01:21:49.360 |
I mean, it's a bunch of people in a room deciding the design 01:21:54.720 |
of a language based on which a lot of the world's systems 01:22:00.740 |
Well, I think most people would agree it's better than if 01:22:05.440 |
Or better than if a single organization like AT&T decided 01:22:11.940 |
I don't know if everyone agrees to that, by the way. 01:22:28.000 |
As Churchill says, democracy is the worst way except for all 01:22:33.060 |
And it's, I would say, the same with formal standardization. 01:22:36.480 |
But anyway, so we meet and we have these votes, and that 01:22:44.800 |
A couple of years later, we extended this so it became 01:22:51.760 |
We have standard organizations that are active in currently 01:22:57.680 |
15 to 20 countries, and another 15 to 20 are sort of looking 01:23:06.560 |
and voting based on the rest of the work on it. 01:23:14.000 |
Next week, I'll be in Cologne, Germany, spending a week doing 01:23:20.480 |
standardization, and we'll vote out the committee draft of C++20, 01:23:25.520 |
which goes to the National Standards Committees for comments 01:23:35.120 |
Then we do that, and there's a second set of votes where 01:23:42.720 |
The first time we finished, we started in the first technical 01:23:54.640 |
That was the standard that people used till '11, or a little 01:24:05.440 |
It took longer with '11, and I'll mention why, but all the 01:24:19.440 |
That is, we do not want something that passes 60/40, 01:24:23.440 |
because then we're getting dial-ins and opponents and people 01:24:30.080 |
They won't complain too much, but basically it has no real 01:24:37.040 |
They have been working to make it easier to use many compilers, 01:24:43.040 |
many computers, and all of that kind of stuff. 01:24:45.200 |
And so the first, it was traditional with ISO standards to 01:24:54.640 |
And we thought we were going to do the next one in '06, because 01:25:08.160 |
Hoping that you would at least get it within the odds, the 01:25:13.040 |
I thought we would get, I thought we would get six, seven, 01:25:20.240 |
Well, the point is that this was sort of like a second system 01:25:32.880 |
Furthermore, there is this tendency, because it's a 10-year 01:25:41.680 |
Just before you're about to ship, somebody has a bright 01:25:59.440 |
We got the standard library that gives us all the STL stuff. 01:26:08.640 |
And then people tried it with other things, and it didn't 01:26:13.600 |
They got things in, but it wasn't as dramatic. 01:26:18.800 |
So after C++ 11, which was a huge improvement, and basically 01:26:26.720 |
what most people are using today, we decided never again. 01:26:39.600 |
So that if you have a slip on a 10-year cycle, by the time 01:26:47.040 |
you know it's a slip, there's 11 years till you get it. 01:26:49.440 |
Now, with a three-year cycle, there is about four years 01:26:56.800 |
The delay between feature freeze and shipping. 01:27:18.160 |
And furthermore, this gives a predictability that allows 01:27:23.120 |
the implementers, the compiler implementers, the library 01:27:26.240 |
implementers, they have a target, and they deliver on it. 01:27:30.800 |
11 took two years before most compilers were good enough. 01:27:35.920 |
14, most compilers were actually getting pretty good in 14. 01:27:45.520 |
We are going to have at least almost everybody ship almost 01:27:53.200 |
And I know this because they're shipping in 19. 01:28:03.920 |
There's a lot of features that came in in C++11. 01:28:09.200 |
There's a lot of features at the birth of C++ that were 01:28:17.120 |
What to you is the most, just to you personally, beautiful 01:28:25.280 |
or just you sit back and think, wow, that's just a nice and 01:28:35.200 |
I have written two papers for the History of Programming 01:28:41.520 |
Languages Conference, which basically ask me such questions. 01:28:45.840 |
And I'm writing a third one, which I will deliver at the 01:28:50.320 |
History of Programming Languages Conference in London 01:28:59.520 |
The way a constructor can establish the environment for 01:29:04.880 |
the use of a type for an object and the destructor that 01:29:14.800 |
That's why we don't have to use garbage collection. 01:29:18.000 |
That's how we can get predictable performance. 01:29:21.600 |
That's how you can get the minimal overhead in many, many 01:29:28.960 |
It's the idea of constructor destructor pairs. 01:29:34.320 |
Sometimes it comes out under the name RAII, resource 01:29:40.160 |
acquisition is initialization, which is the idea that you 01:29:43.680 |
grab resources and the constructor and release them 01:29:47.200 |
It's also the best example of why I shouldn't be an 01:29:51.840 |
I get the best idea and I call it resource acquisition is 01:30:13.440 |
So types is an essential part of C++ and making them 01:30:20.880 |
And to you, this is almost getting philosophical, but the 01:30:26.640 |
construction and the destruction, the creation of an 01:30:30.320 |
instance of a type and the freeing of resources from that 01:30:35.680 |
instance of a type is what defines the object. 01:30:39.680 |
That's almost like birth and death is what defines human 01:30:50.000 |
You can't do good language design without philosophy 01:30:54.640 |
because what you are determining is what people can 01:31:00.320 |
By the way, constructors destructors came into C++ in 01:31:05.600 |
'79, in about the second week of my work with what was then 01:31:15.120 |
Next comes the fact that you need to control copying, 01:31:18.400 |
because once you control, as you said, birth and death, 01:31:22.480 |
you have to control taking copies, which is another way 01:31:28.480 |
And finally, you have to be able to move things around. 01:31:35.200 |
And that's the set of key operations you can define on a 01:31:41.840 |
And so to you, those things are just a beautiful part of C++ 01:31:51.940 |
You mentioned that you hope there will be one unified set 01:31:55.760 |
of guidelines in the future for how to construct a programming 01:31:59.920 |
So perhaps not one programming language, but a unification 01:32:10.080 |
I have some trouble remembering it, but I know the origin 01:32:14.960 |
So maybe you can talk about sort of C++ has been improving. 01:32:20.800 |
Do you, where does the arc of history taking us? 01:32:24.080 |
Do you hope that there is a unification about the languages 01:32:27.840 |
with which we communicate in the digital space? 01:32:30.480 |
Well, I think that languages should be designed not by 01:32:39.840 |
clobbering language features together and doing slightly 01:32:47.040 |
But through the creation of a set of principles, rules of 01:32:58.640 |
And we're trying to teach people in the standards committee 01:33:04.720 |
about these rules because a lot of people come in and say, 01:33:09.760 |
And then you have to ask, why does it fit in the language? 01:33:15.600 |
It may fit in another language and not here, or it may fit 01:33:21.680 |
So you have to work from a set of principles and you have 01:33:26.160 |
And one example that I sometimes remember is I was sitting 01:33:35.280 |
down with some of the designers of Common Lisp and we were 01:33:41.840 |
talking about languages and language features. 01:33:45.520 |
And obviously we didn't agree about anything because, well, 01:33:58.640 |
I said, I had this problem and I developed it. 01:34:07.440 |
We had that problem, different problem, and we develop it 01:34:12.480 |
And so we worked through large chunks of C++ and large 01:34:19.840 |
chunks of Common Lisp and figured out we actually had 01:34:27.760 |
But the constraints on our designs were very different. 01:34:32.160 |
And the aims for the usage was very different. 01:34:35.120 |
But there was commonality in the way you reason about 01:34:41.600 |
language features and the fundamental principles you 01:34:45.760 |
- So do you think that's possible to, so just like there 01:34:50.160 |
is perhaps a unified theory of physics, of the fundamental 01:34:55.680 |
forces of physics, I'm sure there is commonalities among 01:35:01.120 |
the languages, but there's also people involved that help 01:35:07.440 |
Do you have a hope or an optimism that there will be 01:35:14.400 |
If you think about physics and Einstein towards a 01:35:18.640 |
simplified language, do you think that's possible? 01:35:22.720 |
- Let's remember sort of modern physics, I think, started 01:35:45.520 |
And furthermore, they are not as bothered with people 01:35:50.000 |
using physics the way we are worried about programming 01:35:57.040 |
So each have problems and constraints, the others have, 01:36:02.400 |
but we are very immature compared to physics. 01:36:04.640 |
So I would look at sort of the philosophical level and look 01:36:12.480 |
for fundamental principles, like you don't leak resources, 01:36:18.640 |
You don't take errors at runtime that you don't need to. 01:36:30.400 |
There's many kinds of type systems, but when you have one, 01:36:36.960 |
There will be quite a few, and it will not be the same 01:36:43.120 |
But I think if we step back at some kind of philosophical 01:36:47.680 |
level, we would be able to agree on sets of principles 01:36:56.240 |
And within an area of use, like in C++'s case, 01:37:07.520 |
the area between the hardware and the fluffier parts 01:37:12.480 |
of the system, you might very well see a convergence. 01:37:17.360 |
So these days you see Rust having adopted RAII, 01:37:22.080 |
and sometime accuses me for having borrowed it 20 years 01:37:27.120 |
But we're seeing some kind of convergent here, 01:37:34.320 |
instead of relying on garbage collection all the time. 01:37:38.640 |
The garbage collection languages are doing things 01:37:41.680 |
like the dispose patterns and such that imitates 01:37:50.320 |
And they're trying not to use the garbage collection 01:37:57.120 |
But I think we have to step back to the philosophical level, 01:38:00.000 |
agree on principles, and then we'll see some convergences. 01:38:10.720 |
- So a crazy question, but I work a lot with machine 01:38:19.520 |
but you could think of programming as a thing 01:38:25.360 |
Programming is the task of creating a program, 01:38:28.640 |
and a program takes some input and produces some output. 01:38:35.520 |
in order to be able to take an input and produce output. 01:38:48.240 |
You know, we take some input, we make some output, 01:39:01.360 |
is a kind of way of programming, but just fuzzy. 01:39:16.160 |
it's, you know, you can measure, you can test 01:39:20.240 |
With biological systems or machine learning systems, 01:39:24.400 |
you can't say much except sort of empirically 01:39:28.160 |
saying that 99.8% of the time it seems to work. 01:39:31.920 |
What do you think about this fuzzy kind of programming? 01:40:21.840 |
Similarly, a language like C++ is not for everybody. 01:40:26.080 |
It is generated to be a sharp and effective tool 01:40:45.680 |
Counting on your fingers is not going to cut it 01:40:50.320 |
And so there are areas where an 84% accuracy rate, 01:41:00.400 |
16% false positive rate is perfectly acceptable. 01:41:06.720 |
And where people will probably get no more than 70. 01:41:16.320 |
And by really a lot of blood, sweat, and tears, 01:41:21.440 |
So this is fine if it is, say, pre-screening stuff 01:41:31.760 |
It is not good enough for life-threatening situations. 01:41:37.360 |
And so there are lots of areas where the fuzziness 01:41:41.760 |
is perfectly acceptable and good and better than humans, 01:42:17.840 |
that you have this AI system, machine learning, 01:42:28.800 |
And when there is something that is too complicated, 01:42:59.440 |
And for the designer of one of the most recent 01:43:04.240 |
designer of one of the most reliable, efficient, 01:43:08.960 |
I can understand why that world is actually unappealing. 01:43:18.960 |
because we do not know how to get that interaction right. 01:43:30.800 |
- I would much rather never rely on the human. 01:43:37.920 |
it is much better to design systems written in C++ 01:43:53.680 |
So that is one reason I have to keep a weather eye out 01:43:59.600 |
But I will never become an expert in that area. 01:44:14.560 |
No major system today is written in one language. 01:44:33.680 |
That you say, "Damn, I did pretty good there." 01:44:51.520 |
And I have tried to get away from it a few times, 01:44:57.200 |
partly because I am most effective in this area. 01:45:00.640 |
And partly because what I do has much more impact 01:45:11.200 |
that pick it up tomorrow if I get something right. 01:45:18.560 |
and then we'll see if anybody wants to use it. 01:45:34.960 |
And also, I get to see a lot of interesting stuff 01:45:41.040 |
I mean, if it has just been statements on paper, 01:45:51.520 |
But I get to see the telescopes up on Mount Akia, 01:45:55.680 |
and I actually went and see how Ford built cars, 01:45:59.600 |
and I got to JPL and see how they do the Mars rovers. 01:46:09.600 |
and most of the cool stuff is done by pretty nice people. 01:46:27.360 |
- On top of the code are the people in very nice places. 01:46:32.240 |
Well, I think I speak for millions of people, 01:46:35.280 |
Bjarn, in saying thank you for creating this language