back to index

Bjarne Stroustrup: C++ | Lex Fridman Podcast #48


Chapters

0:0 Introduction
1:40 First program
2:18 First programming language
4:21 Type system
6:18 Programming languages
10:14 Objectoriented programming
13:20 Lisp
16:45 Languages
22:27 Larger code bases
25:7 Efficiency and reliability
27:32 Safety and reliability
29:27 Simplification
31:52 Code review
35:52 Static analysis
39:27 What is static analysis
41:16 How do you design
47:12 The magic of C
50:1 Different implementations of C
54:0 Key features of C
58:4 Inheritance

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Bjorn Stroustrup.
00:00:03.120 | He's the creator of C++, a programming language that after 40 years
00:00:08.240 | is still one of the most popular and powerful languages in the world.
00:00:12.480 | Its focus on fast, stable, robust code underlies many of the biggest systems in
00:00:17.440 | the world that we have come to rely on as a society.
00:00:20.720 | If you're watching this on YouTube, for example, many of the critical back-end
00:00:24.480 | components of YouTube are written in C++. Same goes for
00:00:28.560 | Google, Facebook, Amazon, Twitter, most Microsoft
00:00:32.240 | applications, Adobe applications, most database systems, and most physical
00:00:37.440 | systems that operate in the real world, like cars, robots, rockets that launch us
00:00:42.560 | into space, and one day will land us on Mars.
00:00:46.480 | C++ also happens to be the language that I use more than any other in my life.
00:00:52.640 | I've written several hundred thousand lines of C++ source code.
00:00:56.560 | Of course, lines of source code don't mean much, but they do give hints of my
00:01:01.520 | personal journey through the world of software. I've enjoyed watching the
00:01:05.440 | development of C++ as a programming language,
00:01:08.320 | leading up to the big update in a standard in 2011,
00:01:12.480 | and those that followed in '14, '17, and toward the new C++20
00:01:18.160 | standard, hopefully coming out next year. This is
00:01:21.840 | the Artificial Intelligence Podcast. If you enjoy it,
00:01:24.880 | subscribe on YouTube, give it five stars on iTunes,
00:01:28.000 | support it on Patreon, or simply connect with me on Twitter
00:01:31.440 | at Lex Friedman, spelled F-R-I-D-M-A-N. And now, here's my conversation with Bjorn
00:01:38.240 | Stroustrup. What was the first program you've ever
00:01:42.400 | written? Do you remember? It was my second year in
00:01:46.880 | university, first year of computer science, and it
00:01:51.200 | was in Alcol 60. I calculated the shape of a
00:01:56.720 | super ellipse and then connected points on the perimeter, creating star
00:02:03.600 | patterns. It was with a wet ink
00:02:08.640 | on a paper printer. And that was in college?
00:02:12.800 | Yeah, I learned to program the second year in university.
00:02:18.400 | And what was the first programming language,
00:02:21.360 | if I may ask it this way, that you fell in love with?
00:02:26.480 | I think Alcol 60, and after that I remember
00:02:34.000 | Snowball, I remember Fortran, didn't fall in love with that, I remember
00:02:40.320 | Pascal, didn't fall in love with that, it all
00:02:43.440 | got in the way of me. And then I discovered Assembler, and
00:02:48.160 | that was much more fun. And from there I went to
00:02:53.200 | Microcode. So you were drawn to the, you found the low-level stuff
00:03:00.240 | beautiful. I went through a lot of languages and
00:03:03.840 | then I spent significant time in Assembler and
00:03:08.080 | Microcode. That was sort of the first really
00:03:11.040 | profitable things, and I paid for my master's actually.
00:03:15.360 | And then I discovered Simula, which was absolutely great.
00:03:19.520 | Simula? Simula was the extension of Alcol 60,
00:03:25.920 | done primarily for simulation, but basically they invented object-oriented
00:03:30.400 | programming at inheritance and runtime polymorphism
00:03:34.480 | while they were doing it.
00:03:37.520 | And that was the language that taught me that you could have
00:03:43.760 | the sort of the problems of a program grow with size of the program rather
00:03:49.360 | than with the square of the size of the
00:03:52.160 | program. That is, you can actually modularize
00:03:55.520 | very nicely. And that was a surprise to me.
00:04:00.640 | It was also a surprise to me that a stricter type system than Pascal's
00:04:07.440 | was helpful, whereas Pascal's type system got in my way all the time.
00:04:13.040 | So you need a strong type system to organize your code well, but it has to
00:04:19.360 | be extensible and flexible. Let's get into the details a little bit.
00:04:23.200 | If you remember, what kind of type system did Pascal have?
00:04:27.120 | What type system, typing system, did Alcol 60 have?
00:04:31.040 | Basically, Pascal was sort of the simplest language that
00:04:36.320 | Niklaus Wirt could define that served the needs of Niklaus
00:04:41.040 | Wirt at the time. And it has a sort of a highly
00:04:46.640 | moral tone to it. That is, if you can say it
00:04:50.560 | in Pascal, it's good. And if you can't, it's not so good.
00:04:54.560 | Whereas Simula allowed you basically to build your own type system.
00:05:02.880 | So instead of trying to fit yourself into
00:05:07.200 | Niklaus Wirt's world, Christen Nygård's language and
00:05:12.160 | Ole Johan Dahl's language allowed you to build your own.
00:05:15.760 | So it's sort of close to the original idea of you build a domain-specific
00:05:23.200 | language. As a matter of fact, what you build is a
00:05:27.440 | set of types and relations among types that allows you
00:05:31.920 | to express something that's suitable for an
00:05:34.800 | application. So when you say types, stuff you're saying has echoes of
00:05:40.240 | object-oriented programming. Yes, they invented it. Every language
00:05:44.800 | that uses the word "class" for type is a descendant of
00:05:49.920 | Simula. Directly or indirectly.
00:05:55.040 | Christen Nygård and Ole Johan Dahl were mathematicians,
00:05:59.600 | and they didn't think in terms of types.
00:06:03.680 | But they understood sets and classes of elements, and so they called their types
00:06:09.840 | "classes." And basically in C++, as in
00:06:14.640 | Simula, classes are user-defined type. So can you try the impossible task and
00:06:20.960 | give a brief history of programming languages from your
00:06:25.360 | perspective? So we started with Algol 60, Simula,
00:06:30.080 | Pascal, but that's just the 60s and 70s. I can try.
00:06:36.560 | The most sort of interesting and major improvement of programming languages was
00:06:43.600 | Fortran, the first Fortran. Because before that,
00:06:47.120 | all code was written for a specific machine, and each
00:06:50.720 | specific machine had a language, a simply language or
00:06:56.240 | macro assembler or some extension of that idea.
00:07:00.000 | But you are writing for a specific machine
00:07:03.520 | in the language of that machine. And
00:07:09.280 | Barkas and his team at IBM built a language that
00:07:13.680 | would allow you to write what you really wanted.
00:07:17.840 | That is, you could write it in a language that was natural for people.
00:07:23.280 | Now these people happened to be engineers and physicists, so
00:07:26.800 | the language that came out was somewhat unusual for the rest of the world.
00:07:30.560 | But basically they said "formula translation" because they wanted to have
00:07:34.240 | the mathematical formulas translated into the machine. And as a
00:07:39.120 | side effect, they got portability. Because now they are
00:07:44.240 | writing in the terms that the humans used, and
00:07:48.640 | the way humans thought. And then they had a program that
00:07:52.800 | translated it into the machine's needs. And that was new, and that was
00:07:58.320 | great. And it's something to remember. We want to
00:08:02.240 | raise the language to the human level, but we don't want to lose the efficiency.
00:08:09.200 | And that was the first step towards the human.
00:08:12.320 | That was the first step. And of course, they were very particular kind of
00:08:16.960 | humans. Business people were different, so they
00:08:20.160 | got Cobol instead, and etc. And Simula came out.
00:08:25.840 | No, let's not go to Simula yet. Let's go to Algol.
00:08:30.720 | Fortran didn't have, at the time, the notions of...
00:08:36.640 | not a precise notion of type, not a precise notion of scope,
00:08:41.920 | not a set of translation phases that was what we have today.
00:08:49.520 | Lexical, syntax, semantics. It was sort of a bit of a model in the early days, but
00:08:56.560 | hey, they'd just done the biggest breakthrough in the history of
00:09:00.240 | programming, right? So you can't criticize them for not having gotten all the
00:09:04.560 | technical details right. So we got Algol. That was very pretty.
00:09:09.520 | And most people in commerce and science
00:09:14.400 | considered it useless, because it was not flexible enough, and it wasn't
00:09:18.800 | efficient enough, and etc. But that was a breakthrough from a
00:09:24.480 | technical point of view. Then Simula came along to make that idea
00:09:30.640 | more flexible, and you could define your own types.
00:09:34.480 | And that's where I got very interested.
00:09:39.760 | Christen Nygård, who's the main idea man behind
00:09:43.040 | Simula. That was late 60s. This was late 60s.
00:09:46.720 | Well, I was a visiting professor in Aarhus, and so I learned object-oriented
00:09:52.480 | programming by sitting around and, well, in theory,
00:09:57.680 | discussing with Christen Nygård.
00:10:02.720 | But Christen, once you get started and in full flow, it's very hard to get a
00:10:08.880 | word in edgeways. You're just listening. So it was great. I learned it from there.
00:10:14.160 | Not to romanticize the notion, but it seems like a big leap
00:10:17.840 | to think about object-oriented programming.
00:10:21.840 | It's really a leap of abstraction. Yes. And was that as
00:10:29.840 | big and beautiful of a leap as it seems from
00:10:33.360 | now in retrospect, or was it an obvious one
00:10:36.560 | at the time? It was not obvious, and many people have tried to do
00:10:43.200 | something like that, and most people didn't come up with
00:10:46.240 | something as wonderful as Simula. Lots of people got their
00:10:51.520 | PhDs and made their careers out of forgetting about Simula or never
00:10:56.800 | knowing it. For me, the key idea was basically I
00:11:00.640 | could get my own types. And that's the idea that goes further
00:11:06.400 | into C++, where I can get better types and
00:11:10.640 | more flexible types and more efficient types. But it's still the
00:11:14.160 | fundamental idea. When I want to write a program, I want to write it with my types
00:11:19.200 | that is appropriate to my problem and under the constraints that I'm under
00:11:26.480 | with hardware, software, environment, etc.
00:11:29.840 | And that's the key idea. People picked up on the class hierarchies
00:11:36.400 | and the virtual functions and the inheritance, and
00:11:41.760 | that was only part of it. It was an interesting and major part and still a
00:11:47.440 | major part in a lot of graphic stuff, but it was not the most fundamental.
00:11:53.600 | It was when you wanted to relate one type to another, you don't want
00:11:58.320 | them all to be independent. The classical example is that you
00:12:03.920 | don't actually want to write a city simulation with vehicles,
00:12:10.400 | where you say, well, if it's a bicycle, write the code for turning a bicycle to
00:12:15.360 | the left. If it's a normal car, turn right the
00:12:18.480 | normal car way. If it's a fire engine, turn right the fire engine way.
00:12:23.520 | You get these big case statements and bunches of if statements and such.
00:12:28.640 | Instead, you tell the base class that
00:12:34.880 | that's the vehicle and say, turn left the way you want to.
00:12:39.600 | And this is actually a real example. They used it to simulate
00:12:44.480 | and optimize the emergency services for
00:12:51.200 | somewhere in Norway back in the 60s. So this was one of the early examples
00:12:58.480 | for why you needed inheritance and you needed a runtime polymorphism.
00:13:06.000 | Because you wanted to handle this set of
00:13:09.520 | vehicles in a manageable way. You can't just rewrite your code each
00:13:16.400 | time a new kind of vehicle comes along.
00:13:19.920 | Yeah, that's a beautiful, powerful idea. And of course, it stretches through your
00:13:24.080 | work with C++, as we'll talk about. But I think you've structured it nicely.
00:13:31.440 | What other breakthroughs came along in the history of programming languages?
00:13:35.920 | If we were to tell the history in that way?
00:13:39.440 | Obviously, I'm better at telling the part of the history that
00:13:42.800 | that is the path I'm on, as opposed to all the paths.
00:13:46.560 | Yeah, you skipped the hippie John McCarthy in Lisp,
00:13:50.160 | one of my favorite languages. But Lisp is not one of my favorite
00:13:55.360 | languages. It's obviously important. It's
00:13:58.400 | obviously interesting. Lots of people write code in it and then
00:14:02.640 | they rewrite it into C or C++ when they want to go to production.
00:14:06.800 | It's in the world I'm at, which are constrained by performance,
00:14:14.400 | reliability issues, deployability,
00:14:19.680 | cost of hardware. I don't like things to be too dynamic.
00:14:26.480 | It is really hard to write a piece of code that's perfectly flexible,
00:14:32.480 | that you can also deploy on a small computer,
00:14:36.240 | and that you can also put in, say, a telephone switch
00:14:39.280 | in Bogota. What's the chance, if you get an error and you find yourself in the
00:14:44.480 | debugger, that the telephone switch in Bogota on
00:14:48.160 | late Sunday night has a programmer around?
00:14:51.360 | The chance is zero. And so a lot of things I think most about
00:14:58.240 | can't afford that flexibility. I'm quite aware that maybe
00:15:05.680 | 70, 80 percent of all code are not under the kind of constraints
00:15:11.120 | I'm interested in. But somebody has to do the job I'm
00:15:16.080 | doing, because you have to get from these
00:15:19.280 | high-level flexible languages to the hardware.
00:15:23.040 | The stuff that lasts for 10, 20, 30 years is robust,
00:15:27.280 | operates under very constrained conditions, yes, absolutely.
00:15:30.880 | That's right. And it's fascinating and beautiful in its own way.
00:15:34.080 | C++ is one of my favorite languages, and so is Lisp. So I can
00:15:40.880 | embody it, too, for different reasons as a programmer.
00:15:47.120 | I understand why Lisp is popular, and I can see
00:15:51.120 | the beauty of the ideas, and similarly with
00:15:55.200 | Smalltalk. It's just not as relevant in my world.
00:16:05.120 | And by the way, I distinguish between those and the functional languages
00:16:09.280 | where I go to, things like ML and Haskell.
00:16:13.600 | Different kind of languages, they have a different kind of beauty,
00:16:18.160 | and they're very interesting. And I actually try to learn from
00:16:23.680 | all the languages I encounter to see what is there that would make
00:16:29.920 | working on the kind of problems I'm interested in
00:16:33.680 | with the kind of constraints that I'm interested in,
00:16:38.800 | what can actually be done better, because we can surely do better than we do today.
00:16:45.760 | You've said that it's good for any professional programmer to know at
00:16:49.920 | least five languages, speaking about a variety of languages
00:16:54.240 | that you've taken inspiration from. And you've listed
00:16:58.800 | yours as being, at least at the time, C++, obviously, Java, Python,
00:17:05.440 | Ruby, and JavaScript. Can you, first of all, update that list, modify it?
00:17:12.240 | You don't have to be constrained to just five, but can you describe what
00:17:17.520 | you picked up also from each of these languages, how you
00:17:22.080 | see them as inspirations for even you working with C++?
00:17:25.920 | This is a very hard question to answer. So, about languages, you should know
00:17:33.680 | languages. I reckon I knew about 25 or thereabouts when I did C++.
00:17:41.040 | It was easier in those days because the languages were smaller
00:17:44.800 | and you didn't have to learn a whole programming environment and such to do
00:17:50.000 | it. You could learn the language quite easily.
00:17:52.800 | And it's good to learn so many languages.
00:17:55.920 | I imagine, just like with natural language for communication,
00:18:02.320 | there's different paradigms that emerge in all of them.
00:18:05.600 | Yeah. That there's commonalities and so on.
00:18:08.800 | So, I picked five out of a hat. You picked five out of a hat.
00:18:12.640 | Obviously. The important thing that the number is not one.
00:18:17.680 | That's right. It's like, I don't like, I mean, if you're a monoglot, you are
00:18:22.880 | likely to think that your own culture is the only one
00:18:26.000 | superior to everybody else's. A good learning of a foreign language and a
00:18:30.160 | foreign culture is important. It helps you think and be a
00:18:34.320 | better person. With programming languages, you become a
00:18:37.440 | better programmer, better designer with the second language.
00:18:41.680 | Now, once you've got two, the wait of five is not
00:18:45.600 | that long. It's the second one that's most important. And then when I had to
00:18:51.920 | pick five, I sort of
00:18:56.400 | thinking what kinds of languages are there. Well, there's a
00:19:00.080 | really low level stuff. It's good. It's actually good to know machine code.
00:19:04.960 | Even still? Sorry to interrupt. Even today.
00:19:08.320 | The C++ optimizers write better machine code than I do.
00:19:13.760 | Yes. But I don't think I could appreciate them if I actually didn't
00:19:18.560 | understand machine code and machine architecture.
00:19:22.480 | At least in my position, I have to understand a bit of it.
00:19:26.240 | Because you mess up the cache and you're off
00:19:30.320 | in performance by a factor of a hundred. Right? It shouldn't be that if you are
00:19:35.760 | interested in either performance or the size of the computer you have to
00:19:40.080 | deploy. So I would go as a simpler.
00:19:46.000 | I used to mention C, but these days going low level
00:19:50.240 | is not actually what gives you the performance.
00:19:53.280 | It is to express your ideas so cleanly that you can think about it and the
00:19:58.320 | optimizer can understand what you're up to.
00:20:01.200 | My favorite way of optimizing these days is to throw
00:20:04.720 | out the clever bits and see if it still runs fast.
00:20:09.120 | And sometimes it runs faster. So I need the abstraction mechanisms or
00:20:14.800 | something like C++ to write compact high performance code.
00:20:20.480 | There was a beautiful keynote by Jason Turner at the CPP Con a couple of years
00:20:25.360 | ago where he decided he was going to
00:20:27.920 | program Pong on
00:20:32.880 | Motorola 6800 I think it was. And he says, "Well this is relevant
00:20:39.840 | because it looks like a microcontroller. It has specialized hardware. It has not
00:20:44.960 | very much memory and it's relatively slow."
00:20:48.320 | And so he shows in real time how he writes Pong
00:20:52.480 | starting with fairly straightforward low level stuff,
00:20:57.200 | improving his abstractions. And what he's doing
00:21:00.720 | he's writing C++ and it translates into
00:21:06.400 | into 86 Assembler which you can do with Clang and you can see it in real
00:21:12.720 | time. It's the Compiler Explorer which you can
00:21:16.800 | use on the web. And then he wrote a little program that
00:21:19.600 | translated 86 Assembler into
00:21:23.760 | Motorola Assembler. And so he types and you can see this thing
00:21:28.400 | in real time. Wow. You can see it in real time and even if you can't read the
00:21:32.320 | assembly code you can just see it. His code gets
00:21:35.840 | better. The code, the assembler gets smaller. He
00:21:40.480 | increases the abstraction level, uses C++ 11 as it were better.
00:21:46.560 | This code gets cleaner. It gets easier maintainable. The code shrinks
00:21:50.720 | and it keeps shrinking. And I could not in any reasonable amount
00:21:57.680 | of time write that assembler as good as the
00:22:01.680 | compiler generated from really quite nice modern C++.
00:22:06.560 | And I'll go as far as to say that the thing that looked like C
00:22:10.640 | was significantly uglier and smaller when it became and larger
00:22:18.960 | when it became machine code. So the abstractions that can be
00:22:24.720 | optimized are important. I would love to see that kind of
00:22:28.560 | visualization in larger code bases. Yeah. That might be beautiful. But you can't
00:22:32.960 | show a larger code base in a one hour talk and have it fit on
00:22:37.360 | screen. Right. So that's C and C++. So my two languages would be machine code
00:22:42.800 | and C++. And then I think you can learn a lot
00:22:47.440 | from the functional languages. So Pig has GloyML. I don't care which.
00:22:53.440 | I think actually you learn the same lessons
00:22:57.200 | of expressing especially mathematical notions really clearly
00:23:03.280 | and having a type system that's really strict.
00:23:08.000 | And then you should probably have a language for
00:23:11.360 | sort of quickly churning out something. You could pick JavaScript. You could
00:23:17.600 | pick Python. You could pick Ruby. What do you make of JavaScript in general?
00:23:22.560 | So you're talking in the platonic sense about languages, about
00:23:27.600 | what they're good at, what their philosophy of design is. But
00:23:31.920 | there's also a large user base behind each of these languages and they use it
00:23:36.080 | in the way sometimes maybe it wasn't really designed
00:23:39.120 | for. That's right. JavaScript is used way beyond
00:23:42.080 | probably what it was designed for. Let me say it this way. When you build a
00:23:46.160 | tool, you do not know how it's going to be used.
00:23:49.520 | You try to improve the tool by looking at how it's being used and when people
00:23:54.320 | cut their fingers or from trying to stop that from happening.
00:23:58.320 | But really you have no control over how something is used.
00:24:02.480 | So I'm very happy and proud of some of the things C++ is being used at
00:24:07.040 | and some of the things I wish people wouldn't do.
00:24:10.640 | Bitcoin mining being my favorite example. It uses as much energy as Switzerland
00:24:16.560 | and mostly serves criminals. Yeah. But back to the languages.
00:24:23.280 | I actually think that having JavaScript run in the browser
00:24:28.080 | was an enabling thing for a lot of things. Yes, you could have
00:24:33.440 | done it better, but people were trying to do it better
00:24:36.640 | and they were using
00:24:39.520 | sort of more principled language designs, but they just couldn't do it right.
00:24:44.800 | And the non-professional programmers that write
00:24:49.760 | lots of that code just couldn't understand them. So
00:24:53.840 | it did an amazing job for what it was. It's not the prettiest
00:25:00.000 | language and I don't think it ever will be
00:25:02.720 | the prettiest language, but let's not be bigots here.
00:25:07.680 | So what was the origin story of C++? You basically gave a few
00:25:14.800 | perspectives of your inspiration of object-oriented
00:25:19.840 | programming. You had a connection with C and
00:25:23.280 | performance efficiency was an important thing you
00:25:26.640 | were drawn to. Efficiency and reliability. Reliability.
00:25:30.720 | You have to get both. What's reliability? I
00:25:35.600 | really want my telephone calls to get through
00:25:39.200 | and I want the quality of what I am talking coming out at the other end.
00:25:44.720 | The other end might be in London or wherever.
00:25:49.840 | And you don't want the system to be crashing.
00:25:53.440 | If you're doing a bank, you mustn't crash.
00:25:57.040 | It might be your bank account that is in trouble. There's different
00:26:02.640 | constraints like in games, it doesn't matter too much if there's a crash.
00:26:06.240 | Nobody dies and nobody gets ruined, but I'm interested in the combination of
00:26:11.920 | performance, partly because of sort of
00:26:16.320 | speed of things being done, part of being able to do things that is
00:26:21.120 | necessary to have reliability
00:26:26.480 | of larger systems. If you spend all your time
00:26:31.520 | interpreting a simple function call, you are not going to have enough time to
00:26:37.280 | do proper signal processing to get the telephone calls to sound right.
00:26:42.560 | Either that or you have to have 10 times as many computers and you can't afford
00:26:46.880 | your phone anymore. It's a ridiculous idea in the modern
00:26:51.120 | world because we have solved all of those problems.
00:26:55.280 | I mean they keep popping up in different ways because we
00:26:58.640 | tackle bigger and bigger problems, so efficiency remains always an important
00:27:02.480 | aspect. But you have to think about efficiency not just
00:27:06.240 | as speed but as an enabler to important things and one of the things
00:27:12.160 | it enables is reliability, is dependability.
00:27:18.080 | When I press the pedal, the brake pedal of a car,
00:27:22.720 | it is not actually connected directly to
00:27:26.640 | anything but a computer. That computer better work.
00:27:31.680 | Let's talk about reliability just a little bit. So
00:27:34.960 | modern cars have ECUs, have millions of lines of code today.
00:27:42.560 | So this is certainly especially true of autonomous vehicles where some of the
00:27:46.640 | aspects of the control or driver assistance systems that steer
00:27:49.840 | the car, keeping the lanes on. So how do you think, you know, I talk to
00:27:54.720 | regulators, people in government who are very nervous about testing the
00:27:59.440 | safety of these systems of software. Ultimately software that makes
00:28:04.720 | decisions that could lead to fatalities. So
00:28:09.280 | how do we test software systems like these?
00:28:13.600 | First of all, safety like performance and like security
00:28:21.200 | is the system's property. People tend to look at one part of a system at a time
00:28:26.800 | and saying something like, "This is secure." That's
00:28:30.880 | all right. I don't need to do that. Yeah, that piece of code is secure. I'll
00:28:35.840 | buy your operator. If you want to have
00:28:39.760 | reliability, if you want to have performance, if you want to have
00:28:43.600 | security, you have to look at the whole system.
00:28:46.960 | I did not expect you to say that, but that's very true. Yes.
00:28:50.160 | I'm dealing with one part of the system and I want my part to be
00:28:53.920 | really good, but I know it's not the whole system.
00:28:57.280 | Furthermore, making an individual part perfect
00:29:04.000 | may actually not be the best way of getting the highest degree of
00:29:07.680 | reliability and performance and such. There's people who say C++ is type safe,
00:29:13.200 | not type safe. You can break it. Sure. I can break anything that runs on a
00:29:18.880 | computer. I may not go through your type system.
00:29:23.360 | If I wanted to break into your computer, I'll probably try SQL injection.
00:29:28.640 | It's very true. If you think about safety or even reliability at a system level,
00:29:34.400 | especially when a human being is involved,
00:29:38.160 | it starts becoming hopeless pretty quickly in terms of
00:29:45.040 | proving that something is safe to a certain level.
00:29:49.280 | Because there's so many variables, it's so complex. Well, let's get back to
00:29:53.200 | something we can talk about and actually make some progress on.
00:29:57.680 | We can look at C++ programs and we can
00:30:01.680 | try and make sure they crash less often. The way you do that
00:30:09.040 | is largely by simplification. It is not... the first step
00:30:15.760 | is to simplify the code, have less code, have code that are less likely to go
00:30:21.040 | wrong. It's not by runtime testing everything.
00:30:24.480 | It is not by big test frameworks that you're using.
00:30:29.040 | Yes, we do that also. But the first step is actually to make sure that when you
00:30:35.200 | want to express something, you can express it directly in code
00:30:40.560 | rather than going through endless loops and convolutions in your head
00:30:45.440 | before it gets down the code. That if the way you are thinking about a
00:30:51.520 | problem is not in the code. There is a missing
00:30:56.000 | piece that's just in your head. And the code,
00:30:59.360 | you can see what it does, but it cannot see what you
00:31:03.600 | thought about it unless you have expressed things
00:31:06.240 | directly. When you express things directly, you can
00:31:10.480 | maintain it. It's easier to find errors. It's easier
00:31:13.680 | to make modifications. It's actually easier to test it. And
00:31:18.240 | lo and behold, it runs faster. And therefore you can use a
00:31:23.760 | smaller number of computers, which means there's less
00:31:26.800 | hardware that could possibly break. So I think the key here is
00:31:32.480 | simplification. But it has to be, to use the Einstein
00:31:37.680 | quote, as simple as possible and no simpler.
00:31:40.080 | There are other areas with under constraint where you can be
00:31:44.880 | simpler than you can be in C++, but in the domain I'm dealing with,
00:31:50.240 | that's the simplification I'm after. So how do you inspire or
00:31:57.200 | ensure that the Einstein level of simplification is reached?
00:32:03.360 | So can you do code review? Can you look at code?
00:32:08.000 | Is there, if I gave you the code for the Ford F-150 and said,
00:32:13.760 | here, is this a mess or is this okay? Is it possible to tell?
00:32:20.080 | Is it possible to regulate? An experienced developer can look
00:32:26.160 | at code and see if it smells. I mix metaphors deliberately.
00:32:31.760 | The point is that it is hard to generate something that is
00:32:43.360 | really obviously clean and can be appreciated, but you can
00:32:50.000 | usually recognize when you haven't reached that point.
00:32:53.840 | And so if I, I've never looked at the F-150 code, so I wouldn't know.
00:33:03.360 | But I know what I would be looking for. I'll be looking for some
00:33:07.520 | tricks that correlate with bugs and elsewhere.
00:33:10.720 | And I have tried to formulate rules for what good code looks like.
00:33:19.200 | And the current version of that is called the C++ core guidelines.
00:33:26.880 | One thing people should remember is there's what you can do
00:33:31.760 | in a language and what you should do. In a language, you have lots of things that
00:33:38.720 | is necessary in some contexts, but not in others.
00:33:42.080 | There's things that exist just because there's
00:33:45.040 | 30-year-old code out there and you can't get rid of it.
00:33:48.640 | But you can't have rules that says when you create it, try and follow these rules.
00:33:54.640 | This does not create good programs by themselves,
00:34:00.000 | but it limits the damage and for mistakes,
00:34:03.600 | it limits the possibilities of mistakes. And basically, we are trying to
00:34:08.640 | say what is it that a good programmer does
00:34:12.720 | at the fairly simple level of where you use the language and how you use it.
00:34:17.760 | Now, I can put all the rules for chiseling in marble. It doesn't mean
00:34:24.640 | that somebody who follows all of those rules
00:34:27.760 | can do a masterpiece by Michelangelo. That is, there's something else to write
00:34:35.440 | a good program. Just is there something else to create
00:34:38.720 | an important work of art. That is, there's some kind of
00:34:44.400 | inspiration, understanding, gift. But we can
00:34:51.120 | approach the sort of technical, the craftsmanship level of it.
00:34:59.680 | The famous painters, the famous sculptures,
00:35:02.960 | was among other things, superb craftsman. They could express their ideas
00:35:10.640 | using their tools very well. And so these days, I think what I'm doing,
00:35:18.000 | what a lot of people are doing, we are still trying to figure out how it
00:35:21.600 | is to use our tools very well. For a really good piece of code,
00:35:28.960 | you need a spark of inspiration and you can't, I think, regulate that.
00:35:33.280 | You cannot say that I'll take a picture only,
00:35:38.640 | I'll buy your picture only if you're at least Van Gogh.
00:35:44.640 | There are things you can regulate, but not the inspiration.
00:35:50.400 | I think that's quite beautifully put. It is true
00:35:54.160 | that there is, as an experienced programmer, when you see code that's
00:36:00.000 | inspired, that's like Michelangelo,
00:36:05.920 | you know it when you see it. And the opposite of that is code that
00:36:11.680 | is messy, code that smells, you know when you see it.
00:36:14.960 | And I'm not sure you can describe it in words except
00:36:18.400 | vaguely through guidelines and so on. Yes, it's
00:36:21.520 | easier to recognize ugly than to recognize beauty
00:36:26.800 | in code. And for the reason is that sometimes
00:36:30.080 | beauty comes from something that's innovative and unusual.
00:36:34.000 | And you have to sometimes think reasonably hard to appreciate that.
00:36:38.960 | On the other hand, the messes have things
00:36:42.720 | in common. And you can have static checkers and dynamic
00:36:48.800 | checkers that find a large number of the
00:36:55.200 | most common mistakes. You can catch a lot of sloppiness mechanically. I'm a
00:37:02.560 | great fan of static analysis in particular
00:37:07.840 | because you can check for not just the language rules but for the usage of
00:37:11.840 | language rules. And I think we will see much more
00:37:15.600 | static analysis in the coming decade. Can you describe
00:37:19.520 | what static analysis is? You represent a piece of code
00:37:25.840 | so that you can write a program that goes over
00:37:30.320 | that representation and look for things that are
00:37:35.680 | right and not right. So for instance you can
00:37:39.280 | analyze a program to see if
00:37:45.600 | resources are leaked. That's one of my favorite
00:37:50.640 | problems. It's not actually all that hard in modern C++ but you can do it.
00:37:56.960 | If you are writing in the C level you have to have a malloc and a free
00:38:01.360 | and they have to match. If you have them in a single function you can
00:38:07.760 | usually do it very easily. If there's a malloc here there should be a free there.
00:38:14.320 | On the other hand, in between can be true and complete code and then it becomes
00:38:18.400 | impossible. If you pass that pointer to the
00:38:22.640 | memory out of a function and then want to make sure that the free
00:38:29.200 | is done somewhere else. Now it gets really difficult.
00:38:33.360 | And so for static analysis you can run through a program
00:38:37.600 | and you can try and figure out if there's any leaks.
00:38:42.880 | And what you will probably find is that you will find some leaks
00:38:48.160 | and you'll find quite a few places where your analysis can't be complete.
00:38:53.760 | It might depend on runtime. It might depend
00:38:56.960 | on the cleverness of your analyzer. And it might take a long time. Some of
00:39:03.840 | these programs run for a long time. But if you combine
00:39:10.080 | such analysis with a set of rules that says how
00:39:14.480 | people could use it, you can actually see why the rules are violated.
00:39:19.760 | And that stops you from getting into the impossible complexities. You don't want
00:39:25.840 | to solve the holding problem.
00:39:28.720 | So static analysis is looking at the code without running the code.
00:39:32.240 | Yes. And thereby it's almost, not in production code, but it's almost
00:39:38.800 | like an educational tool of how the language should be used.
00:39:43.760 | It guides you. At its best, it would guide you in how you write future
00:39:50.160 | code as well and you learn together.
00:39:52.400 | Yes. So basically you need a set of rules for how you use the language.
00:39:56.960 | Then you need a static analysis that catches your mistakes when you violate
00:40:04.560 | the rules or when your code ends up doing things that it shouldn't, despite
00:40:09.760 | the rules, because there is the language rules. We can go further.
00:40:13.600 | And again, it's back to my idea that I would much rather find errors before I
00:40:18.960 | start running the code. If nothing else, once the code runs, if
00:40:24.160 | it catches an error at run times, I have to have an error handler.
00:40:28.240 | And one of the hardest things to write in code is error handling code, because
00:40:33.840 | you know something went wrong. Do you know really exactly what went
00:40:38.400 | wrong? Usually not. How can you recover when you
00:40:41.760 | don't know what the problem was? You can't be 100% sure what the problem
00:40:46.400 | was in many, many cases. And this is part of it. So yes, we need
00:40:54.240 | good languages with good type systems. We need rules for how to use them.
00:40:58.880 | We need static analysis. And the ultimate for static analysis is,
00:41:03.120 | of course, program proof, but that still doesn't scale to the kind
00:41:07.600 | of systems we deploy. Then we start needing testing and
00:41:13.280 | the rest of the stuff.
00:41:15.280 | So C++ is an object-oriented programming language that creates,
00:41:21.520 | especially with its newer versions, as we'll talk about, higher and higher
00:41:24.560 | levels of abstraction. So how do you design...
00:41:30.480 | Let's even go back to the origin of C++. How do you design something with so much
00:41:34.400 | abstraction that's still efficient and
00:41:39.520 | is still something that you can manage, do static analysis on, you can
00:41:47.200 | have constraints on, that can be reliable, all those things we've talked about.
00:41:51.840 | So to me, there's a slight tension between
00:41:58.000 | high-level abstraction and efficiency. That's a good question. I could probably
00:42:03.760 | have a year's course just trying to answer it.
00:42:08.400 | Yes, there's a tension between efficiency and abstraction,
00:42:12.240 | but you also get the interesting situation that you get the best
00:42:17.040 | efficiency out of the best abstraction. And my main tool
00:42:23.680 | for efficiency, for performance, actually is abstraction.
00:42:28.080 | So let's go back to how C++ got there.
00:42:32.000 | You said it was an object-oriented programming language. I actually never
00:42:35.520 | said that. It's always quoted, but I never did. I
00:42:39.600 | said C++ supports object-oriented programming and
00:42:44.080 | other techniques. And that's important, because I think
00:42:48.800 | that the best solution to most
00:42:52.800 | complex, interesting problems require ideas and techniques from
00:42:59.520 | things that have been called object-oriented,
00:43:04.560 | data abstraction, functional, traditional C-style code,
00:43:11.840 | all of the above. And so when I was designing C++,
00:43:18.720 | I soon realized I couldn't just add features.
00:43:23.520 | If you just add what looks pretty, or what people ask for,
00:43:27.200 | or what you think is good, one by one, you're not going to get a
00:43:31.840 | coherent whole. What you need is a set of guidelines
00:43:36.320 | that guides your decisions. Should this feature be in, or
00:43:41.760 | should this feature be out? How should a feature be modified before
00:43:46.640 | it can go in, and such? And there's a, in the book I wrote
00:43:50.960 | about that, that's "Sign Evolution of C++," there's a
00:43:54.560 | whole bunch of rules like that. Most of them are not
00:43:57.520 | language technical. They're things like,
00:44:02.880 | "Don't violate static type system," because I like static type system
00:44:07.200 | for the obvious reason that I like things to be reliable on
00:44:13.760 | reasonable amounts of hardware. But one of these rules is,
00:44:19.360 | it's a zero overhead principle. The what kind of principle? The zero overhead
00:44:23.200 | principle. It basically says that if you
00:44:28.320 | have an abstraction, it should not cost anything compared to write
00:44:34.080 | the equivalent code at a lower level.
00:44:39.360 | So if I have, say, a matrix multiply, it should be written in such a way
00:44:48.720 | that you could not drop to the C level of abstraction and use arrays and
00:44:53.920 | pointers and such and run faster. And so people have written such
00:45:00.800 | matrix multiplications, and they've actually gotten
00:45:04.800 | code that ran faster than Fortran, because once you had the right
00:45:08.560 | abstraction, you can eliminate, you can eliminate temporaries, and you
00:45:13.520 | can do loop fusion and other good stuff like
00:45:17.680 | that. That's quite hard to do by hand and in a
00:45:20.400 | lower level language. And there's some really nice examples of
00:45:24.320 | that. And the key here is that that matrix
00:45:29.520 | multiplication, the matrix abstraction, allows you to write code that's simple
00:45:35.360 | and easy. You can do that in any language.
00:45:37.520 | But with C++, it has the features so that you can also
00:45:41.040 | have this thing run faster than if you hand-coded it.
00:45:45.360 | Now, people have given that lecture many times, I and others,
00:45:50.320 | and a very common question after the talk, where you have demonstrated that
00:45:54.640 | you can outperform Fortran for dense matrix multiplication, people
00:45:59.120 | come up and say, "Yeah, but that was C++.
00:46:01.680 | If I rewrote your code in C, how much faster would it run?"
00:46:06.000 | The answer is, much slower. This happened the first time, actually,
00:46:12.400 | back in the '80s, with a friend of mine called Doug McElroy,
00:46:15.760 | who demonstrated exactly this effect.
00:46:20.080 | And so, the principle is, you should give programmers the tools so that their
00:46:26.720 | abstractions can follow the zero-overhead principle.
00:46:30.320 | Furthermore, when you put in a language feature in C++,
00:46:34.000 | or a standard library feature, you try to meet this.
00:46:38.000 | It doesn't mean it's absolutely optimal, but it means if you hand-code it
00:46:43.200 | with the usual facilities in the language,
00:46:46.880 | in C++, in C, you should not be able to better it.
00:46:51.040 | Usually, you can do better if you use embedded assembler for machine code,
00:46:57.920 | for some of the details to utilize part of a computer that the compiler doesn't
00:47:03.440 | know about. But you should get to that point before
00:47:06.640 | you beat to the abstraction.
00:47:09.120 | So, that's a beautiful ideal to reach for.
00:47:12.960 | And we meet it quite often.
00:47:16.080 | So, where's the magic of that coming from?
00:47:19.040 | There's some of it is the compilation process, so the implementation of C++.
00:47:23.520 | Some of it is the design of the feature itself,
00:47:27.520 | the guidelines. So, I've recently, and often, talked to Chris Latner,
00:47:33.440 | so, Clang. What, just out of curiosity, is your
00:47:40.320 | relationship in general with the different implementations of C++
00:47:44.800 | as you think about you and committee and other people in C++, think about the
00:47:49.280 | design of new features or design of previous features.
00:47:54.080 | In trying to reach the ideal of zero overhead,
00:47:59.840 | does the magic come from the design, the guidelines, or from the
00:48:05.120 | implementations?
00:48:06.480 | And. Not all.
00:48:09.520 | You go for programming technique,
00:48:13.840 | program language features, and implementation techniques. You need all
00:48:17.760 | three.
00:48:18.960 | And how can you think about all three at the same time?
00:48:23.680 | It takes some experience, takes some practice, and sometimes you get it wrong.
00:48:27.920 | But after a while, you sort of get it right.
00:48:31.280 | I don't write compilers anymore, but
00:48:36.560 | Brian Kernighan pointed out that one of the reasons C++
00:48:42.400 | succeeded was some of the craftsmanship I put into the
00:48:48.720 | early compilers. And, of course, I did the language
00:48:52.160 | design, and of course, I wrote a fair amount of code using this kind of
00:48:55.760 | stuff. And I think most of the successes
00:49:00.560 | involve progress in all three areas together.
00:49:05.600 | A small group of people can do that. Two, three people
00:49:09.600 | can work together to do something like that. It's ideal if it's one person
00:49:13.600 | that has all the skills necessary, but nobody has all the skills necessary
00:49:18.240 | in all the fields where C++ is used. So if you want to
00:49:22.560 | approach my ideal in, say, concurrent programming, you need to
00:49:27.200 | know about algorithms from concurrent programming.
00:49:30.400 | You need to know the trigger of lock-free programming.
00:49:34.240 | You need to know something about compiler techniques.
00:49:37.920 | And then you have to know some of the program areas,
00:49:41.600 | sorry, the application areas where this is,
00:49:45.520 | like some forms of graphics or some forms of
00:49:51.040 | what we call a web-serving kind of stuff. And that's very hard to get
00:49:57.600 | into a single head, but small groups can do it too.
00:50:01.440 | So is there differences in your view, not saying which is better or so on,
00:50:08.080 | but differences in the different implementations
00:50:10.400 | of C++? Why are there several sort of maybe naive questions for me?
00:50:18.720 | GCC, Clang, so on. This is a very reasonable question.
00:50:23.680 | When I designed C++,
00:50:27.680 | most languages had multiple implementations.
00:50:32.720 | Because if you run on an IBM, if you run on a Sun, if you run on a Motorola,
00:50:39.200 | there was just many, many companies and they each have their own compilation
00:50:42.960 | structure and their old compilers. It was just fairly common that there
00:50:47.360 | was many of them. And I wrote C front assuming that other
00:50:52.640 | people would write compilers for C++ if I was successful.
00:50:57.920 | And furthermore, I wanted to utilize all the back-end infrastructures that
00:51:04.400 | were available. I soon realized that my users were using
00:51:07.920 | 25 different linkers. I couldn't write my own linker.
00:51:12.800 | Yes, I could, but I couldn't write 25 linkers and also get any work done on
00:51:18.560 | the language. And so it came from a world where there
00:51:22.800 | was many linkers, many optimizers, many
00:51:27.120 | compiler front-ends, not to start, but many operating systems.
00:51:34.240 | The whole world was not an 86 and a Linux box or something,
00:51:39.440 | whatever is the standard today. In the old days, they said a set of
00:51:43.680 | backs. So basically, I assumed there would be
00:51:47.840 | lots of compilers. It was not a decision that there should
00:51:51.520 | be many compilers. It was just a fact. That's the way the
00:51:55.120 | world is. And yes,
00:51:59.440 | many compilers emerged. And today, there's at least four front-ends,
00:52:08.480 | Clang, GCC, Microsoft, and EDG. It is the same group.
00:52:15.200 | They supply a lot of the independent organizations and the embedded systems
00:52:22.000 | industry. And there's lots and lots of back-ends.
00:52:26.240 | We have to think about how many dozen back-ends there are.
00:52:31.680 | Because different machines have different things. Especially in the
00:52:35.280 | embedded world, the machines are very different. The architectures are very
00:52:39.600 | different. And so,
00:52:43.680 | having a single implementation was never an option. Now, I also happen to
00:52:50.880 | dislike monocultures. Monocultures?
00:52:54.400 | They are dangerous. Because whoever owns the
00:52:58.560 | monoculture can go stale, and there's no competition,
00:53:03.120 | and there's no incentive to innovate. There's a lot of incentive to put
00:53:08.320 | barriers in the way of change. Because, hey, we own the world, and it's
00:53:13.680 | a very comfortable world for us. And who are you to
00:53:16.960 | to mess with that? So, I really am very happy that there's four
00:53:23.360 | front-ends for C++. Clang's great, but GCC was great.
00:53:30.800 | But then it got somewhat stale. Clang came along,
00:53:34.400 | and GCC is much better now. Competition is good.
00:53:38.480 | Microsoft is much better now. So, at least a low number of front-ends
00:53:46.400 | puts a lot of pressure on
00:53:49.920 | standards compliance, and also on performance, and error messages,
00:53:55.920 | and compile time, speed, all this good stuff that we want.
00:54:01.120 | Do you think, crazy question, there might come along,
00:54:05.760 | do you hope there might come along, implementation of C++
00:54:10.640 | written, given all its history, written from scratch? So, written today
00:54:18.240 | from scratch? Well, Clang and LLVM is more or less written by from scratch.
00:54:24.880 | But there's been C++ 11, 14, 17, 20, you know, there's been a lot of...
00:54:30.880 | Sooner or later, somebody's going to try again.
00:54:33.920 | There has been attempts to write new C++ compilers, and
00:54:39.040 | some of them has been used, and some of them has been absorbed into others, and
00:54:42.960 | such. Yeah, it'll happen. So, what are the key features of C++?
00:54:49.600 | And let's use that as a way to sort of talk about
00:54:54.320 | the evolution of C++, the new feature. So,
00:54:57.600 | at the highest level, what are the features that were there in the
00:55:01.360 | beginning? What features got added? Let's first get a principle,
00:55:07.760 | an aim in place. C++ is for people who want to use
00:55:14.160 | hardware really well, and then manage the complexity of doing that through
00:55:19.280 | abstraction. And so, the first facility
00:55:24.720 | you have is a way of manipulating the machines at a fairly low level. That
00:55:31.280 | looks very much like C. It has loops, it has variables, it
00:55:38.000 | has pointers, like machine addresses, it can access memory directly, it can
00:55:44.160 | allocate stuff in the absolute minimum of space
00:55:49.360 | needed on the machine. There's a machine-facing part of C++,
00:55:54.320 | which is roughly equivalent to C. I said C++ could beat C, and it can.
00:56:00.080 | It doesn't mean I dislike C. If I disliked C,
00:56:03.280 | I wouldn't have built on it. Furthermore, after Dennis Ritchie, I'm
00:56:09.360 | probably the major contributor to modern C.
00:56:13.760 | And, well, I had lunch with Dennis most days for 16 years, and we never
00:56:21.040 | had a harsh word between us. So,
00:56:25.360 | these C versus C++ fights are for people who don't quite understand
00:56:30.400 | what's going on. Then the other part is the abstraction.
00:56:35.440 | And there, the key is the class, which is a user-defined type.
00:56:40.160 | And my idea for the class is that you should be able to build a type
00:56:44.560 | that's just like the built-in types, in the way you use them, in the way you
00:56:50.320 | declare them, in the way you get the memory,
00:56:53.840 | and you can do just as well. So, in C++, there's an int,
00:56:59.680 | as in C. You should be able to build an abstraction, a class, which we can call
00:57:05.760 | capital int, that you can use exactly like an integer
00:57:10.800 | and run just as fast as an integer. There's the idea right there. And, of
00:57:16.560 | course, you probably don't want to use the int
00:57:19.520 | itself, but it has happened. People have wanted integers that were
00:57:25.200 | range-checked so that you couldn't overflow and such, especially for very
00:57:28.960 | safety-critical applications, like the fuel injection for a
00:57:33.760 | marine diesel engine for the largest ships.
00:57:37.040 | This is a real example, by the way. This has been done.
00:57:40.640 | They built themselves an integer that was just like integer,
00:57:45.120 | except that it couldn't overflow. If there was an overflow, you went into the error
00:57:49.840 | handling. And then you built more interesting
00:57:54.080 | types. You can build a matrix, which you need to do graphics,
00:58:00.000 | or you could build a gnome for a video game.
00:58:04.960 | And all these are classes and they appear just like the built-in types?
00:58:08.400 | Exactly. In terms of efficiency and so on. So, what else is there?
00:58:11.760 | And flexibility. So, I don't know. For people who are not
00:58:18.080 | familiar with object-oriented programming, there's inheritance.
00:58:21.600 | There's a hierarchy of classes. You can,
00:58:24.960 | just like you said, create a generic vehicle that can turn left.
00:58:29.120 | So, what people found was that you don't actually...
00:58:36.880 | No, how do I say this? A lot of types are related.
00:58:43.760 | That is, the vehicles, all vehicles are related.
00:58:48.720 | Bicycles, cars, fire engines, tanks. They have some things in common and
00:58:55.440 | some things that differ. And you would like to have the common
00:58:58.800 | things common and having the differences specific.
00:59:03.600 | And when you didn't want to know about the differences, like,
00:59:06.560 | just turn left. You don't have to worry about it. That's how you get
00:59:12.800 | the traditional object-oriented programming coming out of Simula,
00:59:16.240 | adopted by Smalltalk and C++ and all the other languages.
00:59:21.600 | The other kind of obvious similarity between types
00:59:25.280 | comes when you have something like a vector.
00:59:29.120 | Fortran gave us the vector, called array,
00:59:32.560 | of doubles. But the minute you have a vector of doubles, you want a vector
00:59:39.360 | of double precision doubles, and for short doubles, for graphics.
00:59:45.680 | Why should you not have a vector of integers while you're at it?
00:59:49.680 | Or a vector of vectors, a vector of vectors of chess pieces.
00:59:55.280 | Now you have a board, right? So, this is, you express the commonality
01:00:03.840 | as the idea of a vector, and the variations come through parameterization.
01:00:10.080 | And so, here we get the two fundamental ways of abstracting,
01:00:14.560 | of having similarities of types in C++.
01:00:21.440 | There's the inheritance, and there's a parameterization.
01:00:24.320 | There's the object-oriented programming, and there's the generic programming.
01:00:28.480 | With the templates for the generic programming?
01:00:31.360 | So, you've presented it very nicely, but now you have to make all that happen
01:00:38.880 | and make it efficient. So, generic programming,
01:00:42.480 | with templates, there's all kinds of magic going on, especially recently,
01:00:47.200 | that you can help catch up on. But it feels to me like you can do way more
01:00:52.240 | than what you just said, with templates. You can start doing
01:00:56.560 | this kind of metaprogramming, this kind of...
01:00:58.320 | You can do metaprogramming also. I didn't go there in that explanation.
01:01:04.240 | We're trying to be very basic, but go back on to the implementation.
01:01:08.000 | Implementation.
01:01:08.640 | If you couldn't implement this efficiently, if you couldn't use it
01:01:13.840 | so that it became efficient, it has no place in C++,
01:01:17.600 | because it will violate the zero overhead principle.
01:01:20.880 | So, when I had to get object-oriented programming inheritance,
01:01:27.440 | I took the idea of virtual functions from simula.
01:01:32.320 | Virtual functions is a simula term. Class is a simula term.
01:01:37.120 | If you ever use those words, say thanks to Christian Nygaard
01:01:40.960 | and Ole Johan Dahl. And I did the simplest implementation
01:01:46.560 | I knew of, which was basically a jump table.
01:01:50.800 | So, you get the virtual function table, the function goes in,
01:01:55.760 | does an indirection through a table, and get the right function.
01:01:59.120 | That's how you pick the right thing there. And I thought that was trivial.
01:02:04.960 | It's close to optimal. And it was obvious.
01:02:08.720 | It turned out the simula had a more complicated way of doing it,
01:02:12.000 | and therefore slower. And it turns out that most languages
01:02:16.480 | have something that's a little bit more complicated,
01:02:18.800 | sometimes more flexible, but you pay for it.
01:02:21.280 | And one of the strengths of C++ was that you could actually do
01:02:25.600 | this object-oriented stuff, and your overhead compared to
01:02:31.200 | ordinary functions, there's no indirection, it's sort of in 5, 10, 25 percent.
01:02:37.440 | Just the call. It's down there. It's not two.
01:02:41.120 | And that means you can afford to use it. Furthermore, in C++,
01:02:47.120 | you have the distinction between a virtual function and a non-virtual function.
01:02:51.840 | If you don't want any overhead, if you don't need the indirection
01:02:56.000 | that gives you the flexibility in object-oriented programming,
01:02:59.440 | just don't ask for it. So the idea is that you only use
01:03:04.960 | virtual functions if you actually need the flexibility.
01:03:07.840 | So it's not zero overhead, but it's zero overhead compared
01:03:12.080 | to any other way of achieving the flexibility.
01:03:14.720 | Now, auto-parameterization.
01:03:18.720 | Basically, the compiler looks at the template,
01:03:28.880 | say the vector, and it looks at the parameter,
01:03:34.320 | and then combines the two and generates a piece of code
01:03:38.800 | that is exactly as if you've written a vector of that specific type.
01:03:43.440 | So that's the minimal overhead. If you have many template parameters,
01:03:50.000 | you can actually combine code that the compiler couldn't usually see
01:03:54.000 | at the same time, and therefore get code that is faster
01:03:59.440 | than if you had handwritten the stuff, unless you are very, very clever.
01:04:05.040 | So the thing is, parameterized code, the compiler fills stuff in
01:04:11.200 | during the compilation process, not during runtime.
01:04:14.720 | That's right. And furthermore, it gives all the information it's gotten,
01:04:20.640 | which is the template, the parameter, and the context of use.
01:04:26.640 | It combines the three and generates good code.
01:04:29.520 | But it can generate...
01:04:32.480 | Now, it's a little outside of what I'm even comfortable thinking about,
01:04:37.680 | but it can generate a lot of code.
01:04:40.580 | And how do you... I remember being both amazed at the power of that idea,
01:04:48.160 | and how ugly the debugging looked.
01:04:52.800 | Yes, debugging can be truly horrid.
01:04:56.800 | Come back to this, because I have a solution.
01:04:58.880 | Anyway, the debugging was ugly.
01:05:02.320 | The code generated by C++ has always been ugly,
01:05:09.360 | because there's these inherent optimizations.
01:05:12.080 | A modern C++ compiler has front-end, middle-end, and back-end optimizations.
01:05:17.680 | Even C-Front, back in '83, had front-end and back-end optimizations.
01:05:23.760 | I actually took the code, generated an internal representation,
01:05:29.200 | munched that representation to generate good code.
01:05:33.680 | So people say, "This is not a compiler that generates C."
01:05:36.640 | The reason it generated C was I wanted to use C's code generators
01:05:41.040 | that was really good at back-end optimizations.
01:05:43.200 | But I needed front-end optimizations,
01:05:46.640 | and therefore the C I generated was optimized C.
01:05:51.280 | The way a really good handcrafted optimizer, human, could generate it,
01:05:59.680 | and it was not meant for humans.
01:06:01.120 | It was the output of a program, and it's much worse today.
01:06:05.120 | And with templates, it gets much worse still.
01:06:07.680 | So it's hard to combine simple debugging with optimal code,
01:06:16.960 | because the idea is to drag in information from different parts of the code
01:06:22.160 | to generate good code, machine code.
01:06:26.400 | And that's not readable.
01:06:29.360 | So what people often do for debugging is they turn the optimizer off.
01:06:35.920 | And so you get code that, when something in your source code
01:06:42.720 | looks like a function call, it is a function call.
01:06:45.920 | When the optimizer is turned on, it may disappear, the function call.
01:06:50.320 | It may inline.
01:06:51.120 | And so one of the things you can do is you can actually get code
01:06:57.280 | that is smaller than the function call,
01:07:01.520 | because you eliminate the function preamble and return,
01:07:05.760 | and there's just the operation there.
01:07:08.880 | One of the key things when I did templates was
01:07:14.640 | I wanted to make sure that if you have, say, a sort algorithm,
01:07:19.440 | and you give it a sorting criteria,
01:07:23.600 | if that sorting criteria is simply comparing things with less than,
01:07:30.560 | the code generated should be the less than,
01:07:34.080 | not an indirect function call to a comparison object,
01:07:40.960 | which is what it is in the source code.
01:07:44.240 | But we really want down to the single instruction.
01:07:47.200 | And but anyway, turn off the optimizer, and you can debug.
01:07:53.840 | The first level of debugging can be done,
01:07:56.800 | and I always do without the optimization on,
01:07:59.040 | because then I can see what's going on.
01:08:01.360 | - And then there's this idea of concepts that puts some...
01:08:06.880 | Now, I've never even...
01:08:11.840 | I don't know if it was ever available in any form,
01:08:14.240 | but it puts some constraints on the stuff you can parameterize, essentially.
01:08:19.600 | - Let me try and explain.
01:08:21.520 | So yes, it wasn't there 10 years ago.
01:08:27.680 | We have had versions of it that actually work for the last four or five years.
01:08:33.840 | It was a design by Gabby Dos Reis, Drew Sautin, and me.
01:08:40.960 | We were professors and postdocs in Texas at the time.
01:08:44.240 | And the implementation by Andrew Sautin has been available for that time.
01:08:53.120 | And it is part of C++20.
01:08:58.400 | And there's a standard library that uses it.
01:09:02.640 | So this is becoming really very real.
01:09:06.240 | It's available in Clang and GCC, GCC for a couple of years.
01:09:13.920 | And I believe Microsoft is soon going to do it.
01:09:16.960 | We expect all of C++20 to be available in all the major compilers in 20.
01:09:23.760 | But this kind of stuff is available now.
01:09:28.640 | I'm just saying that because otherwise people might think
01:09:32.080 | I was talking about science fiction.
01:09:34.640 | And so what I'm going to say is concrete, you can run it today.
01:09:38.000 | And there's production uses of it.
01:09:41.840 | So the basic idea is that when you have a generic component, like a sort function,
01:09:51.440 | the sort function will require at least two parameters.
01:09:56.400 | One, a data structure with a given type and a comparison criteria.
01:10:04.080 | And these things are related, but obviously you can't compare things
01:10:08.880 | if you don't know what the type of things you compare.
01:10:11.280 | And so you want to be able to say, I'm going to sort something.
01:10:18.400 | And it is to be sortable.
01:10:20.400 | What does it mean to be sortable?
01:10:21.840 | You look it up in the standard.
01:10:23.280 | It has to be a sequence with a beginning and an end.
01:10:27.040 | There has to be random access to that sequence.
01:10:31.120 | And there has to be, the element types has to be comparable.
01:10:37.120 | Which means less than operator can operate on it.
01:10:41.200 | Less than logical operator can operate on it.
01:10:42.800 | Basically what concepts are, they're compile time predicates.
01:10:47.200 | They're predicates you can ask, are you a sequence?
01:10:51.600 | I have a begin and end.
01:10:53.040 | Are you a random exit sequence?
01:10:56.800 | I have a subscripting and plus.
01:10:59.120 | Is your element type something that has a less than?
01:11:02.800 | I have a less than.
01:11:03.600 | It's and so basically that's the system.
01:11:07.360 | And so instead of saying, I will take a parameter of any type,
01:11:11.760 | it'll say, I'll take something that's sortable.
01:11:13.920 | And it's well defined.
01:11:17.440 | And so we say, okay, you can sort with less than.
01:11:20.400 | I don't want less than.
01:11:21.760 | I want greater than or something I invent.
01:11:24.240 | So you have two parameters, the sortable thing and the
01:11:27.600 | comparison criteria and the comparison criteria will say,
01:11:31.520 | well, I can, you can write it saying it should operate on the
01:11:37.040 | element type and it has the comparison operations.
01:11:42.240 | So that's simply the fundamental thing.
01:11:45.440 | It's compile time predicates.
01:11:47.120 | Do you have the properties I need?
01:11:48.880 | So it specifies the requirements of the code on the parameters
01:11:54.560 | that it gets, it's very similar to types actually.
01:11:58.960 | But operating in the space of concepts.
01:12:03.280 | Concepts.
01:12:04.080 | The word concept was used by Alex Stefanov, who is sort of
01:12:10.560 | the father of generic programming in the context of C++.
01:12:14.000 | There's other places that use that word, but the way we call
01:12:20.320 | generic programming is the way we call it.
01:12:22.400 | It's that word, but the way we call generic programming is
01:12:25.120 | Alex's and he called them concepts because he said they're
01:12:28.880 | the sort of the fundamental concepts of an area.
01:12:32.000 | So they should be called concepts.
01:12:33.920 | And we've had concepts all the time.
01:12:36.480 | If you look at the K&R book about C, C has arithmetic types
01:12:41.760 | and it has integral types.
01:12:47.200 | It says so in the book.
01:12:49.120 | And then it lists what they are and they have certain properties.
01:12:53.360 | The difference today is that we can actually write a concept
01:12:57.040 | that will ask a type, are you an integral type?
01:13:00.560 | Do you have the properties necessary to be an integral type?
01:13:05.200 | Do you have plus minus divide and such?
01:13:07.600 | So maybe the story of concepts, because I thought it might be
01:13:16.000 | part of C++11.
01:13:17.600 | C-O-X or whatever it was at the time.
01:13:22.720 | What was the, why didn't it, like what, we'll talk a little
01:13:28.160 | bit about this fascinating process of standards because I
01:13:30.960 | think it's really interesting for people.
01:13:32.560 | It's interesting for me, but why did it take so long?
01:13:37.280 | What shapes did the idea of concepts take?
01:13:40.000 | What were the challenges?
01:13:42.960 | - Back in '87 or thereabouts.
01:13:46.880 | - 1987?
01:13:48.240 | - In 1987 or thereabouts, when I was designing templates,
01:13:52.880 | obviously I wanted to express the notion of what is required
01:13:57.760 | by a template of its arguments.
01:14:00.000 | And so I looked at this and basically for templates,
01:14:04.960 | I wanted three properties.
01:14:06.400 | I wanted to be very flexible.
01:14:10.400 | It had to be able to express things I couldn't imagine
01:14:16.160 | because I know I can't imagine everything and I've been
01:14:19.600 | suffering from languages that try to constrain you to only do
01:14:24.240 | what the designer thought good.
01:14:26.160 | I didn't want to do that.
01:14:28.000 | Secondly, it had to run faster, as fast or faster than
01:14:33.680 | handwritten code.
01:14:35.200 | So basically if I have a vector of T and I take a vector of
01:14:39.520 | char, it should run as fast as you build a vector of char
01:14:44.160 | yourself without parameterization.
01:14:46.640 | And thirdly, I wanted to be able to express the constraints
01:14:53.680 | of the arguments, have proper type checking of the
01:14:59.680 | interfaces.
01:15:00.400 | And neither I nor anybody else at the time knew how to get
01:15:05.840 | all three.
01:15:06.320 | And I thought for C++, I must have the two first.
01:15:10.800 | Otherwise it's not C++.
01:15:13.680 | And it bothered me for another couple of decades that I
01:15:17.200 | couldn't solve the third one.
01:15:18.560 | I mean, I was the one that put function argument type checking
01:15:23.440 | into C.
01:15:24.000 | I know the value of good interfaces.
01:15:26.960 | I didn't invent that idea.
01:15:28.400 | It's very common, but I did it.
01:15:30.880 | And I wanted to do the same for templates, of course, and I
01:15:35.920 | couldn't.
01:15:36.480 | So it bothered me.
01:15:38.480 | Then we tried again, 2002, 2003.
01:15:43.120 | Gabby Desrais and I started analyzing the problem, explained
01:15:48.880 | possible solutions.
01:15:50.560 | It was not a complete design.
01:15:52.000 | A group in University of Indiana, an old friend of mine,
01:15:58.560 | they started a project at Indiana and
01:16:04.800 | we thought we could get a good system of concepts in another
01:16:13.040 | two or three years.
01:16:14.480 | That would have made C++ 11 to C++ 06 or 07.
01:16:23.520 | Well, it turns out that I think we got a lot of the fundamental
01:16:29.120 | ideas wrong.
01:16:31.600 | They were too conventional.
01:16:32.960 | They didn't quite fit C++ in my opinion.
01:16:37.120 | Didn't serve implicit conversions very well.
01:16:40.640 | It didn't serve mixed type arithmetic, mixed type
01:16:45.120 | computations very well.
01:16:48.000 | A lot of stuff came out of the functional community and that
01:16:55.840 | community didn't deal with multiple types in the same way
01:17:04.400 | as C++ does.
01:17:05.600 | Had more constraints on what you could express and didn't have
01:17:12.080 | the draconian performance requirements.
01:17:15.280 | And basically, we tried.
01:17:18.000 | We tried very hard.
01:17:19.200 | We had some successes, but it just in the end wasn't.
01:17:23.840 | Didn't compile fast enough, was too hard to use, and didn't run
01:17:29.360 | fast enough unless you had optimizers that was beyond the
01:17:35.520 | state of the art.
01:17:36.800 | They still are.
01:17:37.840 | So we had to do something else.
01:17:39.520 | Basically, it was the idea that a set of parameters has defined
01:17:46.400 | a set of operations and you go through an indirection table
01:17:50.800 | just like for virtual functions, and then you try to optimize
01:17:54.400 | the indirection away to get performance.
01:17:58.560 | And we just couldn't do all of that.
01:18:01.280 | But get back to the standardization.
01:18:05.520 | We are standardizing C++ under ISO rules, which are very open
01:18:11.600 | process.
01:18:12.720 | People come in, there's no requirements for education or
01:18:16.640 | experience.
01:18:17.200 | So you've started to develop C++, and there's a whole...
01:18:23.200 | What was the first standard established?
01:18:26.160 | What is that like?
01:18:26.960 | The ISO standard, is there a committee that you're referring
01:18:31.760 | Sure.
01:18:31.920 | There's a group of people.
01:18:32.960 | What's that like?
01:18:33.920 | How often do you meet?
01:18:35.840 | What's the discussion?
01:18:36.640 | I'll try and explain that.
01:18:38.080 | So sometime in early 1989, I think, I think it was.
01:18:47.360 | In 1989, two people, one from IBM, one from HP, turned up in
01:18:56.160 | my office and told me I would like to standardize C++.
01:19:01.120 | This was a new idea to me, and I pointed out that it wasn't
01:19:08.800 | finished yet.
01:19:10.000 | It wasn't ready for formal standardization and such.
01:19:13.120 | And they say, "No, Bjarne, you haven't gotten it.
01:19:15.680 | Our organizations depend on C++.
01:19:19.360 | We cannot depend on something that's owned by another
01:19:23.760 | corporation that might be a competitor.
01:19:26.880 | Of course, we could rely on you, but you might get run over by
01:19:31.200 | a bus."
01:19:31.600 | Right.
01:19:32.660 | We really need to get this out in the open.
01:19:36.320 | It has to be standardized under formal rules, and we are going
01:19:43.360 | to standardize it under ISO rules.
01:19:48.080 | And you really want to be part of it because basically,
01:19:51.600 | otherwise, we'll do it ourselves.
01:19:53.040 | And we know you can do it better.
01:19:56.400 | So through a combination of arm twisting and flattery, it got
01:20:05.040 | started.
01:20:05.760 | So in late '89, there was a meeting in D.C.
01:20:13.360 | Actually, no, it was not ISO then.
01:20:17.600 | It was ANSI, the American National Standard, we're doing.
01:20:20.640 | We met there.
01:20:24.000 | We were lectured on the rules of how to do an ANSI standard.
01:20:28.720 | There was about 25 of us there, which apparently was a new
01:20:32.720 | record for that kind of meeting.
01:20:35.600 | And some of the old C guys that has been standardizing C was
01:20:40.480 | there, so we got some expertise in.
01:20:42.320 | So the way this works is that it's an open process.
01:20:46.800 | Anybody can sign up if they pay the minimal fee, which is
01:20:51.200 | about $1,000.
01:20:52.320 | There was less then, just a little bit more now.
01:20:55.040 | And I think it's $1,280.
01:20:58.800 | It's not going to kill you.
01:21:01.600 | And we have three meetings a year.
01:21:05.360 | This is fairly standard.
01:21:07.280 | We try two meetings a year for a couple of years that didn't
01:21:12.000 | work too well.
01:21:12.640 | So three one-week meetings a year.
01:21:17.360 | And you meet and you have technical discussions, and then
01:21:25.040 | you bring proposals forward for votes.
01:21:28.640 | The votes are done one vote per organization.
01:21:35.200 | So you can't have, say, IBM come in with 10 people and
01:21:40.480 | dominate things.
01:21:41.440 | That's not allowed.
01:21:42.640 | And these are organizations that extend to the UC++?
01:21:46.660 | Or individuals.
01:21:48.640 | Or individuals.
01:21:49.360 | I mean, it's a bunch of people in a room deciding the design
01:21:54.720 | of a language based on which a lot of the world's systems
01:22:00.180 | That's right.
01:22:00.740 | Well, I think most people would agree it's better than if
01:22:04.480 | I decided it.
01:22:05.440 | Or better than if a single organization like AT&T decided
01:22:11.940 | I don't know if everyone agrees to that, by the way.
01:22:14.320 | Bureaucracies have their critics too.
01:22:18.020 | Look, standardization is not pleasant.
01:22:23.360 | It's horrifying.
01:22:25.680 | It's like democracy.
01:22:26.640 | But we, exactly.
01:22:28.000 | As Churchill says, democracy is the worst way except for all
01:22:31.920 | the others.
01:22:32.560 | Right?
01:22:33.060 | And it's, I would say, the same with formal standardization.
01:22:36.480 | But anyway, so we meet and we have these votes, and that
01:22:43.280 | determines what the standard is.
01:22:44.800 | A couple of years later, we extended this so it became
01:22:49.520 | worldwide.
01:22:51.760 | We have standard organizations that are active in currently
01:22:57.680 | 15 to 20 countries, and another 15 to 20 are sort of looking
01:23:06.560 | and voting based on the rest of the work on it.
01:23:11.360 | And we meet three times a year.
01:23:14.000 | Next week, I'll be in Cologne, Germany, spending a week doing
01:23:20.480 | standardization, and we'll vote out the committee draft of C++20,
01:23:25.520 | which goes to the National Standards Committees for comments
01:23:30.720 | and requests for changes and improvements.
01:23:35.120 | Then we do that, and there's a second set of votes where
01:23:38.640 | hopefully everybody votes in favor.
01:23:40.480 | This has happened several times.
01:23:42.720 | The first time we finished, we started in the first technical
01:23:47.840 | meeting was in 1990.
01:23:50.880 | The last was in '98.
01:23:53.600 | We voted it out.
01:23:54.640 | That was the standard that people used till '11, or a little
01:23:58.160 | bit past '11.
01:23:59.200 | And it was an international standard.
01:24:02.880 | All the countries voted in favor.
01:24:05.440 | It took longer with '11, and I'll mention why, but all the
01:24:12.320 | nations voted in favor.
01:24:15.120 | And we work on the basis of consensus.
01:24:19.440 | That is, we do not want something that passes 60/40,
01:24:23.440 | because then we're getting dial-ins and opponents and people
01:24:28.160 | will complain too much.
01:24:30.080 | They won't complain too much, but basically it has no real
01:24:34.240 | effect.
01:24:35.040 | The standards have been obeyed.
01:24:37.040 | They have been working to make it easier to use many compilers,
01:24:43.040 | many computers, and all of that kind of stuff.
01:24:45.200 | And so the first, it was traditional with ISO standards to
01:24:51.040 | take 10 years.
01:24:51.920 | We did the first one in '08, brilliant.
01:24:54.640 | And we thought we were going to do the next one in '06, because
01:24:58.880 | now we are good at it.
01:24:59.840 | Right.
01:25:01.120 | It took 13.
01:25:03.840 | Yeah, it was named OX.
01:25:06.800 | It was named OX.
01:25:08.160 | Hoping that you would at least get it within the odds, the
01:25:12.480 | single digits.
01:25:13.040 | I thought we would get, I thought we would get six, seven,
01:25:17.040 | or eight.
01:25:17.600 | The confidence of youth.
01:25:19.200 | Yeah, that's right.
01:25:20.240 | Well, the point is that this was sort of like a second system
01:25:24.720 | effect.
01:25:25.280 | That is, we now knew how to do it.
01:25:27.120 | And so we're going to do it much better.
01:25:29.280 | And we got more ambitious.
01:25:30.880 | Ambitious.
01:25:31.440 | And it took longer.
01:25:32.880 | Furthermore, there is this tendency, because it's a 10-year
01:25:37.440 | cycle, or eight, doesn't matter.
01:25:41.680 | Just before you're about to ship, somebody has a bright
01:25:46.240 | idea.
01:25:46.740 | Yeah.
01:25:49.600 | And so we really, really must get that in.
01:25:53.680 | We did that successfully with the STL.
01:25:58.400 | Yeah.
01:25:59.440 | We got the standard library that gives us all the STL stuff.
01:26:04.240 | That basically, I think it saved C++.
01:26:07.120 | It was beautiful.
01:26:07.920 | Yeah.
01:26:08.640 | And then people tried it with other things, and it didn't
01:26:12.800 | work so well.
01:26:13.600 | They got things in, but it wasn't as dramatic.
01:26:16.960 | And it took longer and longer and longer.
01:26:18.800 | So after C++ 11, which was a huge improvement, and basically
01:26:26.720 | what most people are using today, we decided never again.
01:26:31.600 | And so how do you avoid those slips?
01:26:36.480 | And the answer is that you ship more often.
01:26:39.600 | So that if you have a slip on a 10-year cycle, by the time
01:26:47.040 | you know it's a slip, there's 11 years till you get it.
01:26:49.440 | Now, with a three-year cycle, there is about four years
01:26:55.760 | till you get it.
01:26:56.800 | The delay between feature freeze and shipping.
01:27:02.400 | So you always get one or two years more.
01:27:05.680 | And so we shipped 14 on time.
01:27:09.040 | We shipped 17 on time.
01:27:11.840 | And we will ship 20 on time.
01:27:14.640 | It'll happen.
01:27:18.160 | And furthermore, this gives a predictability that allows
01:27:23.120 | the implementers, the compiler implementers, the library
01:27:26.240 | implementers, they have a target, and they deliver on it.
01:27:30.800 | 11 took two years before most compilers were good enough.
01:27:35.920 | 14, most compilers were actually getting pretty good in 14.
01:27:42.000 | 17, everybody shipped in 17.
01:27:45.520 | We are going to have at least almost everybody ship almost
01:27:51.840 | everything in 20.
01:27:53.200 | And I know this because they're shipping in 19.
01:27:57.520 | Predictability is good.
01:27:59.120 | Delivery on time is good.
01:28:01.040 | And so, yeah.
01:28:02.320 | That's great.
01:28:02.960 | So that's how it works.
01:28:03.920 | There's a lot of features that came in in C++11.
01:28:09.200 | There's a lot of features at the birth of C++ that were
01:28:13.600 | amazing and ideas with concepts in 2020.
01:28:17.120 | What to you is the most, just to you personally, beautiful
01:28:25.280 | or just you sit back and think, wow, that's just a nice and
01:28:33.280 | clean feature of C++?
01:28:35.200 | I have written two papers for the History of Programming
01:28:41.520 | Languages Conference, which basically ask me such questions.
01:28:45.840 | And I'm writing a third one, which I will deliver at the
01:28:50.320 | History of Programming Languages Conference in London
01:28:52.800 | next year.
01:28:54.160 | So I've been thinking about that.
01:28:55.760 | And there is one clear answer.
01:28:57.440 | Constructors and destructors.
01:28:59.520 | The way a constructor can establish the environment for
01:29:04.880 | the use of a type for an object and the destructor that
01:29:09.680 | cleans up any messes at the end of it.
01:29:12.000 | That is the key to C++.
01:29:14.800 | That's why we don't have to use garbage collection.
01:29:18.000 | That's how we can get predictable performance.
01:29:21.600 | That's how you can get the minimal overhead in many, many
01:29:25.840 | cases and have really clean types.
01:29:28.960 | It's the idea of constructor destructor pairs.
01:29:34.320 | Sometimes it comes out under the name RAII, resource
01:29:40.160 | acquisition is initialization, which is the idea that you
01:29:43.680 | grab resources and the constructor and release them
01:29:46.320 | and destructor.
01:29:47.200 | It's also the best example of why I shouldn't be an
01:29:51.120 | advertising.
01:29:51.840 | I get the best idea and I call it resource acquisition is
01:29:56.720 | initialization.
01:29:57.760 | Not the greatest naming I've ever heard.
01:30:01.440 | So it's types, abstraction of types.
01:30:08.800 | You said, I want to create my own types.
01:30:13.440 | So types is an essential part of C++ and making them
01:30:17.920 | efficient is the key part.
01:30:20.880 | And to you, this is almost getting philosophical, but the
01:30:26.640 | construction and the destruction, the creation of an
01:30:30.320 | instance of a type and the freeing of resources from that
01:30:35.680 | instance of a type is what defines the object.
01:30:39.680 | That's almost like birth and death is what defines human
01:30:45.440 | life.
01:30:46.240 | Yeah, that's right.
01:30:47.600 | By the way, philosophy is important.
01:30:50.000 | You can't do good language design without philosophy
01:30:54.640 | because what you are determining is what people can
01:30:57.040 | express and how.
01:30:58.080 | This is very important.
01:31:00.320 | By the way, constructors destructors came into C++ in
01:31:05.600 | '79, in about the second week of my work with what was then
01:31:11.040 | called C with classes.
01:31:12.480 | It is a fundamental idea.
01:31:15.120 | Next comes the fact that you need to control copying,
01:31:18.400 | because once you control, as you said, birth and death,
01:31:22.480 | you have to control taking copies, which is another way
01:31:27.280 | of creating an object.
01:31:28.480 | And finally, you have to be able to move things around.
01:31:32.000 | So you get the move operations.
01:31:35.200 | And that's the set of key operations you can define on a
01:31:39.440 | C++ type.
01:31:41.840 | And so to you, those things are just a beautiful part of C++
01:31:49.520 | that is at the core of it all.
01:31:51.940 | You mentioned that you hope there will be one unified set
01:31:55.760 | of guidelines in the future for how to construct a programming
01:31:59.040 | language.
01:31:59.920 | So perhaps not one programming language, but a unification
01:32:05.040 | of how we build programming languages.
01:32:07.280 | If you remember such statements.
01:32:10.080 | I have some trouble remembering it, but I know the origin
01:32:13.760 | of that idea.
01:32:14.960 | So maybe you can talk about sort of C++ has been improving.
01:32:18.800 | There's been a lot of programming language.
01:32:20.800 | Do you, where does the arc of history taking us?
01:32:24.080 | Do you hope that there is a unification about the languages
01:32:27.840 | with which we communicate in the digital space?
01:32:30.480 | Well, I think that languages should be designed not by
01:32:39.840 | clobbering language features together and doing slightly
01:32:44.240 | different versions of somebody else's ideas.
01:32:47.040 | But through the creation of a set of principles, rules of
01:32:54.560 | thumbs, whatever you call them.
01:32:56.080 | I made them for C++.
01:32:58.640 | And we're trying to teach people in the standards committee
01:33:04.720 | about these rules because a lot of people come in and say,
01:33:07.440 | I've got a great idea.
01:33:08.560 | Let's put it in the language.
01:33:09.760 | And then you have to ask, why does it fit in the language?
01:33:13.840 | Why does it fit in this language?
01:33:15.600 | It may fit in another language and not here, or it may fit
01:33:19.840 | here and not the other language.
01:33:21.680 | So you have to work from a set of principles and you have
01:33:24.400 | to develop that set of principles.
01:33:26.160 | And one example that I sometimes remember is I was sitting
01:33:35.280 | down with some of the designers of Common Lisp and we were
01:33:41.840 | talking about languages and language features.
01:33:45.520 | And obviously we didn't agree about anything because, well,
01:33:50.880 | Lisp is not C++ and vice versa.
01:33:53.440 | It's too many parentheses.
01:33:54.960 | But suddenly we started making progress.
01:33:58.640 | I said, I had this problem and I developed it.
01:34:05.040 | According to these ideas and this, why?
01:34:07.440 | We had that problem, different problem, and we develop it
01:34:10.960 | with the same kind of principles.
01:34:12.480 | And so we worked through large chunks of C++ and large
01:34:19.840 | chunks of Common Lisp and figured out we actually had
01:34:23.520 | similar sets of principles of how to do it.
01:34:27.760 | But the constraints on our designs were very different.
01:34:32.160 | And the aims for the usage was very different.
01:34:35.120 | But there was commonality in the way you reason about
01:34:41.600 | language features and the fundamental principles you
01:34:45.040 | are trying to do.
01:34:45.760 | - So do you think that's possible to, so just like there
01:34:50.160 | is perhaps a unified theory of physics, of the fundamental
01:34:55.680 | forces of physics, I'm sure there is commonalities among
01:35:01.120 | the languages, but there's also people involved that help
01:35:05.760 | drive the development of these languages.
01:35:07.440 | Do you have a hope or an optimism that there will be
01:35:13.360 | a unification?
01:35:14.400 | If you think about physics and Einstein towards a
01:35:18.640 | simplified language, do you think that's possible?
01:35:22.720 | - Let's remember sort of modern physics, I think, started
01:35:28.720 | with Galileo in the 1300s.
01:35:31.040 | So they've had 700 years to get going.
01:35:34.480 | Modern computing started in about '49.
01:35:39.440 | We've got, what is that, 70 years.
01:35:43.760 | They have 10 times.
01:35:45.520 | And furthermore, they are not as bothered with people
01:35:50.000 | using physics the way we are worried about programming
01:35:55.120 | is done by humans.
01:35:57.040 | So each have problems and constraints, the others have,
01:36:02.400 | but we are very immature compared to physics.
01:36:04.640 | So I would look at sort of the philosophical level and look
01:36:12.480 | for fundamental principles, like you don't leak resources,
01:36:18.000 | you shouldn't.
01:36:18.640 | You don't take errors at runtime that you don't need to.
01:36:26.320 | You don't violate some kind of type system.
01:36:30.400 | There's many kinds of type systems, but when you have one,
01:36:33.200 | you don't break it, et cetera, et cetera.
01:36:36.960 | There will be quite a few, and it will not be the same
01:36:42.000 | for all languages.
01:36:43.120 | But I think if we step back at some kind of philosophical
01:36:47.680 | level, we would be able to agree on sets of principles
01:36:52.640 | that applied to sets of problem areas.
01:36:56.240 | And within an area of use, like in C++'s case,
01:37:04.320 | what used to be called systems programming,
01:37:07.520 | the area between the hardware and the fluffier parts
01:37:12.480 | of the system, you might very well see a convergence.
01:37:17.360 | So these days you see Rust having adopted RAII,
01:37:22.080 | and sometime accuses me for having borrowed it 20 years
01:37:25.520 | before they discovered it.
01:37:27.120 | But we're seeing some kind of convergent here,
01:37:34.320 | instead of relying on garbage collection all the time.
01:37:38.640 | The garbage collection languages are doing things
01:37:41.680 | like the dispose patterns and such that imitates
01:37:47.280 | some of the construction destruction stuff.
01:37:50.320 | And they're trying not to use the garbage collection
01:37:52.960 | all the time, things like that.
01:37:55.040 | So there's a conversion.
01:37:57.120 | But I think we have to step back to the philosophical level,
01:38:00.000 | agree on principles, and then we'll see some convergences.
01:38:05.280 | And it will be application domain specific.
01:38:10.720 | - So a crazy question, but I work a lot with machine
01:38:16.000 | learning, with deep learning.
01:38:17.280 | I'm not sure if you touch that world much,
01:38:19.520 | but you could think of programming as a thing
01:38:24.240 | that takes some input.
01:38:25.360 | Programming is the task of creating a program,
01:38:28.640 | and a program takes some input and produces some output.
01:38:31.200 | So machine learning systems train on data
01:38:35.520 | in order to be able to take an input and produce output.
01:38:39.600 | But there are messy, fuzzy things, much like
01:38:46.320 | we as children grow up.
01:38:48.240 | You know, we take some input, we make some output,
01:38:51.360 | but we're noisy, we mess up a lot.
01:38:53.360 | We're definitely not reliable.
01:38:54.720 | Biological systems are a giant mess.
01:38:56.560 | So there's a sense in which machine learning
01:39:01.360 | is a kind of way of programming, but just fuzzy.
01:39:04.480 | It's very, very, very different than C++.
01:39:08.080 | Because C++ is, just like you said,
01:39:12.960 | it's extremely reliable, it's efficient,
01:39:16.160 | it's, you know, you can measure, you can test
01:39:19.040 | in a bunch of different ways.
01:39:20.240 | With biological systems or machine learning systems,
01:39:24.400 | you can't say much except sort of empirically
01:39:28.160 | saying that 99.8% of the time it seems to work.
01:39:31.920 | What do you think about this fuzzy kind of programming?
01:39:36.640 | Do you even see it as programming?
01:39:39.920 | Is it totally another kind of world?
01:39:42.960 | - I think it's a different kind of world.
01:39:46.400 | And it is fuzzy.
01:39:47.680 | And in my domain, I don't like fuzziness.
01:39:50.160 | That is, people say things like
01:39:55.840 | they want everybody to be able to program.
01:39:57.840 | But I don't want everybody to program
01:40:01.200 | my airplane controls or the car controls.
01:40:07.120 | I want that to be done by engineers.
01:40:09.920 | I want that to be done with people
01:40:12.720 | that are specifically educated and trained.
01:40:15.840 | For doing, building things.
01:40:19.440 | And it is not for everybody.
01:40:21.840 | Similarly, a language like C++ is not for everybody.
01:40:26.080 | It is generated to be a sharp and effective tool
01:40:30.400 | for professionals, basically.
01:40:34.320 | And definitely for people who aim
01:40:38.560 | at some kind of precision.
01:40:40.000 | You don't have people doing calculations
01:40:43.280 | without understanding math.
01:40:45.680 | Counting on your fingers is not going to cut it
01:40:48.880 | if you want to fly to the moon.
01:40:50.320 | And so there are areas where an 84% accuracy rate,
01:41:00.400 | 16% false positive rate is perfectly acceptable.
01:41:06.720 | And where people will probably get no more than 70.
01:41:11.040 | You said 98%.
01:41:12.080 | What I have seen is more like 84.
01:41:16.320 | And by really a lot of blood, sweat, and tears,
01:41:19.600 | you can get up to 92 and a half.
01:41:21.440 | So this is fine if it is, say, pre-screening stuff
01:41:29.920 | before the human look at it.
01:41:31.760 | It is not good enough for life-threatening situations.
01:41:37.360 | And so there are lots of areas where the fuzziness
01:41:41.760 | is perfectly acceptable and good and better than humans,
01:41:45.600 | cheaper than humans.
01:41:46.720 | But it is not the kind of engineering stuff
01:41:49.840 | I am mostly interested in.
01:41:52.000 | I worry a bit about machine learning
01:41:56.000 | in the context of cars.
01:41:58.160 | You know much more about this than I do.
01:42:00.080 | - I worry too.
01:42:00.880 | - But I am sort of an amateur here.
01:42:04.400 | I have read some of the papers,
01:42:05.840 | but I have not ever done it.
01:42:07.760 | And the idea that scares me the most
01:42:12.160 | is the one I have heard,
01:42:15.920 | and I do not know how common it is,
01:42:17.840 | that you have this AI system, machine learning,
01:42:25.600 | all of these trained neural nets.
01:42:28.800 | And when there is something that is too complicated,
01:42:33.840 | they ask the human for help.
01:42:36.240 | But the human is reading a book or asleep,
01:42:39.120 | and he has 30 seconds or three seconds
01:42:44.400 | to figure out what the problem was
01:42:46.320 | that the AI system could not handle
01:42:48.080 | and do the right thing.
01:42:49.120 | This is scary.
01:42:50.960 | I mean, how do you do the cuddle
01:42:54.400 | between the machine and the human?
01:42:56.160 | - It is very, very difficult.
01:42:59.440 | And for the designer of one of the most recent
01:43:04.240 | designer of one of the most reliable, efficient,
01:43:06.720 | and powerful programming languages, C++,
01:43:08.960 | I can understand why that world is actually unappealing.
01:43:15.040 | It is for most engineers.
01:43:16.560 | To me, it is extremely appealing
01:43:18.960 | because we do not know how to get that interaction right.
01:43:22.880 | But I think it is possible.
01:43:24.080 | But it is very, very hard.
01:43:25.360 | - It is.
01:43:25.920 | - And--
01:43:26.640 | - I was stating a problem, not a solution.
01:43:28.720 | - Yes, that is impossible.
01:43:30.160 | - I mean--
01:43:30.800 | - I would much rather never rely on the human.
01:43:33.280 | If you are driving a nuclear reactor,
01:43:35.280 | or an autonomous vehicle,
01:43:37.920 | it is much better to design systems written in C++
01:43:42.240 | that never ask human for help.
01:43:44.000 | - Let us just get one fact in.
01:43:47.120 | - Yes.
01:43:47.620 | - All of this AI stuff is on top of C++.
01:43:51.360 | (laughing)
01:43:53.680 | So that is one reason I have to keep a weather eye out
01:43:58.000 | on what is going on in that field.
01:43:59.600 | But I will never become an expert in that area.
01:44:02.560 | But it is a good example of how you separate
01:44:04.880 | different areas of applications
01:44:08.160 | and you have to have different tools,
01:44:10.000 | different principles, and they interact.
01:44:14.560 | No major system today is written in one language.
01:44:17.760 | And there are good reasons for that.
01:44:19.360 | - When you look back at your life work,
01:44:24.080 | what is a moment, what is an event,
01:44:30.880 | creation that you are really proud of?
01:44:33.680 | That you say, "Damn, I did pretty good there."
01:44:35.680 | Is it as obvious as the creation of C++?
01:44:40.080 | - It is obvious.
01:44:41.680 | I have spent a lot of time with C++
01:44:44.960 | and it is a combination of a few good ideas,
01:44:49.040 | a lot of hard work, and a bit of luck.
01:44:51.520 | And I have tried to get away from it a few times,
01:44:55.760 | but I get dragged in again,
01:44:57.200 | partly because I am most effective in this area.
01:45:00.640 | And partly because what I do has much more impact
01:45:05.760 | if I do it in the context of C++.
01:45:08.400 | I have four and a half million people
01:45:11.200 | that pick it up tomorrow if I get something right.
01:45:14.400 | If I did it in another field,
01:45:16.160 | I would have to start learning,
01:45:17.600 | then I have to build it,
01:45:18.560 | and then we'll see if anybody wants to use it.
01:45:20.480 | One of the things that has kept me going
01:45:26.160 | for all of these years is,
01:45:27.600 | one, the good things that people do with it.
01:45:30.720 | And the interesting things they do with it.
01:45:34.960 | And also, I get to see a lot of interesting stuff
01:45:39.040 | and talk to a lot of interesting people.
01:45:41.040 | I mean, if it has just been statements on paper,
01:45:47.280 | on a screen,
01:45:48.720 | I don't think I could have kept going.
01:45:51.520 | But I get to see the telescopes up on Mount Akia,
01:45:55.680 | and I actually went and see how Ford built cars,
01:45:59.600 | and I got to JPL and see how they do the Mars rovers.
01:46:06.080 | There's so much cool stuff going on,
01:46:09.600 | and most of the cool stuff is done by pretty nice people.
01:46:12.640 | And sometimes in very nice places,
01:46:14.720 | Cambridge, Sofia Antipolis, Silicon Valley.
01:46:21.520 | There's more to it than just code,
01:46:26.080 | but code is central.
01:46:27.360 | - On top of the code are the people in very nice places.
01:46:32.240 | Well, I think I speak for millions of people,
01:46:35.280 | Bjarn, in saying thank you for creating this language
01:46:40.000 | that so many systems are built on top of
01:46:43.840 | that make a better world.
01:46:46.800 | So thank you.
01:46:47.840 | And thank you for talking today.
01:46:49.120 | - Yeah, thanks. - I really appreciate it.
01:46:50.160 | - And we'll make it even better.
01:46:51.360 | - Good.
01:46:52.720 | (upbeat music)
01:46:55.300 | (upbeat music)
01:46:57.880 | (upbeat music)
01:47:00.460 | (upbeat music)
01:47:03.040 | (upbeat music)
01:47:05.620 | (upbeat music)
01:47:08.200 | [BLANK_AUDIO]