Emilio Frazzoli, CTO, nuTonomy - MIT Self-Driving Cars

Today we have Emilio Frazzoli. He's the CTO of Neutonomy, one of the most successful autonomous vehicle companies in the world. He's the inventor of the RRT star algorithm, formerly a professor at MIT, directing research group that put the first autonomous vehicles on road in Singapore. And now he returns to MIT to talk with us.

Give him a warm welcome. (audience applauding) - Thank you, Lex. It's a great opportunity, it's a great pleasure to be back here. I spent 15 years of my life here at MIT, first as a graduate student, and then as a faculty member. And this is where Neutonomy, the company, essentially was born.

And we did a lot of the research that led us to start this company and eventually develop all this technology. What I will talk about today is a little bit about our vision on autonomous vehicles, why we want to have autonomous vehicles, some of the guidelines on the technology development, why we are doing things in a certain way.

Let's get started. But I really would like to tell you a number of stories about why I started doing this and why I think this is an important technology, why we ended up starting this company. I've been a faculty member here for 10 years. I was happily working with my UAVs and I was in AeroAstra.

At some point around 2005, there was this DARPA Grand Challenges that sounded cool, so I started working on cars as well. But that work that I was doing was mostly, I was working on the technology was mostly, I was working on airplanes and cars to make them fly or drive by themselves because it was cool, just look, no hands, it drives.

As a controls guy, as a roboticist, that's all I needed. But then in 2009, there was this new project that was starting, it was a team that was getting together to write a proposal for a project on future urban mobility in Singapore. Okay, I'm not telling you the whole story but essentially I got interested in that project just because I wanted to go to Singapore.

And then I called the person who was putting together the team and I said, okay, thank you for your interest but what do you think that you bring to the table? And we had just done the DARPA Urban Challenge so I know how to make autonomous cars. So what, this is a project on future urban mobility so what do cars have to do with urban mobility, autonomous cars, what do they have to do with urban mobility?

And there was the phone call, the five minute phone call that changed my life because she asked me this question, that actually was Cindy Barnard who is now the chancellor. And then I had to come up with an excuse. So why, well imagine that you have a smartphone and then a smartphone app and then you use this app to call a car, the car comes to you, you get on the car, drive wherever you go, want to go, step off the car and the car goes to pick up somebody else who goes to park or something, right?

So this was 2009. Uber was Travis Kalanick and a couple of guys and black cars in San Francisco, right? So, and essentially she bought it so I joined the team and we started this activity but the important thing is that I started thinking about, you know there was something, an excuse that I made up in those five minutes, okay?

But then I said, you know what, kind of sounds like a good idea. And I started thinking more about this and I started thinking more about why do we want to have self-driving vehicles, okay? So the number one reason that you typically hear is we want to have self-driving vehicles so that we make roads safer, okay?

A very large number of people die on the road, road accidents every year. What is, people do not realize is that most of those people are actually fairly young, like in their 20s and 30s, okay? And clearly what people usually say is that, Sebastian Thrun, back in the day, he gave all these TED Talks talking about his best friend from when he was young who died in a road accident, right?

And then he made a mission for his life to reduce road accidents, right? But anyway, so the idea is that most of the road accidents are due to human errors, you remove the human, you remove the errors, right? And then you save lives, okay. So this is typically the number one reason that people mention when they talk about why you want to have self-driving vehicles.

Second reason is convenience. Essentially, if the car is driving by itself, you can do other things, you can sleep, you can read, you can text legally to your heart's content, you can check your emails, so on and so forth, right? This is also great. Third thing is improved access to mobility.

People who cannot drive, maybe because they have some physical impairment, or maybe they are too young, they are too old, or maybe too intoxicated to drive, right? So then, you know, the computer can take them home. Another thing is increased efficiency throughput in a city as cars can communicate beyond visual range, for example.

Another one is reduced environmental impact, okay? Now, these are all fantastic reasons why we may want to have self-driving vehicles. The problem with me is that if you think about this, these are all good reasons, but these are all ways that you take the status quo, you know, how cars are used today, and you make it a little bit better, maybe a lot better, but you do not make it different, okay?

And really, that is what I am mostly, what I was mostly interested in. Can we use this technology, leverage this technology to change the way that we think of mobility, okay? So how do you compare all these different things, okay? So this is quick, back-of-the-envelope kind of calculation that you can do on your own.

You can question the numbers, but I think that the orders of magnitude are right, okay? So, you know, the first thing is, okay, so fine, we heard that a big reason for self-driving cars is to increase safety, you know, save lives. Great, now, how much is your life worth?

Well, to yourself, to your loved ones, to your friends, your family, is probably priceless. To the government, it's worth about $9 million, okay? So this is what is called the, (laughs) this is what is called the cost of a statistical life. There was a report that was released a few years ago, probably, you know, there is an update now, but I haven't seen it.

The economic cost of road accidents in the United States is evaluated to be about $300 billion a year. The societal harm, you know, of road accidents is another, you know, all the pain and suffering is evaluated to be another $600 billion a year, so what we are getting to is about almost $1 trillion, okay?

It's a big number, okay? But let's look at where the other effects are, okay? What is the cost of congestion? It's an estimate, $100 billion a year. The health cost of congestion, of the extra pollution, it's another $50 billion a year, so you see that these are just a small change, right?

The next effect is actually important, right? So what is the value of the time that we, as everybody in society, will get back from not having to drive, okay? Simple calculation, what I did is I multiplied one half the median wage of workers in the United States, which is an embarrassingly low number, multiplied by the number of hours that Americans spend behind the wheel, okay?

And what you get is about, you know, what was it? $1.2 trillion a year. So something that you may notice is that the value to society of getting the time back from having to drive is actually more than the value to society of increased safety, okay? Of course, it's a little bit cynical, okay?

So take it with a grain of salt, but you start seeing how these things compare. And what you may notice from this pie chart is that there is still half of that is missing. What is the other half? The other half is actually the value that you provide to society, to all individuals, okay?

By essentially making car sharing finally something that is convenient to use, affordable, reliable, okay? So for me, car sharing, or vehicle sharing in general, is a concept that everybody loves, but nobody uses, okay? Or not as many people as we would like to use these kind of services. Examples, when I was here at MIT, I really like using Hubway, the bicycle sharing.

But you have to be very careful. If you wait too long in the afternoon, sorry, there are no more bikes on campus, right? Or maybe very often you cannot find a bike, or maybe you cannot find a parking spot for your bike. So then you have to buy somewhere else and then walk.

So that defeats the purpose of using that bike. Same thing with cars, right? So typically with car sharing systems, what you have is either you have a two-way, which is essentially hourly rental, right? Or you have a one-way, but in one-way system, then the distribution of cars tend to get skewed, right?

And unless the company repositions cars in some clever way, then you're not guaranteed that you will get a car where you need it, and you're not guaranteed that you will get a spot, a parking spot, where you don't need the car anymore, okay? If you think of that, these are both friction points for using vehicle sharing, and these are both friction points that are actually addressed by if the car can drive itself, okay?

So if you bring in all the economic benefits of a car sharing system that actually works, that's something that we estimate it to be, it's about $2 trillion a year. So you see that this actually has a big chunk in this pie chart, okay? And that is using an estimate of what we call the sharing factor of four, meaning that one of the shared vehicles can essentially substitute for four privately-owned vehicles, okay?

There are some studies that get to this sharing factor up to 10, and of course the benefits are even more. Now, every time I see a round number like that, I get suspicious, right? 10 is a little bit too convenient to be true, right? But anyway, so that's something that you can find in the literature, okay?

So this is really where I think that the major impact of autonomous driving or self-driving cars will come from. Now, I think that also there is a lot of confusion in the community, in the world, about what a self-driving car means. Now, what I'm doing here, I just listed these five levels, it's actually six levels of automation.

These are the Society of Automotive Engineers levels, okay? So level zero is no automation, that's your great-grandfather's car, right? Driver assistance, level one, there is, for example, cruise control or some simple single-channel automation. Partial automation, you have something like, for example, lane keeping and cruise control, but you still require the driver to pay attention and intervene.

Conditional automation, level three, a driver is a necessity, is not required to pay attention all the time, but needs to be able to intervene given some notice, okay? And that some notice, I think, is a kind of like ill-defined concept. And then you have level four and level five that are like a higher automation, essentially no driver needed, in some condition that is level four, and in all conditions that level five, okay?

Now, my first reaction when I started seeing these levels, and there is also a similar version by NHTSA, is that they seem to me a horrible idea. And the horrible idea in the sense because they are given numeric levels. So you have level zero, one, two, three, five. Whenever you have a sequence of numbers, you are led to believe that these are actually sequential, right, that you do level zero, then you do level one, then you do level two, three, four, five.

I think this is enormously bad idea because I think that level two and level three, that is anything where you require the human to pay attention and supervise automation and be ready to intervene with no notice or with some ambiguously defined, you know, like sufficiently long notice, that they'll just go behind, you know, go against human nature.

And, you know, this is especially painful for me as a former aeronautics and astronautics professor where we saw in the airline industry that as soon as autopilots were being introduced and everybody thought that accidents would go down, actually there were more accidents because now you have new failure modes induced by autopilots, okay?

You have mode confusion, pilots lose situational awareness, pilots lose the ability to react in case of an emergency. Okay, so the airline industry had to essentially educate itself on how to deal with automation in a good way. And think of pilots, you know, pilots are highly trained professionals, which is not the same that you can say about your everyday driver, right?

So how do you train people who probably, you know, the last time they sat with an instructor in a car was, you know, when they were 16, right? How do you train people to use the automation technology and do it safely, right? So I think that this is something that I find very scary.

On the other hand, I think that the full automation when the car is essentially able to drive itself does not rely on a human to take over is something that in a sense is easier. And this is what we are doing, but the point is that not only it is easier, but I think that it is essential to capture the value of the technology.

Now if you think of it, so how do we realize the value of these self-driving vehicles? Okay, so the first thing that people say is safety. I think it is true that eventually, asymptotically, self-driving cars will be safer than their human-driven counterparts. However, at what point can we be confident that that is the case?

Are we there yet? Not sure, okay? So how do you demonstrate that the self-driving cars demonstrate the reliability of these self-driving cars? So Waymo, they've driven their cars for three million miles, right? So with a relatively small number of accidents. If I remember correctly, only one was their fault, right?

But actually, humans drive for many times that without accidents. So how do you really make sure that, even though the number sounds impressive, it really doesn't have that much of a statistical significance, right? And then every time you make an update, to a change to your system, to your software, you really have to validate again, right?

So I think that making the case for safety is actually a very challenging issue. And we may not be positive that these self-driving cars are actually safer than their human counterparts until a fairly long time from now, okay? So safety for me remains kind of an open question at this point.

How do you get back the time value of driving? If you had, at least I'm speaking for myself, if I had to constantly pay attention to what the car is doing, excuse me, but I'd rather drive myself, okay? Because if the car is driving, and this is the paradox, right?

So the better the car drives, the harder it is for me to keep paying attention, right? And this is where the whole problem is, right? So it would be very hard for me not to fall asleep or not to get distracted. So if I want to get the time back, really the car must be able to drive itself without requiring me to pay attention.

Car sharing, again, is, in order to make car sharing really convenient and reliable and so on and so forth, you need the car to come to you with nobody inside. And for that, you need level four or level five, okay? Anything else just doesn't cut it. Everything else for me is just a nice gadget that you have on your car that you show off to your friends or to your girlfriend, okay, so that's about it, right?

It's not that useful. So my point is that level four or five automation is really essential to capture the value of this technology, okay? And in fact, the one game-changing feature of these cars is the fact that these cars now can move around with nobody inside. You know, that's really the game-changing feature, okay?

Good, and this is really what we would like to do. Now, there are many paths that you can go after this target, okay? I usually show this figure, okay? So on this figure, what I show on the horizontal axis is the scale or the scope of the kind of driving that you can do, okay?

So on the left is like a small, you know, pilot, maybe a close course. On the right is driving everywhere, okay? On the, you know, like a complex environment, right? Mass deployments and so on and so forth. On the left, there is, on the vertical axis is the level of automation, okay?

Now, really what we would like to do is get to the top right corner, right? So we have millions of cars driving all over the world completely, you know, in a completely automated way, okay? What I see is there are two different paths that the industry is taking, okay?

What I show here is what I call, this is the OEM path, okay? So this is the automakers, right? So they're used to thinking of production of cars in the orders of many, many millions, okay? And essentially what they do is they make a lot of cars and they are adding features to these cars, you know, advanced driver assistance systems and so on and so forth, right?

And essentially, they're following this level zero, one, two, three, four, five, okay? And today, you can buy cars which, even though they claim fully autonomous package for $5,000 plus another $4,000 or something, in the fine print, they say it's level two, right? So level two or level three, right?

So, you know, Tesla, Mercedes, I think the Audi with the new Audi 8 is A8, they're coming out with this kind of feature. Cadillac, I think, has a similar thing, okay? The problem with that, I think, that you have to cross this red band, okay? This red band where you're actually requiring human supervision of your automation system.

Another path that people are following is this other path, okay? So this is what we are doing, what Waymo, where these are all the indications that Waymo is doing, of course, they're not telling me exactly what they do, similar thing for Uber, right? So essentially, what they're doing is they're working on cars that will be fully automated from the beginning, and they start with a small, you know, maybe geofence application, and then scale that up, that operations up, right?

But always remaining at the full, you know, high full automation level, okay? Another thing that is important, that people make a lot of confusion and don't seem to realize the big difference, is the following. When people ask me, "When do you think "that we will see autonomous vehicles "everywhere on the street?" You know, where autonomous vehicles will be common.

I ask them, "Okay, but what do you mean exactly by that?" Because if you ask me, "When is it that you will be able "to walk into a car dealership "and get out with the keys to a car "that you just push a button and it takes you home?" That's not happening for another 20 years, at least, okay?

On the other hand, if you ask me, "When you will be able to go to some new city "and summon one of these vehicles "that picks you up and takes you to your destination?" That thing is happening within a couple of years, okay? What is the difference? There is a big difference between autonomous vehicles, self-driving cars, as a consumer product, versus a service that you provide to passengers.

So, for example, what is the scope? Where do these cars need to be able to drive? If it's a product and I pay $10,000 for it, then I want this thing to work everywhere, right? So, take me home, take me to this little alley, drive me through the countryside.

On the other hand, if I'm a service provider and I'm offering a service, I can decide, I'm offering this service in this particular location. And by the way, I'm offering this service under these weather conditions and maybe under these traffic conditions, okay? So, just the problem becomes much more, much easier.

What are the financials, right? So, if I have to sell you a car with an autonomy package, how much can I cost? What are my cost constraints on that autonomy package? If I sell it to you, first of all, the cost of the autonomy package must be comparable to the cost of the vehicle, okay?

You will not buy a $20,000 car with a half a million dollar autonomy package, right? Also, you can do, so another back of the envelope calculation that I did is, okay, so let's say that what is the value to you as the buyer of this autonomy package? Let's say that the value to you is the fact that now, instead of driving for the rest of, for the next 10 years, you can have the computer driving for you.

What is the value of your time as you are not driving, right? So, do a quick calculations again, total number of hours that Americans spend behind the wheel, median wage, or value of time. What you get is, what I get is that the net present value of the driver's time over the next 10 years is about $20,000, okay?

So then, a rational buyer will not pay more than that to buy this autonomy package, right? So now you are constrained by $20,000, okay? Or actually, if you want to make a profit out of it, you are constrained, your autonomy package cannot cost more than a few thousand dollars, okay?

On the other hand, if you're thinking of this as a service, then what you are comparing to is the cost of providing the same service using a carbon-based life form like a human behind the wheel, okay? So now you want to provide 24/7 service, you need to hire at least, say, three drivers per car, okay?

And then the cost is comparable of the order of 100K a year, okay? So now I'm comparing the cost of my automation package to something that is gonna cost me $100,000 a year over the life of the car, okay? So now the cost of that LIDAR, or that fancy computer, or that fancy radar, or something, doesn't matter that much, okay?

So I have much more freedom in buying the sensor that I need. Infrastructure, for example. People talk about maps, HD maps, right? Now, again, if I want to sell it as a product, I need to, and I have to sell it, I want to sell it on a global scale, where global could mean all the United States, for example, or all of Europe.

Then I need to have maps, HD maps, of the whole of Europe, or the continental United States, or wherever I want to sell the cars. If I'm providing a service, then I only need to map the area where I want to provide the service. And by the way, how does the complexity of the maps scale with the customer base that you're serving?

If you think of a uniform people density, okay? So then actually, you think that the complexity and the cost of generating maps scales with the length of the road network. Then the cost of the maps scales with the square root of my customer base, meaning that will become manageable as I serve more people, okay?

So HD maps, yes, it's a pain in the neck to collect them and to maintain them, but it's much less of a pain in the neck than actually operating the logistics of a fleet serving the population of a city, okay? And servicing and maintenance, how would you calibrate your cameras and your sensors?

That's not something that you would do as a normal consumer, right? We are not used to that. When I was little, I was used to my father, he was tinkering with the car all the time, checking the timing belt or changing the oil. You don't do any of that nowadays, right?

So you just sit in the car, switch it on. If the yellow light, check engine comes up, you take it to the dealership, that's all you do. Now imagine that now you have, if you want to use your autonomy package, you had to calibrate the sensors every time you go out or you have to upload a new version of the drivers and things like that, so you don't want to do that.

On the other hand, in the service model, I have the maintenance crew that can take care of it in a professional way, okay? So big difference between the two models. So there are a couple of important takeaways, right? So one thing is that the cost of the autonomy package is not really an issue.

Clearly, the cheaper I can make it, the better it is, right? But that is not really the main driver. In particular, if you need a LiDAR sensor, for example, to detect a big truck that is crossing your path, buy the LiDAR sensor, okay? So that is not making the difference and maybe can save some lives, okay?

Any reference to other things is intentional. The other thing is HD maps that people worry about very much today. From my point of view, HD maps, my expectation is that HD maps, within a few years, will be a dime a dozen, okay? What is complicated, what is expensive now in generating all these HD maps?

The mapping company need to put these sensors on a car and send these cars around. Now imagine that I have a fleet of 1,000 cars with these sensors on board, and these cars are just driving around the city all the time. They're generating gigantic amount of data that I can just use to make and maintain my HD maps.

So I think that, especially from the point of view of the operators, the providers of these mobility services, it will be very easy to collect data to essentially make and maintain their own maps, okay? So if you need HD maps, that's fine, because as soon as you start offering this service, you will be able to collect all the data you need to generate and maintain the HD maps.

Oh, by the way, this is showing, an animation showing, like a simulation of a fleet of, I think it's a couple of hundred vehicles in Zurich, in Switzerland, right? So that's where I was based until a few days ago. And as you see, essentially you have vehicles that go through most of the city every few hours, okay?

I think that, for example, the Uber fleet goes through 95% of Manhattan every two hours or so, okay? Cost advantages, of course, most of the cost of taxi services nowadays is the driver, it's about half. Of course, you remove the driver from the picture, you don't have to pay them.

Of course, the automation costs you a little bit more, servicing costs you a little bit more, but you see that you still have, you can get a really significant increase in the margin, meaning that you can pass some of those savings to customers, but also you can make a very strong business case.

However, this is also misleading. Now, if you think of it, okay, so typically the reaction that you get is the following. Oh my goodness, now you make this thing and then all taxi drivers, all truck drivers will be out of a job, okay? And in fact, one day I was summoned by the Singapore Ministry of Manpower, okay?

And I was terrified, oh my goodness, they're gonna shut me down because they're afraid that I will put all of their taxi drivers on the street, on the street in the sense of being unemployed. Turns out it was the opposite. What most people do not realize is that actually mobility services worldwide are actually manpower limited.

Okay, in Singapore, they would like to buy more buses, but they don't have enough people who are able and willing to drive the buses, okay? This is true pretty much, same for trucking, same for taxes. Now, there's another back of the envelope calculation that you can do on your own.

Now, imagine, so as we know, Uber's been widely successful, very high valuation. A lot of this valuation is predicated on the fact that everybody in the world will eventually use Uber, right? Or something similar. Now, something that people don't think about is the following. Now, if everybody in the world uses Uber for their mobility needs, how many people in the world need to be drivers for Uber?

Do the calculation. What you see is that one person out of seven must drive for Uber if Uber is serving the whole world. Do you see that happening? No way, right? So people still need to be teachers, doctors, policemen, firemen, or some people need to be kids. So that is something that cannot happen.

How are we facing this paradox, in a sense? So today, what you have is people who drive around, but what is happening today is that we are all doubling up as drivers for ourselves, and in fact, we do spend about 1/7, 1/8 of our productive day behind the wheel, very often, okay?

So for me, the big change will be more on the supply of mobility rather than on job loss. I mean, of course, if you increase supply of mobility, the cost of mobility will probably go down, wages for drivers will go down, right? So that is an issue, but maybe added, maybe balanced by like an added value in service or other things that you can imagine.

Another thing about truck drivers, and something that I recently learned, 25% of all job-related deaths in the US are actually by truck drivers, okay? It's the single most dangerous industry that you can be in. So maybe if you can take some of those people out of those trucks and maybe supervise, remotely control a truck sitting in their office instead of sitting in the truck, that may be actually benefit to them.

Back to the question of when will autonomous vehicles arrive, and in a sense, this is what our prediction, our vision is, right? So what we will see is that, what we think is that we will have a fairly rapid adoption of self-driving vehicles in this mobility as a service model, okay?

As a fleet of shared autonomous vehicles that people can use to go from point to point, right? Rather than own. Of course, eventually, people will be able to buy these cars and maybe own them if they really want, but that is something that is much later in time for a number of reasons, some of which I discussed, okay?

So this is what we expect in terms of the timeline for this. Now, what is the state of the art for autonomous technology today? You do see a lot of demos from a number of companies doing a number of things, right? But a lot of the things that you see are not too much different from this video.

I don't know if any of you recognizes this video, but look at the cars. This was actually done by Ernst Dickmans in the late '90s in Germany, okay? No fancy GPUs, no, it was just cameras and some basic computer vision algorithms, but essentially he was able to drive for hundreds of miles on the German highways, okay?

If you're not showing something that goes beyond that, you have not made any progress over the past 20 years, okay? Yeah, you're using fancy deep learning and GPUs and things nowadays, but you're doing what people were doing 20 years ago. You know, come on, okay? So you see, clearly there is a lot of hype in these things, but if you see something like that, I don't think it's very impressive, okay?

People knew how to do that for a very long time. Something that I find a little bit, I may be biased, clearly, but this is something that I find a little bit more exciting is actually footage from our daily drives in Singapore, okay? This is four times the real time.

We don't drive that fast, okay? But essentially what we are doing in Singapore, we are driving in public roads, normal traffic. What you will see is not so, but we have construction zones, intersections, traffic on both sides. We will get to a pretty interesting intersection. Okay, so it's a red light.

We'll turn to green in a second. Keep in mind in Singapore, they drive on the left, right? So making the right turn is what is hard because you have to cross traffic, right? And here you have a lot of traffic and the car is making the right decision. All of this is without any human intervention, right?

So I think that in this day and age, if you're not showing the capability of driving in traffic in an urban situation like that, you're not really showing any advance over what people were able to do 20 years ago, okay? And as you can see, right, so intersections, other cars, pedestrians, all kind of like crazy interactions, cars parked in the middle of the street that you had to avoid, go to the other lane, things like that, okay?

So this is what you have to do every day. And this is what we are doing every day in Singapore. We are doing every day here in the seaport area. I don't know if you're aware of, but we are driving cars. We are allowed by the city of Boston to drive our cars autonomously in the seaport area.

So what are the technical challenges? Okay, so actually this is a slide that I'm fairly reusing from a talk that Amnon Shashua, the founder and CEO of Mobileye, gave here at MIT a few months ago, okay? So this is what he said, okay? So this is not what I say.

What he says is that the big challenges are sensing, you know, perception, smapping, and then is what he called driving policy, right, that I will call more like a decision-making, okay? Now, what he said is that sensing, perception, is a challenge, but it's a challenge that we are aware of, and then we are making rapid progress on getting better and better sensing, perception algorithms, okay?

Second, it's HD maps. What he said is that it's a huge logistical nightmare, so he didn't want to deal with that, you know, like Mobileye tries to avoid that. From my point of view, as I said, you know, for me, HD maps, it is a big pain in the neck to get those maps, but in a few years, maps will be a dime a dozen, okay?

So we'll get all the mapping data that we want and we need. So the big problem is driving policy, okay? So the remaining problem is driving policy. So how do you deal with that? And this is a typical example of things that we encounter in any kind of urban driving situation.

So you will see a video. So this is a case where we are at a traffic light, we are stopping, the traffic, you know, the light turns green, we are making the turn, there's a pedestrian crossing the street, we wait for the pedestrian, go through it, and then we see that there is a truck that is parked in the middle of our lane.

So we need to go to the other lane, which is in the opposite direction. There is a motorcycle coming, so we have to handle all that kind of situation, right? So how do you write your software in such a way that your car is able to deal with this kind of complicated situation by itself, okay?

And my point is that, you know, this is not really about negotiation. It's not about policy. Why do you have rules of the road? My claim, I have not proved it mathematically yet, but my claim is the following, that actually the rules of the road were introduced exactly to avoid the need for negotiation when you drive.

Okay, when you're walking as a person, you're just walking down the hallway, you know, walking down the infinite corridor, and there is a person coming in the other direction, there's always that awkward moment, right? So when you're trying to, I go left, I go right, right? With cars, you don't do that, right?

So in cars, they say, everybody go right, or in other places, everybody go left, period, and you don't negotiate that, okay? You get to an intersection, the light is red, you stop. You don't say, I'm really in a rush, do you mind if I go? No, you don't do that, right?

So it's red, and you stop, okay? So the rules of the road have been invented by humans in order to minimize the amount of negotiation, okay? And in particular, okay, so this is a slightly, I mean, this is actually a very old video, but I kind of like it.

So now our car is a little bit more aggressive, but what you see here is this case. This is how the car behaved in that particular situation. So you see it's raining, red light, turns green. There's a pedestrian crossing our path, so we yield to the pedestrian. You see that there is a, you will see that there is a truck that is parked on the left lane, in the middle of the lane, so we had to go around it, but there's a motorcycle that is approaching, so we had to be careful in going to the other lane.

Okay, so we squeeze through the motorcycle. We try to go very slowly next to squishy targets, right? But then as soon as we pass the truck, the truck driver decides to get moving, okay? So then what we do is we wait for the truck to get going, and then go back to our lane.

Now, imagine writing a script, or if then else, if there is a truck, but the truck is moving, and then do this, and this is a nightmare, so you don't want to do that. So how do you handle this kind of situations? Okay, so the industry standard approach to this was to, and by the way, this is what we did at the time of the Darfur-Bahn challenge.

So we had a lot of if-then-else statements, or finite-state machines, or some logic that was encoded by, so some finite-state machine kind of things. The problem with that is it's very hard to come up with this logic, and it's essentially impossible to debug it and verify it, right? So I spent many miserable months sitting in the naval air base in Weymouth, right?

So here, in a rental car, just plain interference with our autonomous car, trying to adjust all this logic and parameters and things. So I vowed that I will never do it again. Okay, it was just a miserable experience. I'm happy to say that actually we did come up with a much better way of doing it.

And by the way, this is a video from the Caltech team at the Darfur-Bahn challenge. As you can see, they're trying to go to an intersection. They decide to go, then for some reason, they decide actually to back up out of the intersection. So the director of DARPA, Tony Tedder at the time, he was there, he went like that.

So they were out of the race, okay? So as soon as CCO saw that. What happened here, there was essentially a bug in their logic. Caltech, team of very smart people, very capable, dedicated people, worked on this for months. They didn't catch this bug, they were out of the race.

So it's very easy to make mistakes, and it's very hard to find those bugs. Okay, so as a reaction to that, there is this new (mumbles) Is it possible to cut the sound? Oh, thank you. So now what you hear people saying is, well, there are too many rules of the road.

It's impossible to code all of them correctly. So let's not do that. Just feed the data, feed the car a lot of data, and let the car learn by itself how to behave. And this is what you see in, there are a number of startups and other efforts that are trying to use all this deep learning or learning approaches to get for end-to-end driving of cars, so you see a video from NVIDIA.

I understand this is a course on deep learning for cars. So I don't want to sound too negative. On the other hand, I will try to be honest in what I think. So there are a number of problems. And actually, this happened to us. So one of our developers, super bright lady from Caltech, you know, the first version of the code for dealing with traffic lights, essentially the reaction that they had for the yellow light was, if you see a yellow light, speed up.

I was looking at this, what the heck? Oh, this is what my brother does. So there is always the danger that you learn the wrong thing, okay, the wrong behavior in a sense. Of course, there are some situations in which accelerating when you see a yellow light is actually the right response, but it is not always the case, right?

So there are some other features of the situation that you need to examine, right? Also, the other thing is, it's a cartoon, right? So you want to be able to explain why the car did something, and I would say that more than explaining, because now you also see articles in which people say, oh, we have found a way of explaining why the neural network in the car decided to do something, right?

And what they show you is some, okay, so these are the neurons that were activated. Okay, this is just saying that, you know, if I do a fast MRI of the brain, and I see what neurons, what areas of the brain are activated when I watch a movie, then I know how the brain works.

No, I have no idea, okay? The point is that, yes, you want to trace the reason, the cause for why the car behave in a certain way, but you also want to be able to revert the cause, right? So you want that information to be actionable in some sense, right?

So you want to know that, okay, this happened because of this reason, and this is how I fix it, okay? And the other thing that, you know, this is something that is hard to do with purely based learning algorithms. On the other hand, you can, well, let me actually skip that in the interest of time, okay?

The reality is the following, that it is simply not true that there are too many rules of the road. In fact, any 16-year-old in the States can go to the DMV, get the booklet, study the booklet, do a written test, and be given a learner's permit, okay? And actually, this is what we require of every single licensed driver in the United States, okay?

We don't say, just drive with your dad or mom for a few thousand miles, and they will give you the license. No, we ask them, you know, show me that you studied the rules, and you understand the rules, okay? So how many are the rules of the road? Actually, I went through an exercise of counting, okay?

And what I did, I kind of like a clustered them. So essentially, you have rules on who can drive, when and where, what can be driven, when and where, at what speed, in what direction, who yields to whom, right? How you use your signals, active signaling, how do you interpret the signals that you see on the road, right, and where you can park and where you can stop.

That's essentially it. You know, these are all the rules, okay? So not that many, it's kind of like 12 categories. What is true is that the number of possible combinations of rules and the instantiation of the rules, given the context of the scenario, where other actors are, or pedestrians are, and where other cars are, that is a humongous number.

Okay? So you don't want to code, you don't want to be to essentially any generative model that gives you what is the right response to all possible combinations of rules and instantiations of actors. That is something that is just combinatorially untractable. I mean, you just cannot do that. But the point is that not only it is hard to code the good behavior, what to do in every one of these situations, I claim that it's also hard to learn the good behavior.

Because now you need to have enough training data for every possible combination of rules and instantiations. Good luck with that. Okay? On the other hand, it is very easy to assess what is a good behavior. And that's why I was showing these slides on NP-hardness. So what is a problem that is NP-hard?

The problem is NP-hard where if you have a non-deterministic system that is generating a candidate solution, then it is very easy to check whether or not that candidate is actually a solution of your problem. And that's something that you do in polynomial time. Okay? So in a sense, what I claim is that if you have an engine that is able to generate a very large number of candidates, and all you do is checking, and then what you do is checking whether or not each one of those candidates is good with respect to the rules, then that's all you need.

And it turns out that the algorithms that I worked on during my academic career were exactly generating that very large number, you know, RRT, RRT*. These are algorithms that work by generating a very large graph exploring all potential trajectories, reasonable trajectories that a robotic system can take. And then what you do is you check them for whether they satisfy the rules or not.

You see that it's very different from given the rules, generate something that satisfies everything, rather than given a candidate, check whether or not this candidate satisfies the rules. The generating rules, the generating candidates given all the constraints is a combinatorial problem. Checking a single candidate for compliance with a number of rules is a linear operation in the number of rules.

So that's something that you can do very easily. And that's essentially what we have in our course. Now we are using these formal methods. So essentially we write down all the rules in a formal language, so very precise, like a syntax. And then what you can do is you can verify whether your trajectory satisfy all these rules written in this language that is automatically, that can be automatically translated into something look like a finite state machine by a computer, okay?

But that's not something that you do by hand. It's something that is done automatically. And then what happens is that what we have is we generate trajectories. These trajectories are, you know, you can think of these as trajectories that now are not only trajectories in the physical space and time, but are also trajectories evolving in this logical space, telling me whether or not and to what extent I am satisfying the rules.

Okay, and that's all there is. Okay, so this is, you know, for example, a little example. So, you know, initially what we are doing is work, so this was very early days on Newtonomy where we're still working on research projects with industry, with customers. So our customer in this case wanted us to do an automated parking application.

And then what you see on the left is our plan, our original plan, that is just trying to park the car, right, avoiding hitting other cars. But you see it's kind of ignoring the fact that you have lanes and directional travels, right? So you put in the rules, and what you see is what is on the right, where now what the car is doing is not only finding the trajectory to go park, but it does so obeying all the rules that are imposed on that particular parking structure.

Okay? Something else that is very important, and you know this is something that we as humans do every day, is to deal with infeasibility, okay? So very often you're doing your planning, you're trying to plan your trajectory, you have a number of constraints, and well, sorry, but turns out that there is no trajectory, there is no possible behavior that you can do that will satisfy all the rules.

So what do you do? The computer returns, sorry, there's no compute, unfeasible, still driving this car, I need to do something, right? So you do need a way of dealing with infeasibility, okay? The way that we approach this problem is having this idea of hierarchy of rules, okay? And my claim is that all bodies of rules generated by humans are actually organized hierarchically.

Typical example is the three laws of robotics by Asimov. So the first law of robotics is a robot will not harm a human, right? Or cause a human to come to harm. Second law is a robot will obey a human, orders by a human, unless they violate the first law.

And the third law is a robot will try to preserve its own life, or preserve itself, unless it violates the first two laws, right? Same thing when you drive, right? So there are some rules that are more important than others. So for example, do not hit people, do not hit other cars.

And then lower priority level is maybe driving your lane, lower priority level is maybe maintaining the speed, or something like that, okay? And then what we do is come up with, now we have this product graph of trajectories in the physical and logical space. On top of that, you can give them a cost, right?

What we need is essentially a total order. What we use is a less geographic ordering, okay? Where we have violating an important rule even by a tiny amount is much worse than violating a less important rule by a large amount, okay? So that gives a total order structure to the cost.

And then essentially what we do is we solve a shortest path problem on this graph, okay? Which is exactly what you do in Robotics 101 when you try to do any kind of motion planning, okay? And well, this is a collection of a few interesting things. So here we need to go to the other lane, but you see that there is the other vehicle coming, so technically we could not go to the other lane, but you see that as long as it is safe to do so, the car will go into the other lane, okay?

And again, you have a lot of difficult situations that the car was able to handle by itself without any scripting or without any special instruction for that particular case, okay? So what is the problem here? The problem here is that, okay, so you can do all of this, right?

But then assuming that everybody is running this minimum violation planning, everything will be okay. The problem is that humans introduce a lot of uncertainty in the whole thing, okay? Now you can think of this as asking the question. So when I was young and naive, that is two years ago, I thought that I take all the rules of the road and you convert them to this formal language, you put them in your software, and you're done.

And then you go and look at these rules of the road and then you see that they're a mess, okay? These rules are just not a sound theory in the sense they're not complete, do not cover every possible case, and they're not consistent. You know, they're kind of like, tell you to do different things in different cases.

My favorite rule is this one. It's actually called the fundamental norm in the Swiss rules of the road. Look at that. All road users must behave in such a way not to pose an obstacle or danger to other road users that behave according to the rules. Do you see a problem there?

Okay, that doesn't mean that if I see somebody who is violating the rule, I can just hit them, right? So you can imagine that you have a fleet of vigilantes, you know, autonomous cars that just go around and if you run the red light, bam, gonna kill you. I mean, technically, the autonomous cars will be right, right?

So the other guy will be the one to blame, right? But do we really want that? Probably not, right? In defense of the Swiss, they actually have, that rule continues to say special care must be exerted in case you have evidence that other people are not following the rules, but still doesn't tell you what you're supposed to do when somebody else is violating the rule, okay?

And you have trolley problems, right? So probably you've heard, you know, you hear about all these trolley problems to no end, right? And most of these I find, you know, I mean, totally stupid, you know, in the sense it's like a big waste of time. In the sense that, yeah, sorry, I think it's extremely unlikely that you will be given the choice of killing either Mother Teresa or Hitler, right?

So, I mean, for sure that will never happen, right? But anything remotely similar will never happen to you, okay? On the other hand, there are versions of the trolley problem which are actually meaningful, okay? So this is one that my collaborator, Andrea Cenci, came up with, okay? Look at this case.

So you're driving down the road, and you see a pedestrian that is jaywalking in front of you, okay? If we stay our current course, we will kill the pedestrian, probability one, okay? But it's not our fault, okay? It's his fault that, you know, his or her fault that they stepped in the road when they shouldn't have.

On the other hand, what we could do is we can try to swerve, right? But then with some probability P, we may kill another person who had nothing to do with this thing. You know, they were just walking around, you know, peacefully, right? So the reason why I like this is because this problem actually has clear solutions in the two extreme cases, right?

So if P is one, okay, in the sense that if we swerve, we kill somebody else, then we clearly kill the guy who was jaywalking, right? If P is zero, that is, I'm sure that I'm not killing anybody if I swerve, then clearly I will swerve. What is the boundary?

So I know that the solution exists for P is equal zero. I know the solution exists with P equal one. By some continuity argument, you know, I must have some value of P at which the solution changes. What is that value? Nobody knows. How do you evaluate that P?

Nobody knows. But you know, these are the kind of questions that we actually need to answer somehow. So it's a more, you know, a little bit more sophisticated case. Now what we, and you know, this is what happens every day in our cars, right? So when our computer vision system is telling me that there is a pedestrian in front of us, it's not telling me that there is a pedestrian for sure, right, so it's telling me that I think that there is a pedestrian in front of us, and you know, I'm 80% confident, you know, some probability Q, okay?

Now, the one combination of probability on the pedestrian actually being there and my probability of killing somebody else would I swerve, right? Because if I swerve and kill somebody because just a ghost, you know, like a false positive, then I'll be in serious trouble, right? So how do we explain that?

Well, I thought there was somebody in front of me. There's nobody there, right? So again, you know, you do have solutions for some extreme cases, but then you have this whole two-dimensional domain now in which you had to, you know, there would be a boundary. Where do you put the boundary, okay?

And this is something that somebody will need to answer. Okay, I don't think it should be me. You know, of course I can come up with an answer when I write my code, but I actually think it should be you, right, in the sense this should be a community effort in which the community agrees on how the car should behave or, you know, in these kind of situations.

So let me conclude by saying, you know, when people ask me, what do you think is the biggest challenge in autonomous vehicles? And something that I've come to realize only recently is that I think that the biggest challenge in the development of autonomous vehicle technology is that we do not understand in a very precise way, rigorous way, how we want vehicles in general, including human-driven vehicles to behave, okay?

A lot of these rules of the road are just like a giant pile of, I wouldn't say garbage, but almost, you know, it's very uncertain language, very, you know, no rigorous laws, right, or rules. For example, a lot of the rules are predicated on a concept of right-of-way. You know, I looked everywhere.

There is not a single definition of what right-of-way means in mathematical terms. I know that it has something to do with distance, has something to do with relative speed, maybe with absolute speed, but I don't know what are the values. I don't know what are the numbers. If I had to write a function, so if you see this car approaching and this car is farther away than this distance and the relative speed is more than this, then stop, otherwise go.

There's nobody who's telling me what that relationship should be, and I think, again, what we need is we need to develop a sound theory for these rules of the road, okay, that cover precisely any kind of situation and tells me, you know, any kind of situation, what is the right behavior, what is the wrong behavior, or a little bit more, maybe what is, if I have two behaviors, which one is better, okay?

I need to be able to do the comparison. Now, we can use formal methods. I think that there is a lot of room here for statistical or learning-based methods, you know, like look at what people actually do and at what point will people honk at you, right, rather than in the field that you're cutting them off versus the field that you yielded to them, okay?

So we need to develop this sound theory. We need to assess the behaviors on realized space and time trajectories. What you thought that you had seen, that doesn't matter, okay, you know, because if you say, well, if I didn't see the pedestrian, then it's not my fault that I hit them, well, then people will start removing sensors, right?

So if you don't see anything, you can hit anything you want, and you're not to blame, right? But I really think that the compliance of the rules, once we have these precise, rigorous rules, will actually derive a lot of requirements for the sensing perception system for the planning control system, okay?

So from my point of view, the main message today is what I think is the biggest challenge is that we don't know how precisely how we want human-driven vehicles to behave, okay? Once we answer that question, I think that also designing automated vehicles will be much, much easier, okay?

So let me stop here, okay, so I'm just giving a few references to some of our published work on these topics. And let me just conclude, okay, so this is the company, what we are trying to do. Allow me, we are also hiring, so if anybody's interested, feel free to send me an email or contact us.

We want to double our size in the next couple of years, so we are hiring a couple hundred people. Okay, thank you for your attention. - Thank you, Emilio. (audience applauding)

Emilio Frazzoli, CTO, nuTonomy - MIT Self-Driving Cars

Chapters

Transcript