Back to Index

Self-Driving Cars: State of the Art (2019)


Chapters

0:0 Introduction
1:53 2018 in review
4:49 Fatalities
8:29 Taxi services
10:54 Predictions
16:55 Human-centered autonomy
19:42 Levels of autonomy and proliferation strategies
24:48 Out-of-the-box ideas
27:28 Who will be first?
29:26 Historical context
31:5 Underlying beliefs of the industry and public
32:32 Driving is hard
35:32 Humans are amazing
37:10 Humans and automation don't mix well?
41:55 Two approaches: Lidar vs Vision
49:54 In the meantime… data
52:49 The road ahead

Transcript

Today I'd like to talk about the state of the art of autonomous vehicles, how I see the landscape, how others see the landscape, what we're all excited about, ways to solve the problem, and what to look forward to in 2019 as we also get to hear from the different perspectives from the various leaders in industry and autonomous vehicles in the next few, next couple of weeks and next few days.

So the problem, the mission, the dream, the thing that we're trying to solve, for many it may be about entrepreneurial possibilities of making money and so on, but really it's about improving access to mobility, moving people around the world that don't have that ability, whether it has to do with age or purely access of where you live.

We want to increase the efficiency of how people move about, the ability to be productive in the time we spend in traffic and transportation. One of the most hated things in terms of stress and motion, the thing in our lives that if we could just with a snap of a finger remove is traffic.

So the ability to convert that into efficiency, into a productive aspect, into a positive aspect of life. And really the most important thing, at least for me and for many of us working in this space, is to save lives, prevent crashes that lead to injuries, prevent crashes that lead to fatalities.

Here's a counter, every 23 seconds, somebody in the world dies in a car auto crash. It should be a sobering, it is for me, thing that I think about every single day. You go to bed, you wake up, you work on all the deep learning methods, all the different papers we're publishing, everything we're trying to push forward is really to save lives at the beginning and at the end, that is the main goal.

So with that groundwork, with that idea, with that base of mission that we're all working towards from the different ideas and different perspectives, I would like to review what happened in 2018. So first, Waymo has done incredible work in deploying and testing their vehicles in various domains and have in October reached the mark of 10 million miles driven autonomously, which is an incredible accomplishment.

It's truly a big step for fully autonomous vehicles in terms of deployment and obviously is growing and growing by day and we'll have Drago here from Waymo to talk about their work there. Then on the L2, on the semi-autonomous side, that's the pair, that's the mirror side of this equation.

The other incredible number that's perhaps less talked about is the one billion mile mark reached by Tesla in the semi-autonomous driving of Autopilot. Autopilot is a system that's able to control its position in the lane, center itself in the lane, it's able to control the longitudinal movement, so not follow a vehicle when there's a vehicle in front and so on.

But the degree of its ability to do so is the critical thing here, is the ability to do so for many minutes at a time, even hours at a time, especially on highway driving, that's the critical thing. And the fact that they've reached one billion with a B miles is an incredible accomplishment.

All of that from the machine learning perspective is data, that's data and all of the Autopilot miles are driven with the primary sensor being a camera, so it's computer vision. And how does computer vision work in modern day? Especially with the second iteration of Autopilot hardware, there's a neural network, there's a set of neural networks behind it.

That's super exciting. That is probably the largest deployment of neural networks in the world that has a direct impact on a human life that's able to decide, that's able to make life critical decisions many times a second over and over. That's incredible. You go from the step of image classification on ImageNet and you sit there with a TensorFlow and you're very happy there, you're able to achieve a 99.3 accuracy with a state of the art algorithm.

You take from that a step towards there's a human life, your parents driving, your grandparents driving this, your children driving the system and there's a neural network making the decision of whether they live. So that one billion mark is an incredible accomplishment. And on the sobering side, from various perspectives, the fatalities.

There's been two fatalities that happened in March of 2018. One in the fully autonomous side of things with Uber in Tempe, Arizona hitting a pedestrian and leading to a pedestrian fatality. And on the semi-autonomous side of Tesla Autopilot, the third fatality that Tesla Autopilot led to and the one in 2018 is in Mountain View, California when Tesla slammed into a divider, killing his driver.

Now, the two aspects here that are sobering and really important to think about as we talk about the progression of autonomous vehicles, proliferation in our world is our response as a public, as from the general public to the engineers, to the media and so on, how we think about these fatalities.

And obviously there's a disproportionate amount of attention given to these fatalities. And that's something as engineers you have to also think about, that the bar is much higher on every level in terms of performance. So in order to success, as I'll argue, in order to design successful autonomous vehicles, those vehicles will have to take risks.

And when the risks don't pan out, the public, if the public doesn't understand the general problem that we're tackling, the goal, the mission, that those risks, when they don't, the risks that are taken can have significant detrimental effect to the progress in this autonomous vehicle space. So that's something we really have to think about.

That's our role as engineers and so on. Question, yeah. So the question was, do we know the rate of fatalities per mile of vehicle driven, which is at the crudest level how people think about safety. So there's about 80, 90, 100 million miles driven in manually controlled cars at every fatality.

So one fatality per, depending on which numbers you look at, it's 80 to 100 million miles. And the Tesla vehicle, for example, the fatality is, well, we could just take the one billion and divide it by three. Now it's apples and oranges in comparison. And that's something actually that we're working on to make sure that we compare it correctly, compare the aspects of manual miles that directly are comparable to the autopilot miles.

So autopilot is a modern vehicle that's much safer. Tesla is a modern vehicle that's much safer than the general population of manually driven vehicles. Autopilot is driven on only particular kinds of roads, on the highway, primarily most of the miles. The kinds of people that drive autopilot, all these kinds of factors need to be considered when you compare the two.

But when you just look at the numbers, Tesla autopilot is three times safer than manually driven vehicles. But that's not the right way to look at it. And for anyone that's ever taken a statistics class, three fatalities is not a large number by which to make any significant conclusions.

Nevertheless, that doesn't stop the media, the New York Times and everybody from responding to a single fatality. Which PR and marketing aspects of these different companies are very sensitive to. Which is of course troubling and concerning for an engineer that wants to save lives. But it's something that we have to think about.

Okay, 2018 in review continued. There's been a lot of announcements, or rather actual launches of public testing of autonomous taxi services. So companies that are on public roads have been delivering real people from one location to another. Now there's a lot of caveats. In many of these cases, it's very small scale, just a few vehicles.

In most cases, it's very low speed in a constrained environment, in a constrained community. And almost always, really always with a safety driver, there's a few exceptions for demonstration purposes. But there's always an actual driver in the seat. Some of the brilliant folks representing these companies will speak in this course.

Is Voyage doing it in an isolated community? Awesome work they're doing in villages in Florida. Optimus Ride here in Boston, doing in the community in Union Point. Drive AI in Texas. Maine Mobility expanding beyond Detroit, but really most operations in Detroit. Waymo has launched its service. Waymo One that's gotten some publicity in Phoenix, Arizona.

Neuro doing zero occupancy deliveries of groceries autonomously. So we didn't say it has to be delivering humans, it's delivering groceries autonomously. Uber is quietly or not so quietly resumed its autonomous vehicle taxi service testing in Pittsburgh in a very careful, constrained way. Aptiv, after acquiring Carling-Yema, is a new autonomy, has been doing extensive, large-scale taxi service testing everywhere from Vegas to Boston here, to Pittsburgh and in Singapore, of course.

Aurora that spoke here last time, the head of Tesla Autopilot that launched Aurora, and the Chris Hermsen behind this young upstart company is doing testing in San Francisco and Pittsburgh. And then Cruise, Kyle will be here to talk from GM, is doing testing in San Francisco, Arizona, and Michigan.

So when we talk about predictions, I'll talk about a few people predicting when we're going to have autonomous vehicles. And when you yourself think about what it means, when will they be here? When will autonomous vehicles arise such that the Uber that you call will be autonomous and not with a populated by a driver?

So the thing we have to think about is what we think about how we define autonomous, what that experience looks like. And most importantly, in these discussions, we have to think about scale. So we here at MIT, our group, MIT Human-Centered Autonomous Vehicle, we have a fully autonomous vehicle that people can get in if you would like, and it will give you a ride in a particular location.

But that's one vehicle, it's not a service, and it only works on particular roads, it's extremely constrained. In some ways, it's not much different than most of the companies that we're talking about today. Now, scale here, there's a magic number, I'm not sure what that is, but for the purpose of this conversation, it says 10,000, where there's a meaningful deployment.

When it's truly going beyond that prototype demo mode to where everything's under control, to where it's really touching the general population in a fundamental way, scale is everything here. And it starts, let's say, at 10,000. Just to give you for reference, there's 46,000 active Uber drivers in New York City.

So that's what 10,000 feels like. 25, 30% of the Uber drivers in New York City all of a sudden become passengers. So the predictions. I'm not a marketing PR person, so I don't understand why everybody has to make a prediction, but they all seem to. All the major automakers have made a prediction of when they'll have a deploy, when they will be able to deploy autonomous vehicles.

Tesla has made in early 2017, a prediction that they will have autonomous vehicles 2018. In 2018, they've now adjusted the prediction to 2019. Nissan, Honda, Toyota have made prediction for 2020 under certain constraints in highway urban. Hyundai and Volvo has in 2021, with BMW and Ford, Ford saying at scale.

So a large scale deployment in 2021. And Chrysler in '21, and Daimler saying in the early '20s. So there is the predictions that are extremely optimistic that are perhaps driven by the instinct that the company has to declare that they're at the cutting edge of innovation. And then there is many of the leading engineers behind the leading these teams, including Carl and Yamaha and Gil Pratt from MIT, who injects a little bit of caution and grounded ideas about how difficult it is to remove the human from the loop of automation.

So Carl says that basically tele-operation kind of gives this analogy of an elevator. You know, an elevator is fully autonomous, but there's still a button to call for help if something happens. And that's how he thinks about autonomous vehicles, even with greater and greater degree of automation, there's still going to have to be a human in the loop.

There's still going to be a way to contact a human to get help. And Gil Pratt and Toyota, and they're making some announcements at CES, basically saying that the human in the loop is the fundamental aspect that we need to approach this problem. And removing the human from consideration is really, really far away.

And Gil, historically and currently, is one of the sort of the great roboticists in the world that defined a lot of the DARPA challenges and a lot of our progress, historically speaking, up to this point. So really the full spectrum, we can think of it as the Elon Rodney spectrum of optimism versus pessimism.

The Elon Musk, who's extremely bold and optimistic about his predictions. I often connect with this kind of thinking because sometimes you have to believe the impossible is possible in order to make it happen. And then there's Rodney, also one of the great roboticists, the former head of C-Cell, the AI laboratory here, is a little bit on the pessimistic side.

So for Elon, a fully autonomous vehicle will be here in 2019. For Rodney, the vehicles are really fully autonomous, or beyond 2050. But he believes in the 30s, there will be a significant, a major city will be able to allocate a significant region of that city where manual driving is fully banned, which is the way he believes those vehicles, autonomous vehicles can really proliferate when you ban manually driven vehicles in certain parts.

And in the 40s, 2045 or beyond, majority of US cities will ban manually driven vehicles. Of course, the quote from Elon Musk in 2017 is that my guess is that in probably 10 years, it will be very unusual for cars to be built that are not fully autonomous. So we also have to think about the long tail of the fact that many people drive cars that are 10 years old, 20 years old.

So even when you have every car is built that's fully autonomous, it's still gonna take time for that dissipation of vehicles to happen. And so my own view beyond predictions, to take a little pause into the ridiculous and the fun to explain the view. Yes, that is me playing guitar in our autonomous vehicle.

Now the point of this ridiculous video and embarrassing I should never played it. Yeah, okay, I think it's gonna be over soon. Now for those of you born in the 90s, that's classic rock. So the point I'm trying to make beyond predictions is that autonomous vehicles will not be adopted by human beings in the near term, in the next 10, 15 years, because they're safer.

Safety is not going to, they may be safer, but they're not going to be so much safer that that's going to be the reason you adopt. It's not gonna be because they get you to the location faster. Everything we see with autonomy is they're going to be slower until majority of the fleet is autonomous.

They're cautious and therefore slower and therefore more annoying in the way we think about actually how we navigate this world. We take risk, we drive assertively with speed over the speed limit all the time. That is not how autonomous vehicles today operate. So they're not gonna get us there faster.

And for every promise, every hope that they're going to be cheaper, really there's still significant investment going into them. And there's not good economics in the near term of how to make them obviously significantly cheaper. What I think Uber and Lyft has taken over the taxi service because of the human experience.

In the same way, autonomy will only take over if not take over, be adopted by human beings if it creates a better human experience. If there's something about the experience that you enjoy the heck out of. This video and many others that we're putting out shows that natural language communication, the interaction with the car, the ability of the car to sense everything you're doing from the activity of the driver to the driver's attention and being able to transfer control back and forth in a playful way, but really in a serious way.

Also that's personalized to you. That's really the human experience, the efficiency of the human experience, the richness of the human experience, that is what we need to also solve. That's something you have to think about because many of the people that'll be speaking at this class and many of the people that are working on this problem are not focused on the human experience.

It's a kind of afterthought that once we solve the autonomous vehicle problem, it'll be fun as hell to be in that car. I believe you first have to make it fun as hell to be in the car and then solve the autonomous vehicle problem jointly. In the language that we're talking about here, there's several levels of autonomy that are defined from level zero to level four, level zero, no automation, four and five, level three, four and five, increasing automation.

So level two is when the driver is still responsible. Level three, four, five is when there's less and less responsibility, but really in three, four, five there's parts of the driving where the liability is on the car. So there's only really two, as far as I'm concerned, levels, human-centered autonomy and full autonomy.

Human-centered means the human is responsible. Full autonomy means the car is responsible, both on the legal side, the experience side, and the algorithm side. That means full autonomy does not allow for teleoperation. So it doesn't allow for the human to step in and remotely control the vehicle because that means the human is still in the loop.

It doesn't allow for the 10-second rule that it's gonna be fully autonomous, but once it starts warning you, you have 10 seconds to take over. No, it's not fully autonomous. We cannot guarantee safety in any situation. It has to be able to, if the driver doesn't respond in 10 seconds, it has to be able to find safe harbor.

It has to be able to pull off to the side of the road without hurting anybody else to find safety. So that's the fully autonomous challenge. And so how do we envision these two levels of automation proliferating society, getting deployed at a mass scale? The 10,000, 10 million, beyond.

On the fully autonomous side, the way to think about it with the predictions that we're talking about here is there's several different possibilities of how to deploy these vehicles. One is last mile delivery of goods and service, like the groceries. These are zero occupancy vehicles delivering groceries or delivering human beings at the last mile.

What the last mile means is it's slow moving transport to the destination where most of the tricky driving along the way is done manually, and then the last mile delivery in the city, in the urban environment is done by zero occupancy autonomous vehicles. Trucking on the highway, possibly with platooning where a sequence of trucks follow each other.

So in this, what people think about as a pretty well-defined problem of highway driving with lanes well-marked, well-mapped, routes throughout the United States, and globally on the highway driving is automatable. The specific urban routes, kind of like what a lot of these companies are working on, defining this taxi service and a personalized public transport.

There's certain pickup locations you're allowed to go to, there are certain drop-off locations, that's it. It's kind of like taking the train here, but as opposed to getting on the train with 100 other people or bus, you're getting on a car when you're alone or with one other person.

The closed communities, something Oliver Cameron with Voyage is working on defining and Optimus Ride, defining a particular community that you now have a monopoly over, that you define the constraints, you define the customer base, and then you just deliver the vehicles, you map the entire road, you have slow moving transport that gets people from A to B, anywhere in that community.

And then there's the world of zero occupancy ride sharing delivery. So the Uber that comes to you, as opposed to having you drive it yourself and it comes to you autonomously with nobody in there. And then you get in and drive it. So imagine a world where we have empty vehicles driving around, delivering themselves to you.

Semi-autonomous side is thinking about a world where teleoperation plays a really crucial role, where it's fully autonomous under certain constraints in the highway, but a human can always step in. High autonomy on the highway, kind of like what Tesla is working towards most recently, is on-ramp to off-ramp. Now the driver is still responsible, liability-wise and in terms of just observing the vehicle and algorithmically speaking, but the autonomy is pretty high level to a point where much of the highway driving can be done fully autonomously.

And low autonomy, unrestricted travel as an advanced driver assistance system, meaning that the car, kind of like the Tesla, the Volvo S90s, or the SuperCruise and the Cadillacs, all these kinds of L2 systems that are able to keep you in the lane 10 to 30% of the miles that you drive and some fraction of the time take some of the stress of driving off.

And then there is some out there ideas, the idea of connected vehicles, vehicle-to-vehicle communication and vehicle-to-infrastructure communication, enabling us to navigate, for example, intersection efficiently without stopping, removing all traffic lights. So here shown on the bottom is our conventional approach. There's a queuing system that forms because of traffic lights that turn red, green, yellow, and without traffic lights and with communication to the infrastructure and between the vehicles, you can actually optimize that to significantly increase the traffic flow through a city.

Of course, there's the boring solution of tunnels under cities, layers of tunnels under cities, tunnels all the way down. Autonomous vehicles, basically by the design of the tunnel, constraining the problem to such a degree that, I mean, the idea of autonomy just is completely transformed that you're basically, a car is able to transform itself into a mini train, into a mini public transit entity for a particular period of time.

So you get into that tunnel, you drive at 200 miles an hour, and not necessarily drive, be driven 200 miles an hour, and then you get out of the tunnel. Of course, there's the flying cars, personalized flying car vehicles. I will not, I mean, (audio cuts out) Rodney, as I mentioned before, does believe that we'll have them in 2050.

There's a lot of people that are seriously actually thinking about this problem, is there's a level of autonomy, obviously, that's required here for a regular person, like, I don't know, somebody without a pilot's license, for example, to be able to take off and land. Making that experience accessible to regular people means that there's going to be a significant amount of autonomy involved.

One of the people really, one of the companies really seriously working on this is Uber, with the Uber Elevate, Uber Air, I think it's called, and the idea is that you would meet your vehicle not on the street, but at a roof, you take an elevator, you meet them at the roof of the, of a building.

This video is from Uber, and they're seriously addressing this problem. Many of the great solutions to the world's problems have been laughed at at some point. So, let's not laugh too loud at these possibilities. Back in my day, we used to drive in the street. Okay, so, 10,000 vehicles, if that's the bar.

I sort of out of curiosity asked, did a little public poll, 3,000 people responded, asked who will be first to deploy 10,000 fully autonomous cars operating on public roads without a safety driver? And several options percolated, with Tesla getting 50%, 57% of the vote, and Waymo getting 21% of the vote, and 14% someone else, and 8% the curmudgeons and the engineers saying no one in the next 50 years will do it.

And again, in 1998, when Google came along, the leaders of the space were Ask Jeeves and InfoSeek and Excite, all services I've used, and probably some people in this room have used, Lycos, Yahoo. Obviously, they were the leaders in the space, and Google disrupted that space completely. So, this poll shows the current leaders, but it's wide open to ideas, and that's why there's a lot of autonomous vehicle companies.

Some companies are taking advantage of the hype and the fact that there's a lot of investment in the space, but some companies, like some of the speakers visiting this course, are really trying to solve this problem. They wanna be the next Google, the next billion, multi-billion, next trillion dollar company, by solving the problem.

So, it's wide open. But currently, Tesla, with the semi-autonomous vehicle approach, working towards trying to become fully autonomous, and Waymo, starting with the fully autonomous, working towards achieving scale at the fully autonomous, are the leaders in the space. Given that ranking in 2019, let's take a quick step back to 2005 with the DARPA challenge, when the story began.

The race to the desert, when Stanley from Stanford won a race to the desert. That really captivated people's imagination about what's possible. And a lot of people have said that the autonomous vehicle problem is solved in 2005. They really said, the idea was, especially because in 2004, nobody finished that race.

2005, four cars finished the race. It was like, well, we cracked it. This is it. And then some critics said that urban driving is really nothing comparable to desert driving. Desert is very simple. There's no obstacles and so on. It's really a mechanical engineering problem. It's not a software problem.

It's not a fundamentally, it's not really an autonomous driving problem as it would be delivered to consumers. And then, of course, in 2007, the DARPA put together the Urban Grand Challenge, and several people finished that, with CMU's boss winning. And so, the thought was, at that point, that's it, we're done.

As Ernest Rutherford, a physicist, said, that physics is the only real science. The rest is just stamp collecting. All the biology, chemistry, certainly, boy, I wouldn't wanna know what he thinks about computer science. It's just all the stupid, silly details. Physics is the fundamentals. And that was the idea, with the DARPA Grand Challenge and solving that, that we solve the fundamental problem of autonomy.

And the rest is just for industry to figure out some of the details of how to make an app and make a business out of it. So that could be true. And the underlying beliefs there is that driving is an easy task, that it's solvable. The thing that we do as human beings, that it's pretty formalizable, it's pretty easy to solve with autonomy, that the other idea is that humans are bad at driving.

This is a common belief. Not me, not you, but everybody else. Nobody in this room, but everybody else is a terrible driver. The kind of intuition that we have about our experience in traffic leads us to believe that humans are just really bad at driving. And from the human factors psychology side, there's been over 70 years of research, showing that humans are not able to monitor, maintain vigilance, monitoring a system.

So when you put a human in a room with a robot and say, "Watch that robot," they start texting like 15 seconds in. So that's the fundamental psychology. There's thousands of papers on this. People are, they tune out, they overtrust the system, they misinterpret the system, and they lose vigilance.

Those are the three underlying beliefs. It very well could be true, but what if it is not? So we have to consider that it is not. The driving task is easy, because if you think the driving task is easy and formalizable and solvable by autonomous vehicles, you have to solve this problem.

The subtle vehicle to vehicle, vehicle to pedestrian nonverbal communication that happens here in a dramatic sense, but really happens in a subtle sense, millions of times every single day in Boston. Subtle nonverbal communication between vehicles. You go, no you go. You have to solve all the crazy road conditions where in a split seconds, you have to make a decision about, so in snowy, icy weather, rain, limited visibility conditions, you have 100, 200 milliseconds to make a decision.

Your algorithm based on the perception has to make a control decision. Then you have to deal with the nonverbal communication with pedestrians. These unreasonable, irrational creatures, us human beings. You have to not only understand what they're, the intent of the movement that's anticipated. So anticipating the trajectory of the pedestrian, you also have to assert yourself in a game theoretic way.

As crazy as it might sound, you have to threaten yourself, you have to take a risk. You have to take a risk that if I don't slow down, like that ambulance didn't slow down, that the pedestrian will slow down. Algorithmically, we're afraid to do that. The idea that a pedestrian that's moving, we anticipate their trajectory based on the simple physics of the current velocity, the momentum, they're gonna keep going with some probability.

The fact that by us accelerating, we might make that pedestrian stop is something that we have to incorporate into algorithms and we don't today. So that, and we don't know how to really. So if driving is easy, we have to solve that too. And of course, the thing I showed yesterday with the coast runners and the boat going around and all the ethical dilemmas from the moral machine to the more serious engineering aspects that from the unintended consequences that arise from having to formalize the objective function under which a planning algorithm operates.

If there's any learning that as I showed yesterday, a boat on the left driven by a human wants to finish the race, the boat on the right figures out that it doesn't have to finish the race, it can pick up turbos along the way and gets much more reward.

So if the objective function is to maximize the reward, you can slam into the wall over and over and over again, and that's actually the way to optimize the reward. And those are the unintended consequences of an algorithm that has to be formalizable to the objective function without a human in the loop.

Humans are bad at driving. As I showed yesterday, humans, if they're bad at anything, it's about having a good intuition about what's hard and what's easy. The fact that we have 540 million years worth of data on our visual perception system means we don't understand how damn impressive it is to be able to perceive and understand the scene in a split second, maintain context, maintain an understanding of performing all the visual localization tasks about anticipating the physics of the scene and so on.

And then there's a control side. The humans don't give enough credit to ourselves. We're incredible. A state of the art soccer player on the left and a state of the art robot on the right. (audience laughing) I think there's like four or five times he scores. All right. And that's all the movement and so on involved with that.

Of course here, that's the human robot. That's a really incredible work that's done for the DARPA Robotics Challenge with the human robots on the right and incredible work by the human people doing the same kind of tasks, much more impressive tasks I would say. So that's where we stand.

And the ones on the right are actually not fully autonomous. There's still some human in the loop. There's just noisy broken communication. So humans are incredible in terms of our ability to understand the world and in terms of our ability to act in that world. And the fact that humans, the idea, the view, the popular view grounded in the psychology that humans and automations don't mix well, over trust, misunderstanding, loss of vigilance, decrement and so on.

That's not an obvious fact. It happens a lot in the lab. Most of the experiments are actually in the lab. This is the difference. You put many of you, you put a undergrad, grad student in a lab and say, "Here, watch this screen and wait for the dot to appear." They'll tune out immediately.

But when it's your life and you're on the road, it's just you in the car, it's a different experience. It's not completely obvious the vigilance will be lost. And it's not a complete, when it's just you and the robot, it's not completely obvious what the psychology, what the attentional mechanism, what the vigilance there looks like.

So one of the things we did is we instrumented here 22 Teslas and observe people now over a period of two years of what they actually do when they're driving on a pilot, driving these systems. In red shown manually controlled vehicles and cyan showed vehicle control on autopilot. Now there's a lot of details here and we have a lot of presentation on this, but really the fundamentals are, is that they drive 34%, large percentage of the miles in autopilot.

And in 26,000 moments of transfer of control, they are always vigilant. There's not a moment once in this dataset where they respond too late to a critical situation, to a challenging road situation. Now the dataset, 22 vehicles, that's 0.1% or less than the full Tesla fleet that has autopilot, but it's still an inkling.

It's not obvious that it's not possible to build a system that works together with a human being. And that system essentially looks like this. Some percentage, 90%, maybe less, maybe more. When it can solve the problem of autonomous driving, it solves it and when it needs human help, it asks for help.

That's the trade-off, that's the balance. On the fully autonomous side, on the right, it has to solve here with citations. And there's references always on the bottom. All the problems have to be solved exceptionally, perfectly, from mapping localization to the scene perception, to control, to planning, to being able to find a safe harbor at any moment, to also being able to do external HMI, communication with the other pedestrians and vehicles in the scene.

And then there's teleoperation, vehicle to vehicle, vehicle to eye. You have to solve those perfectly if you want to solve the fully autonomous problem. As I said, including all the crazy things that happen in driving. And if you approach the shared autonomy side, the semi-autonomous, where you're only responsible for a large percentage, but not 100% of the driving, then you have to solve the human side, the human interaction, the sensing what the driver is doing, the collaborating, communicating with the driver, and the personalization aspect that learns with the driver.

Like we've, as I said, you can go online, we have a lot of demonstrations of these kinds of ideas, but the natural language, the communication, I think is critical for all of us, as we're tweeting, as all of us do. So it's as simple as, so this is just demonstration of a vehicle taking control when the attention over time, the driver is being, (man speaking off mic) Okay, we got it, thank you.

Okay, so basically, a smartphone use, which has gone up year by year, and we're doing a lot of analysis on that, it's really what people do in the car, is they use their phone. Whether it's manual or autonomous driving, or semi-autonomous driving. So being able to manage that, to communicate with the driver about when they should be paying attention, which may not be always, you're sort of balancing the time, when is it a critical time to pay attention, when it's not, and communicating effectively, learning with the driver, that problem is a fundamental machine learning problem.

There's a lot of data, visible light, everything about the driver, and it's a psychology problem. So we have data, we have complicated human beings, and it's a human-robot interaction problem that deserves solving. But as you'll hear, on the beyond the human side, looking out into the world, people that are trying to solve the fully autonomous vehicle, it's really a two approach consideration.

One approach is vision, cameras, and deep learning, right? Collect a huge amount of data. So cameras have this aspect that they, they're the highest resolution of information available. It's rich texture information. And there's a lot of it, which is exactly what neural networks love, right? So to be able to cover all the crazy edge cases, vision data, camera data, visible light data is the exactly the kind of data you need to collect a huge amount of, to be able to generalize over all the crazy, countless edge cases that happen.

It's also feasible, all the major data sets, all the, in terms of cost, interest, scale, all the major data sets of visible light cameras. That's another pro and they're cheap. And the world as it happens, whoever designed the simulation that we're all living in, made it such that our, our world, our roads and our world is designed for human eyes.

So eyes is the way we perceive the world. And so the landmark is also on is visual, most of the road textures that you use to navigate, to drive are visible, are made for human eyes. The cons are that without a ton of data, and we don't know how much, they're not accurate.

You make errors because driving is ultimately about 99.999999% accuracy. And so that's what I mean by not accurate. It's really difficult to reach that level. And then the second approach is LiDAR, taking a very particular constrained set of roads, mapping the heck out of them, understanding them fully at a different weather condition and so on, and then using the most accurate sensors available, a Swedish sensors, but really LiDAR at the forefront, being able to localize yourself effectively.

The pros there that it's consistent, especially when machine learning is not evolved, it's consistent and reliable. And it's explainable. If it fails, you can understand why, you can account for those situations. It's not so much true for machine learning methods. It's not so much explainable why it failed in a particular situation.

The accuracy is higher as we'll talk about. The cons of LiDAR is that it's expensive. And most of the approaches in perceiving the world using LiDAR primarily are not deep learning based. And therefore they're not learning over time. And if they were deep learning based, there's a reason they're not, it's 'cause you need a lot of car, you need a lot of LiDAR data.

And there's only a tiny percentage of cars in the world, quite obviously, are equipped with LiDAR in order to collect that data. So quickly running through the sensors, radar is the, it's kind of like the offensive line of football. They're actually the ones that do all the work and they never get the credit.

So radar is that. It's always behind to catch, to actually do the detection in terms of obstacle, the most critical, safety critical obstacle avoidance. It's cheap, it does extremely well, it does well in extreme weather, but it's low resolution. So it's cannot stand on its own to achieve any kind of degree of high autonomy.

Now on the lighter side, it's expensive. It's extremely accurate depth information, 3D cloud, point cloud information. Its resolution is much higher than radar, but still lower than visible light. And there is, depending on the sensor, 360 degree visibility that's built in. So there's a difference in resolution here, visualized LiDAR on the right, radar on the left.

The resolution is just much higher and is improving and the cost is going down and so on. Now on the camera side, it's cheap. Everybody got one. The resolution is extremely high in terms of the amount of information transferred per frame. And everybody, you know, really the scale of, the number of vehicles that have this equipped is humongous.

So it's ripe for application of deep learning. And the challenge is it's noisy, it's bad at depth estimation, and it's not good in extreme weather. So if we kind of use this plot to look, to compare these sensors, to compare these different approaches. So LiDAR works in the dark, variable lighting conditions, has pretty good resolution, has pretty good range, but it's expensive, it's huge, and it doesn't provide rich textural contrast information.

And it's also sensitive to fog and rain conditions. Now, ultrasonic sensors catch a lot of those problems. They're better at detecting proximity. They're high resolution in objects that are close, which is why they're often used for parking, but they can still also be integrated in the sensor fusion package for an autonomous vehicle.

They really catch a lot of the problems that radar has. They complement each other well. And radar, cheap, tiny, detect speed, and has pretty good range, but has terrible resolution. There's very little information being provided. And then cameras, a lot of rich information. They're cheap, they're small, range is great.

The best range actually of all the sensors, and works in bright conditions, but doesn't work in the dark in extreme conditions, and it's just susceptible to all these kinds of problems. And doesn't detect speed, unless you do some tricky structure from motion kind of things. So here's where sensor fusion steps in, and everybody works together to build an entire picture.

That's how this plot works. You can stack it on top of each other. So if you look at a suite that, for example, Tesla is using, which is ultrasonic, radar, and camera, and you compare it to just LiDAR, and see how these paths compare, that actually the suite of camera, radar, and ultrasonic are comparable to LiDAR.

So those are the two comparisons that we have. You have the costly non-machine learning way of LiDAR, and you have the cheap, but needs a lot of data, and is not explainable or reliable in the near term vision-based approach. And those are the two competing approaches. Now, of course, way more, so we'll talk about they're trying to use both, but ultimately the question is, who catches, who is the fail safe?

In the semi-autonomous way, when there's a camera-based method, the human is the fail safe. When you say, oh, crap, I don't know what to do, the human catches. In the fully autonomous mode, so what Waymo's working on, and others, the fail safe is LiDAR. Fail safe is maps, that you can't rely on the human, but you know this road so well, that if the cameras freak out, if there's any of the sensors freak out, that you're able to, you have such good maps, you have such good accurate sensors, that the fundamental problem of obstacle avoidance, which is what safety is about, can be solved.

The question is, what kind of experience that creates. In the meantime, as the people debate, try to make money, start companies, there's just lots of data. Ford F-150 is still the most popular car in America. Manually driven cars are still happening, so there's a lot of data happening. Semi-autonomous cars, every company is now releasing more and more semi-autonomous technology, so that's all data.

And what that boils down to, is the two paths they're walking towards, is vision versus LiDAR, L2 versus L4, semi-autonomous versus fully autonomous. Tesla on the semi-autonomous front, has reached one billion miles, Waymo the leader on the autonomous front, has reached 10 million miles. The pros and cons that I've outlined them.

One, the vision one, the one I'm obviously very excited about, and machine learning researchers are excited about, which fundamentally relies on huge data and deep learning. The neural networks that are running inside the Tesla, and with their new, it's kind of the same kind of path as Google is taking from the GPU to the GPU.

Tesla is taking from the NVIDIA Drive PX2 system, sort of more general GPU based system to creating their own ASIC, and having a ton of awesome neural networks running on their car. That kind of path that others are beginning to embrace, is really interesting to think about for machine learning engineers.

And then people that are maybe more grounded, and actually wanna really value safety, reliability, and sort of from the automotive world, are thinking well we need, machine learning is not explainable, it's difficult to work with, it's not reliable, and so in that sense we have to have a sensor suite that is extremely reliable.

Those are the two paths. Yep, question. The question is, there's all kinds of things you need to perceive. Stop signs and traffic lights, pedestrians and so on. Some of them, if you hit them, it's a problem. Some of them are a bag flying through the air, and all have different visual characteristics, all have different characteristics for all the different sensors.

So LIDAR can detect solid body objects, camera is better at detecting. As last year, Sasha Arnoud talked about, I think fog or smoke, these are interesting things. They might look like an object to certain sensors and not to others, but the traffic light detection problem, luckily with cameras, is pretty solved at this point.

So that's luckily the easy part. The hard part is when you have a green light, and there's a drunk, drug, drowsy, or distracted, the four Ds that Nitz outlined, pedestrian, trying to cross, what to do. That's the hard part. So the road ahead for us, as engineers, as scientists, the thing I'm super excited about, the possibility of artificial intelligence having a huge impact, is taking this step from having these, even if they're large, toy data sets, toy problems, toy benchmarks, of ImageNet classification in Cocoa, all the exciting deep RL stuff that we'll talk about in the future weeks, really are toy examples.

The game of Go and chess and so on. But taking those algorithms and putting them in cars where they can save people's lives, and they actually directly touch and impact our entire civilization. That's actually the defining problem for artificial intelligence in the 21st century, is AI that touches people in a real way.

And I think cars, autonomous vehicles, is one of the big ways that that happens. We get to deal with the psychology, the philosophy, the sociology aspects of it, how we socially think about it, to the robotics problem, to the perception problem. It's a fascinating space to explore. And we have many guest speakers exploring that in different ways, and that's really exciting to see how these people are trying to change the world.

So with that, I'd like to thank you very much. Go to deeplearning.mit.edu, and the code is always available online. (audience applauding) (static crackling) (static crackling) (static crackling) (static crackling) (static crackling) (static crackling) Thanks for watching.