back to indexDmitri Dolgov: Waymo and the Future of Self-Driving Cars | Lex Fridman Podcast #147
Chapters
0:0 Introduction
2:16 Computer games
7:23 Childhood
9:55 Robotics
10:44 Moscow Institute of Physics and Technology
12:56 DARPA Urban Challenge
23:16 Waymo origin story
38:58 Waymo self-driving hardware
47:31 Connected cars
53:23 Waymo fully driverless service in Phoenix
57:45 Getting feedback from riders
65:58 Creating a product that people love
71:49 Do self-driving cars need to break the rules like humans do?
78:33 Waymo Trucks
84:11 Future of Waymo
97:23 Role of lidar in autonomous driving
110:23 Machine learning is essential for autonomous driving
114:25 Pedestrians
121:2 Trolley problem
125:30 Book recommendations
136:56 Meaning of life
00:00:00.000 |
The following is a conversation with Dmitry Dolgov, 00:00:07.100 |
that started as Google's self-driving car project in 2009 00:00:19.480 |
in that they actually have an at-scale deployment 00:00:25.820 |
driving passengers around with no safety driver, 00:00:32.940 |
This, to me, is an incredible accomplishment of engineering 00:00:38.900 |
and exciting artificial intelligence challenges 00:00:45.080 |
followed by some thoughts related to the episode. 00:00:49.640 |
a company that helps businesses apply machine learning 00:01:02.620 |
And Cash App, the app I use to send money to friends. 00:01:05.920 |
Please check out the sponsors in the description 00:01:08.400 |
to get a discount at the support of this podcast. 00:01:19.120 |
that I find fascinating and full of open questions 00:01:22.240 |
from both a robotics and a human psychology perspective. 00:01:28.760 |
about my experiences in academia on this topic 00:01:38.000 |
But I choose to focus on the positive, on solutions, 00:01:41.160 |
on brilliant engineers like Dmitry and the team at Waymo 00:01:46.880 |
and to build amazing technology that will define our future. 00:01:55.640 |
And who knows, perhaps I too will help contribute 00:02:01.760 |
If you enjoy this thing, subscribe on YouTube, 00:02:11.960 |
And now, here's my conversation with Dmitry Dolgov. 00:02:15.700 |
When did you first fall in love with robotics 00:02:21.720 |
- Computer science first, at a fairly young age. 00:02:27.060 |
I think my first interesting introduction to computers 00:02:33.200 |
was in the late '80s when we got our first computer. 00:02:44.320 |
Remember those things that had like a turbo button 00:02:58.800 |
So go on something, then five inches and three inches. 00:03:03.600 |
Maybe that was before that was the giant plates 00:03:07.240 |
But it was definitely not the three inch ones. 00:03:13.400 |
I spent the first few months just playing video games, 00:03:21.600 |
So I started messing around and trying to figure out 00:03:32.120 |
And a couple of years later, it got to a point 00:03:36.640 |
where I actually wrote a game, a little game. 00:03:40.400 |
And a game developer, a Japanese game developer 00:03:43.120 |
actually offered to buy it for me for a few hundred bucks. 00:03:55.840 |
- Yes, that was not the most acute financial move 00:04:09.040 |
It was not open source, but you could upload the binaries, 00:04:25.080 |
As I said, not the best financial decision of my life. 00:04:27.280 |
- You're already playing with business models 00:04:36.240 |
And it had a graphical component, so it's not text-based? 00:04:43.160 |
whatever it was, I think that kind of the earlier version. 00:04:47.200 |
And I actually think the reason why this company 00:04:49.120 |
wanted to buy it is not like the fancy graphics 00:05:05.720 |
and something about the simplicity of the music, 00:05:28.680 |
and thereby creating a more magical experience. 00:05:33.280 |
it feels like your imagination doesn't get to create worlds, 00:05:42.280 |
like waving at kids these days that have no respect, 00:05:58.200 |
I don't, yeah, but that's more about games that, 00:06:06.720 |
like, create a fun, short-term dopamine experience 00:06:12.600 |
versus, I'm more referring to, like, role-playing games 00:06:45.280 |
It's one of the things that suck about being an adult 00:06:48.560 |
is there's no, you have to live in the real world 00:06:57.920 |
You create, like, it's not the fancy graphics, 00:07:03.720 |
You know, one of the pitches for being a parent 00:07:18.640 |
but really you just get to play video games with your kids. 00:07:23.040 |
did you have any ridiculous, ambitious dreams 00:07:26.440 |
of where as a creator you might go as an engineer? 00:07:30.640 |
What did you think of yourself as an engineer, 00:07:33.960 |
as a tinker, or did you want to be like an astronaut 00:07:38.160 |
- You know, I'm tempted to make something up about, 00:07:44.600 |
but that's not the actual memory that pops into my mind 00:08:00.480 |
you know, what I wanted to do when I grow up, 00:08:08.720 |
You know, they don't have those today, I think, 00:08:11.320 |
but, you know, back in the '80s and, you know, in Russia, 00:08:19.080 |
that would stand in the middle of an intersection all day, 00:08:27.760 |
I was strangely infatuated with this whole process, 00:08:35.080 |
And, you know, my parents, both physics profs, by the way, 00:08:42.480 |
with that level of ambition coming from their child, 00:08:52.840 |
I have a OCD nature that I think lends itself 00:09:09.280 |
like a set of rules, a set of rules you can follow, 00:09:15.360 |
I don't know if that's, it was of that nature. 00:09:19.160 |
There's like SimCity and factory building games, 00:09:29.440 |
I think it was the uniform and the, you know, 00:09:31.520 |
the stripe baton that made cars go in the right directions. 00:09:40.400 |
I guess, you know, working on the transportation industry 00:09:46.440 |
- Maybe it was my, you know, deep inner infatuation 00:09:50.600 |
with the traffic control batons that led to this career. 00:09:58.120 |
when was the leap from programming to robotics? 00:10:03.800 |
After, and actually, then it was self-driving cars 00:10:05.840 |
was I think my first real hands-on introduction 00:10:10.600 |
But I never really had that much hands-on experience 00:10:24.120 |
And it was after grad school that I really got involved 00:10:28.160 |
in robotics, which was actually self-driving cars. 00:10:37.840 |
which is, that was the postdoc where I got to play 00:10:46.720 |
So, you know, for episode 100, I talked to my dad. 00:11:10.440 |
And to me, that was like, I met some super interesting, 00:11:15.320 |
as a child, I met some super interesting characters. 00:11:18.280 |
It felt to me like the greatest university in the world, 00:11:23.480 |
And just the people that I met that came out of there 00:11:26.880 |
were like, not only brilliant, but also special humans. 00:11:31.880 |
It seems like that place really tested the soul. 00:11:37.240 |
Both in terms of technically and spiritually. 00:11:41.520 |
So that could be just the romanticization of that place, 00:11:46.840 |
But is it correct to say that you spent some time 00:11:52.600 |
I got my bachelor's and master's in physics and math there. 00:11:56.160 |
And it's actually interesting, 'cause my dad, 00:12:04.680 |
just like you, Alex, growing up about the place 00:12:07.920 |
and how interesting and special and magical it was, 00:12:10.400 |
I think that was a significant, maybe the main reason 00:12:16.680 |
Enough so that I actually went back to Russia from the US. 00:12:26.720 |
- Exactly the reaction most of my peers in college had, 00:12:36.280 |
- Yeah, yeah, I gave them your previous question. 00:12:38.200 |
They supported me in letting me pursue my passions 00:12:47.640 |
Definitely fairly hardcore on the fundamentals 00:12:50.640 |
of math and physics, and lots of good memories 00:13:04.960 |
to join Stanford's DARPA Urban Challenge Team in 2006. 00:13:09.960 |
This was a third in the sequence of the DARPA challenges. 00:13:13.600 |
There were two grand challenges prior to that, 00:13:16.320 |
and then in 2007, they held the DARPA Urban Challenge. 00:13:22.280 |
I joined the team and worked on motion planning 00:13:31.680 |
- So, okay, so for people who might not know, 00:13:38.320 |
In a certain circle of people, everybody knows everything, 00:13:41.200 |
and in a certain circle, nobody knows anything 00:13:50.560 |
but I do think that the Urban Challenge is worth revisiting. 00:13:57.600 |
one that first sparked so many incredible minds 00:14:02.600 |
to focus on one of the hardest problems of our time 00:14:12.640 |
But can you talk about what did the challenge involve? 00:14:31.960 |
as I mentioned, this was the third competition 00:14:47.760 |
So, then DARPA followed with what they called 00:14:50.240 |
the Urban Challenge, where the goal was to have, 00:14:57.080 |
and, you know, share them with other vehicles. 00:15:08.400 |
And they had a bunch of robots, you know, cars, 00:15:13.320 |
that were autonomous in there, all at the same time, 00:15:16.280 |
mixed in with other vehicles driven by professional drivers. 00:15:24.040 |
And so, there's a crude map that they received 00:15:27.880 |
And they had a mission, you know, go here and then there 00:15:31.480 |
And they kind of all were sharing this environment 00:15:34.800 |
at the same time they had to interact with each other, 00:15:38.560 |
So, it's this very first, very rudimentary version 00:15:43.080 |
of a self-driving car that, you know, could operate 00:15:51.600 |
That, as you said, you know, really, in many ways, 00:15:57.120 |
- Okay, so who was on the team and how'd you do? 00:16:04.360 |
Perhaps that was my contribution to the team. 00:16:08.640 |
in the DARPA challenge, but then I joined the team. 00:16:11.720 |
- You were the only one with a bug in the code. 00:16:18.200 |
you know, one of the cool things, it's not, you know, 00:16:21.840 |
this isn't a product, this isn't a thing that, you know, 00:16:26.240 |
it, there's, you have a little bit more freedom 00:16:35.040 |
Is there interesting challenges that stand out to you 00:16:37.120 |
as something that taught you a good technical lesson 00:16:41.200 |
or a good philosophical lesson from that time? 00:16:47.680 |
Not really a challenge, but like one of the most vivid 00:16:53.960 |
and I think that was actually one of the days 00:16:57.680 |
that really got me hooked on this whole field was 00:17:01.520 |
the first time I got to run my software on the car. 00:17:06.800 |
And I was working on a part of our planning algorithm 00:17:16.800 |
So the very first version of that was, you know, 00:17:19.600 |
we tried on the car, it was on Stanford's campus 00:17:22.560 |
in the middle of the night, and you had this little, 00:17:31.960 |
compile and turn over, and, you know, it drove. 00:17:36.160 |
I could actually did something quite reasonable. 00:17:38.120 |
And, you know, it was of course very buggy at the time 00:17:52.680 |
and just, you know, being unable to fall asleep 00:17:54.200 |
for the rest of the night, just my mind was blown. 00:18:06.280 |
interesting memories, like on the day of the competition, 00:18:10.760 |
I remember standing there with Mike Montemarolo, 00:18:12.640 |
who was the software lead and wrote most of the code. 00:18:15.920 |
I think I did one little part of the planner, 00:18:17.800 |
Mike, you know, incredibly did pretty much the rest of it 00:18:24.320 |
But I remember standing on the day of the competition, 00:18:27.360 |
you know, watching the car, you know, with Mike, 00:18:29.080 |
and, you know, cars are completely empty, right? 00:18:32.320 |
They're all there lined up in the beginning of the race. 00:18:34.760 |
And then, you know, DARPA sends them, you know, 00:18:43.680 |
Each siren had its own personality, if you will. 00:18:46.400 |
So, you know, off they go and you don't see them. 00:18:48.640 |
You just kind of, and then every once in a while, 00:19:03.360 |
sending your kids to college and like, you know, 00:19:19.040 |
that was my bug that caused us the loss of the first place, 00:19:24.640 |
occasionally have with people on the CMU team. 00:19:36.120 |
they happen to succeed at something robotics related. 00:19:46.960 |
- So for people, yeah, that's true, unlike Stanford. 00:19:51.680 |
and sort of artificial intelligence universities 00:20:06.120 |
and the way that part of the mission worked is, 00:20:17.840 |
that defined the perimeter of the parking lot. 00:20:21.400 |
and maybe multiple entrances or access to it. 00:20:23.880 |
And then you would get a goal within that open space. 00:20:36.920 |
So it had to navigate a completely free space 00:20:51.840 |
about all the obstacles that it encounters in real time. 00:21:00.800 |
was that you had to reverse out of the parking spot. 00:21:31.120 |
and I have to apologize to Mike Montemariella 00:21:35.880 |
but it is actually one of the more memorable ones. 00:21:39.320 |
And it's something that's kind of become a bit of a metaphor, 00:21:43.480 |
I don't know, a label in the industry since then. 00:21:47.440 |
it's called the victory circle or victory lap. 00:21:53.800 |
So in one of the missions in the urban challenge, 00:22:02.720 |
So the ARPA had a lot of the missions would finish 00:22:12.640 |
and then, you know, go on and finish the rest of it. 00:22:25.720 |
Our car in the hand, which hit the checkpoint, 00:22:28.040 |
and then it would do an extra lap around the oval 00:22:30.360 |
and only then, you know, leave and go on its merry way. 00:22:32.920 |
So over the course of, you know, the full day, 00:22:46.880 |
it was too late for us to consider the next one 00:22:52.080 |
So, you know, that's the Stanford victory lap. 00:23:04.000 |
everybody is a winner in that kind of competition. 00:23:07.240 |
And it led to sort of famously to the creation 00:23:16.720 |
So can we give an overview of how is Waymo born? 00:23:21.280 |
How's the Google self-driving car project born? 00:23:27.320 |
What is it is the engineering kind of set of milestones 00:23:46.880 |
And then Larry and Sergey, Larry Page and Sergey Brin, 00:23:56.040 |
So the Google self-driving car project was born. 00:23:59.240 |
You know, at that time, and we started in 2009, 00:24:10.760 |
At that time, we saw that incredible early result 00:24:19.280 |
I think we're all incredibly excited about where we got to. 00:24:24.200 |
And we believed in the future of the technology, 00:24:26.360 |
but we still had a very rudimentary understanding 00:24:35.680 |
was to really better understand what we're up against. 00:24:39.840 |
And with that goal in mind, when we started the project, 00:24:52.160 |
one was to drive 100,000 miles in autonomous mode, 00:25:01.040 |
And the second milestone was to drive 10 routes. 00:25:08.520 |
They were specifically chosen to be kind of extra spicy, 00:25:12.400 |
extra complicated, and sampled the full complexity 00:25:18.560 |
And you had to drive each one from beginning to end 00:25:25.200 |
So you would get to the beginning of the course, 00:25:26.880 |
you would press the button that would engage in autonomy, 00:25:43.280 |
We had one route that went all through all the freeways 00:25:47.160 |
You know, we had some that went around Lake Tahoe 00:25:52.440 |
We had some that drove through dense urban environments 00:25:56.360 |
like in downtown Palo Alto and through San Francisco. 00:26:20.440 |
probably the most fun I had in my professional career. 00:26:27.160 |
You're not yet starting to build a production system. 00:26:49.600 |
just picking 10 different difficult, spicy challenges 00:26:59.960 |
So like not saying gradually we're going to like, 00:27:08.440 |
and gradually reduce the number of interventions. 00:27:20.840 |
it's unclear that whether that takes two years 00:27:28.160 |
- And I guess that speaks to a really big difference 00:27:31.200 |
between doing something once and having a prototype 00:27:39.320 |
versus how you go about engineering a product that, 00:28:04.640 |
that amount of time to achieve that milestone 00:28:08.120 |
of 10 routes, 100 miles each and no interventions. 00:28:12.360 |
And, you know, it took us a little bit longer 00:28:16.880 |
to get to, you know, a full driverless product 00:28:30.400 |
about the problem of driving from that experience? 00:28:33.480 |
I mean, we can now talk about like what you learned 00:28:38.720 |
but I feel like you may have learned some profound things 00:28:52.160 |
how to make sure it's like safety and all those things, 00:28:56.600 |
But like you were facing the more fundamental, 00:28:59.840 |
philosophical problem of driving in those early days. 00:29:03.800 |
Like what the hell is driving as an autonomous vehicle? 00:29:26.240 |
and we've gotten far enough into the problem that, 00:29:39.240 |
climbing a mountain where you kind of, you know, 00:29:40.680 |
see the next peak and you think that's kind of the summit, 00:29:47.560 |
But we've tried, we've sampled enough of the problem space 00:29:54.200 |
even, you know, with technology of 2009, 2010, 00:29:57.240 |
that it gave us confidence to then, you know, 00:30:05.640 |
So the next step, you mentioned the milestones 00:30:18.080 |
And, you know, Waymo came a little bit later. 00:30:22.280 |
Then, you know, we completed those milestones in 2010. 00:30:41.200 |
maybe, you know, an L3 driver assist program. 00:30:44.600 |
Then around 2013, we've learned enough about the space 00:30:49.600 |
and have thought more deeply about, you know, 00:30:52.880 |
the product that we wanted to build that we pivoted. 00:31:00.880 |
building a driver and deploying it fully driverless vehicles 00:31:05.120 |
And that's the path that we've been on since then. 00:31:07.240 |
And it was exactly the right decision for us. 00:31:10.680 |
- So there was a moment where you also considered like, 00:31:27.520 |
that we've, that became very clear and we made that pivot. 00:31:40.760 |
fundamentally different from how you go building 00:31:43.600 |
So, you know, we've pivoted towards the latter 00:31:47.800 |
and that's what we've been working on ever since. 00:31:53.120 |
Then there's a sequence of really meaningful for us, 00:31:58.040 |
really important, defining milestones since then. 00:32:06.080 |
actually the world's first fully driverless ride 00:32:15.560 |
It was in a custom built vehicle that we had. 00:32:23.240 |
And we put a passenger, his name was Steve Mann, 00:32:28.240 |
a great friend of our project from the early days. 00:32:38.600 |
It was an uncontrolled environment, you know, 00:32:44.960 |
And, you know, we did that trip a few times in Austin, Texas. 00:32:52.200 |
- And, you know, we only, but at that time we're only, 00:33:07.280 |
but it was a fixed route and we only did a few times. 00:33:10.520 |
Then in 2016, end of 2016, beginning of 2017, 00:33:20.640 |
That's when we, kind of, that was the next phase 00:33:30.880 |
And it made sense to create an independent entity, 00:33:52.040 |
where when we started regular driverless operations 00:33:57.040 |
on public roads, that first day of operations, 00:34:11.080 |
but that it was the start of kind of regular, 00:34:15.480 |
- And when you say driverless, it means no driver. 00:35:01.040 |
And then Waymo is really one of the only companies, 00:35:20.880 |
of where the driver's not supposed to disengage. 00:35:32.400 |
about there being nobody in the driver's seat. 00:35:38.520 |
you mentioned the first time you wrote some code 00:35:42.840 |
for free space navigation of the parking lot, 00:35:47.200 |
To me, just sort of as an observer of robots, 00:35:59.800 |
like apply sufficient torque to the steering wheel 00:36:14.080 |
that here's a being with power that makes a decision. 00:36:18.200 |
There's something about like the steering wheel, 00:36:38.600 |
And then there's this leap of trust that you give, 00:36:42.760 |
in the hands of this thing that's in control. 00:36:51.840 |
So I got a chance to last year to take a ride 00:37:09.720 |
and the steering wheel is turning on its own, 00:38:27.560 |
One of the first ones, less about technology, 00:38:34.120 |
We raised our first round of external financing this year. 00:38:40.440 |
so obviously we have access to significant resources. 00:38:44.560 |
But as kind of on the journey of Waymo maturing as a company, 00:38:47.640 |
it made sense for us to partially go externally 00:38:52.200 |
So we raised about $3.2 billion from that round. 00:38:57.200 |
We've also started putting our fifth generation 00:39:33.440 |
on the fifth generation in terms of hardware specs, 00:39:40.640 |
that we are driving from the self-driving hardware 00:39:45.880 |
so this is, as I mentioned, the fifth generation. 00:39:49.880 |
we started building our own hardware many, many years ago. 00:39:54.880 |
And that Firefly vehicle also had the hardware suite 00:40:00.080 |
that was mostly designed, engineered, and built in-house. 00:40:04.320 |
Lighters are one of the more important components 00:40:24.160 |
in terms of sensing, we have lighters, cameras, and radars. 00:40:31.720 |
and makes decisions in real time on board the car. 00:40:41.800 |
in terms of the capabilities and the various parameters 00:40:48.240 |
and compared to what you can kind of get off the shelf 00:40:52.280 |
- Meaning from fifth to fourth or from fifth to first? 00:40:55.360 |
- Definitely from first to fifth, but also from the fourth. 00:40:59.440 |
- Definitely, definitely from fourth to fifth. 00:41:02.200 |
As well as this, the last step is a big step forward. 00:41:18.080 |
There's some components that we get from our manufacturing 00:41:28.880 |
We do a lot of custom design on all of our sensing models. 00:41:36.800 |
Exactly, there's, lighters are almost exclusively in-house 00:41:49.200 |
That is also largely true about radars and cameras. 00:41:58.080 |
- Is there something super sexy about the computer 00:42:03.520 |
Like for people who enjoy computers for, I mean, 00:42:08.520 |
see, there's a lot of machine learning involved, 00:42:13.880 |
There's, you have to probably do a lot of signal processing 00:42:22.040 |
There's probably some kind of redundancy type of situation. 00:42:27.400 |
about the computer for the people who love hardware? 00:42:34.560 |
Redundancy, very beefy compute for general processing 00:43:08.000 |
and you can kind of get a feel for the amount of data 00:43:26.760 |
like GPU wise, is that something you can get into? 00:43:29.960 |
I know that Google works with GPUs and so on. 00:43:40.680 |
I've been talking to people in the government about UFOs 00:43:45.640 |
So this is how I feel right now asking about GPUs. 00:43:50.520 |
But is there something interesting that you could reveal 00:43:55.140 |
or leave it up to our imagination, some of the compute? 00:43:59.680 |
Is there any, I guess, is there any fun trickery? 00:44:02.880 |
Like I talked to Chris Latner for a second time 00:44:08.240 |
and there's a lot of fun stuff going on in Google 00:44:11.320 |
in terms of hardware that optimizes for machine learning. 00:44:18.080 |
in terms of how much, you mentioned customization, 00:44:25.640 |
- I'm gonna be like that government, you know, 00:44:29.920 |
But I, you know, I guess I will say that it's really, 00:44:38.400 |
We have very data hungry and compute hungry ML models 00:44:46.960 |
both being part of Alphabet as well as designing 00:44:51.320 |
our own sensors and the entire hardware suite together 00:44:54.440 |
where on one hand you get access to like really rich, 00:44:59.440 |
raw sensor data that you can pipe from your sensors 00:45:09.120 |
from sensor, raw sensor data to the big compute 00:45:11.560 |
as then have the massive compute to process all that data. 00:45:15.160 |
This is where we're finding that having a lot of control 00:45:18.080 |
of that hardware part of the stack is really advantageous. 00:45:22.560 |
- One of the fascinating magical places to me, 00:45:25.440 |
again, might not be able to speak to the details, 00:45:28.360 |
but it is the other compute, which is like, you know, 00:45:40.760 |
And you have a huge amount of data coming in on the car 00:45:47.760 |
some of that data to then train or to analyze or so on. 00:45:58.320 |
I don't understand how you pull it all together 00:46:02.720 |
in terms of the challenges of seeing the network of cars 00:46:07.720 |
and then bringing the data back and analyzing things 00:46:13.960 |
be able to learn on them to improve the system, 00:46:15.880 |
to see where things went wrong, where things went right 00:46:30.360 |
of really interesting work that's happening there, 00:46:32.600 |
both in the real time operation of the fleet of cars 00:46:36.480 |
and the information that they exchange with each other 00:46:41.960 |
as well as on the kind of the off-board component 00:46:45.720 |
where you have to deal with massive amounts of data 00:46:48.520 |
for training your ML models, evaluating the ML models, 00:47:00.720 |
has once again been tremendously advantageous. 00:47:04.280 |
I think we consume an incredible amount of compute 00:47:09.280 |
We build a lot of custom frameworks to get good 00:47:11.840 |
on data mining, finding the interesting edge cases 00:47:17.640 |
for training and for evaluation of the system 00:47:20.400 |
for both training and evaluating sub-components 00:47:24.040 |
and sub parts of the system and various ML models, 00:47:27.080 |
as well as evaluating the entire system and simulation. 00:47:31.200 |
- Okay, is that first piece that you mentioned 00:47:32.880 |
that cars communicating to each other, essentially, 00:47:36.240 |
I mean, through perhaps through a centralized point, 00:47:48.800 |
I don't know if you can talk to what that number is, 00:47:51.160 |
but it's not in the hundreds of millions yet. 00:47:54.240 |
And imagine if the whole world is Waymo vehicles, 00:47:57.840 |
like that changes potentially the power of connectivity. 00:48:03.280 |
Like the more cars you have, I guess, actually, 00:48:05.840 |
if you look at Phoenix, 'cause there's enough vehicles, 00:48:08.680 |
there's enough, when there's like some level of density, 00:48:12.520 |
you can start to probably do some really interesting stuff 00:48:23.560 |
Is there something interesting there that you can talk to 00:48:27.240 |
about like, how does that help with the driving problem 00:48:41.800 |
and it helps in various ways, but it's not required. 00:49:07.720 |
All right, so the way we do this today is, you know, 00:49:12.440 |
whenever one car encounters something interesting 00:49:15.640 |
in the world, whether it might be an accident 00:49:25.120 |
So, and that's kind of how we think about maps 00:49:27.200 |
as priors in terms of the knowledge of our drivers, 00:49:37.040 |
across the fleet and it's updated in real time. 00:49:52.360 |
and start influencing how they interact with each other, 00:49:56.960 |
potentially sharing some observations, right? 00:50:00.680 |
if you have enough density of these vehicles where, 00:50:07.440 |
You know, it's not part of kind of your updating 00:50:15.720 |
So you can see them exchanging that information 00:50:27.520 |
So when I got a chance to drive with a ride in a Waymo, 00:50:36.440 |
So like there is somebody that's able to dynamically 00:50:46.960 |
What role does the human play in that picture? 00:50:58.440 |
like frictionless, like a human being able to, 00:51:08.520 |
I don't know if they're able to actually control the vehicle. 00:51:15.920 |
I got to believe in teleoperation for a reason 00:51:28.600 |
for our rider experience, especially if it's your first trip, 00:51:34.360 |
or only Waymo vehicle, you get in, there's nobody there. 00:51:37.200 |
And so you can imagine having all kinds of questions 00:51:42.040 |
So we've put a lot of thought into kind of guiding 00:51:44.080 |
our riders, our customers through that experience, 00:51:54.840 |
to service their trip, when you get into the car, 00:51:58.000 |
we have an in-car screen and audio that kind of guides them 00:52:09.600 |
a real life human being that they can talk to, right, 00:52:15.800 |
There is, you know, I should mention that there is 00:52:19.040 |
another function that humans provide to our cars, 00:52:24.640 |
You can think of it a little bit more like, you know, 00:52:28.000 |
traffic control that you have, where our cars, 00:52:39.560 |
They, you know, anything that is safety or latency critical 00:52:43.520 |
is done, you know, purely autonomously by onboard, 00:52:52.560 |
and a car encounters a particularly challenging situation, 00:52:54.600 |
you can imagine like a super hairy scene of an accident, 00:52:59.920 |
They will recognize that it's an off-nominal situation. 00:53:02.400 |
They will, you know, do their best to come up, you know, 00:53:10.160 |
they can ask for confirmation from, you know, 00:53:20.720 |
of kind of contextual information and guidance. 00:53:23.040 |
- So October 8th was when you're talking about the, 00:53:36.320 |
that's the right term, I think, service in Phoenix. 00:53:43.480 |
rider-only vehicles into our public Waymo One service. 00:53:48.640 |
So it's like anybody can get into Waymo in Phoenix? 00:54:06.320 |
we opened that mode of operation to the public. 00:54:09.280 |
So I can download the app and go on the ride. 00:54:12.280 |
There is a lot more demand right now for that service 00:54:42.160 |
kind of transformational technologies of the 21st century. 00:54:48.440 |
Like it's fun to, you know, to be a part of it. 00:54:51.400 |
So it'd be interesting to see like, what do people say? 00:54:54.320 |
What do people, what have been the feedback so far? 00:55:05.520 |
They, you know, we asked them for feedback during the ride. 00:55:24.840 |
they're also giving us feedback on, you know, 00:55:27.760 |
And, you know, that's one of the main reasons 00:55:33.960 |
we are just learning a tremendous amount of new stuff 00:55:38.160 |
There's no substitute for actually doing the real thing, 00:55:47.640 |
paying us money to get from point A to point B. 00:55:53.840 |
- And the idea is you use the app to go from point A 00:56:05.440 |
- It's an area of geography where that service is enabled. 00:56:09.040 |
It's a decent size of geography of territory. 00:56:16.400 |
And, you know, within that, you have full freedom 00:56:19.720 |
of, you know, selecting where you want to go. 00:56:25.760 |
you tell the car where you want to be picked up, 00:56:29.440 |
you know, where you want the car to pull over 00:56:34.240 |
And of course there are some exclusions, right? 00:56:37.080 |
where in terms of where the car is allowed to pull over, 00:56:46.240 |
maybe that's what's the question behind your question, 00:56:49.840 |
- Yes, I guess, so within the geographic constraints 00:56:54.720 |
it can be picked up and dropped off anywhere. 00:56:57.440 |
- That's right, and, you know, people use them 00:57:00.960 |
They, we have, and we have an incredible spectrum of riders. 00:57:03.720 |
I think the youngest, actually have car seats in them, 00:57:05.840 |
and we have, you know, people taking their kids on rides. 00:57:09.800 |
are, you know, one or two years old, you know, 00:57:12.800 |
People, you can take them to, you know, schools, 00:57:19.060 |
to restaurants, to bars, you know, run errands, 00:57:27.240 |
and people, you're gonna use them in their daily lives 00:57:31.480 |
to get around, and we see all kinds of, you know, 00:57:41.880 |
that we then, you know, use to improve our product. 00:57:44.520 |
- So as somebody who's been on, done a few long rants 00:57:58.080 |
I believe that most people are good and kind and intelligent 00:58:09.520 |
So on a product side, it's fascinating to me, 00:58:12.600 |
like, how do you get the richest possible user feedback, 00:58:25.020 |
that's one of the magical things about autonomous vehicles, 00:58:29.160 |
is it's not, like, it's frictionless interaction 00:58:43.800 |
So as part of the normal flow, we ask people for feedback. 00:58:49.040 |
you know, we have on the phone and in the car, 00:58:54.480 |
and provide real-time feedback on how the car is doing 00:58:58.560 |
and how the car is handling a particular situation, 00:59:03.680 |
We have, as we discussed, customer support or life help, 00:59:08.800 |
has a question or he has some sort of concern, 00:59:14.960 |
So that is another mechanism that gives us feedback. 00:59:22.760 |
They give us comments and, you know, a star rating. 00:59:31.800 |
went well and, you know, what could be improved. 00:59:49.640 |
You know, specific that are kind of more, you know, 00:59:53.000 |
go more in depth and we will run both kind of lateral 00:59:55.760 |
and longitudinal studies where we have deeper engagement 01:00:01.440 |
You know, we have our user experience research team 01:00:05.680 |
That's when you say about longitudinal, it's cool. 01:00:08.720 |
And, you know, that's another really valuable feedback, 01:00:12.840 |
And we're just covering a tremendous amount, right? 01:00:16.440 |
People go grocery shopping and they like want to load, 01:00:21.480 |
And like that's one workflow that you maybe don't, 01:00:24.040 |
you know, think about, you know, getting just right 01:00:30.280 |
I have people like, you know, who bike as part 01:00:36.920 |
then they get in our cars, they take apart their bike, 01:00:42.240 |
where we want to pull over and how that, you know, 01:00:50.760 |
In terms of, you know, what makes a good pickup 01:00:53.520 |
and drop off location, we get really valuable feedback. 01:00:57.320 |
And in fact, we had to do some really interesting work 01:01:12.400 |
If you just drop a pin at a current location, 01:01:14.920 |
which is maybe in the middle of a shopping mall, 01:01:20.000 |
And you can have simple heuristics where you just kind of 01:01:24.040 |
and find the nearest spot where the car can pull over 01:01:28.320 |
But oftentimes that's not the most convenient one. 01:01:30.000 |
You know, I have many anecdotes where that heuristic 01:01:33.680 |
One example that I often mention is somebody wanted to be, 01:01:39.000 |
you know, dropped off in Phoenix and, you know, 01:01:49.600 |
where the pin was dropped on the map in terms of, 01:01:53.640 |
But it happened to be on the other side of a parking lot 01:01:57.680 |
that had this row of cacti and the poor person had to like 01:02:00.640 |
walk all around the parking lot to get to where they wanted 01:02:06.160 |
So then, you know, we took all, take all of these, 01:02:08.680 |
all of that feedback from our users and incorporate it 01:02:27.040 |
And then you call the, like the Waymo from there, right? 01:02:42.720 |
I'm gonna, in order to solve, okay, the alternative, 01:02:47.320 |
which I think the Google search engine has taught 01:03:04.600 |
That seems to be like, in terms of Google search, 01:03:09.720 |
that you could see which, like when they recommend 01:03:15.120 |
they suggest based on not some machine learning thing, 01:03:19.880 |
on what was successful for others in the past 01:03:22.720 |
and finding a thing that they were happy with. 01:03:32.200 |
So there's a real, it's an interesting problem. 01:03:35.560 |
Naive solutions have interesting failure modes. 01:03:56.400 |
from getting richer data and getting more information 01:04:02.280 |
But you're absolutely right that there's something, 01:04:05.880 |
that in terms of the effect that they have on users, 01:04:09.000 |
some are much, much, much better than others, right? 01:04:10.680 |
And predictability and understandability is important. 01:04:16.200 |
but is very natural and predictable to the user 01:04:31.840 |
is that the car actually arrives where you told it to ride. 01:04:36.200 |
Like you can always change it, see it on the map 01:04:37.960 |
and you can move it around if you don't like it. 01:04:40.360 |
But like that property that the car actually shows up 01:04:45.760 |
which where compared to some of the human driven analogs, 01:04:59.360 |
I think the fact that it's your phone and the car, 01:05:04.280 |
can lead to some really interesting things we can do 01:05:10.880 |
like the car actually shows up exactly where you told it 01:05:18.880 |
as you call it and it's on the way to come and pick you up. 01:05:21.920 |
And of course you get the position of the car 01:05:25.280 |
but, and they actually follow that route, of course, 01:05:27.960 |
but it can also share some really interesting information 01:05:31.120 |
So, you know, our cars, as they are coming to pick you up, 01:05:35.800 |
if it's come, if a car is coming up to a stop sign, 01:05:38.280 |
it will actually show you that like it's there sitting 01:05:40.440 |
because it's at a stop sign or a traffic light 01:05:44.000 |
So, you know, they're like little things, right? 01:06:09.000 |
Rev.com where I like for this podcast, for example, 01:06:25.360 |
and they do the captioning and transcription. 01:06:28.840 |
And it like, I remember when I first started using them, 01:06:34.560 |
Like, because it was so painful to figure that out earlier. 01:06:38.600 |
The same thing with something called iZotope RX, 01:06:47.520 |
and it just cleans everything up very nicely. 01:07:04.200 |
like I'm a fan of design, I'm a fan of products, 01:07:08.000 |
that you can just create a really pleasant experience. 01:07:18.880 |
I mean, it's exactly what we've been talking about. 01:07:23.280 |
that somehow make you fall in love with the product. 01:07:25.640 |
Is that, we went from like urban challenge days 01:07:29.560 |
where love was not part of the conversation probably, 01:07:38.720 |
and you want them to fall in love with the experience. 01:07:42.600 |
Is that something you're trying to optimize for, 01:07:45.440 |
like how do you create an experience that people love? 01:07:49.760 |
I think that's the vision is removing any friction 01:08:09.480 |
making things and goods get to their destination 01:08:20.560 |
And for our writers, that's what we're trying to get to 01:08:23.040 |
is you download an app and you click and car shows up. 01:08:38.160 |
very convenient, frictionless way to where you wanna be. 01:08:54.360 |
because they don't control the experience, I think, 01:08:57.920 |
they can't make people fall in love necessarily 01:09:07.200 |
to the ride sharing experience I currently have, 01:09:13.320 |
But there's a lot of room for falling in love with it. 01:09:22.840 |
And be like a loyal car person, like whatever. 01:09:26.520 |
Like I like badass hot rods, I guess '69 Corvette. 01:09:33.480 |
Cars are so, owning a car is so 20th century, man. 01:09:37.680 |
But is there something about the Waymo experience 01:09:41.800 |
where you hope that people will fall in love with it? 01:09:46.320 |
Or is it just about making a convenient ride, 01:09:51.400 |
not ride sharing, I don't know what the right term is, 01:09:53.400 |
but just a convenient A to B autonomous transport? 01:09:58.400 |
Or like, do you want them to fall in love with Waymo? 01:10:05.400 |
I mean, almost like from a business perspective, 01:10:28.920 |
And that means building it, building our product 01:10:34.920 |
and building our service in a way that people do. 01:10:37.480 |
Kind of use in a very seamless, frictionless way 01:10:48.960 |
in some way falling in love in that product, right? 01:10:51.440 |
It just kind of becomes part of your routine. 01:10:58.480 |
predictability of the experience and privacy, I think. 01:11:14.320 |
I think if you're gonna use it in your daily life. 01:11:21.760 |
where you're spending a significant part of your life. 01:11:24.600 |
And so not having to share it with other people 01:11:40.760 |
as well as the physical safety of not having to share 01:11:49.380 |
- What about the idea that when there's somebody, 01:11:54.380 |
like a human driving and they do a rolling stop 01:12:01.400 |
you get an Uber or Lyft or whatever, like human driver, 01:12:04.400 |
and they can be a little bit aggressive as drivers. 01:12:09.120 |
It feels like there is, not all aggression is bad. 01:12:18.480 |
Maybe it's possible to create a driving experience. 01:12:21.320 |
Like if you're in the back, busy doing something, 01:12:26.640 |
It's a very different kind of experience perhaps. 01:12:29.320 |
But it feels like in order to navigate this world, 01:12:38.200 |
You need to kind of bend the rules a little bit, 01:12:42.120 |
I don't know what language politicians use to discuss this, 01:12:49.600 |
But like you sort of have a bit of an aggressive way 01:12:56.000 |
of driving that asserts your presence in this world, 01:13:11.200 |
but like how does that fit into the experience 01:13:19.560 |
This is, you're hitting on a very important point 01:13:26.360 |
and parameters that make your driving feel assertive 01:13:38.400 |
They will do the safest thing possible in all situations, 01:13:42.160 |
But if you think of really, really good drivers, 01:13:46.800 |
just think about professional lemon drivers, right? 01:13:50.480 |
They're very, very smooth, and yet they're very efficient. 01:13:57.200 |
They're comfortable for the people in the vehicle. 01:14:00.640 |
They're predictable for the other people outside the vehicle 01:14:04.640 |
And that's the kind of driver that we want to build. 01:14:07.360 |
And you think if maybe there's a sport analogy there, 01:14:21.160 |
So they don't do like hectic flailing, right? 01:14:29.680 |
So that's the kind of driver that we want to build. 01:14:37.160 |
Typically doesn't get you to your destination faster. 01:14:39.280 |
Typically not the safest or most predictable, 01:14:57.480 |
- Yeah, that's a really interesting distinction. 01:14:59.640 |
I think in the early days of autonomous vehicles, 01:15:02.280 |
the vehicles felt cautious as opposed to efficient. 01:15:05.720 |
And still probably, but when I rode in the Waymo, 01:15:17.520 |
And like, yeah, then he's one of the surprising feelings 01:15:32.960 |
and everything I've ever built was felt awkwardly, 01:15:41.240 |
or like awkwardly cautious is the way I would put it. 01:15:51.320 |
and I think efficient is like the right terminology here. 01:15:57.680 |
They wasn't, and I also like the professional limo driver. 01:16:28.160 |
then there's a certain way that mastery looks like. 01:16:37.000 |
And perhaps what we associate with like aggressiveness 01:16:51.960 |
you can be, you can create a good driving experience 01:16:59.060 |
That's, I mean, you're the first person to tell me this. 01:17:05.800 |
but that's exactly what it felt like with Waymo. 01:17:11.420 |
that you have to break the rules in life to get anywhere. 01:17:14.480 |
But maybe, maybe it's possible that that's not the case 01:17:23.280 |
But it certainly felt that way on the streets of Phoenix 01:17:36.240 |
- Yeah, I mean, that's what we're going after. 01:17:44.920 |
but also comfort and predictability and safety. 01:18:03.200 |
and then it's very easy when you have some curve 01:18:05.620 |
of precision and recall, you can move things around 01:18:12.680 |
But then, and you can tune things on that curve 01:18:15.720 |
and be kind of more cautious or more aggressive, 01:18:17.520 |
but then aggressive is bad or cautious is bad. 01:18:19.920 |
But true capabilities come from actually moving 01:18:33.460 |
- Before I forget, let's talk about trucks a little bit. 01:18:42.000 |
I'm not sure if we wanna go too much into that space, 01:18:53.040 |
And how different, like philosophically and technically 01:20:04.640 |
the simulator, ML infrastructure, those carry over. 01:20:15.440 |
all of that carries over between the domains. 01:20:25.160 |
And then there is specialization of that core technology 01:20:39.160 |
the configuration of the sensors is different. 01:20:47.640 |
So for example, we have two of our main laser 01:20:55.680 |
Whereas on the JLR I-PACE, we have one of it, 01:21:12.040 |
custom radars, pulling the whole system together, 01:21:22.320 |
whether it's object detection, classification, 01:21:24.720 |
tracking, semantic understanding, all that carries over. 01:21:34.480 |
but again, the fundamentals carry over very, very nicely. 01:21:48.280 |
to find the long tail, to improve your system 01:21:51.440 |
in that long tail of behavior prediction and response, 01:22:01.920 |
using the smaller vehicles for transportation goods? 01:22:13.080 |
So, one is moving humans, one is moving goods, 01:22:15.720 |
and one is like moving nothing, zero occupancy, 01:22:19.680 |
meaning like you're going to the destination, 01:22:29.520 |
it's the less exciting from the commercial perspective. 01:22:36.920 |
if you think about what's inside a vehicle as it's moving, 01:22:57.640 |
policies that are applied for zero occupancy vehicle? 01:23:03.400 |
or is it just move as if there is a person inside, 01:23:09.680 |
- As a first order approximation, there are no differences. 01:23:13.000 |
And if you think about, you know, safety and, you know, 01:23:17.000 |
comfort and quality of driving, only part of it, 01:23:32.080 |
not for the purely for the benefit of, you know, 01:23:36.280 |
It's also for the benefit of the people outside 01:23:39.200 |
kind of feeding, fitting naturally and predictably 01:23:43.240 |
So, you know, yes, there are some second order things 01:23:47.680 |
and, you know, optimize maybe kind of your fleet, 01:23:52.480 |
and you would take into account whether, you know, 01:23:58.320 |
serving a useful trip, whether with people or with goods, 01:24:04.760 |
to that next valuable trip that they're going to provide, 01:24:08.860 |
but that those are mostly second order effects. 01:24:42.920 |
- So I think the heart of your question is, you know, what-- 01:24:47.360 |
- Can you ask a better question than I asked? 01:24:54.200 |
phrase it in the terms that I want to answer. 01:25:11.840 |
to more places and more people around the world, right? 01:25:20.960 |
is exactly that, larger scale commercialization 01:25:35.200 |
and, you know, Phoenix gives us that platform 01:25:39.120 |
and gives us that foundation upon which we can build. 01:25:44.200 |
And it's, there are a few really challenging aspects 01:25:49.200 |
of this whole problem that you have to pull together 01:26:02.440 |
to go from a driverless car to a fleet of cars 01:26:10.440 |
and then all the way to, you know, commercialization. 01:26:13.160 |
So, and, you know, this is what we have in Phoenix. 01:26:15.720 |
We've taken the technology from a proof point 01:26:21.120 |
and have taken our driver, you know, from, you know, 01:26:24.840 |
one car to a fleet that can provide a service. 01:26:27.140 |
Beyond that, if I think about what it will take 01:26:31.040 |
to scale up and, you know, deploy in, you know, 01:26:49.640 |
the hardware and software core capabilities of our driver. 01:26:53.640 |
The second dimension is evaluation and deployment. 01:27:01.960 |
product, commercial, and operational excellence. 01:27:05.960 |
So you can talk a bit about where we are along, you know, 01:27:17.880 |
on, you know, the hardware and software, you know, 01:27:26.920 |
that is providing fully driverless trips to our customers 01:27:34.320 |
And we've learned a tremendous amount from that. 01:27:38.480 |
So now what we're doing is we are incorporating 01:27:42.120 |
all those lessons into some pretty fundamental improvements 01:27:45.920 |
in our core technology, both on the hardware side 01:27:50.160 |
to build a more general, more robust solution 01:28:10.960 |
And that's the platform, the fourth generation, 01:28:13.720 |
the thing that we have right now driving in Phoenix, 01:28:15.560 |
it's good enough to operate fully driverless, 01:28:38.200 |
It is designed to be manufacturable at very large scale 01:28:41.840 |
and, you know, provides the right unit economics. 01:28:43.440 |
So that's the next big step for us on the hardware side. 01:28:48.080 |
- That's already there for scale, the version five. 01:28:55.360 |
that it's the same version as the Pixel phone? 01:29:15.040 |
from the fourth generation hardware to the fifth, 01:29:16.880 |
we're making similar improvements on the software side 01:29:21.080 |
and allow us to kind of quickly scale beyond Phoenix. 01:29:24.640 |
So that's the first dimension of core technology. 01:29:26.560 |
The second dimension is evaluation and deployment. 01:29:35.040 |
How do you build a release and deployment process 01:29:45.480 |
How do you get good at it so that it is not, you know, 01:29:49.040 |
a huge tax on your researchers and engineers that, you know, 01:29:52.480 |
so you can, how do you build all of these, you know, 01:29:54.240 |
processes, the frameworks, the simulation, the evaluation, 01:29:59.760 |
so that, you know, people can focus on improving the system 01:30:02.440 |
and kind of the releases just go out the door 01:30:06.640 |
So we've gotten really good at that in Phoenix. 01:30:09.640 |
That's been a tremendously difficult problem, 01:30:16.240 |
And now we're working on kind of incorporating 01:30:19.880 |
to make it more efficient, to go to new places, 01:30:22.320 |
you know, and scale up and just kind of, you know, 01:30:25.200 |
So that's that second dimension of evaluation and deployment. 01:30:28.080 |
And the third dimension is product, commercial, 01:30:40.160 |
You know, that's why we're doing things end-to-end 01:30:49.240 |
from our users getting really incredible feedback. 01:31:00.080 |
even better and more convenient for our users. 01:31:01.760 |
- So you're converting this whole process of Phoenix 01:31:04.800 |
in Phoenix into something that could be copied 01:31:09.440 |
So like, perhaps you didn't think of it that way 01:31:12.640 |
when you were doing the experimentation Phoenix, 01:31:23.080 |
but you've taken the full journey in Phoenix, right? 01:31:32.360 |
But I imagine it can encompass the entirety of Phoenix 01:31:42.400 |
like as long as it's a large enough geographic area. 01:31:45.160 |
So what, how copy-pastable is that process currently? 01:31:57.600 |
like when you copy and paste in Google Docs, I think, 01:32:02.600 |
no, or in Word, you can like apply source formatting 01:32:09.440 |
So when you copy and paste the Phoenix into like, 01:32:14.440 |
say, Boston, how do you apply the destination formatting? 01:32:19.840 |
Like how much of the core of the entire process 01:32:32.800 |
is there in Phoenix that you understand enough 01:32:40.600 |
We're not at a point where we're kind of massively 01:32:48.840 |
and we very intentionally have chosen Phoenix 01:33:09.200 |
force ourselves to learn all those hard lessons 01:33:15.000 |
on operating a service, operating a business, 01:33:23.800 |
about the most difficult, most important challenges 01:33:28.500 |
to get us to that next step of massive copy and pasting, 01:33:39.080 |
We're incorporating all those things that we learned 01:33:41.120 |
into that next system that then will allow us 01:33:46.040 |
and to massively scale to more users and more locations. 01:33:49.700 |
I mean, you know, I just talked a little bit about, 01:33:51.100 |
what does that mean along those different dimensions? 01:34:01.800 |
- Can you say what other cities you're thinking about? 01:34:16.680 |
People are not being very nice about San Francisco currently. 01:34:23.440 |
But Austin seems, I visited there and it was, 01:34:28.960 |
It's funny, these moments like turn your life. 01:34:39.800 |
"You look so handsome in that tie, honey," to me. 01:34:51.540 |
but even in San Francisco where people wouldn't, 01:35:00.260 |
And since Waymo does have a little bit of a history there, 01:35:05.160 |
- Is this your version of asking the question of like, 01:35:12.020 |
but I'm thinking about moving to San Francisco, Austin, 01:35:15.180 |
like in a blink twice, if you think I should move to. 01:35:24.060 |
I think we've been testing more than 25 cities. 01:35:26.700 |
We drive in San Francisco, we drive in Michigan for snow. 01:35:31.100 |
We are doing significant amount of testing in the Bay Area, 01:35:37.100 |
'cause we're talking about the various different thing, 01:35:39.020 |
which is like a full on large geographic area, 01:35:58.860 |
They're doing, you know, there's a lot of fun. 01:36:11.560 |
or like, is there tricky things with government and so on? 01:36:15.680 |
Is there other friction that you've encountered 01:36:27.480 |
Is there other stuff that you have to overcome 01:36:37.800 |
So we put significant effort in creating those partnerships 01:36:42.800 |
and those relationships with governments at all levels, 01:36:52.400 |
We've been engaged in very deep conversations 01:36:59.840 |
whenever we go to test or operate in a new area, 01:37:22.200 |
Okay, so Mr. Elon Musk said that LIDAR is a crutch. 01:37:30.760 |
- I wouldn't characterize it exactly that way. 01:37:37.920 |
It is a key sensor that we use just like other modalities. 01:37:42.920 |
As we discussed, our cars use cameras, LIDARs and radars. 01:37:57.520 |
They have very different physical characteristics. 01:38:00.440 |
Cameras are passive, LIDARs and radars are active, 01:38:05.540 |
So that means they complement each other very nicely. 01:38:13.980 |
to build a much safer and much more capable system. 01:38:27.540 |
and not use one or more of those sensing modalities 01:38:41.980 |
or one business might not make sense for another one. 01:38:48.580 |
So if you're talking about driver assist technologies, 01:38:53.780 |
and you make different ones if you're building a driver 01:38:56.820 |
that you deploy in fully driverless vehicles. 01:39:00.640 |
And LIDAR specifically, when this question comes up, 01:39:10.700 |
are the counterpoints that cost and aesthetics. 01:39:15.120 |
And I don't find either of those, honestly, very compelling. 01:39:28.620 |
before people made certain advances in technology 01:39:32.700 |
and started to manufacture them at massive scale 01:39:36.020 |
and deploy them in vehicles, similar with LIDARs. 01:39:39.060 |
And this is where the LIDARs that we have on our cars, 01:39:44.220 |
we've been able to make some pretty qualitative 01:39:47.700 |
discontinuous jumps in terms of the fundamental technology 01:39:52.780 |
at very significant scale and at a fraction of the cost 01:40:36.940 |
you can make look, I mean, you can make look beautiful. 01:41:00.860 |
like, oh man, I'm gonna start so much controversy with this. 01:41:15.500 |
But everyone, it's beauty is in the eye of the beholder. 01:41:24.900 |
- The form and function and my take on the beauty 01:41:31.060 |
You know, I will not comment on your Porsche monologue. 01:41:37.940 |
But there's an underlying like philosophical question 01:41:51.700 |
So I think without sort of disagreements and so on, 01:42:01.940 |
because Waymo's doing a lot of machine learning as well. 01:42:04.900 |
It's interesting to think how much of driving, 01:42:06.980 |
if we look at five years, 10 years, 50 years down the road, 01:42:29.420 |
and they're just collecting huge amounts of data 01:42:41.940 |
We were off mic talking about Hunter S. Thompson. 01:42:43.860 |
He's the Hunter S. Thompson of the time I was driving. 01:42:49.980 |
but they're like really trying to do end to end. 01:42:52.660 |
Like looking at the machine learning problem, 01:43:09.380 |
this level two system, this driver assistance 01:43:18.700 |
There's an underlying deep philosophical question there, 01:43:20.780 |
technical question of how much of driving can be learned. 01:43:29.420 |
for actually deploying a successful service in Phoenix, 01:43:33.140 |
right, that's safe, that's reliable, et cetera, et cetera. 01:43:39.380 |
and I'm not saying you can't do machine learning on LiDAR, 01:43:41.700 |
but the question is that like how much of driving 01:43:53.420 |
and plays a key role in every part of our system. 01:43:56.740 |
I, as you said, I would decouple the sensing modalities 01:44:05.860 |
LiDAR, radar, cameras, it's all machine learning. 01:44:09.780 |
All of the object detection classification, of course, 01:44:15.700 |
You feed them raw data, massive amounts of raw data. 01:44:18.900 |
And that's actually what our custom build LiDARs 01:44:24.220 |
And radars, they don't just give you point estimates 01:44:26.980 |
they give you raw, like physical observations. 01:44:29.620 |
And then you take all of that raw information, 01:44:37.140 |
And angle and distance is much richer information 01:44:40.500 |
plus really rich information from the radars. 01:44:44.620 |
into those massive ML models that then, you know, 01:44:48.460 |
lead to the best results in terms of, you know, 01:44:51.540 |
object detection, classification, you know, state estimation. 01:44:55.940 |
- So there's a, sorry to interrupt, but there is a fusion. 01:44:58.540 |
I mean, that's something that people didn't do 01:45:01.140 |
which is like at the sensor fusion level, I guess, 01:45:04.620 |
like early on fusing the information together, 01:45:10.780 |
that the vehicle receives from the different modalities 01:45:15.900 |
before it is fed into the machine learning models. 01:45:18.460 |
- Yeah, so I think this is one of the trends. 01:45:20.860 |
You're seeing more of that, you mentioned end-to-end, 01:45:22.700 |
there's different interpretation of end-to-end. 01:45:29.660 |
that goes from raw sensor data to like, you know, 01:45:37.580 |
There's more, you know, smaller versions of end-to-end 01:45:40.700 |
where you're kind of doing more end-to-end learning 01:45:51.900 |
It gets into some fairly complex design choices 01:46:01.740 |
you don't wanna create interfaces that are too narrow 01:46:05.900 |
where you're giving up on the generality of a solution 01:46:08.060 |
or you're unable to properly propagate signal, 01:46:10.460 |
you know, reach signal forward and losses and, you know, 01:46:14.220 |
back so you can optimize the whole system jointly. 01:46:18.820 |
And I guess what you're seeing in terms of the fusion 01:46:21.220 |
of the sensing data from different modalities, 01:46:24.100 |
as well as kind of fusion in the temporal level, 01:46:31.580 |
that would do frame by frame detection in camera. 01:46:33.260 |
And then, you know, something that does frame by frame 01:46:39.740 |
Like the field over the last decade has been evolving 01:46:42.460 |
in more kind of joint fusion, more end-to-end models 01:46:45.220 |
that are solving some of these tasks, you know, jointly. 01:47:06.300 |
and how do you inject inductive bias into your system? 01:47:16.700 |
So, you know, we have, there's no part of our system 01:47:20.740 |
that is not heavily, that does not heavily, you know, 01:47:24.300 |
leverage data-driven development or, you know, 01:47:32.020 |
or there's perception, you know, object level, 01:47:34.180 |
you know, perception, whether it's semantic understanding, 01:47:35.980 |
prediction, decision-making, you know, so forth and so on. 01:47:38.980 |
It's, and of course, object detection and classification, 01:47:44.940 |
like you're finding pedestrians and cars and cyclists 01:47:47.100 |
and, you know, cones and signs and vegetation 01:47:51.900 |
kind of detection classification and state estimation, 01:48:00.980 |
but that's just, you know, that's stable stakes. 01:48:03.740 |
Beyond that, you get into the really interesting challenges 01:48:05.780 |
of semantic understanding, the perception level. 01:48:12.180 |
that have to do with prediction and joint prediction 01:48:16.220 |
between all of the actors in the environment, 01:48:22.980 |
So we leverage ML very heavily in all of these components. 01:48:27.340 |
I do believe that the best results you achieve 01:48:35.140 |
having different models with different degrees 01:48:53.140 |
So, you know, one example I can give you is traffic lights. 01:48:56.180 |
There's a problem of the detection of traffic light state, 01:49:07.860 |
But then the interpretation of a traffic light, 01:49:14.260 |
Red, you don't need to build some complex ML model 01:49:31.100 |
whether it's a constraint or a cost function in your stack, 01:49:50.340 |
whether you wanna stop for a red light or not. 01:49:52.580 |
But if you think about how other people treat traffic lights, 01:49:59.940 |
As you know, they're supposed to stop for a red light, 01:50:03.020 |
So then you're back in the like very heavy ML domain 01:50:06.820 |
where you're picking up on like very subtle keys about, 01:50:11.540 |
you know, that have to do with the behavior of objects 01:50:21.380 |
on whether they will in fact stop or run a red light. 01:50:27.140 |
like machine learning is a huge part of the stack. 01:50:33.260 |
so obviously the first level zero or whatever you said, 01:50:42.700 |
but also starting to do prediction behavior and so on 01:50:57.900 |
I think we've been going back to the earliest days, 01:51:04.340 |
and team was leveraging, you know, machine learning. 01:51:09.820 |
but, and I think actually it was before my time, 01:51:12.220 |
but the Stanford team on during the Grand Challenge 01:51:14.980 |
had a very interesting machine learned system 01:51:17.460 |
that would, you know, use lighter and camera, 01:51:33.820 |
kind of sort of looks like this stuff in lighter. 01:51:35.820 |
And I know this stuff that I've seen in lighter, 01:51:43.100 |
that would allow the vehicle to drive faster." 01:51:46.260 |
and kind of staying and pushing the state of the art in ML, 01:52:00.060 |
got pretty heavily involved in machine learning, 01:52:06.420 |
And at that time, it was probably the only company 01:52:08.140 |
that was very heavily investing in kind of state of the art ML 01:52:31.540 |
published some interesting research papers in the space, 01:52:39.420 |
It's just super active. - Super active learning as well. 01:52:43.020 |
And of course there's kind of the more mature stuff 01:52:46.260 |
like, you know, ConvNets for object detection, 01:52:52.860 |
and kind of more, you know, and bigger models 01:53:12.860 |
You know, transformers, you know, GPT-3 inference. 01:53:20.140 |
to those problems of, you know, behavior prediction, 01:53:23.020 |
as well as, you know, decision-making and planning, right? 01:53:25.180 |
You can think about it, kind of the behavior, 01:53:33.020 |
a lot of the fundamental structure, you know, this problem. 01:53:38.860 |
There's a lot of structure in this representation. 01:53:41.980 |
There is a strong locality, kind of like in sentences, 01:53:52.340 |
What, you know, is happening in the scene as a whole 01:54:06.820 |
you're building generative models of, you know, 01:54:09.580 |
humans walking, cyclists riding, and other cars driving. 01:54:15.660 |
transformer models and all the breakthroughs in language 01:54:19.500 |
and NLP that might be applicable to like driving 01:54:34.300 |
But unfortunately, they're also a source of joy 01:54:37.700 |
and love and beauty, so let's keep them around. 01:54:41.540 |
- Oh, for your perspective, yes, yes, for sure. 01:55:01.500 |
of like a game theoretic dance of what to do. 01:55:17.620 |
- From that perspective, I mean, I don't know, 01:55:20.420 |
I'm joking a lot, but I think in seriousness, 01:55:24.180 |
like, you know, pedestrians are a complicated, 01:55:26.580 |
computer vision problem, a complicated behavioral problem. 01:55:36.220 |
from also an autonomous vehicle and a product perspective 01:55:40.060 |
about just interacting with the humans in this world? 01:55:44.980 |
we care deeply about the safety of pedestrians, 01:55:47.140 |
you know, even the ones that don't have Twitter accounts. 01:56:04.740 |
pedestrians or cyclists is one of our highest priorities. 01:56:08.780 |
We do a tremendous amount of testing and validation 01:56:20.620 |
around those unprotected vulnerable road users. 01:56:22.980 |
You know, cars, as we discussed earlier in Phoenix, 01:56:32.220 |
And you know, some people use them to go to school, 01:56:34.780 |
so they will drive through school zones, right? 01:56:48.380 |
And you know, what does it take to be good at it? 01:57:04.220 |
And again, you wanna use all sensing modalities 01:57:07.700 |
Imagine driving on a residential road at night 01:57:11.380 |
and you don't have headlights covering some part 01:57:14.180 |
of the space and like, you know, a kid might run out. 01:57:33.700 |
And in fact, we oftentimes, in these kinds of situations, 01:57:38.620 |
in some cases even earlier than our trained operators 01:57:42.420 |
Especially in conditions like very dark nights. 01:57:48.220 |
Then, you know, perception has to be incredibly good. 01:58:17.620 |
and very low latency in terms of your reactions 01:58:26.060 |
And we've put a tremendous amount of engineering 01:58:30.620 |
in to make sure our system performs properly. 01:58:33.780 |
And oftentimes it does require a very strong reaction 01:58:37.860 |
And we actually see a lot of cases like that. 01:58:45.500 |
that contribute to the safety around pedestrians. 01:59:09.180 |
right next to the sidewalk, it was a multi-lane road. 01:59:11.780 |
So as we got close to the cyclist on the sidewalk, 01:59:17.300 |
just fell right into the path of our vehicle. 01:59:19.780 |
And our car, you know, this was actually with a test driver. 01:59:26.180 |
Our test drivers did exactly the right thing. 01:59:35.900 |
And then we simulated what our system would have done 01:59:37.860 |
in that situation and it did exactly the same thing. 01:59:44.060 |
of really good state estimation and tracking. 01:59:50.900 |
and they're doing that right in front of you. 01:59:52.340 |
So you have to be really like, things are changing. 01:59:53.700 |
The appearance of that whole thing is changing. 01:59:56.100 |
And a person goes one way, they're falling on the road, 01:59:58.220 |
they're being flat on the ground in front of you. 02:00:22.740 |
that our system is performing exactly the way 02:00:42.140 |
for somebody who loves artificial intelligence, 02:00:49.220 |
That's kind of exciting as a problem, like to wake up. 02:00:54.180 |
It's terrifying probably for an engineer to wake up 02:01:04.220 |
that's often brought up about autonomous vehicles. 02:01:13.300 |
So a trolley problem is a interesting philosophical construct 02:01:19.220 |
that highlights, and there's many others like it, 02:01:35.300 |
if you were forced to choose to kill a group X of people 02:01:44.060 |
If you did nothing, you would kill one person, 02:01:53.180 |
Do you do nothing or you choose to do something? 02:02:19.620 |
'cause it's just an exciting thing to think about. 02:02:32.060 |
this like literally never comes up in reality. 02:03:00.940 |
- On the specific version of the trolley problem, 02:03:09.580 |
because we humans ourselves cannot answer it. 02:03:27.220 |
And actually oftentimes I think that freezing 02:03:32.060 |
because like you've taken a few extra milliseconds 02:03:43.420 |
and it can be a bit of a kind of a red herring. 02:03:57.900 |
it's not how you go about building a system, right? 02:04:02.380 |
We've talked about how you engineer a system, 02:04:04.580 |
how you go about evaluating the different components 02:04:09.860 |
How do you kind of inject the various model-based, 02:04:18.660 |
you reason about the probability of a collision, 02:04:26.020 |
and you have to properly reason about the uncertainty 02:04:35.820 |
but they tend to be more of like the emergent behavior. 02:04:38.300 |
And what you see, like you're absolutely right, 02:04:39.900 |
that these clear theoretical problems that they, 02:04:46.060 |
and really kind of being back to our previous discussion, 02:05:00.740 |
If you build a very good, safe and capable driver, 02:05:09.460 |
so you don't put yourself in that situation, right? 02:05:15.180 |
like, okay, you can make a very hard trade-off, 02:05:24.020 |
and then you focus on building the right capability 02:05:27.420 |
so that you don't put yourself in a situation like this. 02:05:30.180 |
- I don't know if you have a good answer for this, 02:05:33.140 |
but people love it when I ask this question about books. 02:05:41.100 |
that you've enjoyed, philosophical, fiction, technical, 02:05:51.980 |
Is there three books that stand out that you can think of? 02:06:09.660 |
I think in the US or kind of internationally, 02:06:21.520 |
it's a novel by Russian author Mikhail Bulgakov. 02:06:32.400 |
And like, it's, you know, the plot is interesting. 02:06:43.760 |
and you enjoy it for different, very different reasons. 02:06:47.960 |
And you keep finding like deeper and deeper meaning. 02:06:57.720 |
probably kind of the cultural stylistic aspect. 02:07:05.640 |
silly, quirky, dark sense of, you know, humor. 02:07:11.640 |
On that like slight note, just out of curiosity, 02:07:14.440 |
one of the saddest things is I've read that book 02:07:18.320 |
Did you by chance read it in English or in Russian? 02:07:26.160 |
Kind of posed to myself every once in a while. 02:07:35.680 |
So I, and actually I'm not sure if, you know, 02:07:54.400 |
of Dostoevsky, Tolstoy, of most of Russian literature 02:07:59.280 |
There's a couple, they're famous, a man and a woman. 02:08:01.680 |
And I'm going to sort of have a series of conversations 02:08:08.200 |
So I'm really embarrassed to say that I've read this, 02:08:16.760 |
even though I can also read, I mean, obviously in Russian, 02:08:44.120 |
So from what I understand, Dostoevsky translates easier. 02:08:49.680 |
Obviously the poetry doesn't translate as well. 02:08:52.520 |
I'm also the music, big fan of Vladimir Vysotsky. 02:09:15.480 |
and also because I want to do a couple of interviews 02:09:28.760 |
It's a fascinating question that ultimately communicates 02:09:48.960 |
at some point in your life that it feels like 02:09:56.400 |
I think about like Chinese scientists or even authors 02:10:00.040 |
that like, that we don't in English speaking world 02:10:04.600 |
don't get to appreciate some like the depth of the culture 02:10:09.040 |
And I feel like I would love to show that to the world. 02:10:13.240 |
Like I'm just some idiot, but because I have this, 02:10:17.000 |
like at least some semblance of skill in speaking Russian, 02:10:25.080 |
I feel like I wanna catch like Grigori Pearlman, 02:10:30.320 |
I wanna talk to him, like he's a fascinating mind 02:10:32.720 |
and to bring him to a wider audience in English speaking, 02:10:36.960 |
But that requires to be rigorous about this question 02:10:46.840 |
but it's a fundamental one because how do you translate? 02:10:50.440 |
And that's the thing that Google Translate is also facing 02:10:57.000 |
But I wonder as a more bigger problem for AI, 02:11:00.600 |
how do we capture the magic that's there in the language? 02:11:10.600 |
If you do read it, Master and Margarita in English, 02:11:14.160 |
sorry, in Russian, I'd be curious to get your opinion. 02:11:30.880 |
The second one, I would probably pick the science fiction 02:11:42.600 |
The Strogowski brothers kind of appealed more to me. 02:11:46.680 |
I think it made more of an impression on me growing up. 02:11:49.920 |
- Can you, I apologize if I'm showing my complete ignorance. 02:12:20.280 |
And some of the language is just completely hilarious. 02:12:34.040 |
You put that in the category of science fiction? 02:12:36.680 |
- That one is, I mean, this was more of a silly, 02:12:44.640 |
- Science fiction, right, is about this research institute. 02:12:47.600 |
It has deep parallels to like serious research, 02:12:52.160 |
but the setting of course is that they're working on magic. 02:12:56.120 |
And there's a lot of, so I, and that's their style, right? 02:12:59.560 |
They go, and other books are very different, right? 02:13:04.360 |
It's about kind of this higher society being injected 02:13:07.120 |
into this primitive world and how they operate there. 02:13:10.040 |
Some of the very deep ethical questions there, right? 02:13:15.520 |
Some is more about kind of more adventure style. 02:13:20.200 |
There's probably a couple, actually one I think 02:13:22.560 |
that they considered their most important work. 02:13:31.480 |
I still don't get it, but everything else I fully enjoyed. 02:13:40.200 |
And then I'll like over the holidays, I just like, 02:13:48.880 |
- And that's the one more, for the third one, 02:14:03.480 |
That one I think falls in the category of both. 02:14:06.080 |
Definitely it's one of those books that you read 02:14:15.560 |
I think there's lessons there people should not ignore. 02:14:20.560 |
And nowadays with everything that's happening in the world, 02:14:25.800 |
I can't help it, but have my mind jump to some parallels 02:14:32.520 |
And like there's this whole concept of double think 02:14:36.520 |
and ignoring logic and holding completely contradictory 02:14:39.640 |
opinions in your mind and not have that not bother you 02:14:52.640 |
which is a kind of friendly, is a friend of "1984" 02:15:07.480 |
But if anything that's been kind of heartbreaking 02:15:12.480 |
to an optimist about 2020 is that society's kind of fragile. 02:15:21.000 |
Like we have this, this is a special little experiment 02:15:34.680 |
I mean, I think "1984" and these books, "Brave New World", 02:15:38.800 |
they're helpful in thinking like stuff can go wrong 02:15:57.480 |
And I feel like I have like now somehow responsibility 02:16:05.440 |
And it dawned on me that like me and millions of others 02:16:10.440 |
had like the little ants that maintain this little colony. 02:16:24.960 |
And there's interesting, complicated ways of doing that 02:16:47.580 |
Unfortunately, I think it's less of a flamethrower 02:17:00.320 |
a world-class roboticist, engineer and leader 02:17:04.280 |
uncomfortable with a ridiculous question about life. 02:17:20.040 |
- I don't know if that makes it more difficult 02:17:26.040 |
one of the stories by Isaac Asimov, actually. 02:17:39.360 |
where the plot is that humans build this supercomputer, 02:17:49.720 |
How can the entropy in the universe be reduced? 02:18:16.040 |
and billions of years, and like at some point, 02:18:22.920 |
and it keeps posing the same question to itself. 02:18:32.280 |
there's the heat death of the universe has occurred, 02:18:35.720 |
and there's nobody else to provide that answer to. 02:18:41.600 |
So it like, you know, recreates the Big Bang, right? 02:18:47.120 |
But I can try to give kind of a different version 02:18:51.480 |
of the answer, you know, maybe not on the behalf 02:19:01.480 |
but at least, you know, personally, it changes, right? 02:19:05.040 |
I think if you think about kind of what gives, 02:19:10.360 |
you know, you and your life meaning and purpose 02:19:19.840 |
And the lifespan of, you know, kind of your existence, 02:19:24.840 |
you know, when you just enter this world, right? 02:19:27.240 |
It's all about kind of new experiences, right? 02:19:28.920 |
You get like new smells, new sounds, new emotions, right? 02:19:35.600 |
You're experiencing new, amazing things, right? 02:19:44.480 |
you start more intentionally learning about things, right? 02:19:49.480 |
I guess actually before you start intentionally learning, 02:19:52.300 |
Fun is a thing that gives you kind of meaning and purpose 02:20:05.720 |
and discovery is another thing that, you know, 02:20:08.800 |
gives you meaning and purpose and drives you, right? 02:20:17.080 |
And so impact and contributions back to technology 02:20:19.880 |
or society, people, you know, local or more globally 02:20:24.880 |
becomes a new thing that drives a lot of kind 02:20:27.720 |
of your behavior and is something that gives you purpose 02:20:31.540 |
and that you derive positive feedback from, right? 02:20:35.360 |
You know, then you go and so on and so forth. 02:20:39.400 |
If you have kids, like that definitely changes 02:20:45.880 |
You know, I have three that definitely flips some bits 02:20:48.800 |
in your head in terms of kind of what you care about 02:20:51.640 |
and what you optimize for and, you know, what matters, 02:20:57.240 |
And it seems to me that, you know, it's all of those things. 02:21:01.840 |
And as you kind of go through life, you know, 02:21:28.040 |
I might have one or two more to add to the list. 02:21:50.240 |
I truly believe it will be one of the most exciting things 02:21:52.520 |
we descendants of apes have created on this earth. 02:21:56.240 |
So I'm a huge fan and I can't wait to see what you do next. 02:22:18.440 |
apply machine learning to solve real world problems. 02:22:25.080 |
BetterHelp, online therapy with a licensed professional 02:22:28.600 |
and Cash App, the app I use to send money to friends. 02:22:31.840 |
Please check out these sponsors in the description 02:22:34.000 |
to get a discount and to support this podcast. 02:22:37.120 |
If you enjoy this thing, subscribe on YouTube, 02:22:47.160 |
And now let me leave you with some words from Isaac Asimov. 02:22:54.360 |
but it is engineering that changes the world. 02:22:57.080 |
Thank you for listening and hope to see you next time.