back to indexBoris Sofman: Waymo, Cozmo, Self-Driving Cars, and the Future of Robotics | Lex Fridman Podcast #241
Chapters
0:0 Introduction
1:8 Robots in science fiction
6:49 Cozmo
32:4 AI companions
38:59 Anki
64:33 Waymo Via
96:10 Sensor suites for long haul trucking
106:6 Machine learning
124:3 Waymo vs Tesla
134:38 Safety and risk management
143:42 Societal effects of automation
154:47 Amazon Astro
159:12 Challenges of the robotics industry
163:39 Humanoid robotics
170:42 Advice for getting a PhD in robotics
178:13 Advice for robotics startups
189:19 Advice for students
00:00:00.000 |
The following is a conversation with Boris Sofman, 00:00:08.920 |
formerly the Google Self-Driving Car Project. 00:00:12.420 |
Before that, Boris was the co-founder and CEO of Anki, 00:00:21.120 |
is one of the most incredible social robots ever built. 00:00:28.240 |
that creates a fun and engaging human-robot interaction. 00:00:32.080 |
It was truly sad for me to see Anki shut down when it did. 00:00:47.220 |
I spoke with Steve Viselli recently on episode 237 00:00:53.240 |
This episode looks more at the robotics side. 00:01:00.280 |
please check out our sponsors in the description. 00:01:02.960 |
And now, here's my conversation with Boris Sofman. 00:01:07.160 |
Who is your favorite robot in science fiction, 00:01:13.640 |
where they were able to convey such an incredible degree 00:01:16.520 |
of intent, emotion, and kind of character attachment 00:01:38.560 |
where you have this like incredible Terminator itself 00:01:49.440 |
you know, in terms of kind of specs by the new one, 00:01:51.520 |
but you know, still kind of like held his own. 00:01:53.360 |
And so it was kind of interesting where you realize 00:01:58.680 |
from human to kind of potentials in AI and robotics 00:02:05.440 |
as much as it was like kind of a direct world in a way, 00:02:08.640 |
was actually quite fascinating, gets the imagination going. 00:02:21.480 |
maybe not with like the realism in terms of skin and so on, 00:02:25.200 |
but that humanoid form, we have that humanoid form. 00:02:30.400 |
Maybe the challenge is it's super expensive to build, 00:02:33.600 |
but you can imagine, maybe not a machine of war, 00:02:36.640 |
but you can imagine Terminator type robots walking around. 00:02:42.520 |
you've basically, so for people who don't know, 00:02:46.800 |
that created a small robot with a big personality 00:02:50.160 |
called Cosmo that just does exactly what WALL-E does, 00:02:53.440 |
which is somehow with very few basic visual tools 00:03:02.400 |
But then again, the humanoid form is super compelling. 00:03:07.000 |
So like Cosmo is very distant from a humanoid form. 00:03:17.520 |
And it's interesting because it was very intentional 00:03:23.080 |
when you think about a character like Cosmo or like WALL-E, 00:03:34.200 |
and then how you actually create a personality 00:03:39.520 |
that actually matches the constraints that you're under, 00:03:43.240 |
whether it's mechanical or sensors or AI of the day. 00:03:51.560 |
towards trying to replicate human form in a robot, 00:03:54.200 |
because you actually take on some pretty significant 00:03:57.880 |
kind of constraints and downsides when you do that. 00:04:02.800 |
where it's just the articulation of a human body 00:04:11.320 |
that to replicate that even in its reasonably close form 00:04:14.280 |
takes like a giant amount of joints and actuators 00:04:16.280 |
and motion and sensors and encoders and so forth. 00:04:20.280 |
But then you're almost like setting an expectation 00:04:23.840 |
that the closer you try to get to human form, 00:04:37.440 |
and embrace your strengths and bypass your weaknesses. 00:04:41.360 |
the human form has way too many degrees of freedom 00:04:46.280 |
It's kind of counterintuitive, just as you're saying, 00:04:52.640 |
it's almost harder to master the communication of emotion. 00:04:57.320 |
Like you see this with cartoons, like stick figures, 00:05:00.240 |
you can communicate quite a lot with just very minimal, 00:05:03.640 |
like two dots for eyes and a line for a smile. 00:05:07.200 |
I think you can almost communicate arbitrary levels 00:05:31.400 |
Like dimensionality, voice, all these sorts of things, 00:05:37.160 |
And so some of the best animators that we've worked with, 00:05:46.400 |
by forcing these projects where all you have is like a ball 00:05:51.400 |
that can like kind of jump and manipulate itself 00:05:53.880 |
or like really, really like aggressive constraints 00:06:04.640 |
If we had to like describe it in like one small phrase, 00:06:07.160 |
it was bringing a Pixar character to life in the real world. 00:06:11.800 |
And in a lot of ways, what was interesting is that 00:06:14.200 |
with like WALL-E, which we studied incredibly deeply, 00:06:16.840 |
and in fact, some of our team were, you know, 00:06:19.600 |
kind of had worked previously at Pixar and on that project, 00:06:23.520 |
they intentionally constrained WALL-E as well, 00:06:29.000 |
because it forced you to like really saturate 00:06:34.080 |
But you sometimes end up getting a far more beautiful output 00:06:41.720 |
of this emotional space in a way that you just wouldn't 00:07:03.920 |
- What was the vision behind this incredible little robot? 00:07:12.200 |
we were PhD students in the Robotics Institute 00:07:16.800 |
And so we were studying robotics, AI, machine learning, 00:07:23.040 |
One of my co-founders was working on walking robots, 00:07:31.400 |
kind of a deeper passion for applications of robotics and AI 00:07:36.680 |
where there's people that get like really fascinated 00:07:38.400 |
by the theory of AI and machine learning robotics, 00:07:40.920 |
where whether it gets applied in the near future or not 00:07:46.080 |
but they love the pursuit of the like the challenge. 00:07:48.920 |
And there's a lot of incredible breakthroughs 00:07:50.920 |
We're probably closer to the other end of the spectrum 00:07:52.520 |
where we love the technology and all the evolution of it, 00:08:03.880 |
that wouldn't have been possible without these approaches? 00:08:11.720 |
where we like got to see the applied side of robotics. 00:08:15.600 |
there was actually relatively few applications of robotics 00:08:27.720 |
So maybe, you know, iRobot was like one exception 00:08:31.000 |
but for the most part, there weren't that many. 00:08:32.680 |
And so we got excited about consumer applications 00:08:37.640 |
way higher levels of intelligence through software 00:08:41.040 |
to create value and experiences that were just not possible 00:08:46.760 |
And we saw kind of a pretty wide range of applications 00:08:52.640 |
that varied in the complexity of what it would take 00:09:05.800 |
and then build up off of that and move into other areas. 00:09:25.160 |
And there was a really fascinating transition technically 00:09:38.880 |
of microphones, cameras was dropping by orders of magnitude. 00:09:43.880 |
And then on top of that, with the iPhone coming out in 2000, 00:09:50.240 |
It started to become apparent within a couple of years 00:09:55.040 |
that this could become a really incredible interface device 00:10:14.920 |
but putting huge amounts of complexity into the AI side. 00:10:19.640 |
and then the one that we're probably most proud of. 00:10:21.520 |
The idea there was to create a physical character 00:10:28.480 |
and the context that mattered to feel like he was alive. 00:10:32.960 |
And to be able to have these emotional connections 00:10:38.600 |
that you would typically only find inside of a movie. 00:10:44.320 |
We had an incredible respect and appreciation 00:10:57.880 |
It was very fixed and it obviously had a magic to it, 00:11:01.360 |
but where you really start to hit a different level 00:11:10.280 |
So basically you take a toy, you add intelligence into it 00:11:21.480 |
but you're now bringing it into the physical space. 00:11:25.200 |
which is you're basically bringing video games to life. 00:11:45.000 |
that matches what's happening in the physical world. 00:11:47.200 |
And then you can have a video game off of that 00:11:51.440 |
different traits for the cars, weapons and interactions 00:11:55.800 |
and special abilities and all these sorts of things 00:12:00.440 |
And one of the things that we were really surprised by 00:12:09.440 |
is that things that feel like they're really constrained 00:12:23.680 |
we were creating a video game engine for the physical world. 00:12:26.360 |
And then with Cosmo, we expanded that video game engine 00:12:29.480 |
to create a character and kind of an animation 00:12:41.000 |
And a lot of those elements were almost like a proving ground 00:12:44.240 |
for what would human-robot interaction feel like 00:13:07.760 |
it builds even more empathy with the character. 00:13:14.320 |
You have so much more freedom to fail, to explore. 00:13:28.120 |
into the direction of fun is a brilliant move. 00:13:50.440 |
but it's actually affordable by a large number of people. 00:14:05.200 |
and this is the part that I worked on out of the gate, 00:14:08.120 |
was a lot of the AI systems where you have these vehicles 00:14:16.480 |
And you realize very quickly that that's actually not fun 00:14:23.040 |
And so you start to intentionally almost add noise 00:14:25.560 |
to the system in order to create more of a realism 00:14:30.340 |
might start really ineffective and inefficient 00:14:37.400 |
And there is a really, really aggressive constraint 00:14:40.160 |
that's forced on you by being a consumer product 00:14:46.360 |
where you can't make a thousand dollar product 00:14:56.360 |
your cost of goods had to be well under a hundred dollars. 00:15:04.600 |
- And it was under $200 of the cost of retail. 00:15:16.440 |
what Cosmo looks like from a design perspective 00:15:32.740 |
Did you have, 'cause there's a lot of unique qualities 00:15:38.480 |
There's a display, there's eyes on the display 00:15:41.080 |
and those eyes can, it's pretty low resolution eyes, right? 00:15:44.640 |
But they still able to convey a lot of emotion. 00:15:52.200 |
- Lift stuff, but there's something about arm movement 00:15:59.640 |
It's like the face communicates emotion and sadness 00:16:05.920 |
And then the arms kind of communicates, I'm trying here. 00:16:11.280 |
- I'm doing my best in this complicated world. 00:16:23.400 |
And so you literally have only a head that goes up and down, 00:16:27.160 |
a lift that goes up and down, and then your two wheels. 00:16:34.500 |
And with that, it's actually pretty incredible 00:16:36.160 |
what you can come up with, where, like you said, 00:16:42.120 |
because there's a lot of ideas far beyond that, obviously, 00:16:44.200 |
as you can imagine, where, like you said, how big is it? 00:16:54.380 |
This is the formula for human kind of robot interface 00:16:57.480 |
more generally, is you almost have this triangle 00:17:04.440 |
what's mass producible, the cost constraints and so forth. 00:17:07.520 |
You have the AI side of how do you understand 00:17:10.800 |
the world around you, interact intelligently with it, 00:17:14.560 |
so perceive the environment, make intelligent decisions, 00:17:24.640 |
in human-robot interaction really miss the mark 00:17:30.280 |
They over-invest in the mechanical side of it, 00:17:32.540 |
and then varied results on the AI side of it. 00:17:46.600 |
your expectations go up, and if the AI can't meet it 00:17:49.560 |
or the overall experience isn't there, you miss the mark. 00:17:53.360 |
- So who, like how did you, through those conversations, 00:17:56.040 |
get the cost down so much and make it so simple? 00:18:04.680 |
which is Carnegie Mellon University, robotics. 00:18:10.080 |
that come from there or just from the world experts 00:18:19.240 |
And so where did that come from, the simplicity? 00:18:22.840 |
- It came from this combination of a team that we had. 00:18:37.040 |
It was because the experience had to kind of meet the mark. 00:18:41.440 |
but there was a pressure where the higher you go, 00:18:43.800 |
the more seasonal you become and the tougher it becomes. 00:18:46.240 |
And so on the cost side, we very quickly partnered up 00:18:50.160 |
with some previous contacts that we worked with 00:18:55.560 |
was one of the earliest heads of engineering at Logitech 00:19:07.840 |
We had a really great mechanical engineering team 00:19:11.680 |
where we were not gonna compromise on feasibility 00:19:20.840 |
we're gonna use cheap, noisy motors and sensors, 00:19:28.040 |
Then we found on the design and character side, 00:19:33.720 |
that thought that it should be very games-driven Cosmo, 00:19:36.600 |
where you create a whole bunch of games experiences 00:19:41.000 |
And then there was a fashion which my co-founder 00:19:45.520 |
like really believed in, which was character-driven. 00:19:47.760 |
And the argument is that you will never compete 00:19:49.840 |
with what you can do virtually from a game standpoint, 00:19:59.080 |
has a massively higher impact physically than virtually. 00:20:07.200 |
For people who don't know, Cosmo plays games with you, 00:20:16.440 |
I wondered exactly what is the compelling aspect of this? 00:20:25.000 |
but to me, the character, what I enjoyed most, honestly, 00:20:29.040 |
or what got me to return to it is the character. 00:20:33.720 |
- That's a fascinating discussion of, you're right. 00:20:44.880 |
and you don't have a graphics engine, it's like all this. 00:20:58.480 |
we immediately went towards Pixar and Carlos Bena, 00:21:02.320 |
he was one of, had been at Pixar for nine years. 00:21:08.880 |
and just immediately kind of spoke the language 00:21:11.680 |
and it just clicked on how you think about that 00:21:18.120 |
as like a really kind of prominent kind of driver of this 00:21:27.800 |
but then got them to really try to create magic 00:21:34.440 |
that was at the overlap of character and the character AI 00:21:38.840 |
that were, if you imagine the dimensionality of emotions, 00:21:41.760 |
happy, sad, angry, surprised, confused, scared, 00:21:52.200 |
to kind of populate this library of responses 00:21:58.040 |
that like goes to the extreme spectrum on angry 00:22:02.320 |
And so that gave us a lot of intuition and learnings. 00:22:09.160 |
but they were parameterized and had randomness to them 00:22:11.240 |
where you could have infinite permutations of happy 00:22:27.480 |
And so if Cosmos saw you for the first time in a day, 00:22:33.120 |
in the same way that the first time you walk in 00:22:34.680 |
and like your toddler sees you, they're so happy, 00:22:40.640 |
But like you have this like spike in response, 00:22:50.480 |
the most enjoyable emotions are him getting frustrated 00:22:57.160 |
I had to let him win because I don't want him to be upset. 00:22:59.440 |
And so you start to like create this feedback loop 00:23:03.320 |
where you see how powerful those emotions are. 00:23:13.720 |
but that's not really a prominent source of interaction. 00:23:17.160 |
What happens when a physical character like Cosmo, 00:23:48.160 |
We increased the play time engagement by 40%. 00:23:50.920 |
Like you see these sort of like kind of interactions 00:23:53.720 |
And so we studied pets, we studied virtual characters. 00:24:01.080 |
most perfect influencers behind these sort of interactions. 00:24:11.520 |
And if you think about the types of games that you played, 00:24:15.640 |
but they were always ones to create scenarios 00:24:17.880 |
of either tension or winning or losing or surprise 00:24:22.520 |
And they were purely there to just like create context 00:24:25.600 |
to where an emotion could feel intelligent and not random. 00:24:28.800 |
And in the end, it was all about the character. 00:24:30.880 |
- So yeah, there's so many elements to play with here. 00:24:43.720 |
And so you could almost like in the early explorations, 00:24:46.240 |
we thought that it would be really incredible 00:24:50.440 |
where you almost help encourage which direction it goes, 00:24:54.960 |
And you had, like, think of like the seven dwarves sort of. 00:24:59.600 |
And initially we even thought that it would be amazing 00:25:15.240 |
some are super warm and like kind of friendly. 00:25:26.160 |
It's almost like how long can you maintain a fiction 00:25:30.280 |
to where the person's explorations don't hit a boundary, 00:25:33.560 |
which happens almost immediately with typical toys. 00:25:39.480 |
how long can we create that immersive experience 00:25:47.520 |
when something has a personality and it's physical. 00:25:51.080 |
That is the key that unlocks robotics interacting 00:26:06.800 |
When you have a character and you make a mistake, 00:26:14.920 |
- It actually builds the depth of the personality, 00:26:26.600 |
of something that will obviously be prevalent 00:26:29.040 |
throughout society at a scale that we cannot even imagine. 00:26:38.520 |
that these kinds of characters will permeate society 00:26:43.920 |
We'll be interacting with them in different ways. 00:26:45.640 |
In a way we, I mean, you don't think of it this way, 00:26:55.400 |
but even then you think about role-playing games, 00:26:59.560 |
you become friends with certain characters in that game. 00:27:14.400 |
if the characters in the game remember that you exist, 00:27:26.000 |
It seems like there's going to be like billions, 00:27:40.760 |
need to be to create fulfilling relationships 00:27:54.840 |
because what's good enough to be a magical experience 00:28:07.540 |
in an office environment or in a home or and so forth. 00:28:10.520 |
And so, and the idea was that you build on that 00:28:19.400 |
But you're absolutely right at the end of the day, 00:28:30.480 |
where you have much richer physical interaction 00:28:36.360 |
And it shows itself in a few kind of really obvious places. 00:28:39.520 |
So just take something as simple as a voice assistant. 00:28:42.240 |
You will never, most people will never tolerate 00:28:54.640 |
and then now you're kind of, it feels intrusive. 00:28:59.680 |
like a cat that touches and gets your attention, 00:29:01.800 |
or a toddler, like you never think twice about it. 00:29:09.960 |
And we had a future version that was always on 00:29:17.080 |
in a way that is not acceptable for machines. 00:29:24.800 |
but it makes people who are skeptical of technology 00:29:29.240 |
There was like, there were a couple of really, 00:29:34.720 |
and so we were in, I think like a dozen countries, 00:29:39.200 |
but like we went pretty aggressively in launching 00:29:48.520 |
a socially higher bar for privacy and security 00:29:53.920 |
have had troubles on things that might've been okay 00:30:08.080 |
like where you have cameras, you have microphones, 00:30:10.040 |
it's kind of connected and like you're playing with kids 00:30:14.720 |
and you're like, this is like ripe to be like a nightmare 00:30:22.440 |
like really, really tough on these sorts of things. 00:30:34.840 |
did we have any complaints beyond like a really casual 00:30:46.000 |
it was almost like, well, of course he has to see and hear 00:30:48.400 |
how else is he gonna be alive and interacting with you? 00:30:50.920 |
And it completely disarmed this like fear of technology 00:30:54.800 |
that enabled this interaction to be much more fluid. 00:30:57.680 |
And again, like entertainment was a proving ground, 00:30:59.520 |
but that is like, you know, there's like ingredients there 00:31:02.560 |
that carry over to a lot of other elements down the road. 00:31:06.200 |
- That's hilarious that we're a lot less concerned 00:31:08.440 |
about privacy if the thing is value and charisma. 00:31:12.160 |
I mean, that's true for all of human to human interactions. 00:31:16.040 |
- It's an understanding of intent where like, 00:31:20.920 |
Right, so it's almost like you're communicating intent 00:31:24.520 |
and with that intent, people were like kind of 00:31:29.360 |
And it's interesting, it was just the earliest 00:31:32.080 |
kind of version of starting to experiment with this, 00:31:35.360 |
And then you have like completely different dimensions 00:31:41.360 |
that just went beyond anything we'd ever seen. 00:31:46.520 |
And we had some research projects kind of going on 00:31:54.200 |
that got unlocked that just hadn't existed before 00:31:56.880 |
that has these really interesting kind of links 00:32:07.840 |
do you think we will have beyond a particular game, 00:32:12.840 |
you know, a companion like Her, like the movie Her, 00:32:26.160 |
How many years away from that do you think we are? 00:32:31.720 |
So I think the idea of a different type of character, 00:32:34.720 |
like more closer to like kind of a pet style companionship, 00:32:47.640 |
And the bar is so high that if you miss it by a bit, 00:32:49.880 |
you hit the uncanny valley where it just becomes creepy 00:32:57.200 |
in form and interface and voice, the harder it becomes. 00:33:12.400 |
and also why like a lot of video game characters, 00:33:15.280 |
like Sims, for example, does not have a voice 00:33:34.920 |
people interpret what they want to interpret. 00:33:38.640 |
a very different interpretation than a 40 year old, 00:33:42.480 |
And so you just, you can lean into these advantages 00:33:44.840 |
much more in something that doesn't resemble a human. 00:33:56.120 |
The chat interfaces are getting way more interesting 00:34:09.800 |
So Google is a very large company that's servicing, 00:34:15.320 |
that wants to provide a service to a lot of people. 00:34:24.560 |
it requires general intelligence to be a successful 00:34:30.880 |
This is very, but the, I honestly want to push back 00:34:44.240 |
If you just understand what creates compelling characters, 00:34:50.080 |
and you exist in the world, and other people find you, 00:34:56.600 |
I like this character, this character's kind of shady, 00:35:06.520 |
I don't know what it is, but the Freudian thing, 00:35:08.920 |
but there's some kind of connection that happens, 00:35:15.920 |
and that's, so I guess the statement I'm trying to make, 00:35:19.040 |
is it possible to achieve a depth of friendship 00:35:27.320 |
And just, you set expectations and constraints, 00:35:31.160 |
such that in the space that's left, you can be successful. 00:35:33.440 |
And so, you can do that by having a very focused domain 00:35:39.280 |
for a particular product, and you create intelligence 00:35:43.040 |
Or, you know, kind of in the personal companionship side, 00:35:46.360 |
you can't be everything to, across the board, 00:35:58.520 |
that has picked up on where kind of cosmo left off, 00:36:04.160 |
And so, I don't know if it's the sort of thing 00:36:05.680 |
where, similar to like how, you know, in dot-com, 00:36:08.360 |
there were all these concepts that we considered, 00:36:10.240 |
like, you know, that didn't work out, or like, failed, 00:36:14.520 |
and then 20 years later, you have these, like, 00:36:16.160 |
incredible successes on almost the same concept. 00:36:24.000 |
But it does feel like that appreciation of that, 00:36:35.240 |
it's hard to, I'm not aware of anywhere right now 00:36:39.080 |
where, like, that same kind of aggressive drive 00:36:42.000 |
with the value on the character is happening. 00:36:48.440 |
something that looks awfully a lot like cosmo, 00:36:52.640 |
but in the three-legged stool, something like that, 00:36:55.680 |
in some number of years, will be a trillion-dollar company. 00:37:01.600 |
that, like, character, not just as robotic companions, 00:37:10.160 |
It's like, Clippy was, like, two legs of that stool 00:37:19.680 |
What's really confusing to me is they're born, 00:37:36.040 |
with just enough brilliance and vision to create this thing 00:37:47.440 |
And then when the timing is right, it just blows up. 00:37:53.480 |
until it just blows up, and I guess everything 00:38:02.240 |
in another five years or 10 years or whatnot? 00:38:10.680 |
and computation's gonna become more and more prevalent, 00:38:12.600 |
as well as cloud as a big tool to offload cost. 00:38:23.360 |
to just kind of a broader contextual understanding 00:38:28.360 |
and mapping of semantics and understanding scenes 00:38:43.120 |
of the tapping and animation and these other areas 00:38:46.920 |
where that was a big unlock in film, obviously. 00:38:51.920 |
And so I think, yeah, the pieces can reconnect 00:38:54.360 |
and the building blocks are actually gonna be 00:38:55.640 |
way more impressive than they were five years ago. 00:38:58.160 |
- So in 2019, Anki, the company that created Cosmo, 00:39:04.600 |
the company that you started, had to shut down. 00:39:21.600 |
because we were kind of life or death on many, many moments, 00:39:26.600 |
just navigating these insane kind of just ups and downs 00:39:48.840 |
So millions of units, got to like pretty serious revenue, 00:39:53.320 |
like kind of close to a hundred million annual revenue, 00:39:56.760 |
number one kind of product in kind of various categories. 00:40:02.040 |
ended up being very seasonal where something like 85% 00:40:04.960 |
of our volume was in Q4 because it was a present 00:40:13.280 |
And even though the volume was like really sizable 00:40:21.040 |
and managing the cash operations was just brutal. 00:40:25.840 |
You don't think about this when you're starting a company 00:40:32.520 |
are kind of just your head count and operations 00:40:48.600 |
to do your sales in mostly November, December 00:40:50.480 |
and then get paid in December, January by retailers. 00:41:14.600 |
- Yeah, and it's not a business that like is recurring 00:41:18.400 |
And it's just, and then you're walking in your forecast 00:41:24.320 |
And it's also like very hit driven and seasonal 00:41:28.080 |
where like you don't have the sort of continued 00:41:31.840 |
in some other consumer electronics industries. 00:41:42.720 |
at like 1X revenue oftentimes, which is tough, right? 00:41:46.440 |
And so we effectively kind of got caught in the middle 00:41:49.720 |
where we were trying to quickly evolve out of entertainment 00:42:00.320 |
But there was no path to kind of pure profitability 00:42:07.840 |
And so we tried really hard to make that transition. 00:42:19.800 |
and get to the next kind of like holiday season. 00:42:22.080 |
And so we ended up selling some of the assets 00:42:30.280 |
like in the team while we were going through it, 00:42:32.880 |
where actually despite how challenging that period was, 00:42:36.840 |
I mean, like people love the vision, the team, 00:42:56.720 |
like in the year, you know, kind of before it, 00:43:08.400 |
and how to get like a bridge loan from an investor. 00:43:12.080 |
that like as hard as things might be anywhere else, 00:43:20.960 |
And so we were very transparent during our fundraise 00:43:23.160 |
on who we're talking to, the challenges that we have, 00:43:26.400 |
how it's going and when things are going well, 00:43:29.880 |
And so it wasn't a complete shock when it happened, 00:43:37.160 |
that like, you know, we basically were just like watching 00:43:48.640 |
and, you know, we could like kind of take care of people 00:43:50.520 |
the best we could, but they like broke down crying 00:43:53.160 |
at all hands and somebody else had to step in for a bit. 00:44:08.600 |
for many people it was like the best kind of work experience 00:44:11.160 |
that they had and there was a lot of pride in what we did. 00:44:14.040 |
And it wasn't anything obvious we could point to that like, 00:44:23.080 |
And, but the experience was pretty incredible, 00:44:32.400 |
in both the technology and products and the team that, 00:44:36.360 |
you know, there's a lot there that like in the, 00:44:41.040 |
you know, right context could have been pretty incredible, 00:44:54.240 |
the implementation, you got the cost down very low. 00:44:58.160 |
And the compelling, the nature of the product was great. 00:45:11.880 |
like a sufficient value to justify the price. 00:45:17.320 |
every single other robotics company, or most of them, 00:45:20.360 |
they're like go in the category of social robotics 00:45:31.120 |
I'm not sure if I talked to you before that happened or not, 00:45:34.680 |
but I remember, you know, I'm distant from this. 00:45:54.880 |
Cost is down, it's just like the most minimal design 00:46:03.340 |
It's really compelling, the balance of games. 00:46:08.080 |
It's a great gift for all kinds of age groups, right? 00:46:12.200 |
It's just, it's compelling in every single way. 00:46:24.400 |
just as an external observer is I was thinking, 00:46:34.040 |
because it's obviously not like, yeah, it's business. 00:46:39.040 |
Maybe it's some aspect of the manufacturing and so on, 00:46:41.920 |
but I'm now realizing it's also not just that, 00:46:52.400 |
And so like, it had some of the hardest elements of, 00:47:04.120 |
you got to convince both the child that wanted 00:47:09.120 |
So you're having like this dual prong marketing challenge. 00:47:12.320 |
you have like really high precision on the components 00:47:19.320 |
it was just really great alignment of unique strength 00:47:22.720 |
across kind of like all these different areas, 00:47:26.600 |
kind of character and animation team between this, 00:47:28.520 |
like Carlos and it was like a character director, 00:47:41.080 |
And actually, you know, he kind of hit that quality. 00:47:48.140 |
we had so much like fan mail from kind of kids, 00:48:02.280 |
of like a stack this much of like letters from, 00:48:06.420 |
just like every, you got a permutation you can imagine. 00:48:24.820 |
maybe Waymo and Google will be somehow involved 00:48:28.220 |
that will carry this flag forward and will make you proud, 00:48:34.740 |
I think this is one of the greatest robotics companies 00:48:52.520 |
that were just on the verge of failure several times 00:48:57.440 |
And they just, it's almost like a roll of the dice, 00:49:00.960 |
And here's a roll of the dice that just happened to go. 00:49:05.560 |
when you really like talk to a lot of the founders, 00:49:09.980 |
and sometimes it really is a matter of like, you know, 00:49:14.560 |
like some things are just out of your control 00:49:20.560 |
for just the dimensionality of that challenge. 00:49:24.120 |
But the great thing is, is that like a lot of the team 00:49:30.120 |
a couple of companies that we kind of kept big chunks 00:49:32.760 |
of the team together and we actually kind of helped align 00:49:38.020 |
And one of them was Waymo where a majority of the AI 00:49:42.960 |
and robotics team actually had the exact background 00:49:46.920 |
that you would look for in like kind of AV space. 00:49:48.560 |
And it was a space that a lot of us like, you know, 00:49:57.280 |
serendipitous timings from another perspective 00:49:59.280 |
where like kind of landed in a really unique circumstance 00:50:05.880 |
- So it's interesting to ask you just your thoughts. 00:50:09.100 |
Cosmos still lives on under Dreamlabs, I think. 00:50:26.020 |
just out of curiosity and obviously just kind of care 00:50:28.340 |
for product line, I think it's deceptive how complex it is 00:50:35.380 |
And the amount of experiences that are required 00:50:40.340 |
to complete the picture and be able to move that forward. 00:50:50.080 |
in the way it was, was able to be manufactured. 00:50:55.760 |
But I think it's deceptive how tricky that is 00:50:59.800 |
on like everything from the quality control, the details, 00:51:03.100 |
and then like technology changes that forces you to, Rick, 00:51:16.140 |
it's deceptively difficult, just as you're saying. 00:51:18.340 |
For example, those same folks, and I've spoken with them, 00:51:23.340 |
they're, they partnered up with Rick and Morty creators 00:51:36.940 |
but now I just watched like the first season. 00:51:41.180 |
I like, I did not understand how brilliant that show is. 00:51:51.900 |
But I just fell in love with the Butter Robot. 00:51:58.220 |
you can create personalities, you can create, 00:52:00.100 |
and that particular robot who's doing a particular task 00:52:12.300 |
the myth of Sisyphus question that Camus writes about. 00:52:23.140 |
that's a beautiful little realization for a robot 00:52:25.380 |
that my purpose is very limited to this particular task. 00:52:38.780 |
that to do the same depth of personality as Cosmo had, 00:52:44.420 |
the same richness, it would be on the manufacturing, 00:52:47.940 |
on the AI, on the storytelling, on the design, 00:52:53.300 |
It could be a cool sort of toy for Rick and Morty fans, 00:52:58.580 |
but to create the same depth of existential angst 00:53:08.500 |
that's the brave effort you've succeeded at with Cosmo, 00:53:14.900 |
- You can fail in almost any one of the kind of dimensions 00:53:20.980 |
yeah, you need convergence of a lot of different skill sets 00:53:25.860 |
On this topic, let me ask you for some advice, 00:53:28.700 |
because as I've been watching Rick and Morty, 00:53:31.620 |
as I told myself, I have to build the Butter Robot, 00:53:36.100 |
And so I got a nice platform for it with treads 00:53:38.860 |
and there's a camera that moves up and down and so on. 00:53:41.900 |
I'll probably paint it, but the question I'd like to ask, 00:53:48.020 |
there's obvious technical questions I'm fine with, 00:53:50.620 |
communication, the personality, storytelling, 00:54:02.260 |
So with Cosmo, how did you know this is great? 00:54:17.260 |
Or like, I guess if we think of it as an optimization space, 00:54:33.420 |
where it didn't try to look like a dog or a human 00:54:37.220 |
And so you avoided having like a weird pseudo similarity, 00:54:45.540 |
where just like a personality or a character emotion 00:54:51.060 |
to kind of the iterations that a character director 00:54:52.740 |
at Pixar would have, where you're running through it 00:54:56.060 |
and you can virtually kind of like see what it'll look like. 00:55:05.420 |
And then we created a plugin that perfectly matched it 00:55:11.900 |
and then push a button and see it physically play out. 00:55:21.140 |
And then sometimes like you would just feel it 00:55:39.900 |
like we were pretty decent about not like getting to it, 00:55:42.860 |
you know, geeking out or getting too attached 00:55:49.140 |
put a customer hat on and does it truly kind of feel magical? 00:55:53.380 |
we just give a lot of autonomy to the character team 00:55:59.900 |
character board and mood boards and storyboards 00:56:02.060 |
and like what's the background of this character 00:56:08.260 |
but now had to operate under these unique constraints. 00:56:13.380 |
kind of took a fairly similar journey than like 00:56:17.060 |
as a character in an animated film, actually. 00:56:19.500 |
- Well, the thing that's really important to me, 00:56:23.140 |
well, I hope it's possible, pretty sure it's possible, 00:56:29.620 |
to make sure there's sufficient randomness in the process, 00:56:33.740 |
probably because it would be machine learning based, 00:56:47.700 |
were you surprised by certain things Cosmo did, 00:56:59.540 |
in how he'd respond in certain circumstances. 00:57:10.380 |
of like parameterized kind of emotional responses 00:57:12.700 |
and an emotional engine that would like kind of map 00:57:15.340 |
your current state of the game, your emotions, the world, 00:57:28.900 |
but still within the bounds of what felt like very realistic 00:57:33.580 |
And then what was really neat is that we could get statistics 00:57:35.620 |
on how much of that space we were saturating, 00:57:38.380 |
and then add more animations and more diversity 00:57:45.020 |
and maximize the chance that it stays feeling alive. 00:57:51.060 |
like the permutations and kind of like the combinations 00:57:56.420 |
sometimes surprised us because you see them in isolation, 00:57:59.340 |
but when you actually see them and you see them live, 00:58:01.900 |
relative to some event that happened in the game 00:58:06.700 |
And not too different than other robotics applications 00:58:10.060 |
where like you get so used to thinking about like 00:58:12.780 |
the modules of a system and how things progress 00:58:16.220 |
that the real magic is when all the pieces come together 00:58:19.100 |
and you start getting the right emergent behavior 00:58:23.980 |
when you just kind of go too deep into any one piece of it. 00:58:26.020 |
- Yeah, when the system is sufficiently complex, 00:58:30.780 |
As a human being, you could still appreciate the beauty 00:58:35.860 |
First of all, thank you for humoring me on this. 00:58:41.740 |
I'd love to just, one last thing on the butter robot, 00:58:56.180 |
Do you think speech is too much of a degree of freedom? 00:59:01.180 |
Like is speech a feature or a bug of deep interaction? 00:59:09.860 |
- Yeah, for a product, it's too deep right now. 00:59:16.460 |
because the state of the art is just not good enough. 00:59:19.340 |
And that's on top of just narrowing down the demographic 00:59:24.740 |
versus the way you speak to a child is very different. 00:59:34.060 |
that is like rich enough and subtly realistic enough 00:59:43.300 |
Now, speech understanding is a different matter 00:59:45.780 |
where understanding intent, that's a really valuable input. 00:59:49.300 |
But giving it back requires like a way, way higher bar 00:59:57.620 |
And so that realization that you can do surprisingly much 01:00:08.700 |
It's quite powerful and it generalizes across cultures 01:00:15.100 |
I think we're gonna be in that world for a little while 01:00:17.940 |
where it's still very much an unsolved problem 01:00:23.780 |
So if you have legs and you're a big humanoid looking thing, 01:00:28.140 |
and a much narrower degree of what's gonna be acceptable 01:00:30.540 |
by society than if you're a robot like Cosmo or Wally 01:00:35.540 |
and you can, or some other form where you can kind of like 01:00:42.020 |
is so well understood in terms of expectations by humans 01:00:45.900 |
that you have far less flexibility on how to deviate 01:00:48.660 |
from that and lean into your strengths and avoid weaknesses. 01:00:52.020 |
- But I wonder if there is, obviously there's certain kinds 01:01:01.780 |
So I guess my intuition is we will solve certain, 01:01:06.780 |
we would be able to create some speech-based personalities 01:01:16.940 |
that doesn't know English and is learning English, right? 01:01:22.020 |
- A fiction where you're like, you're intentionally 01:01:24.700 |
kind of like getting a toddler level of speech. 01:01:28.500 |
So you can't have like tie it into the experience 01:01:36.580 |
or the lack of, sorry, dynamic range in the speech 01:01:42.500 |
And you've seen that in like kind of fictional characters 01:01:48.300 |
- And yeah, and like, and you kind of had that with like, 01:01:51.980 |
I don't know, I guess like data and some of the other, 01:01:55.460 |
- But yeah, so you have to, and that becomes a constraint 01:02:01.380 |
- See, I honestly think like also if you add drunk 01:02:12.220 |
that allow you to be a dumber from an NLP perspective. 01:02:25.300 |
We, if you just look at the full range of humans, 01:02:28.980 |
I think we, there's certain situations where we put up 01:02:33.980 |
with like lower level of intelligence in our communication. 01:02:39.940 |
Like if somebody is drunk, we understand the situation, 01:02:48.540 |
I'm sure there's a lot of other kind of situations. 01:02:52.700 |
- So yeah, again, language, loss in translation, 01:02:55.580 |
that kind of stuff that I think if you play with that, 01:03:00.340 |
what is it, the Ukrainian boy that passed the Turing test, 01:03:06.180 |
And then you can create compelling characters, 01:03:08.380 |
but you're right, that's a dangerous sort of road to walk 01:03:14.260 |
- Yeah, and that's why like you have these big pushes 01:03:19.860 |
like where you'd have like full like human replicas 01:03:38.860 |
- It's not in terms of a rich, deep, fulfilling experience. 01:03:43.900 |
- Yeah, and creating a minefield of potential places 01:03:51.380 |
where like the biggest kind of functional AI challenges 01:04:00.340 |
And that's part of the challenges is like, yeah, 01:04:01.980 |
like robots are gonna get to like thousands of dollars, 01:04:06.140 |
but you can imagine what sort of expectation of value 01:04:10.020 |
And so that's where you wanna be able to invest 01:04:15.740 |
And so going down the full human replica route 01:04:34.300 |
one of the greatest at this point roboticists ever 01:04:43.620 |
that created the little guy with a deep personality 01:04:58.020 |
which is autonomous driving and more specifically, 01:05:26.460 |
and how in one company you almost have to create, 01:05:40.660 |
So Waymo is focused on building what we call a driver, 01:05:45.580 |
which is creating the ability to have autonomous driving 01:05:49.500 |
across different environments, vehicle platforms, 01:05:54.180 |
You know, as you know, it got started in 2009. 01:05:58.220 |
It was a lot, almost like an immediate successor 01:06:10.580 |
and so what Waymo is doing is creating the systems, 01:06:22.500 |
This hits on consumer transportation and ride sharing 01:06:28.620 |
And as you mentioned, it hits on autonomous trucking 01:06:34.060 |
So in a lot of ways, it's transporting people 01:06:35.540 |
and transporting goods, but at the end of the day, 01:06:38.380 |
the underlying capabilities that are required to do that 01:06:40.860 |
are surprisingly better aligned than one might expect, 01:06:45.260 |
where it's the fundamentals of being able to understand 01:06:50.140 |
make intelligent decisions and prove that we are 01:06:53.140 |
at a level of safety that enables large-scale autonomy. 01:07:07.860 |
You have a set of sensors that perceive the world, 01:07:16.420 |
And so in the same way that you have a driver's license 01:07:34.180 |
that kind of add some extra additive challenges. 01:07:38.580 |
It's the underlying systems that enable a physical vehicle 01:07:47.420 |
accomplish the tasks that previously wasn't possible 01:07:57.140 |
which is the transporting people from a brand perspective. 01:08:01.420 |
And just in case we refer to it so people know, 01:08:10.420 |
Is it just like a cool sounding name that just, 01:08:19.460 |
I mean, when you think about it, it's just like, 01:08:21.140 |
well, we're gonna transport it via this and that. 01:08:31.060 |
And the interesting thing is that even the groupings 01:08:32.660 |
kind of blur where Waymo One is like human transportation 01:08:35.660 |
and there's a fully autonomous service in the Phoenix area 01:08:40.940 |
And it's pretty incredible to like, just, you know, 01:08:46.580 |
And then on the Via side, it doesn't even have to be, 01:08:50.380 |
like long haul trucking is a, like a major focus of ours, 01:08:56.980 |
the vehicle transportation as well for local delivery. 01:09:00.620 |
Also, in a lot of this requirements for local delivery 01:09:03.100 |
overlap very heavily with consumer transportation. 01:09:06.660 |
Obviously, you know, given that you're operating 01:09:14.300 |
And so, yeah, and Waymo very much is a, you know, 01:09:18.940 |
multi-product company that has ambitions in both. 01:09:26.260 |
But the cool thing is, is that there's a huge amount 01:09:29.220 |
of leverage and this kind of core technology stack 01:09:36.540 |
but the success case is that the challenges that you push on, 01:09:41.140 |
they get leveraged across all platforms and all domains. 01:09:44.100 |
- From an engineer perspective, the teams are integrated. 01:09:47.980 |
So there's a huge amount of centralized kind of core teams 01:09:52.260 |
And so you think of something like the hardware team 01:09:57.780 |
This is an experience that carries over across, you know, 01:10:03.580 |
Then there's like really unique perception challenges, 01:10:06.820 |
planning challenges, like other types of challenges 01:10:10.060 |
where there's a huge amount of leverage on a core tech stack, 01:10:12.540 |
but then there's like dedicated teams that think of 01:10:16.060 |
For example, an articulated trailer with varying loads 01:10:19.860 |
that completely changes the physical dynamics of a vehicle 01:10:33.180 |
via the autonomous trucking effort that Waymo's doing? 01:10:43.420 |
These are 53 foot trailers that capture like a big, 01:10:47.060 |
a pretty sizable percentage of the goods transportation 01:10:49.980 |
Long-term, the opportunity is obviously to expand 01:10:58.500 |
and start to really expand in both the volume 01:11:08.180 |
with a very specific operating kind of domain 01:11:11.260 |
and constraints that allow you to solve the problem. 01:11:14.060 |
But then over time, you start to really try to push 01:11:17.940 |
against those boundaries and open up deeper feasibility 01:11:37.940 |
today you have already a giant shortage of truck drivers. 01:11:45.180 |
that's expected to grow to hundreds of thousands 01:11:48.700 |
You have really, really quickly increasing demand 01:11:56.700 |
You have one of the deepest safety challenges 01:12:03.340 |
where there's a huge, huge, huge kind of challenge 01:12:07.900 |
around fatigue and around kind of the long routes 01:12:11.500 |
And even beyond kind of the cost and necessity of it, 01:12:21.820 |
and regulatory constraints that are tied to trucking today. 01:12:37.500 |
on how far jumps with a single driver could be 01:12:40.540 |
and makes you very subject to availability of drivers, 01:12:48.980 |
And so you start to have an opportunity on everything 01:12:53.020 |
from plugging into existing fleets and brokerages 01:12:58.260 |
and just immediately start to have a huge opportunity 01:13:01.740 |
to add value from cost and driving fuel insurance 01:13:09.220 |
all the way to completely reinventing the logistics network 01:13:16.500 |
- Yeah, I had, it'll be published before this, 01:13:26.020 |
but we talked about much of the fascinating human stories 01:13:31.460 |
He was also was a truck driver for a bit as a grad student 01:13:35.580 |
to try to understand the depth of the problem. 01:13:39.140 |
we have some drivers that have 4 million miles 01:13:43.620 |
And yeah, it's, you know, learning from them, 01:13:47.540 |
like some of them are on the road for 300 days a year. 01:13:53.140 |
Just like you said, there's a shortage of actually people, 01:13:56.780 |
truck drivers taking the job counter to what is, 01:14:06.060 |
and a shortage of people to take up those jobs. 01:14:08.540 |
And just like you said, it's such a difficult problem. 01:14:17.660 |
to understand, you know, how hard is this problem? 01:14:20.860 |
And that's the question I wanna ask you from a perception, 01:14:25.700 |
What's your sense of how difficult is autonomous trucking? 01:14:32.420 |
are super difficult, which are more manageable. 01:14:42.380 |
So there's, and as you can expect, it's a mix. 01:14:53.860 |
And so, you know, on the things that are like, 01:15:05.780 |
where that's where a majority of the value is captured. 01:15:09.820 |
and a lot more consistency across freeways across the US, 01:15:20.620 |
lack of consistency and variability across cities. 01:15:23.160 |
So you can leverage that consistency to tackle, 01:15:27.140 |
at least in that respect, a more constrained AI problem, 01:15:32.720 |
You can itemize much more of the sort of things 01:15:50.780 |
Like when people talk about traveling across country, 01:15:58.820 |
Like, is there a stretch of road that's like nice and clean? 01:16:03.300 |
And then there's like cities with difficulties in them 01:16:06.260 |
that you kind of think of as the canonical problem 01:16:11.800 |
Waymo very intentionally picked the Phoenix area 01:16:19.400 |
where when you think of consumer transportation 01:16:28.540 |
And so really pushing at and solving San Francisco 01:16:31.300 |
becomes a really huge opportunity and importance. 01:16:34.420 |
And places one dot on kind of like the spectrum 01:16:45.820 |
it's I believe the fastest growing city in the US. 01:16:53.940 |
a really wide range of kind of like complexities. 01:16:58.380 |
actually exposes you to a lot of the building blocks 01:17:00.180 |
you need for the more complicated environments. 01:17:05.320 |
that if you start to kind of place a few of these 01:17:07.300 |
kind of dots where San Francisco has these types 01:17:12.980 |
especially when you get into the downtown areas 01:17:14.660 |
and so forth, and Phoenix has like a really interesting 01:17:18.880 |
maybe other ones like LA kind of add freeway focus 01:17:23.180 |
You start to kind of cover the full set of features 01:17:25.820 |
that you might expect, and it becomes faster and faster 01:17:28.860 |
if you have the right systems and the right organization 01:17:31.460 |
to then open up the fifth city and the 10th city 01:17:37.380 |
where obviously there's uniquenesses in freeways 01:17:42.120 |
and then the real opportunity to then get even more value 01:17:52.920 |
we have a big facility that we're finishing building 01:17:55.660 |
in Q1 in Dallas area that'll allow us to do testing 01:17:59.820 |
from the Dallas area on routes like Dallas to Houston, 01:18:10.460 |
- Well, Waymo, the car side was in Austin for a while. 01:18:23.300 |
On trucking, a huge opportunity is Port of LA going East. 01:18:34.180 |
where you have the biggest port in the United States. 01:18:37.540 |
And the amount of goods going East from there 01:18:40.940 |
And then obviously there's kind of channels everywhere, 01:18:45.660 |
as you get into like snow and inclement weather and so forth. 01:18:57.700 |
And so there's all these dimensions that we think about. 01:19:16.300 |
that level four refers to kind of the first step 01:19:20.980 |
that you could recognize as fully autonomous driving. 01:19:24.300 |
Level five is really fully autonomous driving. 01:19:27.260 |
And level four is kind of fully autonomous driving. 01:19:32.420 |
depending on who you ask what that actually means. 01:19:40.140 |
Let's say like there's three parts of long haul trucking. 01:19:43.060 |
Maybe I'm wrong in this, but there's freeway driving, 01:19:57.580 |
Which of them do you include under level four? 01:20:03.060 |
Where's the biggest impact to be had in the short term? 01:20:27.220 |
you can get to market in so many different ways. 01:20:38.300 |
that are at the entry points to metropolitan areas, 01:20:44.780 |
which does a few things that are very valuable. 01:20:49.860 |
you can automate transfer hub to transfer hub, 01:21:01.940 |
that you feel you wanna handle at that point in time. 01:21:09.580 |
like you need to come out in January and check it out, 01:21:12.900 |
It's not only is it our main operating headquarters 01:21:21.380 |
designed driverless hub for autonomous trucks 01:21:24.580 |
in terms of where do they enter, where do they depart? 01:21:27.180 |
How do you think about the flow of people, goods, everything? 01:21:30.540 |
and it's really beautiful on how it's thought through. 01:21:34.260 |
it is totally reasonable to do the last five miles manually 01:21:40.340 |
to avoid having to solve the general surface street problem, 01:21:57.180 |
And we have probably the best advantages in the world 01:21:59.380 |
because of all the Waymo experience on surface streets, 01:22:09.460 |
L4 can be applied to any domain, operating domain or scope, 01:22:13.220 |
but it's effectively for the places where we say 01:22:17.620 |
We are 100% operating through as a self-driving truck 01:22:28.540 |
And it doesn't mean that you operate in every condition. 01:22:36.700 |
operating conditions, routes, kind of domain, 01:22:45.180 |
L5 is just not even really worth thinking about 01:22:47.540 |
because there's always gonna be these extremes. 01:22:55.860 |
of expanded capabilities that create the most value 01:22:59.140 |
and teach us the most and create this feedback loop 01:23:07.660 |
So first of all, I have to, when I'm allowed, 01:23:10.740 |
visit the Dallas facility 'cause it's super cool. 01:23:13.180 |
It's like robot on the giving and the receiving end. 01:23:17.340 |
It's the truck is a robot and the hub is a robot. 01:23:43.820 |
Like, so we have our servers where we know exactly 01:23:50.220 |
And so you can imagine like a large backend system 01:23:52.620 |
that over time starts to manage timings, goods, 01:24:02.980 |
there might be special cases where that is valuable 01:24:06.660 |
but a majority of the intelligence is gonna be on the truck 01:24:19.220 |
But there's a distinct type of workflow where, 01:24:33.660 |
so that you minimize the interaction between humans 01:24:38.580 |
And then how do you even intelligently select 01:24:53.940 |
to lean into your current capabilities and strengths 01:24:56.100 |
so that you minimize the amount of work that's necessary 01:25:02.380 |
So first, is the goal to have no human in the truck? 01:25:16.980 |
and obviously these trucks can also be manually driven. 01:25:20.500 |
So sometimes like we talk with our fleet partners 01:25:29.060 |
and on the routes that are autonomous, it's autonomous. 01:25:31.620 |
On the routes that are not, it's human driven. 01:25:52.340 |
the largest value proposition is where you're able 01:25:55.500 |
to have no constraints on how you can operate this truck. 01:26:05.340 |
'cause you mentioned that also opportunity to revamp 01:26:15.660 |
my understanding is logistics is not perhaps as great 01:26:19.500 |
as it could be in the current trucking environment. 01:26:29.420 |
Maybe some of it is literally just it's old school. 01:26:39.620 |
There's an independence and there's not a nice interface 01:26:42.980 |
where they can communicate where they're going, 01:26:46.940 |
And so it just feels like there's so much opportunity 01:26:49.700 |
to digitize everything to where you could optimize 01:26:57.380 |
How much are you thinking about that problem? 01:27:03.940 |
how much opportunity is there to revolutionize 01:27:06.220 |
the space of logistics in autonomous trucking, 01:27:10.660 |
It's, this is one of the most motivating aspects 01:27:14.740 |
there's like a mountain of problems that are like, 01:27:16.580 |
you want to, you have to solve to get to like 01:27:18.100 |
the first checkpoints and first driverless and so forth. 01:27:22.500 |
you plug in initially into the existing kind of system 01:27:29.700 |
And so a couple of the factors that play into it. 01:27:34.220 |
there's obviously just the physical constraints 01:27:41.020 |
you know, right now because of just this demands 01:27:58.340 |
the percentage of the overall trucking market 01:28:07.380 |
is like a one to five truck, you know, family business. 01:28:11.340 |
And so there's just like a huge amount of like fragmentation, 01:28:15.140 |
which makes for really interesting challenges 01:28:18.300 |
in kind of stitching together through like bulletin boards 01:28:21.700 |
and brokerages and some people run their own fleets. 01:28:27.460 |
but it is one of the less digitized and optimized worlds 01:28:47.060 |
if I were to predict that whilst trying to solve 01:28:51.940 |
Waymo might solve first the logistics problem. 01:28:55.500 |
Like, 'cause that would already be a huge impact. 01:28:59.220 |
So on the way to solving autonomous trucking, 01:29:02.020 |
the human driven, like there's so much opportunity 01:29:05.260 |
to significantly improve the human driven trucking, 01:29:15.660 |
Well, even the, I mean, you get really ambitious, 01:29:28.460 |
And a lot of the inefficiency today is because like, 01:29:31.980 |
you have a delay, like Port of LA has a bunch of ships 01:29:35.740 |
right now waiting outside of it because they can't dock 01:29:37.980 |
because there's not enough labor inside of the Port of LA. 01:29:43.100 |
which means there's a big backlog of deliveries, 01:29:44.940 |
which means the drivers aren't where they need to be. 01:29:46.660 |
And so you have this like huge chain reaction 01:29:49.100 |
and your feasibility of readjusting in this network is low 01:29:54.580 |
and manual kind of processes or distributed processes 01:30:03.540 |
yes, we have to solve autonomous trucking first. 01:30:05.460 |
And that, by the way, that's not like an overnight thing. 01:30:07.380 |
That's decades of continued kind of expansion and work. 01:30:10.940 |
But the first checkpoint in the first route is like, 01:30:16.100 |
But once you start enabling and you start to learn 01:30:17.940 |
about how the constraints of autonomous trucking, 01:30:40.420 |
and the sort of learnings that they have about the industry 01:30:50.820 |
Or what if you enabled this other constraint? 01:30:53.060 |
That actually drives the roadmap in a lot of ways 01:30:54.860 |
because this is not like an all or nothing problem. 01:31:02.060 |
which functionality most enables this optimization 01:31:05.340 |
ends up being kind of part of the discussion. 01:31:12.980 |
and you think about like very generalized capability 01:31:23.420 |
The efficiency goes far beyond just direct cost 01:31:28.460 |
They go towards reinventing the entire system 01:31:33.180 |
you see, you know, these other industries that, 01:31:45.740 |
that autonomous trucking is like email versus mail. 01:31:48.700 |
And then with email, you're still doing the communication, 01:31:53.900 |
varieties of communication that you didn't anticipate. 01:31:57.420 |
- That's right, constraints are just completely different. 01:31:59.540 |
And yeah, there's definitely a property of that here. 01:32:09.500 |
that the industry has done where there's companies 01:32:18.100 |
but it's an interesting kind of merger of worlds 01:32:31.460 |
between how to use autonomy and how to use humans 01:32:56.860 |
So you have transportation of goods for short distances 01:33:06.020 |
between a passenger vehicle and a truck is just size 01:33:13.580 |
that doesn't, as long as you have the same sense of suite, 01:33:18.500 |
these do kind of converge where in a lot of ways, 01:33:21.420 |
a lot of the challenges we're solving are freeway driving, 01:33:30.820 |
like you have a very different dynamics in your vehicle 01:33:35.620 |
in order to have the proper like response time 01:33:37.900 |
because you have an 80,000 pound fully loaded truck. 01:33:41.140 |
That's a very, very different type of braking profile 01:33:44.740 |
You have a really interesting kind of dynamic limits 01:33:51.220 |
it's very, very hard to like physically like flip a car 01:33:55.740 |
like most risk in a car is from just collisions. 01:33:59.060 |
It's very hard to like in any normal operation 01:34:02.980 |
unless you hit something to actually kind of like 01:34:05.740 |
On a truck, you actually have to drive much closer 01:34:13.660 |
because you could have a really interesting interactions 01:34:21.340 |
If you turn too quickly, you have roll risks and so forth. 01:34:28.100 |
and those boundaries change based on the load that you have, 01:34:34.900 |
so that you're leveraging your dynamic range, 01:34:39.820 |
but understanding what those safety bounds are. 01:34:41.220 |
And so we have this like really cool test facility 01:34:43.620 |
where we like take it to the max and actually, 01:34:46.500 |
imagine a truck with these giant training wheels 01:34:53.220 |
in order to like try to actually see where it rolls. 01:34:55.860 |
And so you define this high dimensional boundary, 01:34:59.020 |
which then gets captured in software to stay safe 01:35:03.660 |
the sort of kind of challenges you have there. 01:35:07.580 |
drive really interesting challenges from perception 01:35:12.500 |
And obviously in Planner where you have to think about 01:35:15.940 |
merging and creating gaps with a 53 foot trailer 01:35:20.140 |
And then obviously the platform itself is very different. 01:35:25.180 |
and you also have unique blind spots that you have 01:35:27.020 |
because of the trailer, which you have to think about. 01:35:32.300 |
you try to capture these special cases in a way 01:35:35.700 |
that is cleanly augmentations of the existing tech stack, 01:35:46.980 |
And over time, they all start to kind of merge ideally 01:36:01.220 |
and have been nowhere near 2X the cost or investment or size. 01:36:10.420 |
- So what kind of sensor suite they can speak to 01:36:16.900 |
LiDAR, vision, how many, what are we talking about here? 01:36:27.740 |
And so we have like dozens of cameras, radar, 01:36:35.500 |
have a central main sensor pod on the roof in the middle, 01:36:38.620 |
and then some kind of hood sensors for blind spots. 01:36:42.060 |
The truck moves to two main sensor pods on the outsides 01:36:54.500 |
- Kind of on the cabin, not all the way in the front, 01:36:56.900 |
but like kind of where the mirrors for the driver would be. 01:37:05.580 |
and you would be occluded with this like awkward wedge. 01:37:09.620 |
And so then you would add a lot of complexity 01:37:14.900 |
- There's so many probably fascinating design choices. 01:37:18.500 |
- 'Cause you can probably bring up a LiDAR higher 01:37:25.020 |
that ultimately probably will define the industry. 01:37:30.500 |
So one is like you're just beyond the trailer, 01:37:43.340 |
And the same perception system you use on the car side, 01:37:47.940 |
and you can retrain some models and so forth, 01:37:52.780 |
where there's a really nice built-in redundancy 01:37:57.620 |
where you can afford to have any one of them fail 01:38:04.860 |
And you will be able to detect when one of them fails 01:38:10.300 |
they're giving you the data that's inconsistent 01:38:14.500 |
And it's not just like they no longer give data. 01:38:23.740 |
So what's neat is that like you have way more sensors, 01:39:22.460 |
in this beautiful problem of autonomous trucking? 01:39:55.940 |
Which version of the hardware stack is it currently? 01:40:07.740 |
which includes sensors and compute is fifth generation. 01:40:11.100 |
- I can't wait until there's like iPhone style 01:40:21.140 |
it takes a lot to like retrain the models and everything. 01:40:33.820 |
And so Waymo has leaned into that as a strength. 01:40:36.780 |
And so a lot of the near range perception system 01:40:39.180 |
that obviously kind of carries over a lot from the car side 01:40:43.460 |
uses LiDAR as a very prominent kind of like primary sensor, 01:40:46.620 |
but then obviously everything has its strengths 01:40:49.740 |
And so in the near range, LiDAR is a gigantic advantage 01:40:56.220 |
when it comes to occlusions in certain areas, 01:41:17.060 |
the position of the sun, the time of the day, 01:41:20.020 |
various of the properties can have a big impact, 01:41:23.620 |
whether there's glare, the field of view, things like that. 01:41:29.140 |
- With in the face of a changing external environment, 01:41:42.460 |
that physically bounce off of something and come back. 01:41:44.900 |
And so whatever the conditional conditions are, 01:41:48.220 |
like the shape of a human sensor reading from a human 01:41:57.780 |
for kind of like the long tail of challenges. 01:42:02.140 |
in terms of range and ours has a really good range, 01:42:08.420 |
on top of the general redundancy that you want 01:42:10.380 |
for near range and complements through cameras and radar 01:42:13.260 |
for occlusions and for complimentary information 01:42:20.180 |
because your LiDAR data will fundamentally drop off 01:42:24.300 |
and you have to be able to see kind of objects further out. 01:42:32.340 |
where you get a high density, high resolution camera, 01:42:37.380 |
and it's like really potentially a huge value. 01:42:40.420 |
Now, the signal drops off, the noise is higher, 01:42:45.780 |
And one that you may not think about localizing is harder 01:42:52.020 |
and where something's located a kilometer away. 01:42:54.460 |
And that's the difference between being on the shoulder 01:42:56.980 |
And so you have like interesting challenges there 01:42:59.660 |
which have a bunch of approaches that come into it. 01:43:01.660 |
Radar is interesting because it also has longer range 01:43:06.660 |
than LiDAR and it gives you speed information. 01:43:12.020 |
So it becomes very, very useful for dynamic information 01:43:15.540 |
of traffic flow, vehicle motions, animals, pedestrians, 01:43:20.260 |
like just things that might be useful signals. 01:43:27.140 |
where radar actually penetrates weather conditions 01:43:34.700 |
towards not thinking about a problem as a LiDAR problem 01:43:38.700 |
but it's a fusion problem where these are all 01:43:46.340 |
And in many cases, you just look for the signals 01:43:49.980 |
that might be present in the union of all of these 01:43:52.060 |
and leave it to the system as much as possible 01:43:54.860 |
to start to really identify how to extract that. 01:44:09.380 |
There's a question that's probably still an open question 01:44:24.780 |
or do you do some kind of heterogeneous fusion 01:44:33.140 |
or at least an inkling of intuitions you can come up with? 01:44:45.020 |
And like when it gets to like final semantics 01:44:58.820 |
And that is because late fusion does not allow you 01:45:08.020 |
Weather's a great example where if you do early fusion, 01:45:12.940 |
for any single sensor in rain to solve that problem 01:45:19.660 |
you have weird kind of noise from the camera, 01:45:24.420 |
But the combination of all of them can help you filter 01:45:32.460 |
And be much more fluid about the strengths and weaknesses 01:45:45.380 |
Whereas like you might be a little bit more resilient 01:45:57.700 |
just maximizes your ability to pull out the best signal 01:46:01.300 |
before you start making constraining decisions 01:46:03.820 |
that end up being hard to unwind late in the stack. 01:46:06.140 |
- So how much of this is a machine learning problem? 01:46:11.780 |
play in this whole problem of autonomous driving, 01:46:16.620 |
- It's massive and it's increasing over time. 01:46:37.740 |
and search-based approaches and planning and so forth. 01:46:42.060 |
with these types of approaches kind of across the board, 01:46:49.820 |
the further you get in some parts of the stack 01:47:19.540 |
the spectrum between kind of like end to end ML, 01:47:22.300 |
which, you know, is a little bit kind of like too far 01:47:24.780 |
to how you architect it to where you have modules, 01:47:27.540 |
but enough ability to think about long tail problems 01:47:39.700 |
including behavior where even when it's not like 01:47:44.700 |
a gigantic ML problem that covers like a giant swath 01:47:48.500 |
end to end, more and more parts of the system 01:47:58.820 |
and try to like solve new expansions of domains 01:48:04.620 |
it becomes intractable for a human to approach that 01:48:09.940 |
has kind of approached some elements of the tech stack. 01:48:12.380 |
- So are you trying to create a data pipeline 01:48:33.460 |
So labeling workflows, ML workflows, everything. 01:48:40.540 |
where almost every model that we have on a truck, 01:48:51.220 |
despite the different domain and different numbers 01:48:54.700 |
there's a lot of signals that carry over across driving. 01:49:00.340 |
where you can reduce the amount of data you need by a lot. 01:49:04.060 |
And so we're increasingly thinking about our data 01:49:22.180 |
we're constantly kind of like adding more data 01:49:37.620 |
And a lot of that 85% comes from surface streets 01:49:41.660 |
because we just had so much of it and it was really valuable. 01:49:46.340 |
particularly in the areas where we need more data, 01:49:50.700 |
- Just all different visual characteristics of roads, 01:49:53.260 |
lane markings, pedestrians, all that, that's still relevant. 01:49:57.460 |
And then just the fundamentals of how, you know, 01:49:59.620 |
you detect the car, does it really change that much? 01:50:02.740 |
Whether you're detecting it from a car or a truck, 01:50:07.780 |
around your vehicle is it, it'll change a little bit, 01:50:10.340 |
but the basics, like there's a lot of signal in there 01:50:18.020 |
where there's a sparsity of events on a freeway. 01:50:20.540 |
The frequency of events happening on a freeway, 01:50:22.780 |
whether it's, you know, interesting, you know, 01:50:31.980 |
on a freeway is far more sparse than on a surface street. 01:50:35.540 |
And so that leads to really interesting data problems 01:50:50.820 |
And so there's really interesting kind of technical 01:51:13.340 |
well, if we've driven, you know, over 20 million miles, 01:51:25.740 |
So you want to do, make sure you have regression prevention 01:51:28.100 |
and protection of everything you're doing, right? 01:51:31.620 |
When you encounter something interesting in the world, 01:51:34.660 |
let's say there was an issue with how the vehicle behaved 01:51:42.660 |
and seeing how you would have reacted to that scenario 01:51:48.460 |
of your regression set after that point, right? 01:51:51.420 |
Then you start getting into like really, really 01:51:57.980 |
I have these metrics that are really correlated 01:52:09.060 |
So grabbing a large scale batch of historical data 01:52:14.060 |
and simulating it to get a signal of over these last, 01:52:17.980 |
or just random sample of a hundred thousand miles, 01:52:20.220 |
how has this metric changed versus where we are today? 01:52:22.980 |
You can do that far more efficiently in simulation 01:52:24.740 |
than just driving with that new system on board, right? 01:52:28.220 |
And then you go all the way to the validation phase 01:52:30.540 |
where to actually see your human relative safety 01:52:34.260 |
of like how well you're performing on the car side 01:52:42.380 |
by taking all of the physical operational driving, 01:52:46.340 |
which probably includes a lot of interventions 01:53:01.420 |
And you can even start to do really interesting things 01:53:03.180 |
where you add virtual agents to create harder environments. 01:53:07.260 |
You can fuzz the locations of physical agents. 01:53:10.260 |
You can muck with the scene and stress test the scenario 01:53:14.900 |
And effectively you're trying to like more efficiently 01:53:20.260 |
but try to encounter the problems as fast as possible. 01:53:27.860 |
is actually the evaluation problem in many ways, 01:53:31.740 |
And so if you could in theory evaluate perfectly 01:53:34.460 |
and instantaneously, you can solve that problem 01:53:46.740 |
on how well you're doing as quickly as possible 01:53:49.460 |
in a way that correlates to physical driving. 01:53:58.020 |
What are the performance metrics that we're talking about? 01:54:04.260 |
like that's what's deceptive where there's a lot of companies 01:54:12.540 |
to being able to go driverless can be deceptively long, 01:54:16.220 |
even when that demo looks like it's driverless quality. 01:54:18.340 |
And the difference is that the thing that keeps you 01:54:20.860 |
from going driverless is not the stuff you encounter 01:54:24.820 |
once in a hundred thousand miles or 500,000 miles. 01:54:27.660 |
And so that is at the root of what is most challenging 01:54:32.020 |
about going driverless because any issue you encounter 01:54:40.020 |
So those learnings, like those were painful learnings 01:54:45.380 |
and led to us then finally being able to go driverless 01:54:48.540 |
in Phoenix and now are at the heart of how we develop. 01:54:52.140 |
Evaluation is simultaneously evaluating final kind 01:54:57.020 |
of end safety of how ready are you to go driverless, 01:54:59.900 |
which may be as direct as what is your collision, 01:55:08.740 |
for all these types of scenarios and severities 01:55:13.140 |
to make sure that you're better than a human bar 01:55:16.460 |
But that's not actually the most useful for development. 01:55:19.460 |
For development, it's much more kind of analog metrics 01:55:33.740 |
that can correlate to the quality you care about 01:55:37.300 |
and push the feedback loop to all of your development. 01:55:41.140 |
comparisons to human drivers, like manual drivers. 01:55:46.180 |
in various dimensions or various circumstances? 01:55:51.060 |
So if I brought you a truck, how would you test it? 01:55:58.180 |
- This one's, can't tell if it's a human driver 01:56:06.740 |
How do you actually know you're ready, basically? 01:56:13.420 |
Waymo released a safety framework for the car side 01:56:15.460 |
because one, it sets the bar so nobody cuts below it, 01:56:22.260 |
Two, it's to start the conversation on framing 01:56:26.820 |
Same thing we'll end up doing for the trucking side. 01:56:29.420 |
It ends up being different portfolio of approaches. 01:56:37.460 |
with all these fundamental rules of the road? 01:56:42.580 |
You can fundamentally prove that it's either impossible 01:56:47.820 |
you can itemize the scenarios where that comes up 01:56:51.180 |
and you can do a test and show that you passed that test 01:56:57.900 |
And so those are like traditional structure testing 01:57:02.900 |
where you can just, like fault rates is another example 01:57:06.940 |
where when something fails, how do you deal with it? 01:57:09.420 |
You're not gonna drive and randomly wait for it to fail. 01:57:17.380 |
And run through all the permutations of failures 01:57:20.300 |
which you can oftentimes for some parts of system itemize 01:57:33.300 |
And you wanna figure out the combinations of approaches 01:57:39.020 |
You can probably pass the Turing test pretty quickly 01:57:41.540 |
even if you're not like completely ready for driverless 01:57:43.740 |
because the events that are really kind of like hard 01:57:56.420 |
there's a serious event happens once every 1.3 million miles 01:58:05.260 |
And so you could have a driver that looks like 01:58:11.100 |
And so that's where you start to get creative 01:58:13.060 |
on combinations of sampling and statistical arguments, 01:58:17.820 |
focused structured arguments where you can kind of 01:58:20.580 |
simulate those scenarios and show that you can handle them 01:58:24.980 |
and metrics that are correlated with what you care about 01:58:33.540 |
And in the end, you end up borrowing a lot of properties 01:58:36.980 |
from aerospace and like space shuttles and so forth 01:58:40.900 |
where you don't get the chance to launch it a million times 01:58:50.100 |
of kind of structured approaches in order to validate it. 01:58:53.260 |
And then by thoroughness, you can make a strong argument 01:58:58.180 |
This is actually a harder problem in a lot of ways though 01:59:01.780 |
as getting to a fixed point and then you kind of like 01:59:04.700 |
or an airplane and you like freeze the software 01:59:06.500 |
and then you like prove it and you're good to go. 01:59:08.620 |
Here you have to get to a driverless quality bar 01:59:11.100 |
but then continue to aggressively change the software 01:59:17.900 |
that there's an external environment with a shuttle 01:59:20.900 |
is you're basically testing the like the systems, 01:59:26.380 |
- And you have a lot of control in the external stuff. 01:59:30.460 |
you didn't get worse in something that you just changed? 01:59:35.580 |
like the Turing test starts to fail pretty quickly 01:59:49.220 |
maybe you'll sit there for 30 minutes, right? 01:59:51.580 |
- So you've driven 15 miles or something like that. 01:59:54.300 |
To go driverless, like what's the sort of rate of issues 01:59:59.340 |
that you need to have you won't even encounter? 02:00:02.900 |
Let's try a different version of the Turing test 02:00:11.700 |
They're designed, you don't know them ahead of time. 02:00:18.620 |
- And so is it possible to in the future orchestrate 02:00:31.900 |
they don't necessarily represent the full spectrum. 02:00:35.900 |
but you can at least get a really quick read and filter. 02:00:55.860 |
animal foreign objects on a road that pop out 02:01:05.300 |
like a hard brake, something reckless that they've done, 02:01:12.820 |
But yeah, like you have these like kind of like 02:01:28.060 |
We itemize everything that could possibly happen 02:01:30.340 |
to give you a starting point to how to think about 02:01:38.060 |
like you know how there's like a validation set 02:01:42.140 |
real world driving is the ultimate validation set. 02:01:44.260 |
That's the, in the end, like the cleanest signal. 02:01:49.940 |
And you're absolutely right, like at the end, 02:02:00.740 |
A really reckless pedestrian that's jaywalking, 02:02:07.540 |
That's actually what keeps you from going driverless. 02:02:11.380 |
- Yeah, and it's interesting to think about that. 02:02:17.220 |
the Turing test is exactly this validation set 02:02:27.180 |
He designed, he thinks about like how to design a test 02:02:35.220 |
And the validation set for him is handcrafted. 02:02:39.380 |
And that it requires like human genius or ingenuity 02:02:47.260 |
It's an interesting perspective on the validation set, 02:02:50.380 |
which is like, make that as hard as possible. 02:03:02.900 |
like all the world states that you'll expand. 02:03:05.060 |
And so you have to come up with different approaches. 02:03:07.220 |
And this is where you start hitting the struggles of ML, 02:03:09.300 |
where ML is fantastic at optimizing the average case. 02:03:16.780 |
which is what we care about in the Navy space, 02:03:24.340 |
So like, you don't care about the worst case really on ads, 02:03:26.940 |
because if you miss a few, it's not a big deal, 02:03:29.380 |
but you do care about it on the driving side. 02:03:36.140 |
And so you have to take a step back and abstract away, 02:03:53.460 |
you feel good about your ability to generalize 02:03:57.260 |
- All right, let me ask you a tricky question. 02:04:01.340 |
So to me, the two companies that are building at scale 02:04:06.340 |
some of the most incredible robots ever built 02:04:19.220 |
technically, philosophically in these two systems. 02:04:23.540 |
Let me ask you to play sort of devil's advocate 02:04:27.460 |
and then the devil's advocate to the devil's advocate. 02:04:30.540 |
It's a bit of a race, of course, everyone can win. 02:04:52.080 |
And which aspect of their approach would be the reason? 02:05:15.860 |
Tesla now removed radar, they do vision only. 02:05:19.560 |
Tesla has a larger fleet of vehicles operated by humans, 02:05:26.460 |
and it's larger, what do you call it, operational domain. 02:05:31.460 |
And then Waymo is more focused on a specific domain 02:05:41.060 |
both are, I think, there's a lot of brilliant ideas, 02:05:44.600 |
So I'd love to get your comments on this lay of the land. 02:05:57.080 |
and how they pushed the field forward as well. 02:06:00.000 |
So on the Waymo side, there is a fundamental advantage 02:06:05.940 |
and geared towards L4 from the very beginning. 02:06:10.500 |
the hardware, the compute, the infrastructure, 02:06:13.740 |
the tech stack, and all of the investment inside the company. 02:06:18.660 |
because there's like a giant spectrum of problems 02:06:29.460 |
because there's a reason why it's the fifth generation 02:06:40.600 |
and you optimize for the best information you have, 02:06:51.860 |
and the gap from really kind of like decent progress 02:06:55.480 |
for L2 and so forth to what it takes to actually go L4. 02:07:06.060 |
but there's a lot of advantages in all of these buckets 02:07:14.920 |
and you can use it and it's at a decently sizable scale. 02:07:23.380 |
- And you see this process you've deployed in Chandler, 02:07:33.720 |
- It's become more engineering than totally blind R&D. 02:07:37.680 |
and then you move to another place and you grow it this way. 02:07:48.420 |
to what is the current generation of the system 02:07:51.180 |
on both sides because the things that got us to driverless, 02:08:08.140 |
And so those learnings, you just can't shortcut. 02:08:13.940 |
to kind of get through technical organizational, 02:08:20.360 |
But there's a few in that, okay, like balls in our court, 02:08:23.640 |
there's a headstart there, now we gotta go and solve it. 02:08:37.940 |
far less complexity and provide value very quickly, 02:08:45.060 |
But you would take a huge risk in having a gap 02:09:02.400 |
So what you just laid out makes perfect sense 02:09:25.900 |
and machine learning can do a lot of the work. 02:09:33.800 |
how much of driving can be end to end learned? 02:09:41.380 |
and the vision only machine learning approach 02:09:51.840 |
like if I were to make the opposite argument, 02:09:53.100 |
like what puts Tesla in the strongest position, it's data. 02:10:05.640 |
And they found a way to like get paid by safety drivers 02:10:16.060 |
like one, it is incredible that they've built a business 02:10:26.360 |
that's always like an incredible kind of advantage. 02:10:31.940 |
If you can use it the right way to then solve the problem, 02:10:33.900 |
but the ability to collect and filter through the things 02:10:38.020 |
to the things that matter at real world scale, 02:10:50.780 |
and hardware systems in order to solve the problem? 02:10:57.400 |
that pure camera systems can't solve the problem 02:11:00.100 |
that humans obviously are solving with vision systems. 02:11:06.940 |
So there's no argument that it's not a risk, right? 02:11:27.600 |
you're now making a really, really hard problem 02:11:29.980 |
which on its own is still almost insurmountably hard, 02:11:36.760 |
And this is where you can easily get into a little bit 02:11:40.100 |
of a kind of a trap where similar to how you, 02:11:43.520 |
how do you evaluate how good an AV company's product is? 02:11:50.140 |
which they've kind of optimized like crazy and so forth. 02:11:55.680 |
You know that that gap is kind of like pretty large still. 02:12:04.980 |
And there's a lot of ways that that can come up. 02:12:08.100 |
And even if it doesn't happen that often at all, 02:12:12.100 |
and what it takes to actually go full driverless, 02:12:16.440 |
but full driverless, that bar gets crazy high. 02:12:20.760 |
And not only do you have to solve it on the behavioral side, 02:12:28.720 |
And so you now on top of the broader AV challenge, 02:12:30.980 |
you have a really hard perception challenge as well. 02:12:36.300 |
To me, what's fascinating about what Tesla is doing 02:12:52.020 |
It's fascinating for humans to be interacting with robots. 02:12:56.400 |
And they're actually helping kind of push it forward. 02:12:59.640 |
where even for us, a decent percentage of our data 02:13:04.360 |
We intentionally have humans drive higher percentage 02:13:08.160 |
because that creates some of the best signals 02:13:14.480 |
- So together we're kind of learning about this problem 02:13:17.860 |
in an applied sense, just like you had with Cosmo. 02:13:24.060 |
that people are going to use, robot-based product 02:13:37.040 |
that interacts with other humans in the world. 02:13:39.000 |
And that's like, to me, one of the most interesting problems 02:13:43.760 |
because you're in trying to create an intelligent agent 02:13:49.760 |
you're also understanding the nature of intelligence itself. 02:14:06.480 |
where you look at each other and just go, okay, go. 02:14:09.200 |
That's hard to do without a human driver, right? 02:14:18.900 |
Can you imagine that when autonomous driving is solved, 02:14:23.120 |
how much of the technology foundation of that space 02:14:26.800 |
can go and have tremendous, just transformative impacts 02:14:40.320 |
With autonomous driving is so safety critical. 02:14:53.480 |
But it's also the con of that is it's so hard to solve. 02:15:02.280 |
who write long articles about the failure of your company 02:15:07.280 |
if there's one accident that's based on a robot. 02:15:33.240 |
because your fear of regression forces so much more rigor 02:15:38.240 |
that obviously you have to find a compromise on like, 02:15:43.000 |
okay, well, how often do we release driverless builds? 02:15:45.360 |
Because every time you release a driverless build, 02:15:46.760 |
you have to go through this validation process, 02:15:56.920 |
you wouldn't release the products way, way quicker 02:16:06.360 |
Like we've gotten there where you think of like surgery, 02:16:09.440 |
Like you have surgery, there's always a risk, 02:16:14.960 |
when you go out and drive your car today, right? 02:16:20.160 |
We're not banning driving because there was a car accident, 02:16:26.440 |
where you have to not only be better than a human, 02:16:38.640 |
that we validate that becomes very comfortable 02:16:42.680 |
because a bunch of jargon that we use internally 02:16:54.120 |
but it's far above a human, you know, relative safety. 02:16:57.120 |
- See, here's the thing to push back a little bit 02:17:05.360 |
I think probably applies for autonomous driving, 02:17:14.680 |
But if you create a product that's really compelling 02:17:26.640 |
then I think people may be able to be willing 02:17:29.480 |
to put up with the thing that might be even riskier 02:17:38.520 |
Humans understand the value of going over the speed limit. 02:17:41.480 |
Humans understand the value of like going fast 02:17:48.640 |
To take, and when you're in Manhattan streets, 02:17:55.000 |
I mean, this is a much more tense topic of discussion, 02:17:59.320 |
So in Cosmo's case, there was something about the way 02:18:05.360 |
the energy it brought, the intent it was able 02:18:22.740 |
If you want a car that never has an accident, 02:18:30.640 |
But that's tricky, because that's not a robotics problem. 02:18:37.040 |
- And many accidents are not even due to you, right? 02:18:40.120 |
Obviously it's, so there's a big difference though. 02:18:46.800 |
You're also impacting obviously kind of the rest 02:18:49.240 |
of the road, and we're facilitating it, right? 02:18:52.240 |
And so there's a higher kind of ethical and moral bar, 02:18:56.720 |
which obviously then translates into, as a society, 02:19:02.960 |
where it's hard for us to ever see this even being a debate 02:19:07.960 |
in the sense that like, you have to be beyond reproach 02:19:15.160 |
you could set the entire field back a decade, right? 02:19:19.760 |
I think if we look into the future, there will be, 02:19:28.100 |
that there will be less and less focus on safety. 02:19:40.400 |
because I think for innovation, just like you were saying, 02:19:55.900 |
to understand the nature of risk, the value of risk. 02:20:00.500 |
It's very difficult, you're right, of course, with driving, 02:20:13.300 |
So you have to figure out what do we value about this world? 02:20:18.900 |
how deeply do we want to avoid hurting other humans? 02:20:26.580 |
you can imagine a scenario where Waymo has a system 02:20:33.540 |
human relative safety and provably statistically 02:20:38.500 |
will save lives, there is a thoughtful navigation 02:20:43.300 |
of that fact versus just kind of society readiness 02:20:48.300 |
and perception and education of society and regulators 02:20:56.380 |
and everything else where like, it's multi-dimensional 02:21:08.320 |
And just like any technology, there's early adopters 02:21:14.740 |
- But eventually celebrities, you get the rock 02:21:16.900 |
in a Waymo vehicle and then everybody just comes. 02:21:39.900 |
from a thoughtful kind of movement and tiptoeing 02:21:44.860 |
and like kind of like a push to society realizes 02:21:47.520 |
how wonderful of an enabler this could become 02:21:51.600 |
and hard to know exactly how that'll play out, 02:22:00.420 |
There's a lot of open questions and challenges to navigate 02:22:02.980 |
and there's obviously the technical problems to solve 02:22:07.380 |
but they have such an opportunity that is on a scale 02:22:12.380 |
that very few industries in the last 20, 30 years 02:22:34.980 |
It's like from taxis to ride sharing services, 02:22:40.180 |
I mean, there's just shift after shift after shift 02:22:55.420 |
of way more trucks and maybe just broadly speaking, 02:22:59.000 |
way more vehicles, just like ants running around 02:23:15.220 |
and then it becomes almost like this kind of interesting 02:23:25.560 |
Very different challenges appear at every stage. 02:23:28.100 |
But over time, like this is one of the most enabling 02:23:31.520 |
technologies that we have in the world today. 02:23:34.940 |
It'll feel like, how was the world before the internet? 02:24:11.720 |
Do you ever think about how it might change cities? 02:24:16.080 |
- You can imagine city where people live versus work 02:24:19.080 |
becoming more distributed because the pain of commuting 02:24:28.080 |
and how you think about car storage and parking 02:24:31.680 |
obviously just enables a completely different type 02:24:39.160 |
I think there was like a statistic that something like 02:24:43.080 |
30% of the traffic in cities during rush hour 02:24:50.960 |
So those obviously kind of open up a lot of options. 02:24:53.920 |
Flexibility on goods will enable new industries 02:25:00.200 |
because now the efficiency becomes more palatable. 02:25:07.720 |
The way we distribute the logistics network will change. 02:25:11.000 |
The way we then can integrate with warehousing, 02:25:16.200 |
You can start to think about greater automation 02:25:21.200 |
and how that supply chain, the ripples become much more agile 02:25:25.880 |
versus like very grindy the way they are today 02:25:32.800 |
and there's like a lot of constraints that we have. 02:25:36.320 |
It'll be great for safety where like probably 02:25:42.960 |
are due to just attention or things that are preventable 02:25:53.320 |
but the net creation is gonna be massively positive. 02:25:57.600 |
about the negative implications that will happen 02:26:03.080 |
But I'm an optimist in general for the technology 02:26:05.200 |
where you could argue a negative on any new technology, 02:26:10.520 |
if there is a big demand for something like this, 02:26:16.040 |
that's gonna kind of propagate through society. 02:26:19.880 |
And particularly as life expectancies get longer 02:26:22.720 |
and so forth, like there's just a lot more need 02:26:28.840 |
to kind of just be serviced with a high level of efficiency 02:26:32.280 |
because otherwise we're gonna have a really hard time 02:26:33.760 |
kind of scaling to what's ahead in the next 50 years. 02:26:42.680 |
And we tend to focus on the negative a little bit too much. 02:26:45.640 |
In fact, autonomous trucks are often brought up 02:27:05.720 |
will take away certain jobs, it'll create other jobs. 02:27:08.800 |
So there's temporary pain, hopefully temporary, 02:27:28.640 |
- Yeah, there's even more positive properties 02:27:30.280 |
about trucking where not only is there just a huge shortage 02:27:34.440 |
the average age of truck drivers is getting closer to 50 02:27:36.880 |
because the younger people aren't wanting to come into it. 02:27:39.000 |
They're trying to like incentivize, lower the age limit, 02:27:46.240 |
And the least favorable, like, I mean, it depends 02:27:53.120 |
where you're on the road away from your family 02:27:56.080 |
- Steve's talked about the pain of those kinds of routes 02:28:13.760 |
And that's also where like the biggest kind of safety risk 02:28:17.920 |
And so when you think of the gradual evolution 02:28:21.360 |
of how trucking comes in, first of all, it's not overnight. 02:28:23.840 |
It's gonna take decades to kind of phase in all the, 02:28:26.360 |
like there's just a long, long, long road ahead, 02:28:35.440 |
and benefit the most from humans are the short haul 02:28:38.120 |
and most complicated kind of more urban routes, 02:28:46.240 |
more flexibility on like geography and location. 02:28:51.120 |
And you get to kind of sleep at your own home. 02:28:54.680 |
- And very importantly, if you optimize the logistics, 02:29:11.240 |
So they really feel the pain of inefficient logistics. 02:29:15.160 |
Because like if they're just sitting around for hours, 02:29:27.040 |
- And a high percentage of trucks are like empty 02:29:33.960 |
and the other thing is when you increase the efficiency 02:29:40.480 |
Like the entire market cap of trucking is going to go up 02:29:50.760 |
And so that on its own just creates more and more demand, 02:29:57.040 |
and starts to really kind of reshape an industry, 02:30:03.200 |
there's just a lot of positives that for at least any time 02:30:06.240 |
in the foreseeable future seem really lined up 02:30:08.120 |
in a good way to kind of come in and help with the shortage 02:30:23.440 |
automation and AI does technology broadly, I would say, 02:30:27.080 |
but automation is a thing that has a potential 02:30:36.640 |
And so that results in, like I said, human suffering 02:30:41.080 |
because people lose their jobs, there's economic pain there. 02:30:46.640 |
So for a lot of people, work is a source of meaning, 02:31:08.280 |
And is that something you think about as a sort of a robotic 02:31:18.600 |
to find activity and work that's a source of identity, 02:31:24.920 |
- I do think about it because you want to make sure 02:31:29.960 |
like not just like the part of how it plays in it, 02:31:33.000 |
but what are the ripple effects of it down the road. 02:31:37.520 |
there's a lot of opportunity to put in the right policies, 02:31:39.960 |
the right opportunities to kind of reshape and retrain 02:31:45.520 |
both trucking and cars, we have remote assistance facilities 02:31:53.880 |
and monitor vehicles and provide like very focused 02:31:57.880 |
kind of assistance on kind of areas where the vehicle 02:32:01.120 |
may want to request help in understanding an environment. 02:32:04.120 |
So those are jobs that kind of get created and supported. 02:32:07.280 |
I remember like taking a tour of one of the Amazon facilities 02:32:10.200 |
where you've probably seen the Kiva Systems robots, 02:32:13.320 |
where you have these orange robots that have automated 02:32:16.120 |
the warehouse, like kind of picking and collecting of items. 02:32:19.760 |
And it's like really elegant and beautiful way. 02:32:22.560 |
It's actually one of my favorite applications 02:32:26.800 |
Like I think it kind of came across that company 02:32:31.840 |
- Warehouse robots that transport little things. 02:32:33.600 |
- So basically instead of a person going and walking around 02:32:38.920 |
these robots go and pick up a shelf and move it over 02:32:42.560 |
in a row where like the seven shelves that contain 02:32:44.600 |
the seven items are lined up in a laser or whatever points 02:32:48.840 |
And you go and pick it and you place it to fill the order. 02:32:50.880 |
And so the people are fulfilling the final orders. 02:32:55.600 |
when I was asking them about like kind of the impact 02:32:57.240 |
on labor, when they transitioned that warehouse, 02:32:59.320 |
the throughput increased so much that the jobs shifted 02:33:07.280 |
the search of the items themselves and the labor, 02:33:12.760 |
like there was actually the same amount of jobs 02:33:19.640 |
Like, so you have these situations that are not zero sum 02:33:24.200 |
And the optimist to me thinks that there's these types 02:33:26.320 |
of solutions in almost any industry where the growth 02:33:32.360 |
but you gotta be intentional about finding those 02:33:36.480 |
Because even if you make the argument that like, 02:33:45.400 |
You have to have an understanding of that link 02:33:50.240 |
whether training is acquired or just mental transition 02:33:58.040 |
The uncertainty of it, there's families involved. 02:34:09.800 |
You can't just look at economic metrics always, 02:34:13.920 |
And you can't even just take it as like, okay, 02:34:17.760 |
because like there is an element of just personal pride 02:34:20.480 |
where majority of people don't wanna just be okay, 02:34:24.760 |
but like they wanna actually like have a craft, 02:34:28.280 |
and feel like they're having a really positive impact. 02:34:33.440 |
there's a lot of transferability and skillset 02:34:36.760 |
that is possible, especially if you create a bridge 02:34:43.320 |
And to some degree, that's our responsibility as well, 02:35:01.520 |
that they have, the screen looks awfully a lot like Cosmo, 02:35:10.000 |
What are your thoughts about like home robotics 02:35:27.120 |
What are your thoughts about Amazon getting into this space? 02:35:30.120 |
- Yeah, we had some signs that they're getting into it 02:35:34.040 |
Maybe they were a little bit too interested in Cosmo 02:35:38.840 |
but they're also very good partners actually for us 02:35:41.320 |
as we kind of just integrated a lot of shared technology. 02:35:47.320 |
you could think of Alexa as a robot as well, Echo. 02:36:05.920 |
I have my doubts that this one's gonna hit the mark 02:36:08.880 |
because I think for the price point that it's at 02:36:11.480 |
and the kind of functionality and value propositions 02:36:15.360 |
it's still searching for the kill application 02:36:18.480 |
that justifies, I think it was like a $1,500 price point 02:36:28.520 |
but you have to really, really hit a high mark 02:36:31.600 |
at that price point, which we always tried to, 02:36:33.640 |
we were always very cautious about jumping too quickly 02:36:35.760 |
to the more advanced systems that we really wanted to make, 02:36:45.520 |
The mobility is an angle that hasn't been utilized, 02:36:58.600 |
like think like our neighbors, our friends, parents, 02:37:38.400 |
So there's a problem of trust to solve there. 02:37:42.680 |
It's the thing that is the quote unquote problem 02:37:51.480 |
with the intent that's communicated by the device 02:37:57.560 |
And so, and I think they also have to retrace 02:38:00.520 |
some of the like warnings on the character side 02:38:06.360 |
a lot of companies are great at the hardware side of it 02:38:16.320 |
The character side of it for technology companies 02:38:25.600 |
I hope there's continued progress in this space 02:38:27.240 |
and that thread doesn't kind of go dormant for too long. 02:39:02.160 |
it's like there's a few familiar kind of elements there 02:39:19.840 |
a robotics company that lives for a long time? 02:39:25.040 |
so I thought Cosmo for sure would live for a very long time. 02:39:29.280 |
That to me was exceptionally successful vision 02:39:37.000 |
that has pivoted in all the right ways to survive 02:39:46.400 |
having like a, have a driver that constantly provides profit, 02:39:54.920 |
what they're doing is they're almost like taking risks 02:39:59.200 |
because they have other sources of revenue, right? 02:40:14.760 |
a really, really great fit of where the technology 02:40:18.240 |
could satisfy a really clear use case and need. 02:40:28.000 |
Robotics is hard because it like tends to be more expensive. 02:40:46.320 |
or they try to bite off a kind of an offering 02:40:51.320 |
that has a mismatch in kind of price to function. 02:41:01.920 |
It's just, I mean, after all the years in it, 02:41:04.200 |
like definitely kind of feel a lot of the battle scars 02:41:09.360 |
you not only do you have to like hit the function, 02:41:10.920 |
but you have to educate and explain, get awareness up, 02:41:15.400 |
Like, you know, there's a reason why a lot of technologies 02:41:20.720 |
and then kind of continue forward in the consumer space. 02:41:22.960 |
Even like, you know, you see AR like starting 02:41:29.360 |
consumers and price points that they're willing 02:41:31.440 |
to kind of be attracted in a mass market way. 02:41:36.920 |
but I mean like, you know, 2 million, 10 million, 02:41:40.720 |
50 million, like mass market kind of interest, 02:41:57.920 |
or just the function that was picked just doesn't line up. 02:42:00.600 |
And so that product market fit is very important. 02:42:05.000 |
or rather super compelling apps is much smaller 02:42:08.840 |
because it's easy to get outside of the price range. 02:42:14.400 |
Like, that's why like we picked off entertainment 02:42:23.480 |
and still create a really compelling offering 02:42:29.040 |
And over time, that same opportunity opens up in healthcare, 02:42:34.560 |
in home applications, in commercial applications 02:42:38.920 |
and kind of broader, more generalized interface. 02:42:41.520 |
But there's missing pieces in order for that to happen. 02:42:44.640 |
And all of those have to be present for it to line up. 02:42:47.200 |
And we see these sort of trends in technology where, 02:42:50.240 |
you know, kind of technologies that start in one place 02:43:01.120 |
and then kind of move into the consumer market. 02:43:03.520 |
And sometimes it's just a timing thing, right? 02:43:05.240 |
Where how many stabs at what became the iPhone 02:43:11.080 |
that just weren't quite ready in the function 02:43:13.840 |
relative to the kind of price point and complexity. 02:43:16.800 |
- And sometimes it's a small detail of the implementation 02:43:23.880 |
- Something, yeah, like the new generation UX, right? 02:43:34.200 |
And, but yeah, history repeats itself in a lot of ways 02:43:37.320 |
in a lot of these trends, which is pretty fascinating. 02:43:39.960 |
- Well, let me ask you about the humanoid form. 02:43:49.360 |
Waymo and the other companies working in the space, 02:43:55.120 |
in potential revolutionary application of robotics, 02:44:08.000 |
Do you think it's interesting, full of mystery? 02:44:12.080 |
- Yeah, I think today humanoid form robotics is research. 02:44:27.880 |
just add a lot of complexity and cost, right? 02:44:31.520 |
oftentimes it's in the pursuit of a humanoid robot, 02:44:33.440 |
not in the pursuit of an application for the time being. 02:44:36.960 |
Especially when you have like kind of the gaps in interface 02:44:39.200 |
and, you know, kind of AI that we kind of talk about today. 02:44:42.240 |
So anything Elon does, I'm interested in following. 02:44:51.240 |
So it's like, you can't ever, you know, ignore it. 02:45:01.960 |
is I've disagreed with Elon a lot about this, 02:45:06.160 |
is to me, the compelling aspect of the humanoid form 02:45:12.040 |
and a lot of kind of robots, Cosmo, for example, 02:45:31.660 |
But to me, the reason you might want to argue 02:45:34.080 |
for the humanoid form is because, you know, at a party, 02:45:41.400 |
The humanoid form has a compelling notion to it 02:45:45.960 |
I would argue, if we were arguing about this, 02:45:50.280 |
that it's cheaper to build a Cosmo, like that form. 02:45:57.000 |
which I have with Jim Keller about, you know, 02:46:05.160 |
if you're using an application where it can be flawed, 02:46:24.060 |
we're drawn to legs and limbs and body language 02:46:28.580 |
And even a face, even if you don't have the facial features, 02:46:32.980 |
to reduce the creepiness factor, all that kind of stuff. 02:46:38.140 |
But yeah, that, to me, the humanoid form is compelling. 02:46:44.140 |
for the factory environment, I'm not so sure. 02:46:48.000 |
like right off the bat, what are you optimizing for? 02:46:53.480 |
Like that changes completely the look and feel 02:46:56.720 |
You know, and almost certainly the human form 02:47:07.800 |
And how do you customize it and make it customizable 02:47:11.840 |
for the different needs, if that was the optimization, right? 02:47:18.380 |
You know, I still feel that the closer you try to get 02:47:20.900 |
to a human, the more you're subject to the biases 02:47:25.020 |
of what a human should be, and you lose flexibility 02:47:40.740 |
and natural interfaces for robotic kind of characters 02:47:56.420 |
but that actually creates way more flexibility 02:48:03.860 |
but I'm still confused by the magic I see in legged robots. 02:48:16.620 |
and like the magic that like the Boston Dynamics team 02:48:26.440 |
to try to find an application for that sort of technology, 02:48:29.820 |
but wow, that's incredible technology, right? 02:48:45.520 |
And that's where humanoid robots is kind of close to that 02:48:49.260 |
in that like it is a decision about a form factor 02:49:00.340 |
- I think the core fascinating idea with the Tesla bot 02:49:05.820 |
is when you're solving the general robotics problem 02:49:10.300 |
where there's the very clear applications of driving. 02:49:18.780 |
the whole world starts to kind of start to look 02:49:39.620 |
there's no reason transformer like this thing 02:49:42.700 |
couldn't take the goods up an elevator, you know? 02:49:47.060 |
Like slowly expand what it means to move goods 02:50:01.020 |
think of it as an end-to-end robotics problem 02:50:02.780 |
from like loading from, you know, from everything. 02:50:10.340 |
into today's understanding of what a vehicle is, right? 02:50:13.420 |
The Pacifica, Jaguar, the Freightliners from Daimler. 02:50:23.980 |
to like expand these partnerships to really rethink 02:50:26.740 |
what would the next generation of a truck look like 02:50:34.460 |
And maybe that means a very different type of trailer. 02:50:38.460 |
there's a lot of things you could rethink on that front, 02:50:50.820 |
So maybe by way of advice and maybe by way of story 02:51:17.940 |
and didn't know anything about robotics coming in 02:51:20.020 |
and was doing, you know, electrical computer engineering, 02:51:26.060 |
and then fell in love with autonomous driving. 02:51:27.980 |
And at that point, like that was just by a big margin, 02:51:36.660 |
And so what I would say is that like robotics, 02:51:46.700 |
has moved from being very research and academics driven 02:51:54.100 |
Now there's other areas that are much younger 02:51:56.060 |
and you see like kind of grasping and manipulation, 02:52:03.780 |
What I would say is the space moves very quickly. 02:52:08.580 |
like it is in most areas will evolve and change 02:52:42.460 |
this is like a five, six year kind of endeavor. 02:52:47.100 |
And you have to love it enough to go super deep 02:52:57.340 |
And in robotics, that probably means more breadth 02:53:05.740 |
And it means being able to collaborate with teams 02:53:08.020 |
where like one of the coolest aspects of like 02:53:10.460 |
the experience that I like kind of cherish in our PhD 02:53:13.580 |
is that we actually had a pretty large AV project 02:53:16.900 |
that for that time was like a pretty serious initiative 02:53:20.140 |
where you got to like partner with a larger team 02:53:28.620 |
- So I was working on a project called UPI back then, 02:53:37.540 |
like a large off-road vehicle that you would like drop 02:53:40.340 |
and then give it a waypoint 10 kilometers away 02:53:42.420 |
and it would have to navigate a completely unstructured-- 02:53:45.260 |
- Yeah, so like forest, ditches, rocks, vegetation. 02:53:48.700 |
And so it was like a really, really interesting 02:53:58.940 |
And so what I think is like the beauty of robotics, 02:54:01.500 |
but also kind of like the expectation is that 02:54:08.940 |
Robotics, the necessity, but also the beauty of it 02:54:12.140 |
is that it forces you to be excited about that breadth 02:54:15.420 |
and that partnership across different disciplines 02:54:29.140 |
It's like the application of physical automation 02:54:33.940 |
And so you can do robotic surgery, you can do vehicles, 02:54:37.020 |
you can do factory automation, you can do healthcare, 02:54:39.620 |
you can do like leverage the AI around the sensing 02:54:44.180 |
to think about static sensors and scene understanding. 02:54:53.300 |
that are probably a little bit more collaborative 02:55:05.300 |
from CMU and MIT, they're really happy people. 02:55:10.980 |
- Because I think it's the collaborative thing. 02:55:16.260 |
You're not like sitting in like the fourth basement. 02:55:20.220 |
Which when you're doing machine learning purely software, 02:55:23.740 |
it's very tempting to just disappear into your own hole 02:55:29.140 |
And that breeds a little bit more of the silo mentality 02:55:36.220 |
It's almost like negative to talk to somebody else 02:55:39.300 |
But robotics folks are just very collaborative, 02:55:45.620 |
you get to confront the physics of reality often, 02:55:59.700 |
like robotics and AI is like just all the rage 02:56:13.940 |
thought it was just the coolest thing in the world. 02:56:15.580 |
They wanted to like make physical things intelligent 02:56:20.460 |
where they went into it for the right reasons and so forth. 02:56:23.740 |
And that organizational challenge, by the way, 02:56:25.900 |
like when you think about the challenges in AV, 02:56:28.580 |
we talk a lot about the technical challenges. 02:56:30.420 |
The organizational challenges through the roof, 02:56:33.180 |
where you think about what it takes to build an AV system 02:56:38.180 |
and you have companies that are now thousands of people. 02:56:42.260 |
And you look at other really hard technical problems, 02:56:45.620 |
like an operating system, it's pretty well established. 02:56:48.180 |
Like you kind of know that there's a file system, 02:56:51.540 |
there's virtual memory, there's this, there's that, 02:56:56.420 |
and there's like a really reasonably well-established 02:57:00.300 |
And so you can kind of like scale it in an efficient fashion 02:57:03.260 |
that doesn't exist anywhere near to that level of maturity 02:57:10.980 |
organizational structures are being reinvented. 02:57:15.620 |
They're part sensing, part behavior prediction, 02:57:20.340 |
And like one of the biggest challenges is actually 02:57:26.820 |
is starting to get strained on how do you organize it 02:57:33.740 |
multi-dimensional matrix that needs to all work together. 02:57:43.260 |
you know, like what does it take to actually scale this? 02:57:46.300 |
And then you look at like other gigantic challenges 02:57:48.060 |
that have, you know, that have been successful 02:57:50.860 |
and are way more mature, there's a stability to it. 02:57:53.420 |
And like, maybe the autonomous vehicle space will get there, 02:57:56.580 |
but right now, just as many technical challenges 02:57:59.820 |
as they are, they're like organizational challenges 02:58:13.660 |
- By way of advice, what advice would you give 02:58:17.420 |
to somebody thinking about doing a robotic startup? 02:58:22.060 |
You mentioned Cosmo, somebody that wanted to carry 02:58:24.780 |
the Cosmo flag forward, the Anki flag forward. 02:58:40.780 |
there are things you learn navigating a startup 02:58:44.060 |
that you'll never like, it was very hard to encounter that 02:58:51.820 |
It's not as like, you know, the glamor of a startup, 02:58:54.540 |
there's just like just brutal emotional swings up and down. 02:58:57.220 |
And so having co-founders actually helps a ton. 02:59:00.180 |
Like I would not, could not imagine doing it solo, 02:59:02.340 |
but having at least somebody where on your darkest days, 02:59:09.900 |
lean onto somebody that's in the thick of it with you, 02:59:20.900 |
Is it worried about whether any of your ideas 02:59:38.820 |
where you feel that you put in a lot of effort 02:59:46.420 |
And, you know, you need to solve the set or do whatever 02:59:52.180 |
but at the end of the day, you put in the effort, 02:59:53.980 |
you tend to like kind of come out with your enough results 03:00:04.860 |
and it doesn't matter how hard you kind of tried and pitched. 03:00:10.700 |
And if you don't fix it, you're out of business. 03:00:16.180 |
And there's a, you got to have this milestone 03:00:18.540 |
in order to like have a good pitch and you do it. 03:00:21.940 |
and you just don't have it inside the company. 03:00:28.060 |
or however many people kind of like along with you 03:00:36.860 |
it's like, there's no walking away from it, right? 03:00:44.660 |
that like things end up being harder than you expect. 03:01:06.140 |
but like you're all fighting for the same thing 03:01:07.820 |
and it's the most satisfying kind of journey ever. 03:01:13.380 |
is like be really, really thoughtful about the application. 03:01:19.100 |
kind of, you know, team and execution and market 03:01:21.780 |
and like kind of how important are each of those. 03:01:26.420 |
and you come at it thinking that if you're smart enough 03:01:29.860 |
and you're like have the right talent to team and so forth, 03:01:32.420 |
like you'll always kind of find a way through. 03:01:39.100 |
and the timing of you entering that industry. 03:01:44.540 |
There is, I don't know if there'll ever be another company 03:01:57.900 |
and productization, you know, from a P&L standpoint. 03:02:05.940 |
like by any measure of any industry that's ever existed, 03:02:09.140 |
except for maybe the US space program, right? 03:02:11.980 |
But it's like a multiple trillion dollar opportunities, 03:02:17.420 |
which is so unusual to find that size of a market 03:02:20.260 |
that just the progress that shows the de-risking of it, 03:02:26.900 |
and it still justifies the investment that is happening 03:02:33.860 |
Now, by the same consequence, like the size of the market, 03:02:40.740 |
how hard that's gonna be, who the accumbents, 03:02:42.540 |
like that's probably one of the lessons I appreciate 03:02:46.140 |
where like those things really, really do matter. 03:02:56.740 |
or you run into like the institutional kind of headwinds 03:03:00.980 |
like let's say you have the greatest idea in the world, 03:03:03.940 |
but it takes 10 years to innovate in healthcare 03:03:12.100 |
And so the combination of like Anki and Waymo 03:03:16.820 |
where you can do a ton if you have the right market, 03:03:19.740 |
the right opportunity, the right way to explain it, 03:03:22.260 |
and you show the progress in the right sequence, 03:03:36.260 |
like the space of robotics is really interesting. 03:03:47.380 |
So how much is like truly revolutionary thinking 03:03:54.820 |
And then, yeah, but so like creating something that- 03:04:12.340 |
And I don't think people fully understand the value of that. 03:04:18.100 |
You have to create it and the product will communicate it. 03:04:31.060 |
I don't think they understood the value of that 03:04:41.580 |
and it has all these things that it did better, right? 03:04:46.060 |
And then like, you even remember the early commercials 03:04:48.780 |
where it's always like one application of what it could do 03:04:53.140 |
And so that was intentionally sending a message, 03:04:56.980 |
you can send a text message, you can listen to music, 03:04:59.900 |
And so, autonomous driving obviously anchors on that as well. 03:05:05.380 |
the functionality of an autonomous truck, right? 03:05:10.860 |
In the home, you have a fundamental advantage. 03:05:14.620 |
because it was so painful to explain to people 03:05:20.460 |
especially when something was so experiential. 03:05:34.060 |
and like actually found like really great success. 03:05:38.820 |
because they anchored on reinventing existing categories 03:05:47.500 |
to take what's familiar, anchor that understanding, 03:06:01.020 |
We had a hard, like we actually had far greater efficiency 03:06:11.860 |
And it became viral where like we had these kind of videos 03:06:16.820 |
of thousands of views and like kind of like get spread 03:06:27.780 |
but then grow into something that's not is an advantage. 03:06:37.220 |
You could argue that that's very novel and very new. 03:06:39.860 |
And there's a lot of other examples that kind of created 03:06:49.780 |
enterprise is a little easier because if you can, 03:06:53.820 |
because if you can argue a clear value proposition 03:07:14.940 |
Like, and we, by the way, went to that same order. 03:07:17.020 |
We almost like, we almost hit a wall coming out of 2013 03:07:25.020 |
and all the kind of like great technology in it 03:07:29.460 |
to having to make a super hard pivot on why is it fun 03:07:32.540 |
and why does the random kind of family of four need this. 03:07:36.780 |
Right, like, so it's learnings, but that's the challenge. 03:07:41.340 |
And I think like robotics tends to sometimes fall 03:08:02.860 |
that's the kind of marketing that Waymo's doing, 03:08:21.540 |
- They don't care how much tech is in your thing. 03:08:24.540 |
- It's like, they need to know why they want it, so. 03:08:43.420 |
And like, everybody understands what a vacuum cleaner is. 03:08:48.740 |
And now you have one that like kind of does it itself, 03:09:00.140 |
and I think they have like 15% of the vacuum cleaner market. 03:09:11.660 |
computes cheaper, clouds cheaper, and AI is better. 03:09:16.340 |
- If we zoom out from specifically startups and robotics, 03:09:20.020 |
what advice do you have to high school students, 03:09:23.580 |
college students about career and living a life 03:09:34.220 |
If you can convert that into a generalizable potion, 03:09:50.380 |
you know, to like maximize your potential in it. 03:09:54.580 |
And so there's always kind of like the saying of like, 03:10:00.980 |
Try to find the overlap of where your passion overlaps 03:10:03.300 |
with like a growing opportunity and need in the world, 03:10:06.140 |
where it's not too different than the startup 03:10:18.980 |
but it's also just opens up tons of opportunities 03:10:22.180 |
Like, and so like, if you're interested in technology, 03:10:26.540 |
that might point to like, go and study machine learning, 03:10:33.220 |
that's gonna be at the root of like everything 03:10:38.820 |
in different industries and be an absolute expert 03:10:46.900 |
And that doesn't apply to just technology, right? 03:10:49.380 |
It could be the exact same thing if you wanna, 03:10:52.500 |
you know, same thought process applies to design, 03:10:55.660 |
to marketing, to, you know, to sales, to anything, 03:11:00.700 |
when you're in a space that's gonna continue to grow, 03:11:10.220 |
where the surface area is just gonna increase 03:11:16.860 |
and so you go into a career where you have that sort of 03:11:29.020 |
to find like that killer overlap of timing and passion 03:11:31.940 |
and skill set and point in life where you can like, 03:11:38.100 |
And then at the same time, like find a balance. 03:11:41.460 |
where I worked like a little bit too obsessively 03:11:45.180 |
And I, you know, think we kind of like tried to correct it, 03:11:49.740 |
but, you know, I think I probably appreciate a lot more now 03:11:56.180 |
And I kind of have the personality where I could use, 03:12:00.100 |
like I have like so much desire to really try to optimize, 03:12:06.100 |
And now I'm trying to like kind of find that balance 03:12:09.060 |
and make sure that I have the friendships, the family, 03:12:11.780 |
like relationship with the kids, everything that like, 03:12:17.580 |
And I think people can be happy on actually many kind of 03:12:24.340 |
but it's easy to kind of inadvertently make a choice 03:12:35.820 |
kind of all of those dimensions makes a lot of sense. 03:12:47.100 |
And hopefully one day, if your work pans out, Boris, 03:13:07.420 |
- Boris, you're one of my favorite human beings, roboticist. 03:13:20.340 |
And I can't wait to see what you do with Waymo. 03:13:33.820 |
So thank you so much for the work you've done 03:13:36.180 |
and thank you for spending your valuable time 03:13:45.220 |
please check out our sponsors in the description. 03:13:47.820 |
And now let me leave you with some words from Isaac Asimov. 03:14:01.060 |
Thank you for listening and hope to see you next time.