back to index

Boris Sofman: Waymo, Cozmo, Self-Driving Cars, and the Future of Robotics | Lex Fridman Podcast #241


Chapters

0:0 Introduction
1:8 Robots in science fiction
6:49 Cozmo
32:4 AI companions
38:59 Anki
64:33 Waymo Via
96:10 Sensor suites for long haul trucking
106:6 Machine learning
124:3 Waymo vs Tesla
134:38 Safety and risk management
143:42 Societal effects of automation
154:47 Amazon Astro
159:12 Challenges of the robotics industry
163:39 Humanoid robotics
170:42 Advice for getting a PhD in robotics
178:13 Advice for robotics startups
189:19 Advice for students

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Boris Sofman,
00:00:02.940 | who is the Senior Director of Engineering
00:00:04.840 | and Head of Trucking at Waymo,
00:00:07.120 | the autonomous vehicle company,
00:00:08.920 | formerly the Google Self-Driving Car Project.
00:00:12.420 | Before that, Boris was the co-founder and CEO of Anki,
00:00:17.000 | a robotics company that created Cosmo,
00:00:19.560 | which in my opinion,
00:00:21.120 | is one of the most incredible social robots ever built.
00:00:24.960 | It's a toy robot,
00:00:26.340 | but one with an emotional intelligence
00:00:28.240 | that creates a fun and engaging human-robot interaction.
00:00:32.080 | It was truly sad for me to see Anki shut down when it did.
00:00:36.840 | I had high hopes for those little robots.
00:00:39.660 | We talk about this story
00:00:41.440 | and the future of autonomous trucks,
00:00:43.940 | vehicles, and robotics in general.
00:00:47.220 | I spoke with Steve Viselli recently on episode 237
00:00:51.320 | about the human side of trucking.
00:00:53.240 | This episode looks more at the robotics side.
00:00:57.040 | This is the Lex Friedman Podcast.
00:00:59.280 | To support it,
00:01:00.280 | please check out our sponsors in the description.
00:01:02.960 | And now, here's my conversation with Boris Sofman.
00:01:07.160 | Who is your favorite robot in science fiction,
00:01:11.160 | books or movies?
00:01:12.480 | - WALL-E and R2-D2,
00:01:13.640 | where they were able to convey such an incredible degree
00:01:16.520 | of intent, emotion, and kind of character attachment
00:01:21.520 | without having any language whatsoever.
00:01:25.360 | And just purely through the richness
00:01:27.800 | of emotional interaction.
00:01:28.960 | So those were fantastic.
00:01:30.000 | And then the Terminator series,
00:01:32.320 | just like really pretty wide range, right?
00:01:36.240 | But I kind of love this dynamic
00:01:38.560 | where you have this like incredible Terminator itself
00:01:41.200 | that Arnold played.
00:01:42.800 | And then he was kind of like the inferior,
00:01:45.480 | like previous generation version
00:01:47.280 | that was like totally outmatched,
00:01:49.440 | you know, in terms of kind of specs by the new one,
00:01:51.520 | but you know, still kind of like held his own.
00:01:53.360 | And so it was kind of interesting where you realize
00:01:55.920 | how many levels there are on the spectrum
00:01:58.680 | from human to kind of potentials in AI and robotics
00:02:01.360 | to futures.
00:02:02.760 | And so, yeah, that movie really,
00:02:05.440 | as much as it was like kind of a direct world in a way,
00:02:08.640 | was actually quite fascinating, gets the imagination going.
00:02:11.160 | - Well, from an engineering perspective,
00:02:12.680 | both the movies you mentioned,
00:02:14.320 | WALL-E and Terminator, the first one,
00:02:18.040 | is probably achievable.
00:02:19.680 | You know, humanoid robot,
00:02:21.480 | maybe not with like the realism in terms of skin and so on,
00:02:25.200 | but that humanoid form, we have that humanoid form.
00:02:28.520 | It seems like a compelling form.
00:02:30.400 | Maybe the challenge is it's super expensive to build,
00:02:33.600 | but you can imagine, maybe not a machine of war,
00:02:36.640 | but you can imagine Terminator type robots walking around.
00:02:40.680 | And then the same, obviously, with WALL-E,
00:02:42.520 | you've basically, so for people who don't know,
00:02:44.880 | you created the company Anki
00:02:46.800 | that created a small robot with a big personality
00:02:50.160 | called Cosmo that just does exactly what WALL-E does,
00:02:53.440 | which is somehow with very few basic visual tools
00:02:58.400 | is able to communicate a depth of emotion.
00:03:00.680 | And that's fascinating.
00:03:02.400 | But then again, the humanoid form is super compelling.
00:03:07.000 | So like Cosmo is very distant from a humanoid form.
00:03:10.920 | And then the Terminator has a humanoid form.
00:03:13.200 | And you can imagine both of those
00:03:14.680 | actually being in our society.
00:03:16.680 | - It's true.
00:03:17.520 | And it's interesting because it was very intentional
00:03:19.920 | to go really far away from human form
00:03:23.080 | when you think about a character like Cosmo or like WALL-E,
00:03:26.360 | where you can completely rethink
00:03:30.120 | the constraints you put on that character,
00:03:32.840 | what tools you leverage,
00:03:34.200 | and then how you actually create a personality
00:03:37.480 | and a level of intelligence interactivity
00:03:39.520 | that actually matches the constraints that you're under,
00:03:43.240 | whether it's mechanical or sensors or AI of the day.
00:03:47.440 | - This is why I was always really surprised
00:03:50.040 | by how much energy people put
00:03:51.560 | towards trying to replicate human form in a robot,
00:03:54.200 | because you actually take on some pretty significant
00:03:57.880 | kind of constraints and downsides when you do that.
00:04:01.000 | The first of which is obviously the cost,
00:04:02.800 | where it's just the articulation of a human body
00:04:05.600 | is just so magical in both the precision
00:04:09.560 | as well as the dimensionality,
00:04:11.320 | that to replicate that even in its reasonably close form
00:04:14.280 | takes like a giant amount of joints and actuators
00:04:16.280 | and motion and sensors and encoders and so forth.
00:04:20.280 | But then you're almost like setting an expectation
00:04:23.840 | that the closer you try to get to human form,
00:04:25.520 | the more you expect the strengths to match.
00:04:27.760 | And that's not the way AI works,
00:04:30.000 | is there's places where you're way stronger
00:04:31.600 | and there's places where you're weaker.
00:04:33.240 | And by moving away from human form,
00:04:34.800 | you can actually change the rules
00:04:37.440 | and embrace your strengths and bypass your weaknesses.
00:04:39.960 | - And at the same time,
00:04:41.360 | the human form has way too many degrees of freedom
00:04:45.240 | to play with.
00:04:46.280 | It's kind of counterintuitive, just as you're saying,
00:04:49.560 | but when you have fewer constraints,
00:04:52.640 | it's almost harder to master the communication of emotion.
00:04:57.320 | Like you see this with cartoons, like stick figures,
00:05:00.240 | you can communicate quite a lot with just very minimal,
00:05:03.640 | like two dots for eyes and a line for a smile.
00:05:07.200 | I think you can almost communicate arbitrary levels
00:05:10.200 | of emotion with just two dots and a line.
00:05:13.280 | And like, that's enough.
00:05:14.160 | And if you focus on just that,
00:05:16.240 | you can communicate the full range.
00:05:18.320 | And then you, like, if you do that,
00:05:20.240 | then you can focus on the actual magic
00:05:23.040 | of human and dot line interaction
00:05:28.040 | versus all the engineering mess.
00:05:30.560 | - That's right.
00:05:31.400 | Like dimensionality, voice, all these sorts of things,
00:05:33.160 | they actually become a crutch
00:05:34.520 | where you get lost in a search space almost.
00:05:37.160 | And so some of the best animators that we've worked with,
00:05:41.520 | they almost like study when they come up,
00:05:44.840 | kind of in building their expertise
00:05:46.400 | by forcing these projects where all you have is like a ball
00:05:51.400 | that can like kind of jump and manipulate itself
00:05:53.880 | or like really, really like aggressive constraints
00:05:56.840 | where you're forced to kind of extract
00:05:59.120 | the deepest level of emotion.
00:06:00.000 | And so in a lot of ways,
00:06:02.040 | you know, when we thought about Cosmo,
00:06:03.400 | I was like, you're right.
00:06:04.640 | If we had to like describe it in like one small phrase,
00:06:07.160 | it was bringing a Pixar character to life in the real world.
00:06:10.040 | And so it's what we were going for.
00:06:11.800 | And in a lot of ways, what was interesting is that
00:06:14.200 | with like WALL-E, which we studied incredibly deeply,
00:06:16.840 | and in fact, some of our team were, you know,
00:06:19.600 | kind of had worked previously at Pixar and on that project,
00:06:23.520 | they intentionally constrained WALL-E as well,
00:06:25.680 | even though in an animated film,
00:06:26.920 | you could do whatever you wanted to
00:06:29.000 | because it forced you to like really saturate
00:06:32.360 | the smaller amount of dimensions.
00:06:34.080 | But you sometimes end up getting a far more beautiful output
00:06:39.000 | because you're pushing at the extremes
00:06:41.720 | of this emotional space in a way that you just wouldn't
00:06:44.840 | because you get lost in a surface area
00:06:46.640 | if you have like something
00:06:47.640 | that is just infinitely articulable.
00:06:49.520 | - So if we backtrack a little bit,
00:06:51.680 | you thought of Cosmo in 2011,
00:06:54.000 | in 2013, actually designed and built it.
00:06:57.560 | What is Anki?
00:06:58.520 | What is Cosmo?
00:07:01.080 | I guess, who is Cosmo?
00:07:02.400 | - Who is Cosmo?
00:07:03.920 | - What was the vision behind this incredible little robot?
00:07:06.800 | - We started Anki back in,
00:07:09.480 | like while we were still in graduate school.
00:07:10.880 | So myself and my two co-founders,
00:07:12.200 | we were PhD students in the Robotics Institute
00:07:15.560 | at Carnegie Mellon.
00:07:16.800 | And so we were studying robotics, AI, machine learning,
00:07:20.560 | kind of different areas.
00:07:23.040 | One of my co-founders was working on walking robots,
00:07:25.280 | you know, for a period of time.
00:07:26.840 | And so we all had a bit of a really deep,
00:07:31.400 | kind of a deeper passion for applications of robotics and AI
00:07:34.720 | where there's like a spectrum
00:07:36.680 | where there's people that get like really fascinated
00:07:38.400 | by the theory of AI and machine learning robotics,
00:07:40.920 | where whether it gets applied in the near future or not
00:07:44.160 | is less of a kind of factor on them,
00:07:46.080 | but they love the pursuit of the like the challenge.
00:07:48.080 | And that's necessary.
00:07:48.920 | And there's a lot of incredible breakthroughs
00:07:50.080 | that happen there.
00:07:50.920 | We're probably closer to the other end of the spectrum
00:07:52.520 | where we love the technology and all the evolution of it,
00:07:56.280 | but we were really driven by applications.
00:07:58.720 | Like how can you really reinvent experiences
00:08:01.120 | and functionality and build value
00:08:03.880 | that wouldn't have been possible without these approaches?
00:08:06.960 | And that's what drove us.
00:08:08.200 | And we had a kind of some experiences
00:08:10.240 | through previous jobs and internships
00:08:11.720 | where we like got to see the applied side of robotics.
00:08:14.360 | And at that time,
00:08:15.600 | there was actually relatively few applications of robotics
00:08:18.840 | that were outside of, you know,
00:08:21.320 | pure research or industrial applications,
00:08:24.200 | military applications and so forth.
00:08:25.960 | There were very few outside of it.
00:08:27.720 | So maybe, you know, iRobot was like one exception
00:08:30.000 | and then maybe there were a few others,
00:08:31.000 | but for the most part, there weren't that many.
00:08:32.680 | And so we got excited about consumer applications
00:08:35.360 | of robotics where you could leverage
00:08:37.640 | way higher levels of intelligence through software
00:08:41.040 | to create value and experiences that were just not possible
00:08:44.320 | in those fields today.
00:08:46.760 | And we saw kind of a pretty wide range of applications
00:08:52.640 | that varied in the complexity of what it would take
00:08:54.400 | to actually solve those.
00:08:55.840 | And what we wanted to do
00:08:56.680 | was to commercialize this into a company,
00:08:58.440 | but actually do a bottoms up approach
00:09:01.000 | where we could have a huge impact in a space
00:09:03.200 | that was ripe to have an impact at that time
00:09:05.800 | and then build up off of that and move into other areas.
00:09:07.880 | And entertainment became the place to start
00:09:09.760 | because you had relatively little innovation
00:09:12.480 | in a toy space, an entertainment space.
00:09:14.840 | You had these really rich experiences
00:09:17.040 | in video games and movies,
00:09:19.360 | but there was like this chasm in between.
00:09:21.160 | And so we thought that we could
00:09:23.520 | really reinvent that experience.
00:09:25.160 | And there was a really fascinating transition technically
00:09:28.480 | that was happening at the time
00:09:29.400 | where the cost of components was plummeting
00:09:32.440 | because of the mobile phone industry
00:09:33.880 | and then the smartphone industry.
00:09:35.240 | And so the cost of a microcontroller,
00:09:37.040 | of a camera, of a motor, of memory,
00:09:38.880 | of microphones, cameras was dropping by orders of magnitude.
00:09:43.880 | And then on top of that, with the iPhone coming out in 2000,
00:09:48.000 | I think it was 2007, I believe.
00:09:50.240 | It started to become apparent within a couple of years
00:09:55.040 | that this could become a really incredible interface device
00:09:58.920 | and the brain with much more computation
00:10:01.160 | behind a physical world experience
00:10:03.080 | that wouldn't have been possible previously.
00:10:05.040 | And so we really got excited about that
00:10:08.720 | and how we push all the complexity
00:10:10.520 | from the physical world into software
00:10:12.960 | by using really inexpensive components,
00:10:14.920 | but putting huge amounts of complexity into the AI side.
00:10:17.560 | And so Cosmo became our second product
00:10:19.640 | and then the one that we're probably most proud of.
00:10:21.520 | The idea there was to create a physical character
00:10:24.160 | that had enough understanding and awareness
00:10:27.000 | of the physical world around it
00:10:28.480 | and the context that mattered to feel like he was alive.
00:10:32.960 | And to be able to have these emotional connections
00:10:36.560 | and experiences with people
00:10:38.600 | that you would typically only find inside of a movie.
00:10:41.320 | And the motivation very much was Pixar.
00:10:44.320 | We had an incredible respect and appreciation
00:10:47.240 | for what they were able to build
00:10:49.160 | in this really beautiful fashion and film.
00:10:51.680 | But it was always, one, it was virtual,
00:10:54.240 | and two, it was like a story on rails
00:10:56.560 | that had no interactivity to it.
00:10:57.880 | It was very fixed and it obviously had a magic to it,
00:11:01.360 | but where you really start to hit a different level
00:11:03.880 | of experiences when you're actually able
00:11:05.160 | to physically interact with that robot.
00:11:06.600 | - And then that was your idea with Anki,
00:11:08.200 | like the first product was the cars.
00:11:10.280 | So basically you take a toy, you add intelligence into it
00:11:15.280 | in the same way you would add intelligence
00:11:18.560 | into AI systems within a video game,
00:11:21.480 | but you're now bringing it into the physical space.
00:11:23.320 | So the idea is really brilliant,
00:11:25.200 | which is you're basically bringing video games to life.
00:11:29.880 | - Exactly, that's exactly right.
00:11:30.800 | We literally use that exact same phrase
00:11:32.640 | because in the case of "Drive,"
00:11:34.400 | this was a parallel of the racing genre.
00:11:37.200 | And the goal was to effectively
00:11:39.920 | have a physical racing experience,
00:11:42.040 | but have a virtual state at all times
00:11:45.000 | that matches what's happening in the physical world.
00:11:47.200 | And then you can have a video game off of that
00:11:48.800 | and you can have different characters,
00:11:51.440 | different traits for the cars, weapons and interactions
00:11:55.800 | and special abilities and all these sorts of things
00:11:57.440 | that you think of virtually,
00:11:59.160 | but then you can have it physically.
00:12:00.440 | And one of the things that we were really surprised by
00:12:03.240 | that really stood out and immediately led us
00:12:06.000 | to really accelerate the path towards Cosmo
00:12:09.440 | is that things that feel like they're really constrained
00:12:12.200 | and simple in the physical world,
00:12:13.360 | they have an amplified impact on people
00:12:14.880 | where the exact same experience virtually
00:12:16.800 | would not have anywhere near the impact,
00:12:18.120 | but seeing it physically really stood out.
00:12:20.560 | And so effectively with "Drive,"
00:12:23.680 | we were creating a video game engine for the physical world.
00:12:26.360 | And then with Cosmo, we expanded that video game engine
00:12:29.480 | to create a character and kind of an animation
00:12:34.440 | and interaction engine on top of it
00:12:36.920 | that allowed us to start to create
00:12:39.000 | these much more rich experiences.
00:12:41.000 | And a lot of those elements were almost like a proving ground
00:12:44.240 | for what would human-robot interaction feel like
00:12:47.040 | in a domain that's much more forgiving,
00:12:49.280 | where you can make mistakes in a game.
00:12:51.280 | It's okay if a car goes off the track
00:12:54.800 | or if Cosmo makes a mistake.
00:12:57.320 | And what's funny is actually,
00:12:58.520 | we were so worried about that.
00:13:00.240 | In reality, we realized very quickly
00:13:02.040 | that those mistakes can be endearing.
00:13:03.800 | And if you make a mistake,
00:13:04.720 | as long as you realize you make a mistake
00:13:06.360 | and have the right emotional reaction to it,
00:13:07.760 | it builds even more empathy with the character.
00:13:09.840 | - That's brilliant.
00:13:10.680 | Exactly.
00:13:11.520 | So the thing you're optimizing for is fun.
00:13:14.320 | You have so much more freedom to fail, to explore.
00:13:17.800 | And also in the toy space,
00:13:20.120 | all of this is really brilliant.
00:13:21.760 | I got to ask you, backtrack.
00:13:24.160 | It seems for a roboticist to take a jump
00:13:28.120 | into the direction of fun is a brilliant move.
00:13:32.980 | Because one, you have the freedom to explore
00:13:34.800 | and to design all those kinds of things.
00:13:36.920 | And you can also build cheap robots.
00:13:39.520 | If you're not chasing perfection and toys,
00:13:45.300 | it's understood that you can go cheaper,
00:13:47.320 | which means in robot, it's still expensive,
00:13:50.440 | but it's actually affordable by a large number of people.
00:13:53.560 | That's a really brilliant space to explore.
00:13:55.800 | - Yeah, that's right.
00:13:57.160 | And in fact, we realized pretty quickly
00:13:58.480 | that perfection is actually not fun.
00:14:00.600 | Because in a traditional roboticist sense,
00:14:03.680 | the first path planner,
00:14:05.200 | and this is the part that I worked on out of the gate,
00:14:08.120 | was a lot of the AI systems where you have these vehicles
00:14:12.120 | and cars racing, making optimal maneuvers
00:14:15.360 | to try to get ahead.
00:14:16.480 | And you realize very quickly that that's actually not fun
00:14:19.280 | because you want the chaos from mistakes.
00:14:23.040 | And so you start to intentionally almost add noise
00:14:25.560 | to the system in order to create more of a realism
00:14:28.600 | in the exact same way that a human player
00:14:30.340 | might start really ineffective and inefficient
00:14:33.160 | and then start to increase their quality bar
00:14:35.400 | as they progress.
00:14:37.400 | And there is a really, really aggressive constraint
00:14:40.160 | that's forced on you by being a consumer product
00:14:42.600 | where the price point matters a ton,
00:14:44.300 | particularly in an entertainment
00:14:46.360 | where you can't make a thousand dollar product
00:14:50.360 | unless you're gonna meet the expectations
00:14:52.480 | of a thousand dollar product.
00:14:53.480 | And so in order to make this work,
00:14:56.360 | your cost of goods had to be well under a hundred dollars.
00:15:00.280 | In the case of Cosmo, we got it under $50,
00:15:03.080 | end to end fully packaged and delivered.
00:15:04.600 | - And it was under $200 of the cost of retail.
00:15:07.920 | Okay, if we sit down at the early stages,
00:15:13.200 | if we go back to that,
00:15:15.000 | and you're sitting down and thinking about
00:15:16.440 | what Cosmo looks like from a design perspective
00:15:18.960 | and from a cost perspective,
00:15:20.200 | I imagine that was part of the conversation.
00:15:22.400 | First of all, what came first?
00:15:25.960 | Did you have a cost in mind?
00:15:27.920 | Is there a target you're trying to chase?
00:15:30.260 | Did you have a vision in mind, like size?
00:15:32.740 | Did you have, 'cause there's a lot of unique qualities
00:15:35.200 | to Cosmo, so for people who don't know,
00:15:36.640 | they should definitely check it out.
00:15:38.480 | There's a display, there's eyes on the display
00:15:41.080 | and those eyes can, it's pretty low resolution eyes, right?
00:15:44.640 | But they still able to convey a lot of emotion.
00:15:47.640 | And there's this arm, like that--
00:15:51.360 | - Lift sort of--
00:15:52.200 | - Lift stuff, but there's something about arm movement
00:15:55.880 | that adds even more kind of depth.
00:15:59.640 | It's like the face communicates emotion and sadness
00:16:03.680 | and disappointment and happiness.
00:16:05.920 | And then the arms kind of communicates, I'm trying here.
00:16:10.440 | - Yeah.
00:16:11.280 | - I'm doing my best in this complicated world.
00:16:13.800 | - Exactly, so it's interesting because like,
00:16:17.640 | all of Cosmo is only four degrees of freedom
00:16:20.440 | and two of them are the two treads,
00:16:22.040 | which is for basic movement.
00:16:23.400 | And so you literally have only a head that goes up and down,
00:16:27.160 | a lift that goes up and down, and then your two wheels.
00:16:30.080 | And you have sound and a screen,
00:16:32.960 | and a low resolution screen.
00:16:34.500 | And with that, it's actually pretty incredible
00:16:36.160 | what you can come up with, where, like you said,
00:16:39.480 | it's a really interesting give and take
00:16:42.120 | because there's a lot of ideas far beyond that, obviously,
00:16:44.200 | as you can imagine, where, like you said, how big is it?
00:16:46.320 | How much degrees of freedom?
00:16:47.440 | What does he look like?
00:16:48.600 | What does he sound like?
00:16:50.520 | How does he communicate?
00:16:51.800 | It's a formula that actually scales
00:16:53.420 | way beyond entertainment.
00:16:54.380 | This is the formula for human kind of robot interface
00:16:57.480 | more generally, is you almost have this triangle
00:17:00.280 | between the physical aspects of it,
00:17:02.960 | the mechanics, the industrial design,
00:17:04.440 | what's mass producible, the cost constraints and so forth.
00:17:07.520 | You have the AI side of how do you understand
00:17:10.800 | the world around you, interact intelligently with it,
00:17:13.240 | execute what you want to execute,
00:17:14.560 | so perceive the environment, make intelligent decisions,
00:17:17.960 | and move forward.
00:17:19.760 | And then you have the character side of it.
00:17:22.640 | Most companies that have done anything
00:17:24.640 | in human-robot interaction really miss the mark
00:17:27.920 | or under-invest in the character side of it.
00:17:30.280 | They over-invest in the mechanical side of it,
00:17:32.540 | and then varied results on the AI side of it.
00:17:36.280 | And so the thinking is that you put
00:17:37.600 | more mechanical flexibility into it,
00:17:39.120 | you're gonna do better.
00:17:41.080 | You don't necessarily, you actually create
00:17:42.600 | a much higher bar for a higher ROI
00:17:45.160 | because now your price point goes up,
00:17:46.600 | your expectations go up, and if the AI can't meet it
00:17:49.560 | or the overall experience isn't there, you miss the mark.
00:17:53.360 | - So who, like how did you, through those conversations,
00:17:56.040 | get the cost down so much and make it so simple?
00:17:59.440 | Like, there's a big theme here
00:18:02.320 | because you come from the mecca of robotics,
00:18:04.680 | which is Carnegie Mellon University, robotics.
00:18:08.160 | For all the people I've interacted with
00:18:10.080 | that come from there or just from the world experts
00:18:14.080 | at robotics, they don't, they would never
00:18:17.320 | build something like Cosmo.
00:18:19.240 | And so where did that come from, the simplicity?
00:18:22.840 | - It came from this combination of a team that we had.
00:18:25.360 | It was quite cool because, and by the way,
00:18:27.840 | you ask anybody that's experienced
00:18:29.560 | in the toy entertainment space,
00:18:32.040 | you'll never sell a product over $99.
00:18:34.200 | That was fundamentally false,
00:18:35.560 | and we believed it to be false.
00:18:37.040 | It was because the experience had to kind of meet the mark.
00:18:40.040 | And so we pushed past that amount,
00:18:41.440 | but there was a pressure where the higher you go,
00:18:43.800 | the more seasonal you become and the tougher it becomes.
00:18:46.240 | And so on the cost side, we very quickly partnered up
00:18:50.160 | with some previous contacts that we worked with
00:18:52.160 | where, just as an example,
00:18:53.840 | our head of mechanical engineering
00:18:55.560 | was one of the earliest heads of engineering at Logitech
00:18:59.720 | and has a billion units of consumer products
00:19:02.160 | in circulation that he's worked on.
00:19:03.760 | So like crazy low cost,
00:19:05.960 | high volume consumer product experience.
00:19:07.840 | We had a really great mechanical engineering team
00:19:09.800 | and just a very practical mindset
00:19:11.680 | where we were not gonna compromise on feasibility
00:19:14.360 | in the market in order to chase something
00:19:16.320 | that would be an enabler.
00:19:17.400 | And we pushed a huge amount of expectations
00:19:19.520 | onto the software team where, yes,
00:19:20.840 | we're gonna use cheap, noisy motors and sensors,
00:19:24.440 | but we're gonna fix it on the software side.
00:19:28.040 | Then we found on the design and character side,
00:19:30.920 | there was a faction that was more
00:19:31.960 | from like a game design background
00:19:33.720 | that thought that it should be very games-driven Cosmo,
00:19:36.600 | where you create a whole bunch of games experiences
00:19:38.800 | and it's all about like game mechanics.
00:19:41.000 | And then there was a fashion which my co-founder
00:19:44.400 | and I are the most involved in this,
00:19:45.520 | like really believed in, which was character-driven.
00:19:47.760 | And the argument is that you will never compete
00:19:49.840 | with what you can do virtually from a game standpoint,
00:19:52.360 | but you actually, on the character side,
00:19:54.080 | put this into your wheelhouse
00:19:55.760 | and put it more towards your advantage
00:19:58.000 | because a physical character
00:19:59.080 | has a massively higher impact physically than virtually.
00:20:04.080 | - Okay, can I just pause on that?
00:20:05.280 | 'Cause this is so brilliant.
00:20:07.200 | For people who don't know, Cosmo plays games with you,
00:20:10.440 | but there's also a depth of character.
00:20:13.200 | And I actually, when I was playing with it,
00:20:16.440 | I wondered exactly what is the compelling aspect of this?
00:20:22.440 | Because to me, obviously I'm biased,
00:20:25.000 | but to me, the character, what I enjoyed most, honestly,
00:20:29.040 | or what got me to return to it is the character.
00:20:32.760 | - That's right.
00:20:33.720 | - That's a fascinating discussion of, you're right.
00:20:37.040 | Ultimately, you cannot compete
00:20:40.200 | on the quality of the gaming experience.
00:20:42.640 | - It's too restrictive.
00:20:43.480 | The physical world is just too restrictive
00:20:44.880 | and you don't have a graphics engine, it's like all this.
00:20:47.600 | But on the character side,
00:20:48.920 | and clearly we moved in that direction
00:20:52.080 | as like kind of the winning path.
00:20:54.360 | And we partnered up with this really,
00:20:58.480 | we immediately went towards Pixar and Carlos Bena,
00:21:02.320 | he was one of, had been at Pixar for nine years.
00:21:05.400 | He'd worked on tons of the movies,
00:21:07.040 | including "Wall-E" and others,
00:21:08.880 | and just immediately kind of spoke the language
00:21:11.680 | and it just clicked on how you think about that
00:21:13.760 | like kind of magic and drive.
00:21:14.960 | And then we built out a team with him
00:21:18.120 | as like a really kind of prominent kind of driver of this
00:21:20.840 | with different types of backgrounds
00:21:22.280 | and animators and character developers
00:21:23.920 | where we put these constraints on the team,
00:21:27.800 | but then got them to really try to create magic
00:21:30.880 | despite that.
00:21:32.080 | And we converged on this system
00:21:34.440 | that was at the overlap of character and the character AI
00:21:38.840 | that were, if you imagine the dimensionality of emotions,
00:21:41.760 | happy, sad, angry, surprised, confused, scared,
00:21:46.040 | like you think of these extreme emotions,
00:21:49.400 | we almost like kind of put this challenge
00:21:52.200 | to kind of populate this library of responses
00:21:54.520 | on how do you show the extreme response
00:21:58.040 | that like goes to the extreme spectrum on angry
00:22:00.960 | or frustrated or whatever.
00:22:02.320 | And so that gave us a lot of intuition and learnings.
00:22:05.280 | And then we started parameterizing them
00:22:07.800 | where it wasn't just a fixed recording,
00:22:09.160 | but they were parameterized and had randomness to them
00:22:11.240 | where you could have infinite permutations of happy
00:22:14.200 | and surprised and so forth.
00:22:16.520 | And then we had a behavioral engine
00:22:18.040 | that took the context from the real world
00:22:20.840 | and would interpret it
00:22:22.320 | and then create kind of probability mappings
00:22:24.400 | on what sort of responses you would have
00:22:26.600 | that actually made sense.
00:22:27.480 | And so if Cosmos saw you for the first time in a day,
00:22:31.360 | he'd be really surprised and happy
00:22:33.120 | in the same way that the first time you walk in
00:22:34.680 | and like your toddler sees you, they're so happy,
00:22:37.600 | but they're not gonna be that happy
00:22:38.640 | for the entirety of your next two hours.
00:22:40.640 | But like you have this like spike in response,
00:22:42.800 | or if you leave him alone for too long,
00:22:44.560 | he gets bored and starts causing trouble
00:22:46.160 | and like nudging things off the table.
00:22:48.520 | Or if you beat him in a game,
00:22:50.480 | the most enjoyable emotions are him getting frustrated
00:22:52.880 | and grumpy to a point where our testers
00:22:55.480 | and our customers would be like,
00:22:57.160 | I had to let him win because I don't want him to be upset.
00:22:59.440 | And so you start to like create this feedback loop
00:23:03.320 | where you see how powerful those emotions are.
00:23:05.760 | And just to give you an example,
00:23:06.840 | something as simple as eye contact.
00:23:08.880 | You don't think about it in a movie,
00:23:10.040 | just like it kind of happens like,
00:23:11.880 | camera angles and so forth,
00:23:13.720 | but that's not really a prominent source of interaction.
00:23:17.160 | What happens when a physical character like Cosmo,
00:23:20.400 | when he makes eye contact with you,
00:23:22.200 | it built universal kind of connection,
00:23:26.160 | kids all the way through adults.
00:23:27.760 | And it was truly universal.
00:23:29.560 | It was not like people stopped caring
00:23:31.320 | after 10, 12 years old.
00:23:32.920 | And so we started doing experiments
00:23:36.840 | and we found something as simple
00:23:38.040 | as increasing the amount of eye contact,
00:23:40.880 | like the amount of times in a minute
00:23:42.640 | that he'll look over for your approval
00:23:44.120 | to like kind of make eye contact,
00:23:46.080 | just by I think doubling it.
00:23:48.160 | We increased the play time engagement by 40%.
00:23:50.920 | Like you see these sort of like kind of interactions
00:23:52.880 | where you build that empathy.
00:23:53.720 | And so we studied pets, we studied virtual characters.
00:23:57.440 | There's like a lot of times actually dogs
00:23:59.480 | are one of the perfect,
00:24:01.080 | most perfect influencers behind these sort of interactions.
00:24:05.040 | And what we realized is that the games
00:24:06.960 | were not there to entertain you.
00:24:08.120 | The games were to create context
00:24:10.160 | to bring out the character.
00:24:11.520 | And if you think about the types of games that you played,
00:24:14.320 | they were relatively simple,
00:24:15.640 | but they were always ones to create scenarios
00:24:17.880 | of either tension or winning or losing or surprise
00:24:20.560 | or whatever the case might be.
00:24:22.520 | And they were purely there to just like create context
00:24:25.600 | to where an emotion could feel intelligent and not random.
00:24:28.800 | And in the end, it was all about the character.
00:24:30.880 | - So yeah, there's so many elements to play with here.
00:24:33.880 | So you said dogs.
00:24:35.480 | What lessons do we draw from cats
00:24:37.520 | who don't seem to give a damn about you?
00:24:40.200 | Is that just another character?
00:24:41.680 | Is it another?
00:24:42.520 | - It's just another character.
00:24:43.720 | And so you could almost like in the early explorations,
00:24:46.240 | we thought that it would be really incredible
00:24:48.480 | if you had a diversity of characters
00:24:50.440 | where you almost help encourage which direction it goes,
00:24:52.920 | just like in a role-playing game.
00:24:54.960 | And you had, like, think of like the seven dwarves sort of.
00:24:59.600 | And initially we even thought that it would be amazing
00:25:02.680 | if like their characters actually help them
00:25:07.680 | have strengths and weaknesses in some,
00:25:10.120 | like whatever they end up doing.
00:25:11.360 | Like some are scared, some are arrogant,
00:25:15.240 | some are super warm and like kind of friendly.
00:25:18.800 | And in the end we focused on one
00:25:20.320 | because it made it very clear that,
00:25:21.760 | hey, we got to build out enough depth here
00:25:23.120 | because you're kind of trying to expand.
00:25:26.160 | It's almost like how long can you maintain a fiction
00:25:28.320 | that this character is alive
00:25:30.280 | to where the person's explorations don't hit a boundary,
00:25:33.560 | which happens almost immediately with typical toys.
00:25:36.840 | And even with video games,
00:25:39.480 | how long can we create that immersive experience
00:25:42.240 | to where you expand the boundary?
00:25:43.640 | And one of the things we realized
00:25:44.800 | is that you're just way more forgiving
00:25:47.520 | when something has a personality and it's physical.
00:25:51.080 | That is the key that unlocks robotics interacting
00:25:56.080 | in the physical world more generally
00:25:57.600 | is that when you don't have a personality
00:26:02.360 | and you make a mistake as a robot,
00:26:04.040 | the stupid robot made a mistake.
00:26:05.560 | Why is it not perfect?
00:26:06.800 | When you have a character and you make a mistake,
00:26:08.800 | you have empathy and it becomes endearing
00:26:10.440 | and you're way more forgiving.
00:26:11.520 | And that was the key that was like,
00:26:13.160 | I think goes far, far beyond entertainment.
00:26:14.920 | - It actually builds the depth of the personality,
00:26:17.400 | the mistakes.
00:26:18.240 | So let me ask the movie "Her" question then.
00:26:20.920 | How, so "Cosmos" feels like the early days
00:26:26.600 | of something that will obviously be prevalent
00:26:29.040 | throughout society at a scale that we cannot even imagine.
00:26:34.040 | My sense is it seems obvious
00:26:38.520 | that these kinds of characters will permeate society
00:26:41.880 | and we'll be friends with them.
00:26:43.920 | We'll be interacting with them in different ways.
00:26:45.640 | In a way we, I mean, you don't think of it this way,
00:26:48.080 | but when you play video games,
00:26:50.540 | they're often cold and impersonal,
00:26:55.400 | but even then you think about role-playing games,
00:26:59.560 | you become friends with certain characters in that game.
00:27:02.840 | They don't remember much about you.
00:27:04.960 | They're just telling a story.
00:27:07.960 | It's exactly what you're saying.
00:27:09.360 | They exist in that virtual world.
00:27:11.380 | But if they acknowledge that you exist
00:27:13.560 | in this physical world,
00:27:14.400 | if the characters in the game remember that you exist,
00:27:18.440 | that you, like for me, like Lex,
00:27:20.820 | they understand that I'm a human being
00:27:23.040 | who has like hopes and dreams and so on.
00:27:26.000 | It seems like there's going to be like billions,
00:27:31.000 | if not trillions of Cosmos in the world.
00:27:34.080 | So if we look at that future,
00:27:36.660 | there's several questions to ask.
00:27:38.080 | How intelligent does that future Cosmo
00:27:40.760 | need to be to create fulfilling relationships
00:27:45.760 | like friendships?
00:27:48.300 | - Yeah, it's a great question.
00:27:50.120 | And part of it was the recognition
00:27:51.520 | that it's going to take time to get there,
00:27:52.580 | because it has to be a lot more intelligent,
00:27:54.840 | because what's good enough to be a magical experience
00:27:58.000 | for an eight-year-old,
00:28:00.380 | it's a higher bar to do that,
00:28:03.000 | be like a pet in the home,
00:28:04.920 | or to help with functional interface
00:28:07.540 | in an office environment or in a home or and so forth.
00:28:10.520 | And so, and the idea was that you build on that
00:28:12.840 | and you kind of get there.
00:28:13.760 | And as technology becomes more prevalent
00:28:16.040 | and less expensive and so forth,
00:28:17.360 | you can start to kind of work up to it.
00:28:19.400 | But you're absolutely right at the end of the day,
00:28:22.200 | we almost equated it to how the touchscreen
00:28:24.920 | created like this really novel interface
00:28:26.680 | to physical kind of devices like this.
00:28:29.080 | This is the extension of it
00:28:30.480 | where you have much richer physical interaction
00:28:33.680 | in the real world.
00:28:34.520 | This is the enabler for it.
00:28:36.360 | And it shows itself in a few kind of really obvious places.
00:28:39.520 | So just take something as simple as a voice assistant.
00:28:42.240 | You will never, most people will never tolerate
00:28:45.400 | an Alexa or a Google Home,
00:28:47.080 | just starting a conversation proactively
00:28:50.560 | when you weren't kind of expecting it,
00:28:51.960 | because it feels weird.
00:28:53.360 | It's like you were listening and like,
00:28:54.640 | and then now you're kind of, it feels intrusive.
00:28:57.200 | But if you had a character,
00:28:59.680 | like a cat that touches and gets your attention,
00:29:01.800 | or a toddler, like you never think twice about it.
00:29:03.520 | And what we found really kind of immediately
00:29:05.380 | is that these types of characters like Cosmo
00:29:07.840 | and they would like roam around
00:29:08.720 | and kind of get your attention.
00:29:09.960 | And we had a future version that was always on
00:29:12.040 | kind of called Vector.
00:29:13.260 | People were way more forgiving.
00:29:14.880 | And so you could initiate interaction
00:29:17.080 | in a way that is not acceptable for machines.
00:29:21.160 | And in general, you know,
00:29:23.080 | there's a lot of ways to customize it,
00:29:24.800 | but it makes people who are skeptical of technology
00:29:27.820 | much more comfortable with it.
00:29:29.240 | There was like, there were a couple of really,
00:29:30.800 | really prominent examples of this.
00:29:32.880 | So when we launched in Europe,
00:29:34.720 | and so we were in, I think like a dozen countries,
00:29:38.360 | if I remember correctly,
00:29:39.200 | but like we went pretty aggressively in launching
00:29:41.880 | in Germany and France and UK.
00:29:45.320 | And we were very worried in Europe
00:29:46.760 | because there's obviously like a really,
00:29:48.520 | a socially higher bar for privacy and security
00:29:51.760 | where you've heard about how many companies
00:29:53.920 | have had troubles on things that might've been okay
00:29:57.440 | in the US, but like are just not okay
00:29:59.120 | in Germany and France in particular.
00:30:01.000 | And so we were worried about this
00:30:02.080 | because you have, you know, Cosmo,
00:30:04.280 | who's, you know, in a future product Vector,
00:30:08.080 | like where you have cameras, you have microphones,
00:30:10.040 | it's kind of connected and like you're playing with kids
00:30:12.520 | and like in these experiences,
00:30:14.720 | and you're like, this is like ripe to be like a nightmare
00:30:17.560 | if you're not careful.
00:30:19.440 | And the journalists are like notoriously
00:30:22.440 | like really, really tough on these sorts of things.
00:30:25.000 | We were shocked and we prepared so much
00:30:27.840 | for what we would have to encounter.
00:30:30.120 | We were shocked in that not once
00:30:32.520 | from any journalists or customer
00:30:34.840 | did we have any complaints beyond like a really casual
00:30:38.640 | kind of question.
00:30:40.120 | And it was because of the character
00:30:42.560 | where when the conversation came up,
00:30:46.000 | it was almost like, well, of course he has to see and hear
00:30:48.400 | how else is he gonna be alive and interacting with you?
00:30:50.920 | And it completely disarmed this like fear of technology
00:30:54.800 | that enabled this interaction to be much more fluid.
00:30:57.680 | And again, like entertainment was a proving ground,
00:30:59.520 | but that is like, you know, there's like ingredients there
00:31:02.560 | that carry over to a lot of other elements down the road.
00:31:06.200 | - That's hilarious that we're a lot less concerned
00:31:08.440 | about privacy if the thing is value and charisma.
00:31:12.160 | I mean, that's true for all of human to human interactions.
00:31:16.040 | - It's an understanding of intent where like,
00:31:18.120 | well, he's looking at me, he can see me.
00:31:19.560 | If he's not looking at me, he can't see me.
00:31:20.920 | Right, so it's almost like you're communicating intent
00:31:24.520 | and with that intent, people were like kind of
00:31:28.160 | a more understanding and calmer.
00:31:29.360 | And it's interesting, it was just the earliest
00:31:32.080 | kind of version of starting to experiment with this,
00:31:33.880 | but it wasn't an enabler.
00:31:35.360 | And then you have like completely different dimensions
00:31:38.080 | where like, you know, kids with autism
00:31:39.320 | had like an incredible connection with Cosmo
00:31:41.360 | that just went beyond anything we'd ever seen.
00:31:43.160 | And we have like these just letters
00:31:45.240 | that we would receive from parents.
00:31:46.520 | And we had some research projects kind of going on
00:31:48.760 | with some universities on studying this.
00:31:50.920 | But there's an interesting dimension there
00:31:54.200 | that got unlocked that just hadn't existed before
00:31:56.880 | that has these really interesting kind of links
00:32:00.360 | into society and a potential building block
00:32:03.920 | of future experience.
00:32:05.040 | - So if you look out into the future,
00:32:07.840 | do you think we will have beyond a particular game,
00:32:12.840 | you know, a companion like Her, like the movie Her,
00:32:18.080 | or like a Cosmo that's kind of asks you
00:32:21.560 | how your day went too, right?
00:32:24.920 | You know, like a friend.
00:32:26.160 | How many years away from that do you think we are?
00:32:29.440 | What's your intuition?
00:32:30.880 | - Good question.
00:32:31.720 | So I think the idea of a different type of character,
00:32:34.720 | like more closer to like kind of a pet style companionship,
00:32:37.360 | it will come way faster.
00:32:38.560 | And there's a few reasons.
00:32:41.240 | One is like to do something like in Her,
00:32:44.440 | that's like effectively almost general AI.
00:32:47.640 | And the bar is so high that if you miss it by a bit,
00:32:49.880 | you hit the uncanny valley where it just becomes creepy
00:32:51.920 | and like, and not appealing.
00:32:55.480 | Because the closer you try to get to a human
00:32:57.200 | in form and interface and voice, the harder it becomes.
00:33:00.720 | Whereas you have way more flexibility
00:33:03.840 | on still landing a really great experience
00:33:06.680 | if you embrace the idea of a character.
00:33:08.160 | And that's why, one of the other reasons
00:33:10.680 | why we didn't have a voice,
00:33:12.400 | and also why like a lot of video game characters,
00:33:15.280 | like Sims, for example, does not have a voice
00:33:17.800 | when you think about it.
00:33:19.160 | It wasn't just a cost savings like for them.
00:33:22.000 | It was actually for all of these purposes,
00:33:24.320 | it was because when you have a voice,
00:33:26.480 | you immediately narrow down the appeal
00:33:28.080 | to some particular demographic or age range
00:33:30.360 | or kind of style or gender.
00:33:33.680 | If you don't have a voice,
00:33:34.920 | people interpret what they want to interpret.
00:33:37.440 | And an eight year old might get
00:33:38.640 | a very different interpretation than a 40 year old,
00:33:41.240 | but you create a dynamic range.
00:33:42.480 | And so you just, you can lean into these advantages
00:33:44.840 | much more in something that doesn't resemble a human.
00:33:48.120 | And so that'll come faster.
00:33:49.920 | I don't know when a human like,
00:33:52.080 | that's just still like Matt,
00:33:54.360 | just complete R&D at this point.
00:33:56.120 | The chat interfaces are getting way more interesting
00:33:59.120 | and richer, but it's still a long way to go
00:34:01.400 | to kind of pass the test of, you know.
00:34:04.600 | - Well, let me, like, let's consider like,
00:34:07.680 | let me play devil's advocate.
00:34:09.800 | So Google is a very large company that's servicing,
00:34:13.520 | it's creating a very compelling product
00:34:15.320 | that wants to provide a service to a lot of people.
00:34:17.680 | But let's go outside of that.
00:34:20.040 | You said characters.
00:34:21.720 | - Yeah.
00:34:22.560 | - It feels like, and you also said that
00:34:24.560 | it requires general intelligence to be a successful
00:34:27.720 | participant in a relationship,
00:34:29.440 | which could explain why I'm single.
00:34:30.880 | This is very, but the, I honestly want to push back
00:34:34.480 | on that a little bit because I feel like,
00:34:37.400 | is it possible that if you're just good
00:34:40.280 | at playing a character, in a movie,
00:34:43.040 | there's a bunch of characters.
00:34:44.240 | If you just understand what creates compelling characters,
00:34:47.760 | and then you just are that character,
00:34:50.080 | and you exist in the world, and other people find you,
00:34:53.360 | and they connect with you, just like you do
00:34:54.960 | when you talk to somebody at a bar.
00:34:56.600 | I like this character, this character's kind of shady,
00:34:58.840 | I don't like them.
00:34:59.800 | You pick the ones that you like,
00:35:01.600 | and, you know, maybe it's somebody that's,
00:35:04.480 | reminds you of your father or mother,
00:35:06.520 | I don't know what it is, but the Freudian thing,
00:35:08.920 | but there's some kind of connection that happens,
00:35:11.600 | and that's the cosmo you connect to.
00:35:14.360 | That's the future cosmo you connect,
00:35:15.920 | and that's, so I guess the statement I'm trying to make,
00:35:19.040 | is it possible to achieve a depth of friendship
00:35:22.200 | without solving general intelligence?
00:35:24.560 | - I think so, and it's about intelligent
00:35:26.200 | kind of constraints, right?
00:35:27.320 | And just, you set expectations and constraints,
00:35:31.160 | such that in the space that's left, you can be successful.
00:35:33.440 | And so, you can do that by having a very focused domain
00:35:37.200 | that you can operate in.
00:35:38.040 | For example, you're a customer support agent
00:35:39.280 | for a particular product, and you create intelligence
00:35:41.440 | and a good interface around that.
00:35:43.040 | Or, you know, kind of in the personal companionship side,
00:35:46.360 | you can't be everything to, across the board,
00:35:50.040 | you kind of solve those constraints.
00:35:51.720 | And I think it's possible.
00:35:53.720 | My worry is, right now, I don't see anybody
00:35:58.520 | that has picked up on where kind of cosmo left off,
00:36:02.520 | and is pushing on it in the same way.
00:36:04.160 | And so, I don't know if it's the sort of thing
00:36:05.680 | where, similar to like how, you know, in dot-com,
00:36:08.360 | there were all these concepts that we considered,
00:36:10.240 | like, you know, that didn't work out, or like, failed,
00:36:13.000 | or like, were too early or whatnot,
00:36:14.520 | and then 20 years later, you have these, like,
00:36:16.160 | incredible successes on almost the same concept.
00:36:18.560 | Like, it might be that sort of thing
00:36:20.040 | where, like, there's another pass at it
00:36:21.680 | that happens in five years or in 10 years.
00:36:24.000 | But it does feel like that appreciation of that,
00:36:28.040 | like, the three-legged stool, if you will,
00:36:31.040 | between, like, you know, the hardware,
00:36:32.360 | the AI, and the character, that balance,
00:36:35.240 | it's hard to, I'm not aware of anywhere right now
00:36:39.080 | where, like, that same kind of aggressive drive
00:36:42.000 | with the value on the character is happening.
00:36:44.360 | And so- - To me,
00:36:45.720 | just a prediction, exactly as you said,
00:36:48.440 | something that looks awfully a lot like cosmo,
00:36:50.800 | not in the actual physical form,
00:36:52.640 | but in the three-legged stool, something like that,
00:36:55.680 | in some number of years, will be a trillion-dollar company.
00:36:58.560 | I don't understand, like, it's obvious to me
00:37:01.600 | that, like, character, not just as robotic companions,
00:37:06.600 | but in all our computers, they'll be there.
00:37:10.160 | It's like, Clippy was, like, two legs of that stool
00:37:15.160 | or something like that.
00:37:17.320 | I mean, those are all different attempts.
00:37:19.680 | What's really confusing to me is they're born,
00:37:24.680 | these attempts, and everybody gets excited,
00:37:27.880 | and for some reason, they die,
00:37:29.400 | and then nobody else tries to pick it up.
00:37:31.680 | And then maybe a few years later,
00:37:33.600 | a crazy guy like you comes around
00:37:36.040 | with just enough brilliance and vision to create this thing
00:37:40.760 | and is born, a lot of people love it.
00:37:43.480 | A lot of people get excited,
00:37:45.000 | but maybe the timing is not right yet.
00:37:47.440 | And then when the timing is right, it just blows up.
00:37:51.160 | It just keeps blowing up more and more
00:37:53.480 | until it just blows up, and I guess everything
00:37:56.040 | in the full span of human civilization
00:37:58.040 | collapses eventually.
00:37:59.800 | - And that wouldn't surprise me at all.
00:38:01.160 | And, like, what's gonna be different
00:38:02.240 | in another five years or 10 years or whatnot?
00:38:04.680 | Physical component costs will continue
00:38:06.840 | to come down in price, and mobile devices
00:38:10.680 | and computation's gonna become more and more prevalent,
00:38:12.600 | as well as cloud as a big tool to offload cost.
00:38:16.760 | AI is gonna be a massive transformation
00:38:19.120 | compared to what we dealt with,
00:38:20.840 | where everything from voice understanding
00:38:23.360 | to just kind of a broader contextual understanding
00:38:28.360 | and mapping of semantics and understanding scenes
00:38:34.560 | and so forth.
00:38:35.400 | And then the character side will continue
00:38:37.200 | to kind of progress as well,
00:38:38.520 | 'cause that magic does exist,
00:38:39.600 | it just exists in different forms.
00:38:41.400 | And you have just the brilliance
00:38:43.120 | of the tapping and animation and these other areas
00:38:46.920 | where that was a big unlock in film, obviously.
00:38:51.920 | And so I think, yeah, the pieces can reconnect
00:38:54.360 | and the building blocks are actually gonna be
00:38:55.640 | way more impressive than they were five years ago.
00:38:58.160 | - So in 2019, Anki, the company that created Cosmo,
00:39:04.600 | the company that you started, had to shut down.
00:39:08.520 | How did you feel at that time?
00:39:11.000 | - Yeah, it was tough.
00:39:13.320 | That was a really emotional stretch
00:39:15.520 | and it was a really tough year.
00:39:17.880 | Like about a year ahead of that
00:39:19.560 | was actually a pretty brutal stretch
00:39:21.600 | because we were kind of life or death on many, many moments,
00:39:26.600 | just navigating these insane kind of just ups and downs
00:39:31.040 | and barriers.
00:39:32.280 | And the thing that made it,
00:39:33.760 | just rewinding a tiny bit,
00:39:37.640 | like what ended up being really challenging
00:39:40.320 | about it as a business where is,
00:39:42.200 | from a commercial standpoint
00:39:44.400 | and customer reception standpoint,
00:39:46.040 | there was a lot of things you could point to
00:39:47.200 | that were like pretty big successes.
00:39:48.840 | So millions of units, got to like pretty serious revenue,
00:39:53.320 | like kind of close to a hundred million annual revenue,
00:39:56.760 | number one kind of product in kind of various categories.
00:40:00.280 | But it was pretty expensive,
00:40:02.040 | ended up being very seasonal where something like 85%
00:40:04.960 | of our volume was in Q4 because it was a present
00:40:09.000 | and it was expensive to market it
00:40:11.080 | and explain it and so forth.
00:40:13.280 | And even though the volume was like really sizable
00:40:16.120 | and like the reviews are really fantastic,
00:40:18.960 | forecasting and planning for it
00:40:21.040 | and managing the cash operations was just brutal.
00:40:24.040 | Like it was absolutely brutal.
00:40:25.840 | You don't think about this when you're starting a company
00:40:27.440 | or when you have a few million in revenue
00:40:30.320 | because it's just your biggest costs
00:40:32.520 | are kind of just your head count and operations
00:40:34.280 | and everything's ahead of you.
00:40:35.800 | But we got to a point where,
00:40:37.680 | if you look at the entire year,
00:40:41.280 | you have to operate your company,
00:40:43.760 | pay all the people and so forth.
00:40:45.360 | You have to pay for the manufacturing,
00:40:46.600 | the marketing and everything else
00:40:48.600 | to do your sales in mostly November, December
00:40:50.480 | and then get paid in December, January by retailers.
00:40:53.520 | And those swings were really rough
00:40:57.040 | and just made it like so difficult
00:40:59.160 | because the more successfully became,
00:41:00.440 | the more wild those swings became
00:41:02.520 | because you'd have to like spend,
00:41:04.920 | tens of millions of dollars on inventory,
00:41:06.760 | tens of millions of dollars on marketing
00:41:08.280 | and tens of millions of dollars on payroll
00:41:10.440 | and everything else.
00:41:11.440 | - And then the bigger dip
00:41:12.720 | and then you're waiting for the key four.
00:41:14.600 | - Yeah, and it's not a business that like is recurring
00:41:17.080 | kind of month to month and predictable.
00:41:18.400 | And it's just, and then you're walking in your forecast
00:41:20.560 | in July, maybe August, if you're lucky.
00:41:24.320 | And it's also like very hit driven and seasonal
00:41:28.080 | where like you don't have the sort of continued
00:41:30.440 | kind of slow growth like you do
00:41:31.840 | in some other consumer electronics industries.
00:41:34.200 | And so before then like hardware
00:41:36.160 | kind of like went out of favor too.
00:41:37.480 | And so you had Fitbit and GoPro drop
00:41:39.800 | from 10 billion revenue to 1 billion revenue
00:41:41.440 | and hardware companies are getting valued
00:41:42.720 | at like 1X revenue oftentimes, which is tough, right?
00:41:46.440 | And so we effectively kind of got caught in the middle
00:41:49.720 | where we were trying to quickly evolve out of entertainment
00:41:53.200 | and move into some other categories,
00:41:55.480 | but you can't let go of that business
00:41:56.920 | because like that's what you're valued on,
00:41:58.400 | that's what you're raising money on.
00:42:00.320 | But there was no path to kind of pure profitability
00:42:02.520 | just there because it was such,
00:42:04.880 | specific type of price points and so forth.
00:42:07.840 | And so we tried really hard to make that transition.
00:42:12.080 | And yeah, we had a financing round
00:42:14.760 | that fell apart at the last second.
00:42:16.040 | And effectively there was just no path
00:42:17.880 | to kind of get through that
00:42:19.800 | and get to the next kind of like holiday season.
00:42:22.080 | And so we ended up selling some of the assets
00:42:25.280 | and kind of winding down the company.
00:42:26.320 | It was brutal.
00:42:27.880 | Like I was very transparent with the company
00:42:30.280 | like in the team while we were going through it,
00:42:32.880 | where actually despite how challenging that period was,
00:42:35.880 | very few people left.
00:42:36.840 | I mean, like people love the vision, the team,
00:42:38.960 | the culture of the like kind of chemistry
00:42:40.880 | and kind of what we were doing.
00:42:41.720 | There was just a huge amount of pride there.
00:42:43.520 | And then we wanted to see it through.
00:42:44.640 | And we felt like we had a shot
00:42:46.440 | to kind of get through these checkpoints.
00:42:48.640 | We ended up, and I mean by brutal,
00:42:52.240 | I mean like literally like days of cash,
00:42:54.040 | like three, four different times runway,
00:42:56.720 | like in the year, you know, kind of before it,
00:42:59.800 | where you're like playing games of chicken
00:43:02.640 | on negotiating credit line timelines
00:43:05.680 | and like repayment terms
00:43:08.400 | and how to get like a bridge loan from an investor.
00:43:10.400 | It's just like level of stress
00:43:12.080 | that like as hard as things might be anywhere else,
00:43:14.160 | like you'll never come close to that
00:43:16.360 | where you feel that like responsibility
00:43:18.120 | for 200 plus people, right?
00:43:20.960 | And so we were very transparent during our fundraise
00:43:23.160 | on who we're talking to, the challenges that we have,
00:43:26.400 | how it's going and when things are going well,
00:43:28.560 | when things were tough.
00:43:29.880 | And so it wasn't a complete shock when it happened,
00:43:32.520 | but it was just very emotional where like,
00:43:34.720 | you know, when we announced it finally,
00:43:37.160 | that like, you know, we basically were just like watching
00:43:40.880 | kind of like, you know, the runway
00:43:42.000 | and trying to kind of time it.
00:43:42.920 | And when we realized that like,
00:43:43.800 | we didn't have any more outs,
00:43:45.040 | we wanted to like kind of wind it down,
00:43:47.040 | make sure that it was like clean
00:43:48.640 | and, you know, we could like kind of take care of people
00:43:50.520 | the best we could, but they like broke down crying
00:43:53.160 | at all hands and somebody else had to step in for a bit.
00:43:56.080 | And like, it was just very, very emotional,
00:43:57.880 | but the beautiful part is like afterwards,
00:43:59.840 | like everybody stayed at the office
00:44:01.000 | till like two, three in the morning,
00:44:02.720 | just like drinking and hanging out
00:44:04.600 | and telling stories and celebrating.
00:44:06.040 | And it was just like one of the best,
00:44:08.600 | for many people it was like the best kind of work experience
00:44:11.160 | that they had and there was a lot of pride in what we did.
00:44:14.040 | And it wasn't anything obvious we could point to that like,
00:44:16.200 | hey, if only we had done that different,
00:44:17.800 | things would have been completely different.
00:44:19.000 | It was just like the physics didn't line up.
00:44:23.080 | And, but the experience was pretty incredible,
00:44:27.760 | but it was hard.
00:44:28.600 | Like it was, it had this feeling
00:44:30.720 | that there was just like incredible beauty
00:44:32.400 | in both the technology and products and the team that,
00:44:36.360 | you know, there's a lot there that like in the,
00:44:41.040 | you know, right context could have been pretty incredible,
00:44:44.760 | but it was emotional just.
00:44:47.000 | - Yeah, just thinking, I mean,
00:44:48.480 | just looking at this company, like you said,
00:44:51.160 | the product and technology, but the vision,
00:44:54.240 | the implementation, you got the cost down very low.
00:44:58.160 | And the compelling, the nature of the product was great.
00:45:01.720 | So many robotics companies failed at this,
00:45:04.960 | at the robot was too expensive.
00:45:07.480 | It didn't have the personality.
00:45:09.560 | It didn't really provide any value,
00:45:11.880 | like a sufficient value to justify the price.
00:45:14.280 | So like you succeeded where basically
00:45:17.320 | every single other robotics company, or most of them,
00:45:20.360 | they're like go in the category of social robotics
00:45:23.120 | have kind of failed.
00:45:25.440 | And I mean, it's quite tragic.
00:45:29.280 | I remember reading that,
00:45:31.120 | I'm not sure if I talked to you before that happened or not,
00:45:34.680 | but I remember, you know, I'm distant from this.
00:45:37.680 | I remember being heartbroken reading that
00:45:40.640 | because like if Cosmo is not gonna succeed,
00:45:45.640 | what is going to succeed?
00:45:48.960 | 'Cause that to me was incredible.
00:45:51.520 | Like it was an incredible idea.
00:45:54.880 | Cost is down, it's just like the most minimal design
00:45:59.880 | in physical form that you could do.
00:46:03.340 | It's really compelling, the balance of games.
00:46:06.320 | So it's a fun toy.
00:46:08.080 | It's a great gift for all kinds of age groups, right?
00:46:12.200 | It's just, it's compelling in every single way.
00:46:14.920 | And it seemed like it was a huge success
00:46:17.880 | and it failing was, I don't know,
00:46:22.200 | there was heartbreak on many levels for me
00:46:24.400 | just as an external observer is I was thinking,
00:46:28.240 | how hard is it to run a business?
00:46:31.280 | That's what I was thinking.
00:46:32.120 | Like if this failed, this must have failed
00:46:34.040 | because it's obviously not like, yeah, it's business.
00:46:39.040 | Maybe it's some aspect of the manufacturing and so on,
00:46:41.920 | but I'm now realizing it's also not just that,
00:46:44.000 | it's sales, marketing, all those.
00:46:47.400 | - It's everything, right?
00:46:48.240 | Like how do you explain something
00:46:49.400 | that's like a new category to people
00:46:51.040 | that like have all these predispositions?
00:46:52.400 | And so like, it had some of the hardest elements of,
00:46:57.400 | if you were to pick a business,
00:46:58.460 | it had some of the hardest customer dynamics
00:47:01.660 | because like to sell a $150 product,
00:47:04.120 | you got to convince both the child that wanted
00:47:06.760 | and the parents to agree that it's valuable.
00:47:09.120 | So you're having like this dual prong marketing challenge.
00:47:11.380 | You have manufacturing,
00:47:12.320 | you have like really high precision on the components
00:47:14.840 | that you need, you have the AI challenges.
00:47:16.140 | So there were a lot of tough elements,
00:47:17.920 | but it was this feeling where like,
00:47:19.320 | it was just really great alignment of unique strength
00:47:22.720 | across kind of like all these different areas,
00:47:24.600 | just incredible like, you know,
00:47:26.600 | kind of character and animation team between this,
00:47:28.520 | like Carlos and it was like a character director,
00:47:30.480 | Dave that came on board and like,
00:47:31.920 | really great people there, the AI side,
00:47:34.160 | the manufacturing, the, you know,
00:47:38.320 | where like never missing a launch, right?
00:47:41.080 | And actually, you know, he kind of hit that quality.
00:47:42.840 | It was, yeah, it was heartbreaking,
00:47:45.240 | but here's one neat thing is like,
00:47:48.140 | we had so much like fan mail from kind of kids,
00:47:51.220 | parents, like I actually like,
00:47:52.600 | there was a bunch that collected in the end
00:47:54.900 | that I actually saved and like, I never,
00:47:57.940 | it was too emotional to open it
00:47:59.060 | and I still haven't opened it.
00:48:00.780 | And so I actually have this giant envelope
00:48:02.280 | of like a stack this much of like letters from,
00:48:04.820 | you know, kids and families,
00:48:06.420 | just like every, you got a permutation you can imagine.
00:48:09.100 | And so planning to kind of, I don't know,
00:48:10.860 | maybe like a five year, you know,
00:48:12.300 | five year to some year reunion,
00:48:14.180 | just inviting everybody over
00:48:15.460 | and we'll just like kind of dig into it
00:48:17.240 | and kind of bring back some memories,
00:48:18.860 | but you know, good impact.
00:48:21.100 | - And well, I think there will be companies,
00:48:24.820 | maybe Waymo and Google will be somehow involved
00:48:28.220 | that will carry this flag forward and will make you proud,
00:48:32.980 | whether you're involved or not.
00:48:34.740 | I think this is one of the greatest robotics companies
00:48:37.540 | in the history of robotics.
00:48:39.660 | So you should be proud.
00:48:41.680 | It's still tragic to know that, you know,
00:48:45.240 | 'cause you read all the stories of Apple
00:48:47.520 | and let's see, SpaceX and like companies
00:48:52.520 | that were just on the verge of failure several times
00:48:56.600 | through that story.
00:48:57.440 | And they just, it's almost like a roll of the dice,
00:48:59.960 | they succeeded.
00:49:00.960 | And here's a roll of the dice that just happened to go.
00:49:04.200 | - And that's the appreciation that like,
00:49:05.560 | when you really like talk to a lot of the founders,
00:49:08.280 | like everybody goes through those moments
00:49:09.980 | and sometimes it really is a matter of like, you know,
00:49:13.080 | timing, a little bit of luck,
00:49:14.560 | like some things are just out of your control
00:49:16.400 | and you get a much deeper appreciation
00:49:20.560 | for just the dimensionality of that challenge.
00:49:24.120 | But the great thing is, is that like a lot of the team
00:49:26.680 | actually like stayed together.
00:49:27.840 | And so there were actually, you know,
00:49:30.120 | a couple of companies that we kind of kept big chunks
00:49:32.760 | of the team together and we actually kind of helped align
00:49:34.760 | this, you know, to help people out as well.
00:49:38.020 | And one of them was Waymo where a majority of the AI
00:49:42.960 | and robotics team actually had the exact background
00:49:46.920 | that you would look for in like kind of AV space.
00:49:48.560 | And it was a space that a lot of us like, you know,
00:49:51.640 | worked on in grad school,
00:49:52.800 | we're always passionate about and ended up,
00:49:55.400 | you know, maybe the time, you know,
00:49:57.280 | serendipitous timings from another perspective
00:49:59.280 | where like kind of landed in a really unique circumstance
00:50:03.720 | that's actually been quite exciting too.
00:50:05.880 | - So it's interesting to ask you just your thoughts.
00:50:09.100 | Cosmos still lives on under Dreamlabs, I think.
00:50:13.660 | Is that, are you tracking the progress there
00:50:17.060 | or is it too much pain?
00:50:18.300 | Is that something that you're excited to see
00:50:22.540 | where that goes?
00:50:24.060 | - So keeping an eye on it, of course,
00:50:26.020 | just out of curiosity and obviously just kind of care
00:50:28.340 | for product line, I think it's deceptive how complex it is
00:50:31.580 | to manufacture and evolve that product line.
00:50:35.380 | And the amount of experiences that are required
00:50:40.340 | to complete the picture and be able to move that forward.
00:50:43.380 | And I think that's gonna make it pretty hard
00:50:45.500 | to do something really substantial with it.
00:50:48.540 | It would be cool if like even the product
00:50:50.080 | in the way it was, was able to be manufactured.
00:50:51.980 | - Yes. - Again, that would--
00:50:52.820 | - Which is the current goal, I suppose.
00:50:54.420 | - Yeah, which would be neat.
00:50:55.760 | But I think it's deceptive how tricky that is
00:50:59.800 | on like everything from the quality control, the details,
00:51:03.100 | and then like technology changes that forces you to, Rick,
00:51:07.060 | reinvent and update certain things.
00:51:09.620 | So I haven't been super close to it,
00:51:11.860 | but just kind of keeping an eye on it.
00:51:13.620 | - Yeah, it's really interesting how,
00:51:16.140 | it's deceptively difficult, just as you're saying.
00:51:18.340 | For example, those same folks, and I've spoken with them,
00:51:23.340 | they're, they partnered up with Rick and Morty creators
00:51:27.420 | to do the Butter Robot.
00:51:30.460 | - Yeah. - I love the idea.
00:51:31.780 | I just recently, I'd kind of half-assed
00:51:35.220 | watched Rick and Morty previously,
00:51:36.940 | but now I just watched like the first season.
00:51:39.280 | It's such a brilliant show.
00:51:41.180 | I like, I did not understand how brilliant that show is.
00:51:45.140 | And obviously I think in season one
00:51:47.060 | is where the Butter Robot comes along
00:51:48.980 | for just a few minutes or whatever.
00:51:51.900 | But I just fell in love with the Butter Robot.
00:51:53.860 | The sort of the, that particular character,
00:51:56.440 | just like you said, there's characters,
00:51:58.220 | you can create personalities, you can create,
00:52:00.100 | and that particular robot who's doing a particular task
00:52:05.100 | realizes, you know, this, like realizes,
00:52:10.020 | ask the existential question,
00:52:12.300 | the myth of Sisyphus question that Camus writes about.
00:52:15.940 | It's like, is this all there is?
00:52:17.780 | Is he, moves butter?
00:52:19.700 | But you know, that realization,
00:52:23.140 | that's a beautiful little realization for a robot
00:52:25.380 | that my purpose is very limited to this particular task.
00:52:29.580 | It's humor, of course, it's darkness,
00:52:32.140 | it's a beautiful mix.
00:52:33.260 | But so they wanna release that Butter Robot,
00:52:37.140 | but something tells me
00:52:38.780 | that to do the same depth of personality as Cosmo had,
00:52:44.420 | the same richness, it would be on the manufacturing,
00:52:47.940 | on the AI, on the storytelling, on the design,
00:52:51.460 | it's going to be very, very difficult.
00:52:53.300 | It could be a cool sort of toy for Rick and Morty fans,
00:52:58.580 | but to create the same depth of existential angst
00:53:03.580 | that the Butter Robot symbolizes is really,
00:53:08.500 | that's the brave effort you've succeeded at with Cosmo,
00:53:12.100 | but it's not easy, it's really difficult.
00:53:14.900 | - You can fail in almost any one of the kind of dimensions
00:53:17.540 | and like, and yeah, it takes, you know,
00:53:20.980 | yeah, you need convergence of a lot of different skill sets
00:53:23.460 | to try to pull that off, yeah.
00:53:25.860 | On this topic, let me ask you for some advice,
00:53:28.700 | because as I've been watching Rick and Morty,
00:53:31.620 | as I told myself, I have to build the Butter Robot,
00:53:34.500 | just as a hobby project.
00:53:36.100 | And so I got a nice platform for it with treads
00:53:38.860 | and there's a camera that moves up and down and so on.
00:53:41.900 | I'll probably paint it, but the question I'd like to ask,
00:53:48.020 | there's obvious technical questions I'm fine with,
00:53:50.620 | communication, the personality, storytelling,
00:53:53.140 | all those kinds of things.
00:53:55.500 | I think I understand the process of that,
00:53:57.260 | but how do you know when you got it right?
00:54:02.260 | So with Cosmo, how did you know this is great?
00:54:06.780 | Like, or something is off, like,
00:54:09.940 | is this brainstorming with the team?
00:54:12.300 | Do you know it when you see it?
00:54:13.660 | Is it like love at first sight?
00:54:15.820 | It's like, this is right.
00:54:17.260 | Or like, I guess if we think of it as an optimization space,
00:54:22.020 | is there uncanny valley where you're like,
00:54:24.300 | this is not right, or this is right,
00:54:26.260 | or are a lot of characters right?
00:54:28.060 | - Yeah, we stayed away from uncanny valley
00:54:30.340 | just by having such a different mapping
00:54:33.420 | where it didn't try to look like a dog or a human
00:54:36.180 | or anything like that.
00:54:37.220 | And so you avoided having like a weird pseudo similarity,
00:54:42.220 | but not quite hitting the mark.
00:54:44.180 | But you could like just fall flat
00:54:45.540 | where just like a personality or a character emotion
00:54:48.300 | just didn't feel right.
00:54:49.260 | And so it actually mirrored very closely
00:54:51.060 | to kind of the iterations that a character director
00:54:52.740 | at Pixar would have, where you're running through it
00:54:56.060 | and you can virtually kind of like see what it'll look like.
00:54:59.740 | We created a plugin to where,
00:55:01.940 | we actually used like Maya, the simulation,
00:55:04.220 | you know, the animation tools.
00:55:05.420 | And then we created a plugin that perfectly matched it
00:55:09.420 | to the physical one.
00:55:10.260 | And so you could like test it out virtually
00:55:11.900 | and then push a button and see it physically play out.
00:55:14.740 | And there's like subtle differences.
00:55:15.900 | And so you want to like make sure
00:55:17.020 | that that feedback loop is super easy
00:55:19.100 | to be able to test it live.
00:55:21.140 | And then sometimes like you would just feel it
00:55:24.660 | that it's right and intuitively know.
00:55:26.740 | And then you'd also do, we did user testing,
00:55:29.220 | but it was very, very often that like the,
00:55:33.060 | like if we found it magical,
00:55:34.900 | it would scale and be magical more broadly.
00:55:37.420 | There were not too many cases where like,
00:55:39.900 | like we were pretty decent about not like getting to it,
00:55:42.860 | you know, geeking out or getting too attached
00:55:44.620 | to something that was super unique to us,
00:55:47.100 | but trying to kind of like, you know,
00:55:49.140 | put a customer hat on and does it truly kind of feel magical?
00:55:52.060 | And so in a lot of ways,
00:55:53.380 | we just give a lot of autonomy to the character team
00:55:57.420 | to really think about the, you know,
00:55:59.900 | character board and mood boards and storyboards
00:56:02.060 | and like what's the background of this character
00:56:04.020 | and how would they react?
00:56:05.340 | And they went through a process
00:56:07.140 | that's actually pretty familiar,
00:56:08.260 | but now had to operate under these unique constraints.
00:56:11.220 | But the moment where it felt right
00:56:13.380 | kind of took a fairly similar journey than like
00:56:17.060 | as a character in an animated film, actually.
00:56:18.660 | - That's quite cool.
00:56:19.500 | - Well, the thing that's really important to me,
00:56:21.460 | and I wonder if it's possible,
00:56:23.140 | well, I hope it's possible, pretty sure it's possible,
00:56:25.340 | is for me, even though I know how it works,
00:56:29.620 | to make sure there's sufficient randomness in the process,
00:56:33.740 | probably because it would be machine learning based,
00:56:37.900 | that I'm surprised that I don't,
00:56:40.340 | I'm surprised by certain reactions,
00:56:42.380 | I'm surprised by certain communication.
00:56:44.180 | Maybe that's in a form of a question,
00:56:47.700 | were you surprised by certain things Cosmo did,
00:56:50.300 | like certain interactions?
00:56:52.340 | - Yeah, we made it intentionally,
00:56:54.380 | so that there would be some surprise
00:56:57.260 | than like a decent amount of variability
00:56:59.540 | in how he'd respond in certain circumstances.
00:57:02.220 | And so in the end, like it's,
00:57:04.060 | this isn't generally AI,
00:57:06.340 | this is a giant like spectrum and library
00:57:10.380 | of like parameterized kind of emotional responses
00:57:12.700 | and an emotional engine that would like kind of map
00:57:15.340 | your current state of the game, your emotions, the world,
00:57:19.060 | the people are playing with you all so forth
00:57:20.820 | to what's happening.
00:57:22.500 | But we could make it feel spontaneous
00:57:24.260 | by creating enough diversity and randomness,
00:57:28.900 | but still within the bounds of what felt like very realistic
00:57:32.740 | to make that work.
00:57:33.580 | And then what was really neat is that we could get statistics
00:57:35.620 | on how much of that space we were saturating,
00:57:38.380 | and then add more animations and more diversity
00:57:40.260 | in the places that would get hit more often,
00:57:42.060 | so that you stay ahead of the curve
00:57:45.020 | and maximize the chance that it stays feeling alive.
00:57:49.060 | And so, but then when you like combine it,
00:57:51.060 | like the permutations and kind of like the combinations
00:57:55.100 | of emotions stitched together,
00:57:56.420 | sometimes surprised us because you see them in isolation,
00:57:59.340 | but when you actually see them and you see them live,
00:58:01.900 | relative to some event that happened in the game
00:58:03.820 | or whatnot, like it was kind of cool
00:58:05.460 | to see the combination of the two.
00:58:06.700 | And not too different than other robotics applications
00:58:10.060 | where like you get so used to thinking about like
00:58:12.780 | the modules of a system and how things progress
00:58:14.940 | through a tech stack,
00:58:16.220 | that the real magic is when all the pieces come together
00:58:19.100 | and you start getting the right emergent behavior
00:58:22.900 | in a way that's easy to lose
00:58:23.980 | when you just kind of go too deep into any one piece of it.
00:58:26.020 | - Yeah, when the system is sufficiently complex,
00:58:27.900 | there is something like emergent behavior
00:58:29.580 | and that's where the magic is.
00:58:30.780 | As a human being, you could still appreciate the beauty
00:58:32.780 | of that magic at the system level.
00:58:35.860 | First of all, thank you for humoring me on this.
00:58:38.140 | It's really, really fascinating.
00:58:40.500 | I think a lot of people would love this.
00:58:41.740 | I'd love to just, one last thing on the butter robot,
00:58:44.460 | I promise. (laughs)
00:58:46.220 | In terms of speech,
00:58:48.260 | Cosmo is able to communicate so much
00:58:53.340 | with just movement and face.
00:58:56.180 | Do you think speech is too much of a degree of freedom?
00:59:01.180 | Like is speech a feature or a bug of deep interaction?
00:59:08.100 | Or emotional interaction?
00:59:09.860 | - Yeah, for a product, it's too deep right now.
00:59:13.460 | It's just not real.
00:59:15.060 | You would immediately break the fiction
00:59:16.460 | because the state of the art is just not good enough.
00:59:19.340 | And that's on top of just narrowing down the demographic
00:59:23.340 | where like the way you speak to an adult
00:59:24.740 | versus the way you speak to a child is very different.
00:59:27.480 | Yet a dog is able to appeal to everybody.
00:59:30.500 | And so right now there is no speech system
00:59:34.060 | that is like rich enough and subtly realistic enough
00:59:38.940 | to feel appropriate.
00:59:40.380 | And so we very, very quickly kind of like
00:59:42.460 | moved away from it.
00:59:43.300 | Now, speech understanding is a different matter
00:59:45.780 | where understanding intent, that's a really valuable input.
00:59:49.300 | But giving it back requires like a way, way higher bar
00:59:55.300 | given kind of where today's world is.
00:59:57.620 | And so that realization that you can do surprisingly much
01:00:01.580 | with either no speech or kind of tonal
01:00:04.620 | like the way, you know, Wally R2-D2
01:00:06.660 | and kind of other characters are able to.
01:00:08.700 | It's quite powerful and it generalizes across cultures
01:00:12.940 | and across ages really, really well.
01:00:15.100 | I think we're gonna be in that world for a little while
01:00:17.940 | where it's still very much an unsolved problem
01:00:20.380 | on how to like make something.
01:00:22.340 | It touches on the Uncanny Valley thing.
01:00:23.780 | So if you have legs and you're a big humanoid looking thing,
01:00:26.740 | you have very different expectations
01:00:28.140 | and a much narrower degree of what's gonna be acceptable
01:00:30.540 | by society than if you're a robot like Cosmo or Wally
01:00:35.540 | and you can, or some other form where you can kind of like
01:00:38.020 | reinvent the character.
01:00:39.860 | Speech has that same property where speech
01:00:42.020 | is so well understood in terms of expectations by humans
01:00:45.900 | that you have far less flexibility on how to deviate
01:00:48.660 | from that and lean into your strengths and avoid weaknesses.
01:00:52.020 | - But I wonder if there is, obviously there's certain kinds
01:00:55.220 | of speech that activates the Uncanny Valley
01:00:59.580 | and breaks the illusion faster.
01:01:01.780 | So I guess my intuition is we will solve certain,
01:01:06.780 | we would be able to create some speech-based personalities
01:01:11.620 | sooner than others.
01:01:13.340 | So for example, I could think of a robot
01:01:16.940 | that doesn't know English and is learning English, right?
01:01:21.180 | Those kinds of personalities.
01:01:22.020 | - A fiction where you're like, you're intentionally
01:01:24.700 | kind of like getting a toddler level of speech.
01:01:27.660 | So that's exactly right.
01:01:28.500 | So you can't have like tie it into the experience
01:01:32.380 | where it is a more limited character
01:01:34.380 | or you embrace the lack of emotions as part,
01:01:36.580 | or the lack of, sorry, dynamic range in the speech
01:01:39.900 | kind of capabilities, emotions as like part
01:01:41.660 | of the character itself.
01:01:42.500 | And you've seen that in like kind of fictional characters
01:01:44.660 | as well, but-
01:01:46.500 | - That's why this podcast works.
01:01:48.300 | - And yeah, and like, and you kind of had that with like,
01:01:51.980 | I don't know, I guess like data and some of the other,
01:01:53.820 | you know, what's it like. - Yeah, exactly.
01:01:55.460 | - But yeah, so you have to, and that becomes a constraint
01:01:58.780 | that lets you meet the bar.
01:02:01.380 | - See, I honestly think like also if you add drunk
01:02:05.980 | and angry, that gives you more constraints
01:02:12.220 | that allow you to be a dumber from an NLP perspective.
01:02:15.980 | Like there's certain aspects.
01:02:17.260 | So if you modify human behavior, like,
01:02:19.580 | so forget the sort of artificial thing
01:02:22.260 | where you don't know English, toddler thing.
01:02:25.300 | We, if you just look at the full range of humans,
01:02:28.980 | I think we, there's certain situations where we put up
01:02:33.980 | with like lower level of intelligence in our communication.
01:02:39.940 | Like if somebody is drunk, we understand the situation,
01:02:42.260 | that they're probably under the influence,
01:02:43.780 | like we understand that they're not going
01:02:45.860 | to be making any sense.
01:02:47.180 | Anger is another one like that.
01:02:48.540 | I'm sure there's a lot of other kind of situations.
01:02:51.860 | - Yeah.
01:02:52.700 | - So yeah, again, language, loss in translation,
01:02:55.580 | that kind of stuff that I think if you play with that,
01:03:00.340 | what is it, the Ukrainian boy that passed the Turing test,
01:03:03.820 | you know, play with those ideas.
01:03:05.220 | I think that's really interesting.
01:03:06.180 | And then you can create compelling characters,
01:03:08.380 | but you're right, that's a dangerous sort of road to walk
01:03:10.980 | because you're adding degrees of freedom
01:03:13.300 | that can get you in trouble.
01:03:14.260 | - Yeah, and that's why like you have these big pushes
01:03:17.140 | that like for most of the last decade plus,
01:03:19.860 | like where you'd have like full like human replicas
01:03:24.020 | of robots, where I'd be down to like skin
01:03:25.940 | and like kind of in some places.
01:03:27.540 | My personal feeling is like, man,
01:03:32.060 | like that's not the direction
01:03:34.860 | that's most fruitful right now.
01:03:37.180 | - Beautiful art.
01:03:38.020 | - Yeah.
01:03:38.860 | - It's not in terms of a rich, deep, fulfilling experience.
01:03:43.060 | Yeah, you're right.
01:03:43.900 | - Yeah, and creating a minefield of potential places
01:03:47.500 | to feel off and then you're sidestepping
01:03:51.380 | where like the biggest kind of functional AI challenges
01:03:53.820 | are to actually have, you know,
01:03:55.460 | kind of like really rich productivity
01:03:56.900 | that actually kind of justifies, you know,
01:03:58.780 | kind of the higher price points.
01:04:00.340 | And that's part of the challenges is like, yeah,
01:04:01.980 | like robots are gonna get to like thousands of dollars,
01:04:04.700 | tens of thousands of dollars and so forth,
01:04:06.140 | but you can imagine what sort of expectation of value
01:04:08.420 | that comes with it.
01:04:10.020 | And so that's where you wanna be able to invest
01:04:12.740 | the time and depth.
01:04:15.740 | And so going down the full human replica route
01:04:19.700 | creates a gigantic distraction
01:04:21.980 | and really, really high bar that can end up
01:04:28.380 | sucking up so much of your resources.
01:04:30.860 | - So it's weird to say, but you happen to be
01:04:34.300 | one of the greatest at this point roboticists ever
01:04:37.540 | because you created this little guy,
01:04:41.540 | you were part obviously of a great team
01:04:43.620 | that created the little guy with a deep personality
01:04:47.260 | and are now switching to an entirely,
01:04:51.620 | well, maybe not entirely, but a different,
01:04:53.980 | fascinating, impactful robotics problem,
01:04:58.020 | which is autonomous driving and more specifically,
01:05:00.860 | the biggest version of autonomous driving,
01:05:02.740 | which is autonomous trucking.
01:05:04.780 | So you are at Waymo now.
01:05:07.300 | Can you give us a big picture overview?
01:05:10.500 | What is Waymo?
01:05:12.020 | What is Waymo driver?
01:05:13.540 | What is Waymo One?
01:05:14.900 | What is Waymo Via?
01:05:17.100 | Can you give an overview of the company
01:05:19.060 | and the vision behind the company?
01:05:20.540 | - For sure.
01:05:21.380 | Waymo, by the way, it's just,
01:05:23.180 | it's been eyeopening on just how incredible
01:05:24.700 | that the people and the talent is
01:05:26.460 | and how in one company you almost have to create,
01:05:29.900 | I don't know, 30 companies worth of like
01:05:31.740 | technology and capability to like kind of
01:05:33.300 | solve the full spectrum of it.
01:05:34.700 | So yeah, so I've been at Waymo since 2019,
01:05:39.220 | so about two and a half years.
01:05:40.660 | So Waymo is focused on building what we call a driver,
01:05:45.580 | which is creating the ability to have autonomous driving
01:05:49.500 | across different environments, vehicle platforms,
01:05:52.780 | domains and use cases.
01:05:54.180 | You know, as you know, it got started in 2009.
01:05:58.220 | It was a lot, almost like an immediate successor
01:06:00.700 | to the Grand Challenge and Urban Challenges
01:06:02.420 | that were like incredible kind of catalysts
01:06:05.900 | for this whole space.
01:06:07.140 | And so Google started this project
01:06:09.340 | and then eventually Waymo spun out
01:06:10.580 | and so what Waymo is doing is creating the systems,
01:06:15.380 | both hardware, software, infrastructure,
01:06:18.180 | everything that goes into it to enable
01:06:20.340 | and to commercialize autonomous driving.
01:06:22.500 | This hits on consumer transportation and ride sharing
01:06:25.180 | and kind of vehicles and urban environments.
01:06:28.620 | And as you mentioned, it hits on autonomous trucking
01:06:31.940 | to transport goods.
01:06:34.060 | So in a lot of ways, it's transporting people
01:06:35.540 | and transporting goods, but at the end of the day,
01:06:38.380 | the underlying capabilities that are required to do that
01:06:40.860 | are surprisingly better aligned than one might expect,
01:06:45.260 | where it's the fundamentals of being able to understand
01:06:48.700 | the world around you, process it,
01:06:50.140 | make intelligent decisions and prove that we are
01:06:53.140 | at a level of safety that enables large-scale autonomy.
01:06:56.620 | - So from a branding perspective,
01:06:58.860 | sort of Waymo driver is the system
01:07:01.940 | that's irrespective of a particular vehicle
01:07:06.780 | it's operating in.
01:07:07.860 | You have a set of sensors that perceive the world,
01:07:10.540 | can act in that world and move this,
01:07:13.300 | whatever the vehicle is through the world.
01:07:15.580 | - That's right.
01:07:16.420 | And so in the same way that you have a driver's license
01:07:17.620 | and like your ability to drive isn't tied
01:07:19.540 | to a particular make and model of a car.
01:07:21.580 | And of course there's special licenses
01:07:22.980 | for other types of vehicles,
01:07:23.980 | but the fundamentals of a human driver
01:07:27.140 | very, very largely carry over.
01:07:28.460 | And then there's uniquenesses related
01:07:29.940 | to a particular environment or domain
01:07:31.860 | or a particular vehicle type
01:07:34.180 | that kind of add some extra additive challenges.
01:07:37.340 | But that's exactly right.
01:07:38.580 | It's the underlying systems that enable a physical vehicle
01:07:43.580 | without a human driver to very successfully
01:07:47.420 | accomplish the tasks that previously wasn't possible
01:07:51.420 | without a hundred percent human driving.
01:07:54.740 | - And then there's Waymo One,
01:07:57.140 | which is the transporting people from a brand perspective.
01:08:01.420 | And just in case we refer to it so people know,
01:08:04.180 | and then there's Waymo Via,
01:08:05.660 | which is the trucking component.
01:08:07.700 | Why Via, by the way?
01:08:08.740 | What is that?
01:08:09.580 | What is that?
01:08:10.420 | Is it just like a cool sounding name that just,
01:08:13.380 | like, is there an interesting story there?
01:08:16.940 | Just, it is a pretty cool sounding name.
01:08:18.620 | - It's a cool sounding name.
01:08:19.460 | I mean, when you think about it, it's just like,
01:08:21.140 | well, we're gonna transport it via this and that.
01:08:24.180 | So it's just kind of like an allusion
01:08:25.460 | to the mechanics of transporting something.
01:08:29.100 | And it is a pretty good grouping.
01:08:31.060 | And the interesting thing is that even the groupings
01:08:32.660 | kind of blur where Waymo One is like human transportation
01:08:35.660 | and there's a fully autonomous service in the Phoenix area
01:08:38.860 | that like every day is transporting people.
01:08:40.940 | And it's pretty incredible to like, just, you know,
01:08:43.380 | see that operate at reasonably large scale
01:08:45.500 | and just kind of happen.
01:08:46.580 | And then on the Via side, it doesn't even have to be,
01:08:50.380 | like long haul trucking is a, like a major focus of ours,
01:08:54.900 | but down the road, you can stitch together
01:08:56.980 | the vehicle transportation as well for local delivery.
01:09:00.620 | Also, in a lot of this requirements for local delivery
01:09:03.100 | overlap very heavily with consumer transportation.
01:09:06.660 | Obviously, you know, given that you're operating
01:09:09.060 | on a lot of the same roads
01:09:10.660 | and navigating the same safety challenges.
01:09:14.300 | And so, yeah, and Waymo very much is a, you know,
01:09:18.940 | multi-product company that has ambitions in both.
01:09:23.020 | They have different challenges
01:09:24.020 | and both are tremendous opportunities.
01:09:26.260 | But the cool thing is, is that there's a huge amount
01:09:29.220 | of leverage and this kind of core technology stack
01:09:31.620 | now gets pushed on by both sides.
01:09:34.180 | And that adds its own unique challenges,
01:09:36.540 | but the success case is that the challenges that you push on,
01:09:41.140 | they get leveraged across all platforms and all domains.
01:09:44.100 | - From an engineer perspective, the teams are integrated.
01:09:47.140 | - It's a mix.
01:09:47.980 | So there's a huge amount of centralized kind of core teams
01:09:50.820 | that support all applications.
01:09:52.260 | And so you think of something like the hardware team
01:09:54.380 | that develops the lasers, the compute,
01:09:56.020 | integrates into vehicle platforms.
01:09:57.780 | This is an experience that carries over across, you know,
01:10:00.660 | any application that we'd have,
01:10:01.780 | and they have been full with both.
01:10:03.580 | Then there's like really unique perception challenges,
01:10:06.820 | planning challenges, like other types of challenges
01:10:10.060 | where there's a huge amount of leverage on a core tech stack,
01:10:12.540 | but then there's like dedicated teams that think of
01:10:14.820 | how do you deal with a unique challenge?
01:10:16.060 | For example, an articulated trailer with varying loads
01:10:19.860 | that completely changes the physical dynamics of a vehicle
01:10:22.620 | that doesn't exist on a car,
01:10:24.220 | but it becomes one of the most important
01:10:26.220 | kind of unique new challenges on a truck.
01:10:28.380 | - So what's the long-term dream of Waymo
01:10:33.180 | via the autonomous trucking effort that Waymo's doing?
01:10:37.260 | - Yeah, so we're starting with developing
01:10:39.460 | L4 autonomy for class A trucks.
01:10:43.420 | These are 53 foot trailers that capture like a big,
01:10:47.060 | a pretty sizable percentage of the goods transportation
01:10:49.140 | in the country.
01:10:49.980 | Long-term, the opportunity is obviously to expand
01:10:53.220 | to much more diverse types of vehicles,
01:10:56.260 | types of goods transportation,
01:10:58.500 | and start to really expand in both the volume
01:11:00.740 | and the route feasibility that's possible.
01:11:03.300 | And so just like we did on the car side,
01:11:05.580 | you start with a single route
01:11:08.180 | with a very specific operating kind of domain
01:11:11.260 | and constraints that allow you to solve the problem.
01:11:14.060 | But then over time, you start to really try to push
01:11:17.940 | against those boundaries and open up deeper feasibility
01:11:21.140 | across routes, across surface streets,
01:11:23.540 | across environmental conditions,
01:11:25.540 | across the type of goods that you carry,
01:11:27.140 | the versatility of those goods,
01:11:28.580 | and how little supervision is necessary
01:11:31.340 | to just start to scale this network.
01:11:33.500 | And long-term, there's actually,
01:11:35.940 | it's a pretty incredible enabler where,
01:11:37.940 | today you have already a giant shortage of truck drivers.
01:11:42.620 | It's over 80,000 truck driver shortage
01:11:45.180 | that's expected to grow to hundreds of thousands
01:11:47.500 | in the years ahead.
01:11:48.700 | You have really, really quickly increasing demand
01:11:52.140 | from e-commerce and just distribution
01:11:54.540 | of where people are located.
01:11:56.700 | You have one of the deepest safety challenges
01:12:00.540 | of any profession in the US
01:12:03.340 | where there's a huge, huge, huge kind of challenge
01:12:07.900 | around fatigue and around kind of the long routes
01:12:10.580 | that are driven.
01:12:11.500 | And even beyond kind of the cost and necessity of it,
01:12:14.820 | there are fundamental constraints built
01:12:16.140 | into our logistics network that are tied
01:12:18.340 | to the type of human constraints
01:12:21.820 | and regulatory constraints that are tied to trucking today.
01:12:24.900 | For example, our limits on how long a driver
01:12:27.500 | can be driving in a single day
01:12:30.340 | before they're not allowed to drive anymore,
01:12:32.500 | which is a very important safety constraint.
01:12:35.380 | What that does is it enforces limitations
01:12:37.500 | on how far jumps with a single driver could be
01:12:40.540 | and makes you very subject to availability of drivers,
01:12:43.860 | which influences where warehouses are built,
01:12:45.900 | which influences how goods are transported,
01:12:47.620 | which influences costs.
01:12:48.980 | And so you start to have an opportunity on everything
01:12:53.020 | from plugging into existing fleets and brokerages
01:12:56.380 | and the existing logistics network
01:12:58.260 | and just immediately start to have a huge opportunity
01:13:01.740 | to add value from cost and driving fuel insurance
01:13:06.740 | and safety standpoint,
01:13:09.220 | all the way to completely reinventing the logistics network
01:13:12.820 | across the United States
01:13:14.140 | and enabling something completely different
01:13:15.540 | than what it looks like today.
01:13:16.500 | - Yeah, I had, it'll be published before this,
01:13:18.900 | had a great conversation with Steve Vigeli,
01:13:20.780 | who we talked about the manual driving.
01:13:23.340 | He echoed many of the same things
01:13:24.700 | that you were talking about,
01:13:26.020 | but we talked about much of the fascinating human stories
01:13:30.180 | of truck drivers.
01:13:31.460 | He was also was a truck driver for a bit as a grad student
01:13:35.580 | to try to understand the depth of the problem.
01:13:37.460 | He's a fascinating- - Fascinating, why it's,
01:13:39.140 | we have some drivers that have 4 million miles
01:13:40.980 | of lifetime driving experience.
01:13:42.380 | It's pretty incredible.
01:13:43.620 | And yeah, it's, you know, learning from them,
01:13:47.540 | like some of them are on the road for 300 days a year.
01:13:49.740 | It's a very unique type of lifestyle.
01:13:51.700 | - So there's fascinating stuff there.
01:13:53.140 | Just like you said, there's a shortage of actually people,
01:13:56.780 | truck drivers taking the job counter to what is,
01:14:00.340 | I think is publicly believed.
01:14:01.900 | So there's an excess of jobs
01:14:06.060 | and a shortage of people to take up those jobs.
01:14:08.540 | And just like you said, it's such a difficult problem.
01:14:12.220 | And these are experts at driving,
01:14:14.140 | at solving this particular problem.
01:14:16.060 | And it's fascinating to learn from them,
01:14:17.660 | to understand, you know, how hard is this problem?
01:14:20.860 | And that's the question I wanna ask you from a perception,
01:14:23.740 | from a robotics perspective.
01:14:25.700 | What's your sense of how difficult is autonomous trucking?
01:14:29.600 | Maybe you can comment on which scenarios
01:14:32.420 | are super difficult, which are more manageable.
01:14:35.180 | Is there a way to kind of convert into words
01:14:39.100 | how difficult the problem is?
01:14:40.820 | - Yeah, it's a good question.
01:14:42.380 | So there's, and as you can expect, it's a mix.
01:14:45.620 | Some things become a lot easier,
01:14:50.620 | or at least more flexible.
01:14:52.700 | Some things are harder.
01:14:53.860 | And so, you know, on the things that are like,
01:14:56.900 | the tailwinds, the benefits.
01:14:58.300 | A big focus of automating trucking,
01:15:01.820 | especially initially, is really focusing
01:15:03.620 | on the long haul freeway stretch of it,
01:15:05.780 | where that's where a majority of the value is captured.
01:15:07.980 | On a freeway, you have a lot more structure
01:15:09.820 | and a lot more consistency across freeways across the US,
01:15:14.100 | compared to surface streets,
01:15:15.340 | where you have a way higher dimensionality
01:15:18.540 | of what can happen, lack of structure,
01:15:20.620 | lack of consistency and variability across cities.
01:15:23.160 | So you can leverage that consistency to tackle,
01:15:27.140 | at least in that respect, a more constrained AI problem,
01:15:30.660 | which has some benefits to it.
01:15:32.720 | You can itemize much more of the sort of things
01:15:34.420 | you might encounter and so forth.
01:15:35.460 | And so those are benefits.
01:15:37.060 | - Is there a canonical freeway and city
01:15:40.620 | we should be thinking about?
01:15:41.700 | Like, is there a standard thing
01:15:44.460 | that's brought up in conversation often?
01:15:46.060 | Like, here's a stretch of road.
01:15:48.080 | What is it?
01:15:50.780 | Like when people talk about traveling across country,
01:15:52.740 | they'll talk about New York, San Francisco.
01:15:57.700 | Is that the route?
01:15:58.820 | Like, is there a stretch of road that's like nice and clean?
01:16:03.300 | And then there's like cities with difficulties in them
01:16:06.260 | that you kind of think of as the canonical problem
01:16:08.380 | to solve here.
01:16:09.340 | - Right.
01:16:10.300 | So starting with the car side,
01:16:11.800 | Waymo very intentionally picked the Phoenix area
01:16:16.140 | and the San Francisco area as a follow-up
01:16:18.420 | once we hit driverless,
01:16:19.400 | where when you think of consumer transportation
01:16:22.260 | and ride sharing kind of economy,
01:16:25.000 | a big percentage of that market is captured
01:16:26.820 | in the densest cities in the United States.
01:16:28.540 | And so really pushing at and solving San Francisco
01:16:31.300 | becomes a really huge opportunity and importance.
01:16:34.420 | And places one dot on kind of like the spectrum
01:16:38.420 | of like kind of complexity.
01:16:40.280 | The Phoenix area, starting with Chandler
01:16:42.220 | and then like kind of expanding more broadly
01:16:43.500 | in the Phoenix metropolitan area,
01:16:45.820 | it's I believe the fastest growing city in the US.
01:16:48.500 | It's a kind of a higher medium sized city,
01:16:51.500 | but growing quickly and still captures
01:16:53.940 | a really wide range of kind of like complexities.
01:16:56.460 | And so getting to driverless there
01:16:58.380 | actually exposes you to a lot of the building blocks
01:17:00.180 | you need for the more complicated environments.
01:17:03.040 | And so in a lot of ways, there's a thesis
01:17:05.320 | that if you start to kind of place a few of these
01:17:07.300 | kind of dots where San Francisco has these types
01:17:09.900 | of unique challenges, dense pedestrians,
01:17:11.540 | all this like complexity,
01:17:12.980 | especially when you get into the downtown areas
01:17:14.660 | and so forth, and Phoenix has like a really interesting
01:17:17.560 | kind of spectrum of challenges,
01:17:18.880 | maybe other ones like LA kind of add freeway focus
01:17:22.220 | and so forth.
01:17:23.180 | You start to kind of cover the full set of features
01:17:25.820 | that you might expect, and it becomes faster and faster
01:17:28.860 | if you have the right systems and the right organization
01:17:31.460 | to then open up the fifth city and the 10th city
01:17:33.660 | and the 20th city.
01:17:34.880 | On trucking, there's similar properties
01:17:37.380 | where obviously there's uniquenesses in freeways
01:17:40.500 | when you get into really dense environments,
01:17:42.120 | and then the real opportunity to then get even more value
01:17:47.120 | is to think about how you expand
01:17:48.780 | with like some of the surface challenges.
01:17:50.260 | But for example, right now, we're looking,
01:17:52.920 | we have a big facility that we're finishing building
01:17:55.660 | in Q1 in Dallas area that'll allow us to do testing
01:17:59.820 | from the Dallas area on routes like Dallas to Houston,
01:18:02.380 | Dallas to Phoenix, going out East and-
01:18:05.660 | - Dallas to Austin.
01:18:07.020 | - Austin, so that triangle-
01:18:08.740 | - Waymo should come to Austin.
01:18:10.460 | - Well, Waymo, the car side was in Austin for a while.
01:18:13.860 | - Yes, I know, come back.
01:18:15.420 | - Yeah, but trucking is actually,
01:18:17.180 | Texas is one of the best places to start
01:18:19.260 | because of both volume, regulatory,
01:18:20.940 | where there's a lot of benefits.
01:18:23.300 | On trucking, a huge opportunity is Port of LA going East.
01:18:27.100 | So in a lot of ways, a lot of the work
01:18:29.820 | is to start to stitch together a network
01:18:32.060 | and converge to Port of LA
01:18:34.180 | where you have the biggest port in the United States.
01:18:37.540 | And the amount of goods going East from there
01:18:39.620 | is pretty tremendous.
01:18:40.940 | And then obviously there's kind of channels everywhere,
01:18:44.380 | and then you have extra complexities
01:18:45.660 | as you get into like snow and inclement weather and so forth.
01:18:48.340 | But what's interesting about trucking
01:18:50.620 | is every single route segment that you add
01:18:52.840 | increases the value of the whole network.
01:18:54.260 | And so it has this kind of network effect
01:18:56.220 | and cumulative effect that's very unique.
01:18:57.700 | And so there's all these dimensions that we think about.
01:19:00.020 | And so in a lot of ways,
01:19:01.220 | Dallas has a really unique hub
01:19:02.980 | that opens up a lot of options
01:19:04.100 | has become a really valuable lever.
01:19:05.940 | - So the million questions I could ask you,
01:19:07.740 | first of all, you mentioned level four.
01:19:10.260 | For people who totally don't know,
01:19:13.540 | there's these levels of automation
01:19:16.300 | that level four refers to kind of the first step
01:19:20.980 | that you could recognize as fully autonomous driving.
01:19:24.300 | Level five is really fully autonomous driving.
01:19:27.260 | And level four is kind of fully autonomous driving.
01:19:30.020 | And then there are specific definitions
01:19:32.420 | depending on who you ask what that actually means.
01:19:34.580 | But for you, what does the level four mean?
01:19:37.980 | And you mentioned freeway.
01:19:40.140 | Let's say like there's three parts of long haul trucking.
01:19:43.060 | Maybe I'm wrong in this, but there's freeway driving,
01:19:46.740 | there's like truck stop,
01:19:49.780 | and then there's more urban-y type of area.
01:19:53.900 | So which of those do you want to tackle?
01:19:57.580 | Which of them do you include under level four?
01:20:00.180 | Like, how do you think about this problem?
01:20:01.660 | What do you focus on?
01:20:03.060 | Where's the biggest impact to be had in the short term?
01:20:05.780 | - So the goal is to,
01:20:07.420 | we got to get to market as fast as we can,
01:20:09.060 | because the moment you get to market,
01:20:10.340 | you just learn so much
01:20:11.460 | and it influences everything that you do.
01:20:13.500 | And it is,
01:20:14.340 | I mean, it's one of the experiences
01:20:16.660 | I carried over from before,
01:20:18.220 | is that you add constraints,
01:20:20.220 | you figure out the right compromises,
01:20:21.660 | you do whatever it takes,
01:20:22.660 | because getting to market is so critical.
01:20:25.740 | - But here, with autonomous driving,
01:20:27.220 | you can get to market in so many different ways.
01:20:28.860 | - That's right.
01:20:29.700 | And so one of the simplifications
01:20:32.020 | that we intentionally have put on
01:20:33.300 | is using what we call transfer hubs,
01:20:35.340 | where you can imagine depots
01:20:38.300 | that are at the entry points to metropolitan areas,
01:20:42.180 | like let's say Dallas,
01:20:43.420 | like the hub that we're building,
01:20:44.780 | which does a few things that are very valuable.
01:20:47.620 | So from a first product standpoint,
01:20:49.860 | you can automate transfer hub to transfer hub,
01:20:52.380 | and that path from the transfer hub
01:20:54.380 | to the full freeway route
01:20:57.500 | can be a very intentional single route
01:21:00.100 | that you can select for the features
01:21:01.940 | that you feel you wanna handle at that point in time.
01:21:04.220 | - Now you build a hub specifically designed
01:21:07.220 | for autonomous trucking.
01:21:08.500 | - And that's what's gonna happen actually,
01:21:09.580 | like you need to come out in January and check it out,
01:21:11.980 | 'cause it's gonna be really cool.
01:21:12.900 | It's not only is it our main operating headquarters
01:21:16.980 | for our fleet there,
01:21:18.700 | but it will be the first fully ground up
01:21:21.380 | designed driverless hub for autonomous trucks
01:21:24.580 | in terms of where do they enter, where do they depart?
01:21:27.180 | How do you think about the flow of people, goods, everything?
01:21:29.260 | It's quite cool,
01:21:30.540 | and it's really beautiful on how it's thought through.
01:21:32.740 | And so early on,
01:21:34.260 | it is totally reasonable to do the last five miles manually
01:21:38.580 | to get to the final kind of depot
01:21:40.340 | to avoid having to solve the general surface street problem,
01:21:42.540 | which is obviously very complex.
01:21:43.940 | Now, when the time comes,
01:21:46.020 | and we are increasingly,
01:21:48.100 | well, already we're pushing on some of this,
01:21:49.460 | but we will increasingly be pushing
01:21:50.580 | on surface street capabilities
01:21:52.140 | to build out the value chain
01:21:53.780 | to go all the way depot to depot
01:21:55.300 | instead of transfer hub to transfer hub.
01:21:57.180 | And we have probably the best advantages in the world
01:21:59.380 | because of all the Waymo experience on surface streets,
01:22:01.860 | but that's not the highest ROI right now
01:22:03.740 | where the highest ROI is--
01:22:04.580 | - Hub to hub.
01:22:05.420 | - Hub to hub and get the routes going.
01:22:06.940 | And so when you ask what's L4,
01:22:09.460 | L4 can be applied to any domain, operating domain or scope,
01:22:13.220 | but it's effectively for the places where we say
01:22:15.100 | we're ready for autonomous operation.
01:22:17.620 | We are 100% operating through as a self-driving truck
01:22:22.620 | with no human behind the wheel.
01:22:27.420 | That is L4 autonomy.
01:22:28.540 | And it doesn't mean that you operate in every condition.
01:22:30.740 | It doesn't mean you operate on every road,
01:22:32.620 | but for a particularly well-defined area,
01:22:36.700 | operating conditions, routes, kind of domain,
01:22:39.540 | you are fully autonomous.
01:22:40.500 | And that's the difference between L4 and L5.
01:22:42.180 | And most people would agree that
01:22:43.620 | at least anytime in the foreseeable future,
01:22:45.180 | L5 is just not even really worth thinking about
01:22:47.540 | because there's always gonna be these extremes.
01:22:50.220 | And so it's a race and almost like a game
01:22:53.540 | where you think of what is the sequence
01:22:55.860 | of expanded capabilities that create the most value
01:22:59.140 | and teach us the most and create this feedback loop
01:23:02.100 | where we're building out and unlocking
01:23:03.740 | more and more capability over time.
01:23:05.820 | - I gotta ask you, just curious.
01:23:07.660 | So first of all, I have to, when I'm allowed,
01:23:10.740 | visit the Dallas facility 'cause it's super cool.
01:23:13.180 | It's like robot on the giving and the receiving end.
01:23:17.340 | It's the truck is a robot and the hub is a robot.
01:23:20.700 | - Yeah, it's gotta be very robot friendly.
01:23:22.140 | - Yeah, that's great.
01:23:24.260 | I will feel at home.
01:23:25.460 | What's the sensor suite like on the hub
01:23:28.780 | if you can just high level mention it?
01:23:31.020 | Is, does the hub have like lidars and like,
01:23:34.540 | is the truck doing most of the intelligence
01:23:38.060 | or is the hub also intelligent?
01:23:39.980 | - Yeah, so most of it will be the truck
01:23:42.020 | and everything is like connected.
01:23:43.820 | Like, so we have our servers where we know exactly
01:23:47.300 | where every truck is.
01:23:48.260 | We know exactly what's happening at a hub.
01:23:50.220 | And so you can imagine like a large backend system
01:23:52.620 | that over time starts to manage timings, goods,
01:23:56.300 | delivery windows, all these sorts of things.
01:23:58.500 | And so you don't actually need to,
01:24:02.980 | there might be special cases where that is valuable
01:24:04.740 | to equip some sensors in the hub,
01:24:06.660 | but a majority of the intelligence is gonna be on the truck
01:24:08.780 | because whatever's relevant to the truck,
01:24:12.100 | relevant should be seen by the truck
01:24:13.940 | and can be relayed remotely for any sort of
01:24:17.420 | kind of cognizance or decision-making.
01:24:19.220 | But there's a distinct type of workflow where,
01:24:22.900 | where do you check trucks?
01:24:24.060 | Where do you want them to enter?
01:24:25.060 | What if there's many operating at once?
01:24:26.780 | Where's the staging area to depart?
01:24:28.740 | How do you set up the flow of humans
01:24:31.020 | and human cars and traffic
01:24:33.660 | so that you minimize the interaction between humans
01:24:36.100 | and kind of self-driving trucks?
01:24:38.580 | And then how do you even intelligently select
01:24:40.300 | the locations of these transfer hubs
01:24:42.300 | that are both really great service locations
01:24:44.700 | for a metropolitan area?
01:24:45.740 | And there could be over time,
01:24:47.180 | many of them for a metropolitan area,
01:24:49.580 | while at the same time,
01:24:50.580 | leaning into the path of least resistance
01:24:53.940 | to lean into your current capabilities and strengths
01:24:56.100 | so that you minimize the amount of work that's necessary
01:24:59.180 | to unlock the next kind of big bar.
01:25:01.140 | - I have a million questions.
01:25:02.380 | So first, is the goal to have no human in the truck?
01:25:06.180 | - The goal is to have no human in the truck.
01:25:08.140 | Now, of course, right now we're testing
01:25:09.780 | with expert operators and so forth,
01:25:11.780 | but the goal is to...
01:25:14.260 | Now, there might be circumstances
01:25:15.620 | where it makes sense to have a human
01:25:16.980 | and obviously these trucks can also be manually driven.
01:25:20.500 | So sometimes like we talk with our fleet partners
01:25:23.220 | about how you can buy Waymo equipped
01:25:27.460 | Dymor truck down the road
01:25:29.060 | and on the routes that are autonomous, it's autonomous.
01:25:31.620 | On the routes that are not, it's human driven.
01:25:34.340 | Maybe there's L2 functionality
01:25:35.740 | that add safety systems and so forth.
01:25:37.740 | But as soon as they become,
01:25:39.540 | as soon as we expand in software,
01:25:41.260 | the availability of driverless routes,
01:25:43.300 | the hardware is forward compatible
01:25:44.660 | to just now start using them in real time.
01:25:47.780 | And so you can imagine this mixed use,
01:25:51.220 | but at the end of the day,
01:25:52.340 | the largest value proposition is where you're able
01:25:55.500 | to have no constraints on how you can operate this truck.
01:25:58.580 | And it's 100% autonomous with nobody inside.
01:26:01.500 | - That's amazing.
01:26:02.340 | So let me ask on a logistics front,
01:26:05.340 | 'cause you mentioned that also opportunity to revamp
01:26:09.020 | or for built from scratch,
01:26:10.260 | some of the ideas around logistics.
01:26:12.820 | I don't wanna throw too much shade,
01:26:14.500 | but from talking to Steve,
01:26:15.660 | my understanding is logistics is not perhaps as great
01:26:19.500 | as it could be in the current trucking environment.
01:26:23.660 | I'm not, maybe you can break down why,
01:26:25.820 | but there's probably competing companies.
01:26:28.460 | There's just a mess.
01:26:29.420 | Maybe some of it is literally just it's old school.
01:26:32.660 | Like it's just like, it's not computerized.
01:26:36.740 | Like truckers are almost like contractors.
01:26:39.620 | There's an independence and there's not a nice interface
01:26:42.980 | where they can communicate where they're going,
01:26:44.540 | where they're at, all those kinds of things.
01:26:46.940 | And so it just feels like there's so much opportunity
01:26:49.700 | to digitize everything to where you could optimize
01:26:53.100 | the use of human time,
01:26:54.700 | optimize the use of all kinds of resources.
01:26:57.380 | How much are you thinking about that problem?
01:26:59.820 | How fascinating is that problem?
01:27:02.540 | How difficult does it,
01:27:03.940 | how much opportunity is there to revolutionize
01:27:06.220 | the space of logistics in autonomous trucking,
01:27:08.580 | in trucking period?
01:27:09.820 | - It's pretty fascinating.
01:27:10.660 | It's, this is one of the most motivating aspects
01:27:13.060 | of all this where like, yes,
01:27:14.740 | there's like a mountain of problems that are like,
01:27:16.580 | you want to, you have to solve to get to like
01:27:18.100 | the first checkpoints and first driverless and so forth.
01:27:20.740 | And inevitably like in a space like this,
01:27:22.500 | you plug in initially into the existing kind of system
01:27:25.900 | and start to kind of learn and iterate.
01:27:27.620 | But that opportunity is massive.
01:27:29.700 | And so a couple of the factors that play into it.
01:27:32.060 | So first of all,
01:27:34.220 | there's obviously just the physical constraints
01:27:36.300 | of driving time, driver availability.
01:27:38.860 | Some fleets have a 95% attrition rate,
01:27:41.020 | you know, right now because of just this demands
01:27:44.620 | and like, you know,
01:27:45.900 | kind of gaps in competition and so forth.
01:27:48.020 | And then it's also incredibly fragmented
01:27:49.580 | where you would be shocked at like,
01:27:52.220 | when you look at industries,
01:27:53.420 | like, and you think of the top 10 players,
01:27:55.220 | like the biggest fleets,
01:27:56.060 | like the Walmarts and FedEx's and so forth,
01:27:58.340 | the percentage of the overall trucking market
01:28:00.700 | that's captured by the top 10 or 50 fleets
01:28:02.620 | is surprisingly small.
01:28:04.740 | The average kind of truck operation
01:28:07.380 | is like a one to five truck, you know, family business.
01:28:11.340 | And so there's just like a huge amount of like fragmentation,
01:28:15.140 | which makes for really interesting challenges
01:28:18.300 | in kind of stitching together through like bulletin boards
01:28:21.700 | and brokerages and some people run their own fleets.
01:28:24.340 | And this world's kind of like evolving,
01:28:27.460 | but it is one of the less digitized and optimized worlds
01:28:32.860 | that there is.
01:28:33.700 | And the part that is optimized
01:28:36.020 | is optimized to the constraints of today.
01:28:38.620 | And even within the constraints of today,
01:28:40.420 | this is a $900 billion industry in the US
01:28:42.900 | and it's continuing to grow.
01:28:44.620 | - It feels like from a business perspective,
01:28:47.060 | if I were to predict that whilst trying to solve
01:28:50.420 | the autonomous trucking problem,
01:28:51.940 | Waymo might solve first the logistics problem.
01:28:55.500 | Like, 'cause that would already be a huge impact.
01:28:59.220 | So on the way to solving autonomous trucking,
01:29:02.020 | the human driven, like there's so much opportunity
01:29:05.260 | to significantly improve the human driven trucking,
01:29:10.260 | the timing, the logistics.
01:29:12.340 | So you use humans optimally.
01:29:13.860 | - The handoffs, the like, you know.
01:29:15.660 | Well, even the, I mean, you get really ambitious,
01:29:18.380 | you start to expand this beyond like,
01:29:19.820 | how does the fulfillment center work?
01:29:22.140 | And like, how does the transfer hub work?
01:29:23.700 | How does the warehouse work to,
01:29:25.700 | I mean, there's a lot of opportunities
01:29:26.940 | to start to automate these chains.
01:29:28.460 | And a lot of the inefficiency today is because like,
01:29:31.980 | you have a delay, like Port of LA has a bunch of ships
01:29:35.740 | right now waiting outside of it because they can't dock
01:29:37.980 | because there's not enough labor inside of the Port of LA.
01:29:41.540 | That means there's a big backlog of trucks,
01:29:43.100 | which means there's a big backlog of deliveries,
01:29:44.940 | which means the drivers aren't where they need to be.
01:29:46.660 | And so you have this like huge chain reaction
01:29:49.100 | and your feasibility of readjusting in this network is low
01:29:52.580 | because everything's tied to humans
01:29:54.580 | and manual kind of processes or distributed processes
01:29:58.660 | across a whole bunch of players.
01:30:00.220 | And so one of the biggest enablers is,
01:30:03.540 | yes, we have to solve autonomous trucking first.
01:30:05.460 | And that, by the way, that's not like an overnight thing.
01:30:07.380 | That's decades of continued kind of expansion and work.
01:30:10.940 | But the first checkpoint in the first route is like,
01:30:14.260 | is not that far off.
01:30:16.100 | But once you start enabling and you start to learn
01:30:17.940 | about how the constraints of autonomous trucking,
01:30:22.220 | which are very, very different
01:30:23.700 | than the constraints of human trucking,
01:30:25.020 | and again, strengths and weaknesses,
01:30:27.540 | how do you then start to leverage that
01:30:30.460 | and rethink a flow of goods more broadly?
01:30:34.740 | And this is where like the learnings
01:30:36.340 | of like really partnering
01:30:37.580 | with some of the largest fleets in the US
01:30:40.420 | and the sort of learnings that they have about the industry
01:30:43.300 | and the sort of needs that they have
01:30:44.500 | and what would change if you just like
01:30:47.460 | really broke this one constraint
01:30:49.060 | that like holds up the whole network?
01:30:50.820 | Or what if you enabled this other constraint?
01:30:53.060 | That actually drives the roadmap in a lot of ways
01:30:54.860 | because this is not like an all or nothing problem.
01:30:57.820 | It's, you know, you start to kind of unlock
01:30:59.780 | more and more functionality over time,
01:31:02.060 | which functionality most enables this optimization
01:31:05.340 | ends up being kind of part of the discussion.
01:31:07.020 | But you're totally right.
01:31:08.220 | Like you fast forward to like, you know,
01:31:10.340 | five years, 10 years, 15 years,
01:31:12.980 | and you think about like very generalized capability
01:31:17.500 | of automation and logistics,
01:31:19.820 | as well as the ability to like poke
01:31:21.340 | into how those handoffs work.
01:31:23.420 | The efficiency goes far beyond just direct cost
01:31:26.180 | of today's like unit economics of a truck.
01:31:28.460 | They go towards reinventing the entire system
01:31:31.260 | in the same way that, you know,
01:31:33.180 | you see, you know, these other industries that,
01:31:35.820 | like when you get to enough scale,
01:31:36.860 | you can really rethink how you build
01:31:39.500 | around your new set of capabilities,
01:31:41.020 | not the old set of capabilities.
01:31:43.140 | - Yeah, use the analogy metaphor or whatever
01:31:45.740 | that autonomous trucking is like email versus mail.
01:31:48.700 | And then with email, you're still doing the communication,
01:31:51.140 | but it opens up all kinds of,
01:31:53.900 | varieties of communication that you didn't anticipate.
01:31:57.420 | - That's right, constraints are just completely different.
01:31:59.540 | And yeah, there's definitely a property of that here.
01:32:02.900 | And we're also still learning about it
01:32:04.780 | because there is a lot of really fascinating
01:32:08.260 | and sometimes really elegant things
01:32:09.500 | that the industry has done where there's companies
01:32:11.300 | whose entire existence is around,
01:32:13.220 | despite the constraints,
01:32:14.140 | optimizing as much as they can out of it.
01:32:16.220 | And those lessons do carry over,
01:32:18.100 | but it's an interesting kind of merger of worlds
01:32:20.580 | to think about like, well, what if
01:32:22.740 | this was completely different?
01:32:23.980 | How would we approach it?
01:32:25.660 | And the interesting thing is that
01:32:28.100 | for a really, really, really long time,
01:32:30.260 | it's actually gonna be the merger
01:32:31.460 | between how to use autonomy and how to use humans
01:32:33.940 | that leans into each of their strengths.
01:32:36.380 | - Yeah, and then we're back to Cosmo,
01:32:38.980 | human robot interaction.
01:32:40.460 | So, and the interesting thing about Waymo
01:32:42.060 | is because there's the passenger vehicle,
01:32:43.700 | the human, the transportation of humans
01:32:46.180 | and transportation of goods,
01:32:48.140 | you could see over time,
01:32:49.420 | they may kind of meld together more
01:32:52.900 | because you'll probably have like
01:32:54.900 | zero occupancy vehicles moving around.
01:32:56.860 | So you have transportation of goods for short distances
01:32:59.700 | and then for slightly longer distances
01:33:02.540 | and then slightly longer,
01:33:03.580 | and then there'll be this,
01:33:04.860 | then you just see the difference
01:33:06.020 | between a passenger vehicle and a truck is just size
01:33:09.540 | and you can have different sizes
01:33:10.740 | and all that kind of stuff.
01:33:11.900 | And at the core, you can have a Waymo driver
01:33:13.580 | that doesn't, as long as you have the same sense of suite,
01:33:15.980 | you can just think of it as one problem.
01:33:17.580 | - And that's why over time,
01:33:18.500 | these do kind of converge where in a lot of ways,
01:33:21.420 | a lot of the challenges we're solving are freeway driving,
01:33:23.940 | which are going to carry over very well
01:33:25.460 | to the vehicles, to the car side.
01:33:27.340 | But there are like then unique challenges,
01:33:30.820 | like you have a very different dynamics in your vehicle
01:33:33.740 | where you have to see much further out
01:33:35.620 | in order to have the proper like response time
01:33:37.900 | because you have an 80,000 pound fully loaded truck.
01:33:41.140 | That's a very, very different type of braking profile
01:33:43.380 | than a car.
01:33:44.740 | You have a really interesting kind of dynamic limits
01:33:49.100 | because of the trailer where you actually,
01:33:51.220 | it's very, very hard to like physically like flip a car
01:33:54.060 | or do something like physically,
01:33:55.740 | like most risk in a car is from just collisions.
01:33:59.060 | It's very hard to like in any normal operation
01:34:01.380 | to do something other than like,
01:34:02.980 | unless you hit something to actually kind of like
01:34:04.740 | roll over something.
01:34:05.740 | On a truck, you actually have to drive much closer
01:34:08.340 | to the physical bounds of the safety limits,
01:34:11.300 | but you actually have like real constraints
01:34:13.660 | because you could have a really interesting interactions
01:34:18.420 | between the cabin and the trailer.
01:34:20.100 | There's something called jackknifing.
01:34:21.340 | If you turn too quickly, you have roll risks and so forth.
01:34:25.420 | And so we spent a huge amount of time
01:34:26.660 | understanding those boundaries
01:34:28.100 | and those boundaries change based on the load that you have,
01:34:30.780 | which is also an interesting difference.
01:34:32.500 | And you have to propagate through that,
01:34:34.060 | through the algorithm
01:34:34.900 | so that you're leveraging your dynamic range,
01:34:38.020 | but always staying within a safety balance,
01:34:39.820 | but understanding what those safety bounds are.
01:34:41.220 | And so we have this like really cool test facility
01:34:43.620 | where we like take it to the max and actually,
01:34:46.500 | imagine a truck with these giant training wheels
01:34:48.660 | on the back of the trailer
01:34:50.020 | and you're pushing it past the safety limits
01:34:53.220 | in order to like try to actually see where it rolls.
01:34:55.860 | And so you define this high dimensional boundary,
01:34:59.020 | which then gets captured in software to stay safe
01:35:01.100 | and actually do the right thing.
01:35:01.980 | But it's kind of fascinating,
01:35:03.660 | the sort of kind of challenges you have there.
01:35:06.540 | But then all of these things
01:35:07.580 | drive really interesting challenges from perception
01:35:09.500 | to unique behavior prediction challenges.
01:35:12.500 | And obviously in Planner where you have to think about
01:35:15.940 | merging and creating gaps with a 53 foot trailer
01:35:19.300 | and so forth.
01:35:20.140 | And then obviously the platform itself is very different.
01:35:22.060 | We have different numbers of sensors,
01:35:23.940 | sometimes types of sensors,
01:35:25.180 | and you also have unique blind spots that you have
01:35:27.020 | because of the trailer, which you have to think about.
01:35:28.700 | And so it's a really interesting spectrum.
01:35:30.740 | And in the end,
01:35:32.300 | you try to capture these special cases in a way
01:35:35.700 | that is cleanly augmentations of the existing tech stack,
01:35:39.660 | because a majority of what we're solving
01:35:42.300 | is actually generalizable to freeway driving
01:35:45.420 | and different platforms.
01:35:46.980 | And over time, they all start to kind of merge ideally
01:35:50.780 | where the things that are unique
01:35:52.100 | are as minimal as possible.
01:35:54.780 | And that's where you get the most leverage.
01:35:56.220 | And that's why Waymo can do,
01:35:58.700 | take on $2 trillion opportunities
01:36:01.220 | and have been nowhere near 2X the cost or investment or size.
01:36:05.500 | In fact, it's much, much smaller than that
01:36:07.580 | because of the high degree of leverage.
01:36:10.420 | - So what kind of sensor suite they can speak to
01:36:13.460 | that a long haul truck needs to have?
01:36:16.900 | LiDAR, vision, how many, what are we talking about here?
01:36:21.260 | - Yeah, so it's more than the car.
01:36:23.140 | So very loose, you can think of as like 2X,
01:36:25.580 | but it varies depending on the sensor.
01:36:27.740 | And so we have like dozens of cameras, radar,
01:36:30.940 | and then multiple LiDAR as well.
01:36:33.140 | You'll see one difference where the cars
01:36:35.500 | have a central main sensor pod on the roof in the middle,
01:36:38.620 | and then some kind of hood sensors for blind spots.
01:36:42.060 | The truck moves to two main sensor pods on the outsides
01:36:45.100 | where you would typically have the mirrors
01:36:46.900 | next to the driver.
01:36:47.940 | They effectively go as far out as possible,
01:36:51.540 | kind of up to the- - Up front.
01:36:54.500 | - Kind of on the cabin, not all the way in the front,
01:36:56.900 | but like kind of where the mirrors for the driver would be.
01:36:59.500 | And so those are the main sensor pods.
01:37:01.020 | And the reason they're there
01:37:02.140 | is because if you had one in the middle,
01:37:04.060 | the trailer's higher than the cabin
01:37:05.580 | and you would be occluded with this like awkward wedge.
01:37:08.060 | - Too much occlusion. - Too much occlusion.
01:37:09.620 | And so then you would add a lot of complexity
01:37:11.140 | to the software to make up for that
01:37:12.980 | and just unnecessary complexity.
01:37:14.900 | - There's so many probably fascinating design choices.
01:37:17.660 | - Really cool.
01:37:18.500 | - 'Cause you can probably bring up a LiDAR higher
01:37:20.140 | and have it in the center or something.
01:37:21.300 | You could have all kinds of choices
01:37:23.260 | to make the decisions here
01:37:25.020 | that ultimately probably will define the industry.
01:37:27.860 | - Right, but by having two on the side,
01:37:29.420 | there's actually multiple benefits.
01:37:30.500 | So one is like you're just beyond the trailer,
01:37:34.140 | so you can see fully flush with the trailer.
01:37:36.220 | And so you eliminate most of your blind spot
01:37:37.980 | except for right behind the trailer,
01:37:39.820 | which is great because now the software
01:37:41.980 | carries over really well.
01:37:43.340 | And the same perception system you use on the car side,
01:37:45.780 | largely that architecture can carry over
01:37:47.940 | and you can retrain some models and so forth,
01:37:49.980 | but you leverage it a lot.
01:37:51.220 | It also actually helps with redundancy
01:37:52.780 | where there's a really nice built-in redundancy
01:37:55.980 | for all the LiDAR cameras and radar
01:37:57.620 | where you can afford to have any one of them fail
01:38:00.380 | and you're still okay.
01:38:01.740 | And at scale, every one of them will fail.
01:38:04.860 | And you will be able to detect when one of them fails
01:38:07.260 | because they don't, because the redundancy,
01:38:10.300 | they're giving you the data that's inconsistent
01:38:12.820 | with the rest of the-
01:38:13.660 | - That's right.
01:38:14.500 | And it's not just like they no longer give data.
01:38:15.780 | It could be like they're fouled
01:38:17.020 | or they stop giving data
01:38:18.740 | where some electrical thing gets cut
01:38:21.100 | or part of your compute goes down.
01:38:23.740 | So what's neat is that like you have way more sensors,
01:38:25.620 | part of his field of view and occlusions,
01:38:27.620 | part of it's redundancy,
01:38:28.580 | and then part of it is new use cases.
01:38:30.020 | So there's new types of sensors
01:38:33.740 | to optimize for long range
01:38:35.180 | and kind of the sensing horizon
01:38:38.060 | that we look for on our vehicles
01:38:40.580 | that is unique to trucks
01:38:42.220 | because it actually is like kind of much
01:38:44.060 | like further out than a car.
01:38:47.020 | But a majority are actually used
01:38:48.300 | across both cars and trucks.
01:38:49.420 | And so we use the same compute,
01:38:50.540 | the same fundamental baseline sensors,
01:38:53.260 | cameras, radar, IMUs.
01:38:57.060 | And so you get a great leverage
01:38:58.860 | from all of the infrastructure
01:39:00.180 | and the hardware development as a result.
01:39:01.620 | - So what about cameras?
01:39:03.220 | What role does,
01:39:04.060 | so LIDAR is this rich set of information
01:39:06.980 | that has its strengths,
01:39:09.060 | has some weaknesses.
01:39:10.620 | Camera is this rich source of information
01:39:13.740 | that has some strengths,
01:39:14.780 | has its weaknesses.
01:39:16.220 | What role does LIDAR play?
01:39:17.860 | What role does vision cameras play
01:39:22.460 | in this beautiful problem of autonomous trucking?
01:39:25.980 | - It is beautiful.
01:39:26.820 | There's like so much that comes together.
01:39:28.820 | - And how much,
01:39:30.100 | at which point do they come together?
01:39:31.980 | - Yeah.
01:39:32.820 | So I'll start with LIDAR.
01:39:33.940 | So LIDAR has been like way most,
01:39:36.820 | one of way most big strengths and advantages
01:39:38.540 | where we developed our own LIDAR in-house
01:39:42.260 | where many generations in,
01:39:43.940 | both in cost and functionality,
01:39:45.500 | it is the best in the space.
01:39:49.660 | - Which generation?
01:39:50.740 | 'Cause I know there's this cool,
01:39:53.580 | I love versions that are increasing.
01:39:55.940 | Which version of the hardware stack is it currently?
01:39:59.540 | Officially, publicly?
01:40:01.660 | - So some parts iterate more than others.
01:40:04.340 | I'm trying to remember on the sensor side.
01:40:05.780 | So the entire self-driving system,
01:40:07.740 | which includes sensors and compute is fifth generation.
01:40:11.100 | - I can't wait until there's like iPhone style
01:40:14.220 | like announcements for like new versions
01:40:17.100 | of the Waymo hardware stack.
01:40:19.100 | - Well, we try to be careful
01:40:19.940 | 'cause man, when you change the hardware,
01:40:21.140 | it takes a lot to like retrain the models and everything.
01:40:24.180 | So we just went through that
01:40:25.100 | and going from the Pacificus to the Jaguars.
01:40:27.380 | And so the Jaguars and the trucks
01:40:29.220 | have the same generation now.
01:40:31.580 | But yeah, the LiDAR is, it's incredible.
01:40:33.820 | And so Waymo has leaned into that as a strength.
01:40:36.780 | And so a lot of the near range perception system
01:40:39.180 | that obviously kind of carries over a lot from the car side
01:40:43.460 | uses LiDAR as a very prominent kind of like primary sensor,
01:40:46.620 | but then obviously everything has its strengths
01:40:48.900 | and weaknesses.
01:40:49.740 | And so in the near range, LiDAR is a gigantic advantage
01:40:53.820 | and it has its weaknesses on,
01:40:56.220 | when it comes to occlusions in certain areas,
01:40:58.780 | rain and weather, like things like that.
01:41:01.300 | But it's an incredible sensor
01:41:02.620 | and it gives you incredible density,
01:41:04.820 | perfect location, precision and consistency,
01:41:07.980 | which is a very valuable property
01:41:10.060 | to be able to kind of apply ML approach.
01:41:13.500 | - Can you elaborate consistency?
01:41:15.220 | - Yeah, when you have a camera,
01:41:17.060 | the position of the sun, the time of the day,
01:41:20.020 | various of the properties can have a big impact,
01:41:23.620 | whether there's glare, the field of view, things like that.
01:41:26.940 | - So consistent.
01:41:28.300 | - The signal.
01:41:29.140 | - With in the face of a changing external environment,
01:41:33.500 | the signal.
01:41:34.340 | - Yeah, daytime, nighttime,
01:41:36.180 | it's about 3D physical existence.
01:41:39.900 | In effect, like you're seeing beams of light
01:41:42.460 | that physically bounce off of something and come back.
01:41:44.900 | And so whatever the conditional conditions are,
01:41:48.220 | like the shape of a human sensor reading from a human
01:41:52.140 | or from a car or from an animal,
01:41:54.060 | like you have a reliability there,
01:41:56.660 | which ends up being valuable
01:41:57.780 | for kind of like the long tail of challenges.
01:41:59.740 | Now, LiDAR is the first sensor to drop off
01:42:02.140 | in terms of range and ours has a really good range,
01:42:03.820 | but at the end of the day, it drops off.
01:42:05.820 | And so particularly for trucks,
01:42:08.420 | on top of the general redundancy that you want
01:42:10.380 | for near range and complements through cameras and radar
01:42:13.260 | for occlusions and for complimentary information
01:42:15.740 | and so forth, when you get the long range,
01:42:17.740 | you have to be radar and camera primary
01:42:20.180 | because your LiDAR data will fundamentally drop off
01:42:22.940 | after a period of time,
01:42:24.300 | and you have to be able to see kind of objects further out.
01:42:27.340 | Now, cameras have the incredible range
01:42:32.340 | where you get a high density, high resolution camera,
01:42:35.220 | you can get data well past a kilometer
01:42:37.380 | and it's like really potentially a huge value.
01:42:40.420 | Now, the signal drops off, the noise is higher,
01:42:42.460 | detecting is harder, classifying is harder.
01:42:45.780 | And one that you may not think about localizing is harder
01:42:48.740 | because you can be off by like two meters
01:42:52.020 | and where something's located a kilometer away.
01:42:54.460 | And that's the difference between being on the shoulder
01:42:56.140 | and being in your lane.
01:42:56.980 | And so you have like interesting challenges there
01:42:58.820 | that you have to solve,
01:42:59.660 | which have a bunch of approaches that come into it.
01:43:01.660 | Radar is interesting because it also has longer range
01:43:06.660 | than LiDAR and it gives you speed information.
01:43:12.020 | So it becomes very, very useful for dynamic information
01:43:15.540 | of traffic flow, vehicle motions, animals, pedestrians,
01:43:20.260 | like just things that might be useful signals.
01:43:24.860 | And it helps with weather conditions
01:43:27.140 | where radar actually penetrates weather conditions
01:43:28.980 | in a better way than other sensors.
01:43:30.660 | And so it's just, it's kind of interesting
01:43:32.820 | where we've kind of started to converge
01:43:34.700 | towards not thinking about a problem as a LiDAR problem
01:43:37.260 | or a camera problem or radar problem,
01:43:38.700 | but it's a fusion problem where these are all
01:43:42.140 | like large scale ML problems
01:43:44.180 | where you put data into the system.
01:43:46.340 | And in many cases, you just look for the signals
01:43:49.980 | that might be present in the union of all of these
01:43:52.060 | and leave it to the system as much as possible
01:43:54.860 | to start to really identify how to extract that.
01:43:58.380 | And then there's places we have to intervene
01:43:59.660 | and actually include more,
01:44:01.340 | but no single sensor is in a great position
01:44:04.060 | to like really solve this problem
01:44:05.420 | and end without a huge extra challenge.
01:44:08.340 | - That's fascinating.
01:44:09.380 | There's a question that's probably still an open question
01:44:12.740 | is at which point do you fuse them?
01:44:15.420 | Do you solve the perception problem
01:44:19.540 | for each sensor suite individually,
01:44:22.220 | the LiDAR suite and the camera suite,
01:44:24.780 | or do you do some kind of heterogeneous fusion
01:44:28.180 | or do you fuse at the very beginning?
01:44:30.180 | Is there a good answer
01:44:33.140 | or at least an inkling of intuitions you can come up with?
01:44:35.420 | - Yeah, so people refer to this
01:44:36.540 | as like early fusion or late fusion.
01:44:38.860 | So late fusion might be that you have like
01:44:41.420 | the camera pipeline, the LiDAR pipeline,
01:44:43.580 | and then you like fuse them.
01:44:45.020 | And like when it gets to like final semantics
01:44:48.460 | and classification and tracking,
01:44:49.900 | you like kind of fuse them together
01:44:51.180 | and figure out which one's best.
01:44:52.940 | There's more and more evidence
01:44:55.260 | that early fusion is important.
01:44:58.820 | And that is because late fusion does not allow you
01:45:03.340 | to pick up on the complementary strengths
01:45:06.100 | and weaknesses of the sensors.
01:45:08.020 | Weather's a great example where if you do early fusion,
01:45:11.420 | you have an incredibly hard problem
01:45:12.940 | for any single sensor in rain to solve that problem
01:45:16.420 | because you have reflections from the LiDAR,
01:45:19.660 | you have weird kind of noise from the camera,
01:45:23.460 | blah, blah, blah, right?
01:45:24.420 | But the combination of all of them can help you filter
01:45:27.300 | and help you get to the real signal
01:45:29.060 | that then gets you as close as possible
01:45:30.860 | to the original stack.
01:45:32.460 | And be much more fluid about the strengths and weaknesses
01:45:35.420 | where your camera is much more susceptible
01:45:38.740 | to like kind of fouling on the actual lens
01:45:42.340 | from like rain or random stuff.
01:45:45.380 | Whereas like you might be a little bit more resilient
01:45:47.100 | in other sensors.
01:45:47.940 | And so there's an element of logic
01:45:50.580 | that always happens late in the game,
01:45:51.860 | but that fusion early on, actually,
01:45:54.140 | especially as you move towards ML
01:45:55.780 | and large-scale data-driven approaches,
01:45:57.700 | just maximizes your ability to pull out the best signal
01:45:59.980 | you can out of each modality
01:46:01.300 | before you start making constraining decisions
01:46:03.820 | that end up being hard to unwind late in the stack.
01:46:06.140 | - So how much of this is a machine learning problem?
01:46:09.380 | What role does ML, machine learning,
01:46:11.780 | play in this whole problem of autonomous driving,
01:46:14.740 | autonomous trucking?
01:46:16.620 | - It's massive and it's increasing over time.
01:46:20.300 | If you go back to the grand challenge days
01:46:24.180 | in the early days of kind of AV development,
01:46:26.580 | there was ML, but it was not in like
01:46:30.060 | kind of the mass scale data style of ML.
01:46:32.020 | It was like learning models,
01:46:34.140 | but in a more structured kind of way.
01:46:36.100 | And it was a lot of heuristic
01:46:37.740 | and search-based approaches and planning and so forth.
01:46:40.260 | You can make a lot of progress
01:46:42.060 | with these types of approaches kind of across the board,
01:46:45.020 | an almost deceptive amount of progress.
01:46:46.420 | We can get pretty far,
01:46:47.780 | but then you start to really grind
01:46:49.820 | the further you get in some parts of the stack
01:46:52.220 | if you don't have an ability
01:46:53.460 | to absorb a massive amount of experience
01:46:55.660 | in a way that scales very sub-linearly
01:46:57.660 | in terms of human labor and human attention.
01:46:59.700 | And so when you look at the stack,
01:47:01.580 | the perception side is probably the first
01:47:03.260 | to get really revolutionized by ML.
01:47:04.980 | And it goes back many years
01:47:06.780 | because ML for like computer vision
01:47:09.740 | and these types of approaches
01:47:10.780 | kind of took off was a lot of the like early
01:47:14.780 | kind of push in deep learning.
01:47:16.860 | And so there's always a debate on, you know,
01:47:19.540 | the spectrum between kind of like end to end ML,
01:47:22.300 | which, you know, is a little bit kind of like too far
01:47:24.780 | to how you architect it to where you have modules,
01:47:27.540 | but enough ability to think about long tail problems
01:47:30.100 | and so forth.
01:47:30.940 | But at the end of the day,
01:47:32.860 | you have big parts of the system
01:47:34.780 | that are very ML and data driven.
01:47:36.620 | And we're increasingly moving that direction
01:47:38.580 | all the way across the board,
01:47:39.700 | including behavior where even when it's not like
01:47:44.700 | a gigantic ML problem that covers like a giant swath
01:47:48.500 | end to end, more and more parts of the system
01:47:50.740 | have this property where you want to be able
01:47:52.300 | to put more data into it and it gets better.
01:47:55.420 | And that has been one of the realizations
01:47:57.140 | as you drive tens of millions of miles
01:47:58.820 | and try to like solve new expansions of domains
01:48:02.780 | without regressing in your old ones,
01:48:04.620 | it becomes intractable for a human to approach that
01:48:07.700 | in the way that traditionally robotics
01:48:09.940 | has kind of approached some elements of the tech stack.
01:48:12.380 | - So are you trying to create a data pipeline
01:48:16.780 | specifically for the trucking problem?
01:48:18.940 | This is it, like how much leveraging
01:48:21.620 | of the autonomous driving is there
01:48:23.140 | in terms of data collection?
01:48:24.540 | - Yeah.
01:48:25.380 | - And how unique is the data required
01:48:29.060 | for the trucking problem?
01:48:30.460 | - So we use all the same infrastructure.
01:48:33.460 | So labeling workflows, ML workflows, everything.
01:48:35.820 | So that actually carries over quite well.
01:48:38.500 | We heavily reuse the data even,
01:48:40.540 | where almost every model that we have on a truck,
01:48:43.620 | we started with the latest car model.
01:48:45.740 | - Cool.
01:48:46.580 | - And-
01:48:47.420 | - So it's almost like a good back car model.
01:48:49.100 | - Yeah, it's like you can think of like,
01:48:51.220 | despite the different domain and different numbers
01:48:53.060 | of sensors and position of sensors,
01:48:54.700 | there's a lot of signals that carry over across driving.
01:48:57.100 | And so it's almost like pre-training
01:48:59.020 | and getting a big boost out of the gate
01:49:00.340 | where you can reduce the amount of data you need by a lot.
01:49:02.740 | And it goes both ways actually.
01:49:04.060 | And so we're increasingly thinking about our data
01:49:05.780 | strategy on how we leverage both of these.
01:49:09.180 | So you think about, you know,
01:49:11.260 | how other agents react to a truck.
01:49:12.980 | Yeah, it's a little bit different,
01:49:14.020 | but the fundamentals are actually like,
01:49:15.860 | what will other vehicles in the road do?
01:49:18.100 | There's a lot of carry over that's possible.
01:49:19.660 | And in fact, just to give you an example,
01:49:22.180 | we're constantly kind of like adding more data
01:49:24.940 | from the trucking side.
01:49:25.900 | But as of right now, when we think of our,
01:49:28.580 | like one of our models, behavior prediction
01:49:30.500 | for other agents on the road, like vehicles.
01:49:34.500 | 85% of that data comes from cars.
01:49:37.620 | And a lot of that 85% comes from surface streets
01:49:41.660 | because we just had so much of it and it was really valuable.
01:49:44.500 | And so we're adding in more and more,
01:49:46.340 | particularly in the areas where we need more data,
01:49:48.620 | but you get a huge boost out of the gate.
01:49:50.700 | - Just all different visual characteristics of roads,
01:49:53.260 | lane markings, pedestrians, all that, that's still relevant.
01:49:56.620 | - It's all still relevant.
01:49:57.460 | And then just the fundamentals of how, you know,
01:49:59.620 | you detect the car, does it really change that much?
01:50:02.740 | Whether you're detecting it from a car or a truck,
01:50:05.140 | the fundamentals of how a person will walk
01:50:07.780 | around your vehicle is it, it'll change a little bit,
01:50:10.340 | but the basics, like there's a lot of signal in there
01:50:12.860 | that as a starting point to a network
01:50:14.660 | can actually be very valuable.
01:50:16.180 | Now we do have some very unique challenges
01:50:18.020 | where there's a sparsity of events on a freeway.
01:50:20.540 | The frequency of events happening on a freeway,
01:50:22.780 | whether it's, you know, interesting, you know,
01:50:25.740 | objects in the road or incidents,
01:50:27.460 | or even like from a human benchmark,
01:50:29.780 | like how often does a human have a accident
01:50:31.980 | on a freeway is far more sparse than on a surface street.
01:50:35.540 | And so that leads to really interesting data problems
01:50:37.900 | where you can't just drive infinitely
01:50:40.580 | to encounter all the different permutations
01:50:42.540 | of things you might encounter.
01:50:43.740 | And so there you get into interesting tools
01:50:46.060 | like structure testing and data collection,
01:50:48.660 | data augmentation, and so forth.
01:50:50.820 | And so there's really interesting kind of technical
01:50:53.620 | challenges that push some of the research
01:50:57.060 | that enables these new suites of approaches.
01:50:59.900 | - What role does simulation play?
01:51:02.140 | - Really good question.
01:51:02.980 | So Waymo simulates about a thousand miles
01:51:05.500 | for every mile it drives.
01:51:07.020 | So you think of--
01:51:08.340 | - In both, so across the board.
01:51:10.140 | - Across the board, yeah.
01:51:11.700 | So you think of, for example,
01:51:13.340 | well, if we've driven, you know, over 20 million miles,
01:51:16.460 | that's over 20 billion miles in simulation.
01:51:18.260 | Now, how do you use simulation?
01:51:20.620 | It's a multi-purpose.
01:51:22.260 | So you use it for basic development.
01:51:25.740 | So you want to do, make sure you have regression prevention
01:51:28.100 | and protection of everything you're doing, right?
01:51:30.420 | That's an easy one.
01:51:31.620 | When you encounter something interesting in the world,
01:51:34.660 | let's say there was an issue with how the vehicle behaved
01:51:36.820 | versus an ideal human.
01:51:38.620 | You can play that back in simulation
01:51:40.220 | and start augmenting your system
01:51:42.660 | and seeing how you would have reacted to that scenario
01:51:44.700 | with this improvement or this new area.
01:51:46.900 | You can create scenarios that become part
01:51:48.460 | of your regression set after that point, right?
01:51:51.420 | Then you start getting into like really, really
01:51:53.660 | like kind of hill climbing where you say,
01:51:56.500 | "Hey, I need to improve this system.
01:51:57.980 | I have these metrics that are really correlated
01:51:59.460 | with final performance.
01:52:00.900 | How do I know how well I'm doing?"
01:52:02.620 | The actual physical driving
01:52:05.220 | is the least efficient form of testing.
01:52:07.260 | And it's expensive, it's time consuming.
01:52:09.060 | So grabbing a large scale batch of historical data
01:52:14.060 | and simulating it to get a signal of over these last,
01:52:17.980 | or just random sample of a hundred thousand miles,
01:52:20.220 | how has this metric changed versus where we are today?
01:52:22.980 | You can do that far more efficiently in simulation
01:52:24.740 | than just driving with that new system on board, right?
01:52:28.220 | And then you go all the way to the validation phase
01:52:30.540 | where to actually see your human relative safety
01:52:34.260 | of like how well you're performing on the car side
01:52:36.740 | or the trucking side relative to a human.
01:52:38.820 | A lot of that safety case is actually driven
01:52:42.380 | by taking all of the physical operational driving,
01:52:46.340 | which probably includes a lot of interventions
01:52:48.540 | where the driver took over just in case.
01:52:53.020 | And then you simulate those forward
01:52:56.260 | and see if would anything have happened?
01:52:58.140 | And in most cases, the answer is no,
01:52:59.740 | but you can simulate it forward.
01:53:01.420 | And you can even start to do really interesting things
01:53:03.180 | where you add virtual agents to create harder environments.
01:53:07.260 | You can fuzz the locations of physical agents.
01:53:10.260 | You can muck with the scene and stress test the scenario
01:53:13.260 | from a whole bunch of different dimensions.
01:53:14.900 | And effectively you're trying to like more efficiently
01:53:17.620 | sample this like infinite dimensional space,
01:53:20.260 | but try to encounter the problems as fast as possible.
01:53:22.940 | Because what most people don't realize
01:53:24.860 | is the hardest problem in autonomous driving
01:53:27.860 | is actually the evaluation problem in many ways,
01:53:29.980 | not the actual autonomy problem.
01:53:31.740 | And so if you could in theory evaluate perfectly
01:53:34.460 | and instantaneously, you can solve that problem
01:53:37.020 | in a really fast feedback loop quite well.
01:53:39.940 | But the hardest part is being really smart
01:53:42.260 | about this suite of approaches
01:53:43.420 | on how can you get an accurate signal
01:53:46.740 | on how well you're doing as quickly as possible
01:53:49.460 | in a way that correlates to physical driving.
01:53:51.220 | - Can you explain the evaluation problem?
01:53:53.580 | Which metric are you evaluating towards?
01:53:56.180 | Are we talking about safety?
01:53:58.020 | What are the performance metrics that we're talking about?
01:54:00.220 | - So in the end you care about end safety.
01:54:02.260 | Like that's in the end what keeps you,
01:54:04.260 | like that's what's deceptive where there's a lot of companies
01:54:08.140 | that have like a great demo.
01:54:09.540 | The path from like a really great demo
01:54:12.540 | to being able to go driverless can be deceptively long,
01:54:16.220 | even when that demo looks like it's driverless quality.
01:54:18.340 | And the difference is that the thing that keeps you
01:54:20.860 | from going driverless is not the stuff you encounter
01:54:23.300 | in a demo, it's the stuff that you encounter
01:54:24.820 | once in a hundred thousand miles or 500,000 miles.
01:54:27.660 | And so that is at the root of what is most challenging
01:54:32.020 | about going driverless because any issue you encounter
01:54:35.580 | you can go and fix it, but how do you know
01:54:36.860 | you didn't create five other issues
01:54:38.180 | that you haven't encountered yet?
01:54:40.020 | So those learnings, like those were painful learnings
01:54:43.180 | in Waymo's history that Waymo went through
01:54:45.380 | and led to us then finally being able to go driverless
01:54:48.540 | in Phoenix and now are at the heart of how we develop.
01:54:52.140 | Evaluation is simultaneously evaluating final kind
01:54:57.020 | of end safety of how ready are you to go driverless,
01:54:59.900 | which may be as direct as what is your collision,
01:55:05.900 | human relative kind of collision rate
01:55:08.740 | for all these types of scenarios and severities
01:55:13.140 | to make sure that you're better than a human bar
01:55:15.340 | by a good amount.
01:55:16.460 | But that's not actually the most useful for development.
01:55:19.460 | For development, it's much more kind of analog metrics
01:55:23.260 | that are part of the art of finding how,
01:55:28.100 | what are the properties of driving
01:55:30.460 | that give you a way quicker signal
01:55:31.980 | that's more sensitive than a collision
01:55:33.740 | that can correlate to the quality you care about
01:55:37.300 | and push the feedback loop to all of your development.
01:55:40.020 | A lot of these are, for example,
01:55:41.140 | comparisons to human drivers, like manual drivers.
01:55:43.820 | How do you do relative to a human driver
01:55:46.180 | in various dimensions or various circumstances?
01:55:49.220 | - Can I ask you a tricky question?
01:55:51.060 | So if I brought you a truck, how would you test it?
01:55:54.900 | Okay, Alan Turing came along and you said-
01:55:58.180 | - This one's, can't tell if it's a human driver
01:56:00.100 | or autonomous driver. - Yeah, exactly.
01:56:01.500 | But not the human because humans are flawed.
01:56:05.660 | - It's different, but yeah.
01:56:06.740 | How do you actually know you're ready, basically?
01:56:08.300 | Like, how do you know it's good enough?
01:56:10.740 | Yeah, and by the way, this is the reason why
01:56:13.420 | Waymo released a safety framework for the car side
01:56:15.460 | because one, it sets the bar so nobody cuts below it,
01:56:18.900 | and does something bad for the field
01:56:20.580 | that causes an accident.
01:56:22.260 | Two, it's to start the conversation on framing
01:56:25.500 | what does this need to look like?
01:56:26.820 | Same thing we'll end up doing for the trucking side.
01:56:29.420 | It ends up being different portfolio of approaches.
01:56:35.180 | There's easy things like, are you compliant
01:56:37.460 | with all these fundamental rules of the road?
01:56:39.540 | Like, you never drive above the speed limit.
01:56:41.100 | That's actually pretty easy.
01:56:42.580 | You can fundamentally prove that it's either impossible
01:56:44.660 | to violate that rule or that in these,
01:56:47.820 | you can itemize the scenarios where that comes up
01:56:51.180 | and you can do a test and show that you passed that test
01:56:55.100 | and therefore you can handle that scenario.
01:56:57.900 | And so those are like traditional structure testing
01:57:01.420 | kind of system engineering approaches
01:57:02.900 | where you can just, like fault rates is another example
01:57:06.940 | where when something fails, how do you deal with it?
01:57:09.420 | You're not gonna drive and randomly wait for it to fail.
01:57:11.180 | You're gonna force a failure and make sure
01:57:12.540 | that you can handle it in close courses
01:57:14.860 | and simulation or on the road.
01:57:17.380 | And run through all the permutations of failures
01:57:20.300 | which you can oftentimes for some parts of system itemize
01:57:23.020 | like hardware.
01:57:23.860 | The hardest part is behavioral
01:57:26.700 | where you have just infinite situations
01:57:31.500 | that could in theory happen.
01:57:33.300 | And you wanna figure out the combinations of approaches
01:57:37.060 | that can work there.
01:57:39.020 | You can probably pass the Turing test pretty quickly
01:57:41.540 | even if you're not like completely ready for driverless
01:57:43.740 | because the events that are really kind of like hard
01:57:47.820 | will not happen that often.
01:57:48.900 | Just to give you a perspective,
01:57:50.500 | a human has a serious accident on a freeway
01:57:54.620 | like a truck driver on a freeway has,
01:57:56.420 | there's a serious event happens once every 1.3 million miles
01:58:00.740 | and something that actually has like
01:58:01.820 | a really serious injuries, 28 million miles.
01:58:03.780 | And so those are really rare.
01:58:05.260 | And so you could have a driver that looks like
01:58:07.380 | it's ready to go, but you have no signal
01:58:09.660 | on what happens there.
01:58:11.100 | And so that's where you start to get creative
01:58:13.060 | on combinations of sampling and statistical arguments,
01:58:17.820 | focused structured arguments where you can kind of
01:58:20.580 | simulate those scenarios and show that you can handle them
01:58:24.980 | and metrics that are correlated with what you care about
01:58:28.340 | but you can measure much more quickly
01:58:29.980 | and get to a right answer.
01:58:32.060 | And that's what makes it pretty hard.
01:58:33.540 | And in the end, you end up borrowing a lot of properties
01:58:36.980 | from aerospace and like space shuttles and so forth
01:58:40.900 | where you don't get the chance to launch it a million times
01:58:43.300 | just to say you're ready
01:58:44.500 | because it's too expensive to fail.
01:58:46.700 | And so you go through a huge amount
01:58:50.100 | of kind of structured approaches in order to validate it.
01:58:53.260 | And then by thoroughness, you can make a strong argument
01:58:56.580 | that you're ready to go.
01:58:58.180 | This is actually a harder problem in a lot of ways though
01:59:00.340 | because you can think of a space shuttle
01:59:01.780 | as getting to a fixed point and then you kind of like
01:59:04.700 | or an airplane and you like freeze the software
01:59:06.500 | and then you like prove it and you're good to go.
01:59:08.620 | Here you have to get to a driverless quality bar
01:59:11.100 | but then continue to aggressively change the software
01:59:14.100 | even while you're driverless.
01:59:15.260 | And so-
01:59:16.100 | - But in also the full range of environment
01:59:17.900 | that there's an external environment with a shuttle
01:59:20.900 | is you're basically testing the like the systems,
01:59:24.340 | the internal stuff.
01:59:25.380 | - Yeah.
01:59:26.380 | - And you have a lot of control in the external stuff.
01:59:28.820 | - Yeah, and the hard part is how do you know
01:59:30.460 | you didn't get worse in something that you just changed?
01:59:32.820 | - Yes, sure.
01:59:33.660 | - And so in a lot of ways
01:59:35.580 | like the Turing test starts to fail pretty quickly
01:59:38.740 | because you start to feel driverless quality
01:59:41.140 | pretty early in that curve.
01:59:43.260 | If you think about it, right?
01:59:44.220 | Like in most kind of really good AV demos
01:59:49.220 | maybe you'll sit there for 30 minutes, right?
01:59:50.740 | - Yeah.
01:59:51.580 | - So you've driven 15 miles or something like that.
01:59:54.300 | To go driverless, like what's the sort of rate of issues
01:59:59.340 | that you need to have you won't even encounter?
02:00:01.300 | - So let's try something different then.
02:00:02.900 | Let's try a different version of the Turing test
02:00:05.780 | which is like an IQ test.
02:00:07.980 | So there's these difficult questions
02:00:10.380 | of increasing difficulty.
02:00:11.700 | They're designed, you don't know them ahead of time.
02:00:16.100 | Nobody knows the answer to them.
02:00:17.780 | - Right.
02:00:18.620 | - And so is it possible to in the future orchestrate
02:00:22.460 | basically really difficult--
02:00:23.300 | - Obstacle course almost of like--
02:00:24.780 | - Yeah, that maybe change every year
02:00:29.140 | and that represent, if you can pass these,
02:00:31.900 | they don't necessarily represent the full spectrum.
02:00:34.100 | - That's it, yeah.
02:00:34.940 | They won't be conclusive,
02:00:35.900 | but you can at least get a really quick read and filter.
02:00:38.100 | - Yeah, like you're able to,
02:00:39.700 | 'cause you didn't know them ahead of time.
02:00:41.500 | Like, I don't know, probably--
02:00:43.980 | - Like construction zones, failures--
02:00:46.260 | - Or driving anywhere in Russia.
02:00:47.660 | - Yeah, yeah, exactly.
02:00:48.940 | - Snow.
02:00:50.300 | - Weather, cut-ins, dense traffic,
02:00:53.540 | kind of merging lane closures,
02:00:55.860 | animal foreign objects on a road that pop out
02:00:57.980 | on short notice, mechanical failures,
02:01:00.340 | sensor breaking, tire popped,
02:01:02.940 | weird behaviors by other vehicles,
02:01:05.300 | like a hard brake, something reckless that they've done,
02:01:08.140 | fouling of sensors, like bugs or birds,
02:01:11.100 | you know, poop or something.
02:01:12.820 | But yeah, like you have these like kind of like
02:01:14.380 | extreme conditions where like
02:01:17.100 | you have a nasty construction zone
02:01:18.540 | where everything shuts down
02:01:19.900 | and you have to like, you know,
02:01:21.260 | get pulled to the other side of the freeway
02:01:22.820 | with a temporary lane like that, right?
02:01:24.980 | Those are sort of conditions where
02:01:26.900 | we do that to ourselves, right?
02:01:28.060 | We itemize everything that could possibly happen
02:01:30.340 | to give you a starting point to how to think about
02:01:32.900 | what you need to develop.
02:01:33.860 | And at the end of the day,
02:01:34.700 | there's no substitute for real miles.
02:01:36.380 | Like if you think of traditional ML,
02:01:38.060 | like you know how there's like a validation set
02:01:39.620 | where you hold out some data and like
02:01:42.140 | real world driving is the ultimate validation set.
02:01:44.260 | That's the, in the end, like the cleanest signal.
02:01:47.500 | But you can do a really good job
02:01:48.860 | on creating an obstacle course.
02:01:49.940 | And you're absolutely right, like at the end,
02:01:52.100 | if there was such a thing as automating
02:01:54.700 | and kind of a readiness,
02:01:56.900 | it would be these extreme conditions
02:01:59.180 | like a red light runner, right?
02:02:00.740 | A really reckless pedestrian that's jaywalking,
02:02:04.180 | a cyclist that, you know,
02:02:05.740 | makes like a really awkward maneuver.
02:02:07.540 | That's actually what keeps you from going driverless.
02:02:09.220 | Like in the end, that is the long tail.
02:02:11.380 | - Yeah, and it's interesting to think about that.
02:02:13.380 | That to me is the Turing test.
02:02:14.620 | Turing test means a lot of things,
02:02:15.900 | but to me in driving,
02:02:17.220 | the Turing test is exactly this validation set
02:02:22.020 | that is handcrafted.
02:02:23.140 | There's a, I don't know if you know him.
02:02:25.300 | There's a guy named Francois Chollet.
02:02:27.180 | He designed, he thinks about like how to design a test
02:02:31.980 | for general intelligence.
02:02:32.900 | He designs these IQ tests for machines.
02:02:35.220 | And the validation set for him is handcrafted.
02:02:39.380 | And that it requires like human genius or ingenuity
02:02:43.340 | to create a really good test.
02:02:45.460 | And you hold, you truly hold it out.
02:02:47.260 | It's an interesting perspective on the validation set,
02:02:50.380 | which is like, make that as hard as possible.
02:02:54.700 | Not a generic representation of the data,
02:02:57.660 | but this is the hardest.
02:02:59.540 | - The hardest stuff.
02:03:00.380 | Yeah, you know, it's like, go.
02:03:01.540 | Like you'll never fully itemize
02:03:02.900 | like all the world states that you'll expand.
02:03:05.060 | And so you have to come up with different approaches.
02:03:07.220 | And this is where you start hitting the struggles of ML,
02:03:09.300 | where ML is fantastic at optimizing the average case.
02:03:12.980 | It's a really unique craft to think about
02:03:15.300 | how you deal with the worst case,
02:03:16.780 | which is what we care about in the Navy space,
02:03:19.220 | when using an ML system on something
02:03:22.060 | that occurs like super infrequently.
02:03:24.340 | So like, you don't care about the worst case really on ads,
02:03:26.940 | because if you miss a few, it's not a big deal,
02:03:29.380 | but you do care about it on the driving side.
02:03:30.980 | And so typically like,
02:03:33.540 | you'll never fully enumerate the world.
02:03:36.140 | And so you have to take a step back and abstract away,
02:03:38.340 | what are the signals that you care about
02:03:40.460 | and the properties of a driver
02:03:42.540 | that correlate to defensive driving
02:03:45.820 | and avoiding nasty situations
02:03:49.340 | that even though you'll always be surprised
02:03:52.140 | by things you'll encounter,
02:03:53.460 | you feel good about your ability to generalize
02:03:56.020 | from what you've learned.
02:03:57.260 | - All right, let me ask you a tricky question.
02:04:01.340 | So to me, the two companies that are building at scale
02:04:06.340 | some of the most incredible robots ever built
02:04:11.500 | is Waymo and Tesla.
02:04:13.500 | So there's very distinct approaches,
02:04:19.220 | technically, philosophically in these two systems.
02:04:23.540 | Let me ask you to play sort of devil's advocate
02:04:27.460 | and then the devil's advocate to the devil's advocate.
02:04:30.540 | It's a bit of a race, of course, everyone can win.
02:04:34.600 | But if Waymo wins this race to level four,
02:04:39.860 | why would they win?
02:04:43.780 | What aspect of the approach
02:04:45.180 | do you think would be the winning aspect?
02:04:47.700 | And if Tesla wins, why would they win?
02:04:52.080 | And which aspect of their approach would be the reason?
02:04:55.780 | Just building some intuition,
02:04:57.500 | almost not from a business perspective,
02:04:59.540 | for me, that's just technically.
02:05:01.540 | - Yeah.
02:05:02.380 | - Yeah, and we could summarize,
02:05:03.660 | I think maybe you can correct me,
02:05:05.940 | one of the more distinct aspects
02:05:09.420 | is Waymo has a richer suite of sensors,
02:05:14.020 | there's LiDAR and vision.
02:05:15.860 | Tesla now removed radar, they do vision only.
02:05:19.560 | Tesla has a larger fleet of vehicles operated by humans,
02:05:24.300 | so it's already deployed on the field
02:05:26.460 | and it's larger, what do you call it, operational domain.
02:05:31.460 | And then Waymo is more focused on a specific domain
02:05:35.900 | and growing it with fewer vehicles.
02:05:38.500 | So both are fascinating approaches,
02:05:41.060 | both are, I think, there's a lot of brilliant ideas,
02:05:43.540 | nobody knows the answer.
02:05:44.600 | So I'd love to get your comments on this lay of the land.
02:05:48.540 | - Yeah, for sure.
02:05:49.380 | So maybe I'll start with Waymo.
02:05:51.740 | And you're right, both incredible companies
02:05:53.900 | and just a gigantic respect
02:05:55.180 | to everything Tesla's accomplished
02:05:57.080 | and how they pushed the field forward as well.
02:06:00.000 | So on the Waymo side, there is a fundamental advantage
02:06:03.300 | in the fact that it is focused
02:06:05.940 | and geared towards L4 from the very beginning.
02:06:08.640 | We've customized the sensor suite for it,
02:06:10.500 | the hardware, the compute, the infrastructure,
02:06:13.740 | the tech stack, and all of the investment inside the company.
02:06:17.540 | That's deceptively important
02:06:18.660 | because there's like a giant spectrum of problems
02:06:21.260 | you have to solve in order to really do this
02:06:23.040 | from infrastructure to hardware,
02:06:25.260 | to autonomy stack, to the safety framework.
02:06:28.380 | And that's an advantage
02:06:29.460 | because there's a reason why it's the fifth generation
02:06:31.460 | hardware and why all of those learnings
02:06:34.400 | went into the Dymor program.
02:06:35.820 | It becomes such an advantage
02:06:38.340 | because you learn a lot as you drive
02:06:40.600 | and you optimize for the best information you have,
02:06:43.520 | but fundamentally, there's a big, big jump,
02:06:46.300 | like every order of magnitude that you drive
02:06:50.020 | in numbers of miles and what you learn
02:06:51.860 | and the gap from really kind of like decent progress
02:06:55.480 | for L2 and so forth to what it takes to actually go L4.
02:06:57.860 | And at the end of the day,
02:07:00.020 | there's a feeling that Waymo has,
02:07:02.940 | there's a long way to go, nobody's won,
02:07:06.060 | but there's a lot of advantages in all of these buckets
02:07:11.060 | where it's the only company that has shipped
02:07:13.380 | a fully driverless service where you can go
02:07:14.920 | and you can use it and it's at a decently sizable scale.
02:07:19.920 | And those learnings can feed forward
02:07:21.380 | into how to solve the more general problem.
02:07:23.380 | - And you see this process you've deployed in Chandler,
02:07:26.720 | you don't know the timeline exactly,
02:07:28.420 | but you could see the steps,
02:07:30.520 | they seem almost incremental, the steps.
02:07:33.720 | - It's become more engineering than totally blind R&D.
02:07:36.640 | - 'Cause it works in one place
02:07:37.680 | and then you move to another place and you grow it this way.
02:07:40.300 | - And just to give you an example,
02:07:41.580 | we fundamentally changed our hardware
02:07:43.760 | and our software stack almost entirely
02:07:46.340 | from what went driverless in Phoenix
02:07:48.420 | to what is the current generation of the system
02:07:51.180 | on both sides because the things that got us to driverless,
02:07:55.220 | even though it got to driverless
02:07:57.380 | way beyond human relative safety,
02:07:59.220 | it is fundamentally not well set up to scale
02:08:03.880 | in an exponential fashion without getting
02:08:06.340 | into huge kind of scaling pains.
02:08:08.140 | And so those learnings, you just can't shortcut.
02:08:10.520 | And so that's an advantage.
02:08:11.520 | And so there's a lot of open challenges
02:08:13.940 | to kind of get through technical organizational,
02:08:15.900 | like how do you solve problems
02:08:17.220 | that are increasingly broad and complex
02:08:19.060 | like this work on multiple products.
02:08:20.360 | But there's a few in that, okay, like balls in our court,
02:08:23.640 | there's a headstart there, now we gotta go and solve it.
02:08:26.440 | And I think that focus on L4,
02:08:27.680 | it's a fundamentally different problem.
02:08:28.940 | If you think about it, like let's say
02:08:31.180 | we were designing an L2 truck that was meant
02:08:33.640 | to be safer and help a human,
02:08:35.520 | you could do that with far less sensors,
02:08:37.940 | far less complexity and provide value very quickly,
02:08:41.400 | arguably with what we already have today,
02:08:43.020 | just packaged up in a good product.
02:08:45.060 | But you would take a huge risk in having a gap
02:08:48.860 | from even the like compute and sensors,
02:08:51.320 | not to mention the software to then jump
02:08:53.440 | from that system to an L4 system.
02:08:54.860 | So it's a huge risk basically.
02:08:56.560 | - So again, allow me to be the person
02:08:59.100 | that plays the devil's advocate
02:09:00.400 | and let's argue for the Tesla approach.
02:09:02.400 | So what you just laid out makes perfect sense
02:09:06.100 | and is exactly right.
02:09:07.600 | There are some open questions here,
02:09:09.140 | which is it's possible that investing more
02:09:14.140 | in faster data collection,
02:09:16.440 | which is essentially what Tesla's doing,
02:09:18.900 | will get us there faster
02:09:20.900 | if the sensor suite doesn't matter as much
02:09:25.900 | and machine learning can do a lot of the work.
02:09:29.060 | This is the open question is,
02:09:30.940 | how much is the thing you mentioned before,
02:09:33.800 | how much of driving can be end to end learned?
02:09:37.560 | That's the open question.
02:09:38.720 | Obviously the Waymo
02:09:41.380 | and the vision only machine learning approach
02:09:44.480 | will solve driving eventually both.
02:09:47.940 | The question is of timeline, what's faster.
02:09:50.180 | - That's right.
02:09:51.020 | And what you mentioned,
02:09:51.840 | like if I were to make the opposite argument,
02:09:53.100 | like what puts Tesla in the strongest position, it's data.
02:09:57.140 | That is their like superpower
02:09:58.560 | where they have an access to real world data
02:10:02.560 | effectively with like a safety driver.
02:10:05.640 | And they found a way to like get paid by safety drivers
02:10:10.140 | versus safer safety drivers.
02:10:12.020 | It's brilliant, right?
02:10:14.760 | But all joking aside,
02:10:16.060 | like one, it is incredible that they've built a business
02:10:18.860 | that's incredibly successful
02:10:20.300 | that can now be a foundation and bootstrap
02:10:22.300 | kind of like really aggressive investment
02:10:23.760 | in autonomy space.
02:10:25.520 | If you can do it,
02:10:26.360 | that's always like an incredible kind of advantage.
02:10:28.400 | And then the data aspect of it,
02:10:30.640 | it is a giant amount of data.
02:10:31.940 | If you can use it the right way to then solve the problem,
02:10:33.900 | but the ability to collect and filter through the things
02:10:38.020 | to the things that matter at real world scale,
02:10:40.220 | at like a large distribution, that is huge.
02:10:43.400 | Like it's a big advantage.
02:10:45.480 | And so then the question becomes,
02:10:47.060 | can you use it in our right way?
02:10:48.480 | And do you have the right software systems
02:10:50.780 | and hardware systems in order to solve the problem?
02:10:52.860 | And you're right that in the longterm,
02:10:55.980 | there's no reason to believe
02:10:57.400 | that pure camera systems can't solve the problem
02:11:00.100 | that humans obviously are solving with vision systems.
02:11:04.260 | - Question is when?
02:11:05.100 | - It's a risk.
02:11:06.100 | It's a big risk.
02:11:06.940 | So there's no argument that it's not a risk, right?
02:11:09.560 | And it's already such a hard problem.
02:11:12.100 | And so much of that problem, by the way,
02:11:13.720 | is even beyond the perception side,
02:11:17.060 | some of the hardest elements of the problem
02:11:18.440 | in our behavioral side and decision-making
02:11:20.440 | and the long tail safety case,
02:11:22.680 | if you are adding risk and complexity
02:11:25.640 | on the input side from perception,
02:11:27.600 | you're now making a really, really hard problem
02:11:29.980 | which on its own is still almost insurmountably hard,
02:11:33.860 | even harder.
02:11:34.680 | And so the question is just how much.
02:11:36.760 | And this is where you can easily get into a little bit
02:11:40.100 | of a kind of a trap where similar to how you,
02:11:43.520 | how do you evaluate how good an AV company's product is?
02:11:46.140 | Like you go and you do a trial,
02:11:48.600 | kind of a test run with them, a demo run,
02:11:50.140 | which they've kind of optimized like crazy and so forth.
02:11:52.600 | And like, and it feels good.
02:11:53.520 | Do you put any weight in that, right?
02:11:55.680 | You know that that gap is kind of like pretty large still.
02:11:59.520 | Same thing on the like perception case,
02:12:01.600 | like the long tail of computer vision
02:12:03.440 | is really, really hard.
02:12:04.980 | And there's a lot of ways that that can come up.
02:12:08.100 | And even if it doesn't happen that often at all,
02:12:10.400 | when you think about the safety bar
02:12:12.100 | and what it takes to actually go full driverless,
02:12:14.560 | not like incredible assistance driverless,
02:12:16.440 | but full driverless, that bar gets crazy high.
02:12:20.760 | And not only do you have to solve it on the behavioral side,
02:12:23.520 | but now you have to push computer vision
02:12:26.400 | beyond arguably where it's ever been pushed.
02:12:28.720 | And so you now on top of the broader AV challenge,
02:12:30.980 | you have a really hard perception challenge as well.
02:12:32.940 | - So there's perception, there's planning,
02:12:34.760 | there's human robot interaction.
02:12:36.300 | To me, what's fascinating about what Tesla is doing
02:12:40.100 | is in this march towards level four,
02:12:43.140 | because it's in the hands of so many humans,
02:12:45.720 | you get to see video, you get to see humans.
02:12:48.560 | I mean, forget companies, forget businesses.
02:12:52.020 | It's fascinating for humans to be interacting with robots.
02:12:55.560 | - That's incredible.
02:12:56.400 | And they're actually helping kind of push it forward.
02:12:57.940 | And that is valuable by the way,
02:12:59.640 | where even for us, a decent percentage of our data
02:13:02.680 | is human driving.
02:13:04.360 | We intentionally have humans drive higher percentage
02:13:07.320 | than you might expect,
02:13:08.160 | because that creates some of the best signals
02:13:09.940 | to train the autonomy.
02:13:11.240 | And so that is on its own value.
02:13:14.480 | - So together we're kind of learning about this problem
02:13:17.860 | in an applied sense, just like you had with Cosmo.
02:13:20.360 | When you're chasing an actual product
02:13:24.060 | that people are going to use, robot-based product
02:13:27.600 | that people are going to use,
02:13:29.160 | you have to contend with the reality
02:13:30.880 | of what it takes to build a robot
02:13:33.280 | that successfully perceives the world
02:13:34.720 | and operates in the world,
02:13:35.680 | and what it takes to have a robot
02:13:37.040 | that interacts with other humans in the world.
02:13:39.000 | And that's like, to me, one of the most interesting problems
02:13:42.080 | humans have ever undertaken,
02:13:43.760 | because you're in trying to create an intelligent agent
02:13:47.840 | that operates in a human world,
02:13:49.760 | you're also understanding the nature of intelligence itself.
02:13:54.080 | Like how hard is driving?
02:13:56.360 | It's still not answered to me.
02:13:59.680 | I still don't understand.
02:14:00.920 | - All the subtle cues, even little things
02:14:04.040 | like your interaction with a pedestrian
02:14:06.480 | where you look at each other and just go, okay, go.
02:14:09.200 | That's hard to do without a human driver, right?
02:14:12.000 | And you're missing that dimension.
02:14:13.440 | How do you communicate that?
02:14:14.560 | So there's really, really interesting
02:14:16.320 | kind of elements here.
02:14:18.080 | Now here's what's beautiful.
02:14:18.900 | Can you imagine that when autonomous driving is solved,
02:14:23.120 | how much of the technology foundation of that space
02:14:26.800 | can go and have tremendous, just transformative impacts
02:14:30.680 | on other problem areas and other spaces
02:14:33.440 | that have subsets of these same problems?
02:14:36.840 | It's just incredible to think about that.
02:14:38.360 | - It's both a pro and a con.
02:14:40.320 | With autonomous driving is so safety critical.
02:14:46.480 | Once you solve it, it's beautiful
02:14:49.680 | because there's so many applications
02:14:51.360 | that are a lot less safety critical.
02:14:53.480 | But it's also the con of that is it's so hard to solve.
02:14:58.120 | And the same journalists that you mentioned
02:14:59.960 | that get excited for a demo are the ones
02:15:02.280 | who write long articles about the failure of your company
02:15:07.280 | if there's one accident that's based on a robot.
02:15:12.600 | And it's just society's so tense
02:15:15.880 | and waiting for failure of robots.
02:15:17.880 | You're in such a high stake environment.
02:15:20.280 | Failure has such a high cost.
02:15:22.360 | - And it slows down development.
02:15:23.480 | - It slows down development.
02:15:24.880 | - Yeah, the team definitely noticed
02:15:26.520 | that once you go driverless,
02:15:28.080 | like we're driverless in Phoenix
02:15:29.480 | and you continue to iterate,
02:15:31.040 | your iteration pace slows down
02:15:33.240 | because your fear of regression forces so much more rigor
02:15:38.240 | that obviously you have to find a compromise on like,
02:15:43.000 | okay, well, how often do we release driverless builds?
02:15:45.360 | Because every time you release a driverless build,
02:15:46.760 | you have to go through this validation process,
02:15:48.840 | which is very expensive and so forth.
02:15:50.000 | So it is interesting.
02:15:51.880 | It's like, it is one of the hardest things.
02:15:53.960 | There's no other industry where like,
02:15:55.800 | you would not like,
02:15:56.920 | you wouldn't release the products way, way quicker
02:15:59.000 | when you start to kind of provide
02:16:00.560 | even portions of the value that you provide.
02:16:03.120 | Healthcare maybe is the other one.
02:16:04.440 | - Health, that's right.
02:16:05.280 | - But at the same time, right?
02:16:06.360 | Like we've gotten there where you think of like surgery,
02:16:08.600 | right?
02:16:09.440 | Like you have surgery, there's always a risk,
02:16:11.280 | but like, it's really, really bounded.
02:16:13.720 | You know that there's an accident rate
02:16:14.960 | when you go out and drive your car today, right?
02:16:16.480 | Like, and you know what the fatality rate
02:16:18.800 | in the US is per year.
02:16:20.160 | We're not banning driving because there was a car accident,
02:16:23.240 | but the bar for us is way higher.
02:16:24.800 | And we hold ourselves very serious to it,
02:16:26.440 | where you have to not only be better than a human,
02:16:29.440 | but you probably have to like at scale,
02:16:31.440 | be far better than a human by a big margin.
02:16:33.720 | And you have to be able to like really,
02:16:35.560 | really thoughtfully explain all of the ways
02:16:38.640 | that we validate that becomes very comfortable
02:16:41.160 | for humans to understand,
02:16:42.680 | because a bunch of jargon that we use internally
02:16:44.480 | just doesn't compute at the end of the day.
02:16:46.960 | We have to be able to explain to society,
02:16:49.120 | how do we quantify the risk and acknowledge
02:16:51.880 | that there is some non-zero risk,
02:16:54.120 | but it's far above a human, you know, relative safety.
02:16:57.120 | - See, here's the thing to push back a little bit
02:17:00.440 | and bring Cosmo back in the conversation.
02:17:03.120 | You said something quite brilliant
02:17:04.280 | at the beginning of this conversation,
02:17:05.360 | I think probably applies for autonomous driving,
02:17:08.240 | which is, you know, there's this desire
02:17:10.800 | to make autonomous cars more safer
02:17:12.720 | than human driven cars.
02:17:14.680 | But if you create a product that's really compelling
02:17:18.360 | and is able to explain both the leadership
02:17:20.880 | and the engineers and the product itself
02:17:23.960 | can communicate intent,
02:17:26.640 | then I think people may be able to be willing
02:17:29.480 | to put up with the thing that might be even riskier
02:17:32.040 | than humans, because they understand
02:17:35.560 | the value of taking risks.
02:17:36.880 | You mentioned the speed limit.
02:17:38.520 | Humans understand the value of going over the speed limit.
02:17:41.480 | Humans understand the value of like going fast
02:17:46.200 | through a yellow light.
02:17:48.640 | To take, and when you're in Manhattan streets,
02:17:50.880 | pushing through, crossing pedestrians,
02:17:53.960 | they understand that.
02:17:55.000 | I mean, this is a much more tense topic of discussion,
02:17:57.640 | so this is just me talking.
02:17:59.320 | So in Cosmo's case, there was something about the way
02:18:03.520 | this particular robot communicated,
02:18:05.360 | the energy it brought, the intent it was able
02:18:07.260 | to communicate to the humans,
02:18:09.120 | that you understood that of course
02:18:11.600 | it needs to have a camera.
02:18:13.360 | Of course it needs to have this information.
02:18:15.280 | And in that same way, to me,
02:18:17.640 | of course a car needs to take risks.
02:18:20.040 | Of course there's going to be accidents.
02:18:22.740 | If you want a car that never has an accident,
02:18:28.600 | have a car that just doesn't go anywhere.
02:18:30.640 | But that's tricky, because that's not a robotics problem.
02:18:37.040 | - And many accidents are not even due to you, right?
02:18:40.120 | Obviously it's, so there's a big difference though.
02:18:43.140 | That's not a personal decision.
02:18:46.800 | You're also impacting obviously kind of the rest
02:18:49.240 | of the road, and we're facilitating it, right?
02:18:52.240 | And so there's a higher kind of ethical and moral bar,
02:18:56.720 | which obviously then translates into, as a society,
02:19:00.360 | and from a regulatory standpoint,
02:19:01.480 | kind of like what comes out of it,
02:19:02.960 | where it's hard for us to ever see this even being a debate
02:19:07.960 | in the sense that like, you have to be beyond reproach
02:19:12.960 | from a safety standpoint,
02:19:14.040 | because if you're wrong about this,
02:19:15.160 | you could set the entire field back a decade, right?
02:19:17.200 | - See, this is me speaking.
02:19:19.760 | I think if we look into the future, there will be,
02:19:23.880 | I personally believe, this is me speaking,
02:19:28.100 | that there will be less and less focus on safety.
02:19:30.760 | That's still very, very high.
02:19:32.720 | - Yeah, meaning like after autonomy
02:19:34.480 | is very common and accepted.
02:19:36.040 | - But not so common that it's everywhere,
02:19:38.760 | but there has to be a transition,
02:19:40.400 | because I think for innovation, just like you were saying,
02:19:44.480 | to explore ideas, you have to take risks.
02:19:46.820 | And I think if autonomy in the near term
02:19:50.760 | is to become prevalent in society,
02:19:53.440 | I think people need to be more willing
02:19:55.900 | to understand the nature of risk, the value of risk.
02:20:00.500 | It's very difficult, you're right, of course, with driving,
02:20:04.260 | but that's the fascinating nature of it.
02:20:06.500 | It's a life and death situation
02:20:10.840 | that brings value to millions of people.
02:20:13.300 | So you have to figure out what do we value about this world?
02:20:16.940 | How much do we value,
02:20:18.900 | how deeply do we want to avoid hurting other humans?
02:20:23.620 | - That's right.
02:20:24.460 | And there is a point where like,
02:20:26.580 | you can imagine a scenario where Waymo has a system
02:20:29.580 | that is even when it's like kind of beyond
02:20:33.540 | human relative safety and provably statistically
02:20:38.500 | will save lives, there is a thoughtful navigation
02:20:43.300 | of that fact versus just kind of society readiness
02:20:48.300 | and perception and education of society and regulators
02:20:56.380 | and everything else where like, it's multi-dimensional
02:21:01.020 | and it's not a purely logical argument,
02:21:04.180 | but ironically, the logic can actually help
02:21:07.100 | with the emotions.
02:21:08.320 | And just like any technology, there's early adopters
02:21:11.860 | and then there's kind of like a curve
02:21:13.060 | that happens after it.
02:21:14.740 | - But eventually celebrities, you get the rock
02:21:16.900 | in a Waymo vehicle and then everybody just comes.
02:21:19.220 | - And then everybody's comes down
02:21:20.180 | 'cause the rock likes it, yeah.
02:21:21.580 | (laughing)
02:21:23.260 | - If you post the-
02:21:24.620 | - Yeah, and it's like, it's an open question
02:21:26.540 | on how this plays out.
02:21:27.380 | I mean, maybe we're pleasantly surprised
02:21:28.620 | and it just like, people just realize
02:21:30.100 | that this is such a enabler of life
02:21:32.580 | and like efficiency and cost and everything
02:21:35.220 | that there's a pull, like at some point
02:21:38.100 | I should fully believe that this will go
02:21:39.900 | from a thoughtful kind of movement and tiptoeing
02:21:44.860 | and like kind of like a push to society realizes
02:21:47.520 | how wonderful of an enabler this could become
02:21:50.460 | and it becomes more of a pull
02:21:51.600 | and hard to know exactly how that'll play out,
02:21:53.820 | but at the end of the day,
02:21:54.660 | like both the goods transportation
02:21:56.980 | and the people transportation side of it
02:21:58.340 | has that property where it's not easy.
02:22:00.420 | There's a lot of open questions and challenges to navigate
02:22:02.980 | and there's obviously the technical problems to solve
02:22:05.820 | as a kind of prerequisite,
02:22:07.380 | but they have such an opportunity that is on a scale
02:22:12.380 | that very few industries in the last 20, 30 years
02:22:15.460 | have even had a chance to tackle
02:22:17.220 | that I maybe were pleasantly surprised
02:22:20.500 | by how much that tipping point,
02:22:22.780 | like in a really short amount of time
02:22:24.220 | actually turns into a societal pull
02:22:26.480 | to kind of embrace the benefits of this.
02:22:27.860 | - Yeah, I hope so.
02:22:29.100 | It seems like in the recent few decades,
02:22:31.180 | there's been tipping points for technologies
02:22:32.920 | where like overnight things change.
02:22:34.980 | It's like from taxis to ride sharing services,
02:22:39.060 | all that shift.
02:22:40.180 | I mean, there's just shift after shift after shift
02:22:42.900 | that requires digitization and technology.
02:22:45.580 | I hope we're pleasantly surprised in this.
02:22:47.380 | So there's millions of long haul trucks
02:22:49.820 | now in the United States.
02:22:51.720 | Do you see a future where there's millions
02:22:55.420 | of way more trucks and maybe just broadly speaking,
02:22:59.000 | way more vehicles, just like ants running around
02:23:03.380 | the United States, freeways and local roads?
02:23:07.780 | - Yeah, and other countries too.
02:23:09.860 | You look back decades from now
02:23:11.660 | and it might be one of those things
02:23:13.940 | that just feels so natural
02:23:15.220 | and then it becomes almost like this kind of interesting
02:23:17.900 | kind of oddity that we had none of it,
02:23:19.500 | like kind of decades earlier.
02:23:22.400 | And it'll take a long time to grow in scale.
02:23:25.560 | Very different challenges appear at every stage.
02:23:28.100 | But over time, like this is one of the most enabling
02:23:31.520 | technologies that we have in the world today.
02:23:34.940 | It'll feel like, how was the world before the internet?
02:23:39.040 | How was the world before mobile phones?
02:23:40.280 | Like it's gonna have that sort of a feeling
02:23:41.480 | to it on both sides.
02:23:42.400 | - It's hard to predict the future,
02:23:43.640 | but do you sometimes think about weird ways
02:23:48.320 | it might change the world?
02:23:49.200 | Like surprising ways.
02:23:50.680 | So obviously there's more direct ways
02:23:53.320 | where like there's increases efficiency.
02:23:55.760 | It will enable a lot of kind of logistics,
02:23:58.120 | optimizations kind of things.
02:23:59.740 | It will change probably our roadways
02:24:05.800 | and all that kind of stuff.
02:24:06.920 | But it could also change society
02:24:09.400 | in some kind of interesting ways.
02:24:11.720 | Do you ever think about how it might change cities?
02:24:13.400 | How it might change our lives?
02:24:14.680 | All that kind of stuff.
02:24:16.080 | - You can imagine city where people live versus work
02:24:19.080 | becoming more distributed because the pain of commuting
02:24:21.520 | becomes different, just easier.
02:24:23.480 | There's a lot of options that open up.
02:24:26.300 | The layout of cities themselves
02:24:28.080 | and how you think about car storage and parking
02:24:31.680 | obviously just enables a completely different type
02:24:35.480 | of experience in urban environments.
02:24:39.160 | I think there was like a statistic that something like
02:24:43.080 | 30% of the traffic in cities during rush hour
02:24:46.760 | is caused by pursuit of parking
02:24:49.320 | or something like some really high stats.
02:24:50.960 | So those obviously kind of open up a lot of options.
02:24:53.920 | Flexibility on goods will enable new industries
02:24:58.640 | and businesses that never existed before
02:25:00.200 | because now the efficiency becomes more palatable.
02:25:03.840 | Good delivery, timing, consistency
02:25:05.640 | and flexibility is gonna change.
02:25:07.720 | The way we distribute the logistics network will change.
02:25:11.000 | The way we then can integrate with warehousing,
02:25:13.480 | with shipping ports.
02:25:16.200 | You can start to think about greater automation
02:25:18.920 | through the whole kind of stack
02:25:21.200 | and how that supply chain, the ripples become much more agile
02:25:25.880 | versus like very grindy the way they are today
02:25:30.520 | where just the adaptation is like very tough
02:25:32.800 | and there's like a lot of constraints that we have.
02:25:34.960 | I think it'll be great for the environment.
02:25:36.320 | It'll be great for safety where like probably
02:25:38.960 | about 95% of accidents today statistically
02:25:42.960 | are due to just attention or things that are preventable
02:25:46.760 | with the strengths of automation.
02:25:49.800 | Yeah, and it'll be one of those things
02:25:51.080 | where like industries will shift,
02:25:53.320 | but the net creation is gonna be massively positive.
02:25:56.240 | And then we just have to be thoughtful
02:25:57.600 | about the negative implications that will happen
02:25:59.960 | in local places and adjust for those.
02:26:03.080 | But I'm an optimist in general for the technology
02:26:05.200 | where you could argue a negative on any new technology,
02:26:07.280 | but you start to kind of see that
02:26:10.520 | if there is a big demand for something like this,
02:26:12.760 | that in almost all cases,
02:26:14.320 | that like it's an enabling factor
02:26:16.040 | that's gonna kind of propagate through society.
02:26:19.880 | And particularly as life expectancies get longer
02:26:22.720 | and so forth, like there's just a lot more need
02:26:26.080 | for a greater percentage of the population
02:26:28.840 | to kind of just be serviced with a high level of efficiency
02:26:32.280 | because otherwise we're gonna have a really hard time
02:26:33.760 | kind of scaling to what's ahead in the next 50 years.
02:26:36.720 | - Yeah, and you're absolutely right.
02:26:37.960 | Every technology has negative consequences
02:26:41.120 | and positive consequences.
02:26:42.680 | And we tend to focus on the negative a little bit too much.
02:26:45.640 | In fact, autonomous trucks are often brought up
02:26:49.840 | as an example of artificial intelligence
02:26:54.080 | and robots in general taking our jobs.
02:26:57.320 | And as we've talked about briefly here,
02:26:59.160 | we talk a lot with Steve, you know,
02:27:01.120 | that it is a concern that automation
02:27:05.720 | will take away certain jobs, it'll create other jobs.
02:27:08.800 | So there's temporary pain, hopefully temporary,
02:27:12.160 | but pain is pain and people suffer.
02:27:14.880 | And that human suffering is really important
02:27:16.640 | to think about how, but trucking is,
02:27:21.080 | I mean, there's a lot written on this,
02:27:22.840 | is I would say far from the thing
02:27:25.840 | that will cause the most pain.
02:27:28.640 | - Yeah, there's even more positive properties
02:27:30.280 | about trucking where not only is there just a huge shortage
02:27:33.120 | which is gonna increase,
02:27:34.440 | the average age of truck drivers is getting closer to 50
02:27:36.880 | because the younger people aren't wanting to come into it.
02:27:39.000 | They're trying to like incentivize, lower the age limit,
02:27:42.560 | like all these sorts of things.
02:27:44.600 | And the demand is just gonna increase.
02:27:46.240 | And the least favorable, like, I mean, it depends
02:27:48.520 | on the person, but in most cases,
02:27:49.720 | the least favorable types of routes
02:27:51.480 | are the massive long haul routes
02:27:53.120 | where you're on the road away from your family
02:27:54.680 | 300 plus days a year.
02:27:56.080 | - Steve's talked about the pain of those kinds of routes
02:27:58.760 | from a family perspective.
02:28:00.200 | You're basically away from family.
02:28:03.200 | It's not just hours, you work insane hours,
02:28:06.040 | but it's also just time away from family.
02:28:08.840 | - Obesity rate is through the roof
02:28:10.360 | because you're just sitting all day.
02:28:11.600 | Like it's really, really tough.
02:28:13.760 | And that's also where like the biggest kind of safety risk
02:28:17.000 | is because of fatigue.
02:28:17.920 | And so when you think of the gradual evolution
02:28:21.360 | of how trucking comes in, first of all, it's not overnight.
02:28:23.840 | It's gonna take decades to kind of phase in all the,
02:28:26.360 | like there's just a long, long, long road ahead,
02:28:29.200 | but the routes and the portions of trucking
02:28:33.000 | that are gonna require humans the longest
02:28:35.440 | and benefit the most from humans are the short haul
02:28:38.120 | and most complicated kind of more urban routes,
02:28:40.640 | which are also the more pleasant ones,
02:28:42.520 | which are less continual driving time,
02:28:46.240 | more flexibility on like geography and location.
02:28:51.120 | And you get to kind of sleep at your own home.
02:28:54.680 | - And very importantly, if you optimize the logistics,
02:28:56.960 | you're going to use humans much better
02:29:01.960 | and thereby pay them much better.
02:29:04.960 | Because like one of the biggest problems
02:29:06.720 | is truck drivers currently are paid
02:29:09.240 | by like how much they drive.
02:29:11.240 | So they really feel the pain of inefficient logistics.
02:29:15.160 | Because like if they're just sitting around for hours,
02:29:17.640 | which they often do not driving, waiting,
02:29:20.280 | they're not getting paid for that time.
02:29:21.800 | - That's right.
02:29:23.000 | - So like logistics has a significant impact
02:29:25.400 | on the quality of life of a truck driver.
02:29:27.040 | - And a high percentage of trucks are like empty
02:29:29.360 | because of inefficiencies in the system.
02:29:31.720 | Yeah, it's one of those things where like,
02:29:33.960 | and the other thing is when you increase the efficiency
02:29:35.400 | of a system like this,
02:29:36.720 | the overall net like volume of the system
02:29:39.560 | tends to increase, right?
02:29:40.480 | Like the entire market cap of trucking is going to go up
02:29:44.440 | when the efficiency improves
02:29:46.640 | and facilitates both growth in industries
02:29:48.960 | and better utilization of trucking.
02:29:50.760 | And so that on its own just creates more and more demand,
02:29:53.440 | which of all the places where AI comes in
02:29:57.040 | and starts to really kind of reshape an industry,
02:30:01.920 | this is one of those where like,
02:30:03.200 | there's just a lot of positives that for at least any time
02:30:06.240 | in the foreseeable future seem really lined up
02:30:08.120 | in a good way to kind of come in and help with the shortage
02:30:13.120 | and start to kind of optimize for the routes
02:30:16.280 | that are most dangerous and most painful.
02:30:18.640 | - Yeah, so this is true for trucking,
02:30:21.600 | but if we zoom out broader,
02:30:23.440 | automation and AI does technology broadly, I would say,
02:30:27.080 | but automation is a thing that has a potential
02:30:31.480 | in the next couple of decades to shift
02:30:33.600 | the kind of jobs available to humans.
02:30:36.640 | And so that results in, like I said, human suffering
02:30:41.080 | because people lose their jobs, there's economic pain there.
02:30:44.000 | And there's also a pain of meaning.
02:30:46.640 | So for a lot of people, work is a source of meaning,
02:30:51.640 | it's a source of identity, of pride,
02:30:56.560 | pride in getting good at the job,
02:31:00.720 | pride in craftsmanship and excellence,
02:31:03.320 | which is what truck drivers talk about.
02:31:05.920 | But this is true for a lot of jobs.
02:31:08.280 | And is that something you think about as a sort of a robotic
02:31:11.720 | is zooming out from the trucking thing,
02:31:13.680 | like where do you think it would be harder
02:31:18.600 | to find activity and work that's a source of identity,
02:31:22.560 | a source of meaning in the future?
02:31:24.920 | - I do think about it because you want to make sure
02:31:27.040 | that you worry about the entire system,
02:31:29.960 | like not just like the part of how it plays in it,
02:31:33.000 | but what are the ripple effects of it down the road.
02:31:35.120 | And on enough of a time window,
02:31:37.520 | there's a lot of opportunity to put in the right policies,
02:31:39.960 | the right opportunities to kind of reshape and retrain
02:31:42.680 | and find those openings.
02:31:44.080 | And so just to give you a few examples,
02:31:45.520 | both trucking and cars, we have remote assistance facilities
02:31:50.520 | that are there to interface with customers
02:31:53.880 | and monitor vehicles and provide like very focused
02:31:57.880 | kind of assistance on kind of areas where the vehicle
02:32:01.120 | may want to request help in understanding an environment.
02:32:04.120 | So those are jobs that kind of get created and supported.
02:32:07.280 | I remember like taking a tour of one of the Amazon facilities
02:32:10.200 | where you've probably seen the Kiva Systems robots,
02:32:13.320 | where you have these orange robots that have automated
02:32:16.120 | the warehouse, like kind of picking and collecting of items.
02:32:19.760 | And it's like really elegant and beautiful way.
02:32:22.560 | It's actually one of my favorite applications
02:32:23.840 | of robotics of all time.
02:32:26.800 | Like I think it kind of came across that company
02:32:28.400 | like 2006, it was just amazing.
02:32:30.360 | And what was the--
02:32:31.840 | - Warehouse robots that transport little things.
02:32:33.600 | - So basically instead of a person going and walking around
02:32:36.120 | and picking the seven items in your order,
02:32:38.920 | these robots go and pick up a shelf and move it over
02:32:42.560 | in a row where like the seven shelves that contain
02:32:44.600 | the seven items are lined up in a laser or whatever points
02:32:48.000 | to what you need to get.
02:32:48.840 | And you go and pick it and you place it to fill the order.
02:32:50.880 | And so the people are fulfilling the final orders.
02:32:53.400 | What was interesting about that is that
02:32:55.600 | when I was asking them about like kind of the impact
02:32:57.240 | on labor, when they transitioned that warehouse,
02:32:59.320 | the throughput increased so much that the jobs shifted
02:33:02.840 | towards the final fulfillment,
02:33:05.160 | even though the robots took over entirely
02:33:07.280 | the search of the items themselves and the labor,
02:33:10.920 | the job stayed like nobody,
02:33:12.760 | like there was actually the same amount of jobs
02:33:15.280 | roughly that were necessary,
02:33:16.480 | but the throughput increased by like,
02:33:17.680 | I think over two X or some amount, right?
02:33:19.640 | Like, so you have these situations that are not zero sum
02:33:22.920 | games in this really interesting way.
02:33:24.200 | And the optimist to me thinks that there's these types
02:33:26.320 | of solutions in almost any industry where the growth
02:33:29.200 | that's enabled creates opportunities
02:33:31.080 | that you can then leverage,
02:33:32.360 | but you gotta be intentional about finding those
02:33:34.520 | and really helping make those links.
02:33:36.480 | Because even if you make the argument that like,
02:33:39.440 | there's a net positive,
02:33:41.240 | locally there's always tough hits
02:33:43.280 | that you gotta be very careful about.
02:33:44.560 | - That's right.
02:33:45.400 | You have to have an understanding of that link
02:33:47.400 | because there's a short period of time,
02:33:50.240 | whether training is acquired or just mental transition
02:33:53.560 | or physical or whatever is acquired,
02:33:55.880 | that's still gonna be short-term pain.
02:33:58.040 | The uncertainty of it, there's families involved.
02:34:00.920 | I mean, it's exceptionally,
02:34:05.480 | it's difficult on a human level
02:34:07.280 | and you have to really think about that.
02:34:09.800 | You can't just look at economic metrics always,
02:34:12.000 | it's human beings.
02:34:13.080 | - That's right.
02:34:13.920 | And you can't even just take it as like, okay,
02:34:16.000 | well, we need to like subsidize or whatever,
02:34:17.760 | because like there is an element of just personal pride
02:34:20.480 | where majority of people don't wanna just be okay,
02:34:24.760 | but like they wanna actually like have a craft,
02:34:26.520 | like you said, and have a mission
02:34:28.280 | and feel like they're having a really positive impact.
02:34:30.800 | And so my personal belief is that
02:34:33.440 | there's a lot of transferability and skillset
02:34:36.760 | that is possible, especially if you create a bridge
02:34:39.880 | and an investment to enable it.
02:34:43.320 | And to some degree, that's our responsibility as well,
02:34:46.520 | this process.
02:34:48.240 | - You mentioned Kiva Robots, Amazon.
02:34:51.360 | Let me ask you about the Astro Robot,
02:34:54.760 | which is, I don't know if you've seen it.
02:34:56.520 | Amazon has announced that it's a home robot
02:35:01.520 | that they have, the screen looks awfully a lot like Cosmo,
02:35:06.000 | has I think different vision probably.
02:35:10.000 | What are your thoughts about like home robotics
02:35:12.120 | in this kind of space?
02:35:13.120 | There's been quite a bunch of home robots,
02:35:15.920 | social robots that very unfortunately
02:35:18.680 | have closed their doors for various reasons,
02:35:23.160 | perhaps they were too expensive,
02:35:24.320 | there's been manufacturing challenges,
02:35:25.880 | all that kind of stuff.
02:35:27.120 | What are your thoughts about Amazon getting into this space?
02:35:30.120 | - Yeah, we had some signs that they're getting into it
02:35:32.560 | like long, long, long ago.
02:35:34.040 | Maybe they were a little bit too interested in Cosmo
02:35:37.000 | and I drew in our conversations,
02:35:38.840 | but they're also very good partners actually for us
02:35:41.320 | as we kind of just integrated a lot of shared technology.
02:35:43.560 | - If I could also get your thoughts on,
02:35:47.320 | you could think of Alexa as a robot as well, Echo.
02:35:52.200 | Do you see those as fundamentally different
02:35:55.360 | just because you can move and look around,
02:35:57.760 | is that fundamentally different
02:35:58.840 | than the thing that just sits in place?
02:36:00.840 | - It opens up options,
02:36:02.600 | but my first reaction is I think like,
02:36:05.920 | I have my doubts that this one's gonna hit the mark
02:36:08.880 | because I think for the price point that it's at
02:36:11.480 | and the kind of functionality and value propositions
02:36:13.680 | that they're trying to put out,
02:36:15.360 | it's still searching for the kill application
02:36:18.480 | that justifies, I think it was like a $1,500 price point
02:36:21.640 | or kind of somewhere on there.
02:36:23.280 | That's a really high bar.
02:36:24.520 | So there's enthusiasts and early adopters
02:36:27.280 | will obviously kind of pursue it,
02:36:28.520 | but you have to really, really hit a high mark
02:36:31.600 | at that price point, which we always tried to,
02:36:33.640 | we were always very cautious about jumping too quickly
02:36:35.760 | to the more advanced systems that we really wanted to make,
02:36:38.680 | but would have raised the bar so much
02:36:41.400 | you have to be able to hit it
02:36:42.880 | in today's cost structures and technologies.
02:36:45.520 | The mobility is an angle that hasn't been utilized,
02:36:49.520 | but it has to be utilized in the right way.
02:36:52.800 | And so that's gonna be the biggest challenge
02:36:54.040 | is like, can you meet the bar
02:36:55.840 | of what the mass market consumer,
02:36:58.600 | like think like our neighbors, our friends, parents,
02:37:03.600 | like would they find a deep, deep value
02:37:05.760 | like in this at a mass scale
02:37:08.880 | that justifies the price point?
02:37:10.440 | I think that's in the end,
02:37:11.320 | one of the biggest challenges for robotics,
02:37:13.120 | especially consumer robotics,
02:37:15.120 | where you have to kind of meet that bar,
02:37:17.840 | it becomes very, very hard.
02:37:20.000 | - And there's also the higher bar,
02:37:22.200 | just like you were saying with Cosmo of,
02:37:24.760 | you know, a thing that can look one way
02:37:27.680 | and then turn around and look at you,
02:37:29.520 | that's either a super desirable quality
02:37:33.320 | or super undesirable quality,
02:37:35.760 | depending on how much you trust the thing.
02:37:37.560 | - That's right.
02:37:38.400 | So there's a problem of trust to solve there.
02:37:41.440 | There's a problem of personality.
02:37:42.680 | It's the thing that is the quote unquote problem
02:37:44.880 | that Cosmo solved so well,
02:37:46.720 | is that you trust the thing.
02:37:49.160 | And that has to do with the company,
02:37:50.520 | with the leadership,
02:37:51.480 | with the intent that's communicated by the device
02:37:54.160 | and the company and everything together.
02:37:56.200 | - Yeah, exactly right.
02:37:57.560 | And so, and I think they also have to retrace
02:38:00.520 | some of the like warnings on the character side
02:38:02.600 | where like, as usual,
02:38:04.000 | I think that's the place where it's a,
02:38:06.360 | a lot of companies are great at the hardware side of it
02:38:08.480 | and can, you know,
02:38:09.320 | think about those elements.
02:38:10.160 | And then there's like, you know,
02:38:11.240 | the thinking about the AI challenges,
02:38:12.880 | particularly with the advantage of Alexa
02:38:14.400 | is a pretty huge boost for them.
02:38:16.320 | The character side of it for technology companies
02:38:18.760 | is pretty new, novel territory.
02:38:20.600 | And so that will take some iterations,
02:38:23.280 | but yeah, I mean, I hope,
02:38:25.600 | I hope there's continued progress in this space
02:38:27.240 | and that thread doesn't kind of go dormant for too long.
02:38:30.400 | And it's not, you know,
02:38:31.560 | it's gonna take a while to kind of evolve
02:38:34.320 | into like the ideal applications,
02:38:36.040 | but you know, this is one of Amazon's,
02:38:39.600 | I guess like you could call it,
02:38:41.040 | it's definitely like part of their DNA,
02:38:42.920 | but in many cases is also strength
02:38:44.680 | where they're very willing to like iterate
02:38:46.960 | kind of aggressively and move quickly.
02:38:50.240 | - Not take risks.
02:38:51.200 | - And take risks.
02:38:52.040 | - You have deep pockets so you can kind of.
02:38:53.080 | - Yeah, and then maybe have more misfires
02:38:55.280 | than an apple would,
02:38:56.600 | but you know, it's different styles
02:38:58.760 | and different approaches.
02:38:59.600 | And you know, at the end of the day,
02:39:02.160 | it's like there's a few familiar kind of elements there
02:39:04.880 | for sure, which was, you know, kind of.
02:39:07.920 | - Homage.
02:39:08.760 | Is one way to put it.
02:39:12.120 | So why is it so hard at a high level
02:39:17.120 | to build a robotics company,
02:39:19.840 | a robotics company that lives for a long time?
02:39:23.080 | So if you look at,
02:39:25.040 | so I thought Cosmo for sure would live for a very long time.
02:39:29.280 | That to me was exceptionally successful vision
02:39:31.600 | and idea and implementation.
02:39:34.440 | iRobot is an example of a company
02:39:37.000 | that has pivoted in all the right ways to survive
02:39:41.600 | and arguably thrive by focusing on the,
02:39:46.400 | having like a, have a driver that constantly provides profit,
02:39:51.120 | which is the vacuum cleaner.
02:39:52.640 | And of course there's like Amazon,
02:39:54.920 | what they're doing is they're almost like taking risks
02:39:58.360 | so they can afford it
02:39:59.200 | because they have other sources of revenue, right?
02:40:02.560 | But outside of those examples,
02:40:05.480 | most robotics companies fail.
02:40:07.720 | - Yeah.
02:40:08.560 | - Why do they fail?
02:40:10.160 | Why is it so hard to run a robotics company?
02:40:12.600 | - iRobot's impressive because they found
02:40:14.760 | a really, really great fit of where the technology
02:40:18.240 | could satisfy a really clear use case and need.
02:40:21.280 | And they did it well,
02:40:23.320 | and they didn't try to overshoot
02:40:25.520 | from a cost to benefit standpoint.
02:40:28.000 | Robotics is hard because it like tends to be more expensive.
02:40:31.920 | It combines way more technologies
02:40:33.480 | than a lot of other types of companies do.
02:40:35.760 | If I were to like say one thing
02:40:37.120 | that is maybe the biggest risk
02:40:39.320 | in like a robotics company failing
02:40:41.000 | is that it can be either a technology
02:40:44.240 | in search of a application,
02:40:46.320 | or they try to bite off a kind of an offering
02:40:51.320 | that has a mismatch in kind of price to function.
02:40:56.040 | And just the mass market appeal isn't there.
02:40:59.280 | And consumer products are just hard.
02:41:01.920 | It's just, I mean, after all the years in it,
02:41:04.200 | like definitely kind of feel a lot of the battle scars
02:41:07.400 | because you have, you know,
02:41:09.360 | you not only do you have to like hit the function,
02:41:10.920 | but you have to educate and explain, get awareness up,
02:41:13.640 | deal with different types of consumers.
02:41:15.400 | Like, you know, there's a reason why a lot of technologies
02:41:18.880 | sometimes start in the enterprise space
02:41:20.720 | and then kind of continue forward in the consumer space.
02:41:22.960 | Even like, you know, you see AR like starting
02:41:25.920 | to kind of make that shift with HoloLens
02:41:27.600 | and so forth in some ways,
02:41:29.360 | consumers and price points that they're willing
02:41:31.440 | to kind of be attracted in a mass market way.
02:41:34.280 | And I don't mean like, you know,
02:41:35.640 | 10,000 enthusiasts bought it,
02:41:36.920 | but I mean like, you know, 2 million, 10 million,
02:41:40.720 | 50 million, like mass market kind of interest,
02:41:44.640 | you know, have bought it.
02:41:46.480 | That bar is very, very high.
02:41:47.920 | And typically robotics is novel enough
02:41:50.440 | and non-standardized enough
02:41:51.520 | to where it pushes on price points so much
02:41:53.920 | that you can easily get out of range
02:41:55.120 | where the capabilities in today's technology
02:41:57.920 | or just the function that was picked just doesn't line up.
02:42:00.600 | And so that product market fit is very important.
02:42:03.360 | - So the space of killer apps
02:42:05.000 | or rather super compelling apps is much smaller
02:42:08.840 | because it's easy to get outside of the price range.
02:42:11.840 | - Yeah. - For most consumers.
02:42:13.240 | - And it's not constant, right?
02:42:14.400 | Like, that's why like we picked off entertainment
02:42:17.080 | because the quality was just so low
02:42:19.720 | in physical entertainment
02:42:20.880 | that we felt we could leapfrog that
02:42:23.480 | and still create a really compelling offering
02:42:25.400 | at a price point that was defensible.
02:42:26.880 | And we, like that proved out to be true.
02:42:29.040 | And over time, that same opportunity opens up in healthcare,
02:42:34.560 | in home applications, in commercial applications
02:42:38.920 | and kind of broader, more generalized interface.
02:42:41.520 | But there's missing pieces in order for that to happen.
02:42:44.640 | And all of those have to be present for it to line up.
02:42:47.200 | And we see these sort of trends in technology where,
02:42:50.240 | you know, kind of technologies that start in one place
02:42:53.360 | evolve and kind of grow to another.
02:42:55.200 | Some things start in gaming,
02:42:56.640 | some things start in space or aerospace
02:43:01.120 | and then kind of move into the consumer market.
02:43:03.520 | And sometimes it's just a timing thing, right?
02:43:05.240 | Where how many stabs at what became the iPhone
02:43:09.200 | were there over the 20 years before
02:43:11.080 | that just weren't quite ready in the function
02:43:13.840 | relative to the kind of price point and complexity.
02:43:16.800 | - And sometimes it's a small detail of the implementation
02:43:19.600 | that makes all the difference,
02:43:20.640 | which is design is so important.
02:43:23.880 | - Something, yeah, like the new generation UX, right?
02:43:26.880 | - Yeah.
02:43:27.720 | - And that's, it's tough.
02:43:31.200 | And oftentimes all of them have to be there
02:43:32.720 | and it has to be like a perfect storm.
02:43:34.200 | And, but yeah, history repeats itself in a lot of ways
02:43:37.320 | in a lot of these trends, which is pretty fascinating.
02:43:39.960 | - Well, let me ask you about the humanoid form.
02:43:41.840 | What do you think about the Tesla bot
02:43:43.440 | and humanoid robotics in general?
02:43:45.320 | So obviously to me, autonomous driving,
02:43:49.360 | Waymo and the other companies working in the space,
02:43:52.080 | that seems to be a great place to invest
02:43:55.120 | in potential revolutionary application of robotics,
02:43:57.800 | application, folks, application.
02:44:00.380 | What's the role of humanoid robotics?
02:44:02.640 | Do you think Tesla bot is ridiculous?
02:44:05.800 | Do you think it's super promising?
02:44:08.000 | Do you think it's interesting, full of mystery?
02:44:10.080 | Nobody knows.
02:44:10.920 | What do you think about this thing?
02:44:12.080 | - Yeah, I think today humanoid form robotics is research.
02:44:16.360 | There's very few situations
02:44:17.600 | where you actually need a humanoid form
02:44:18.960 | to solve a problem.
02:44:20.440 | If you think about it, right,
02:44:21.640 | like wheels are more efficient than legs.
02:44:23.720 | There's joints and degrees of freedom
02:44:26.960 | beyond a certain point,
02:44:27.880 | just add a lot of complexity and cost, right?
02:44:29.680 | So if you're doing a humanoid robot,
02:44:31.520 | oftentimes it's in the pursuit of a humanoid robot,
02:44:33.440 | not in the pursuit of an application for the time being.
02:44:36.960 | Especially when you have like kind of the gaps in interface
02:44:39.200 | and, you know, kind of AI that we kind of talk about today.
02:44:42.240 | So anything Elon does, I'm interested in following.
02:44:45.280 | So there's an element of that world.
02:44:46.760 | - No matter how crazy.
02:44:47.800 | - How crazy it is.
02:44:48.640 | It's like, you know, I'll pay attention
02:44:49.880 | and I'm curious to see what comes out of it.
02:44:51.240 | So it's like, you can't ever, you know, ignore it.
02:44:54.360 | But, you know, it's definitely far afield
02:44:56.480 | from their kind of core business, obviously.
02:44:59.960 | - What was interesting to me,
02:45:01.960 | is I've disagreed with Elon a lot about this,
02:45:06.160 | is to me, the compelling aspect of the humanoid form
02:45:12.040 | and a lot of kind of robots, Cosmo, for example,
02:45:15.680 | is a human robot interaction part.
02:45:18.280 | From Elon Musk's perspective,
02:45:23.240 | Tesla bot has nothing to do with the human.
02:45:25.600 | It's a form that's effective for the factory
02:45:28.840 | because the factory is designed for humans.
02:45:31.660 | But to me, the reason you might want to argue
02:45:34.080 | for the humanoid form is because, you know, at a party,
02:45:38.200 | it's a nice way to fit into the party.
02:45:41.400 | The humanoid form has a compelling notion to it
02:45:43.860 | in the same way that Cosmo is compelling.
02:45:45.960 | I would argue, if we were arguing about this,
02:45:50.280 | that it's cheaper to build a Cosmo, like that form.
02:45:54.800 | But if you wanted to make an argument,
02:45:57.000 | which I have with Jim Keller about, you know,
02:45:58.800 | you could actually make a humanoid robot
02:46:00.200 | for pretty cheap, it's possible.
02:46:03.080 | And then the question is, all right,
02:46:05.160 | if you're using an application where it can be flawed,
02:46:08.720 | it can have a personality and be flawed
02:46:12.520 | in the same way that Cosmo is,
02:46:14.380 | then maybe it's interesting
02:46:15.380 | for integration to human society.
02:46:17.820 | That's, to me, is an interesting application
02:46:19.860 | of a humanoid form, 'cause humans are drawn,
02:46:22.540 | like I mentioned to you, legged robots,
02:46:24.060 | we're drawn to legs and limbs and body language
02:46:27.180 | and all that kind of stuff.
02:46:28.580 | And even a face, even if you don't have the facial features,
02:46:31.380 | which you might not want to have
02:46:32.980 | to reduce the creepiness factor, all that kind of stuff.
02:46:38.140 | But yeah, that, to me, the humanoid form is compelling.
02:46:40.460 | But in terms of that being the right form
02:46:44.140 | for the factory environment, I'm not so sure.
02:46:46.620 | - Yeah, for the factory environment,
02:46:48.000 | like right off the bat, what are you optimizing for?
02:46:50.980 | Is it strength?
02:46:51.820 | Is it mobility?
02:46:52.640 | Is it versatility, right?
02:46:53.480 | Like that changes completely the look and feel
02:46:55.080 | of the robot that you create.
02:46:56.720 | You know, and almost certainly the human form
02:46:59.720 | is over-designed for some dimensions
02:47:02.480 | and constrained for some dimensions.
02:47:03.620 | And so, like, what are you grasping?
02:47:06.140 | Is it big?
02:47:06.980 | Is it little, right?
02:47:07.800 | And how do you customize it and make it customizable
02:47:11.840 | for the different needs, if that was the optimization, right?
02:47:14.540 | And then, you know, for the other one,
02:47:17.020 | I could totally be wrong.
02:47:18.380 | You know, I still feel that the closer you try to get
02:47:20.900 | to a human, the more you're subject to the biases
02:47:25.020 | of what a human should be, and you lose flexibility
02:47:27.940 | to shift away from your weaknesses
02:47:31.100 | and towards your strengths.
02:47:32.740 | And that changes over time,
02:47:35.740 | but there's ways to make really approachable
02:47:40.740 | and natural interfaces for robotic kind of characters
02:47:45.940 | and, you know, and kind of deployments
02:47:51.020 | in these applications that do not at all
02:47:54.460 | look like a human directly,
02:47:56.420 | but that actually creates way more flexibility
02:47:58.900 | and capability and role and forgiveness
02:48:01.480 | and interface and everything else.
02:48:02.980 | - Yeah, it's interesting,
02:48:03.860 | but I'm still confused by the magic I see in legged robots.
02:48:08.860 | - Yeah, so there is a magic.
02:48:10.200 | So I'm absolutely amazed at it
02:48:13.340 | from a technical curiosity standpoint
02:48:16.620 | and like the magic that like the Boston Dynamics team
02:48:19.660 | can do from, you know, like from walking
02:48:22.300 | and jumping and so forth.
02:48:23.980 | Now, like there's been a long journey
02:48:26.440 | to try to find an application for that sort of technology,
02:48:29.820 | but wow, that's incredible technology, right?
02:48:31.700 | - Yes.
02:48:32.780 | - Then you kind of go towards, okay,
02:48:34.140 | are you working back from a goal
02:48:36.140 | of what you're trying to solve
02:48:37.140 | or are you working forward from a technology
02:48:38.660 | and then looking for a solution?
02:48:39.700 | And I think that's where it's a kind of
02:48:42.220 | a bidirectional search oftentimes,
02:48:43.580 | but you got to, the two have to meet.
02:48:45.520 | And that's where humanoid robots is kind of close to that
02:48:49.260 | in that like it is a decision about a form factor
02:48:51.980 | and a technology that it forces
02:48:54.380 | that doesn't have a clear justification
02:48:57.820 | on why that's the killer app or, you know,
02:48:59.500 | from the other end.
02:49:00.340 | - I think the core fascinating idea with the Tesla bot
02:49:03.100 | is the one that's carried by Waymo as well
02:49:05.820 | is when you're solving the general robotics problem
02:49:08.860 | of perception control,
02:49:10.300 | where there's the very clear applications of driving.
02:49:14.060 | It's as you get better and better at it,
02:49:16.860 | when you have like Waymo driver,
02:49:18.780 | the whole world starts to kind of start to look
02:49:22.440 | like a robotics problem.
02:49:23.660 | So it's very interesting for now.
02:49:25.180 | - Detection, classification, segmentation,
02:49:27.980 | tracking, planning, like it's, yeah.
02:49:31.060 | - So there's no reason, I mean,
02:49:32.660 | I'm not speaking for Waymo here,
02:49:35.340 | but you know, moving goods,
02:49:39.620 | there's no reason transformer like this thing
02:49:42.700 | couldn't take the goods up an elevator, you know?
02:49:47.060 | Like slowly expand what it means to move goods
02:49:52.060 | and expand more and more of the world
02:49:57.460 | into a robotics problem.
02:49:59.140 | - Well, that's right.
02:49:59.980 | And you start to like,
02:50:01.020 | think of it as an end-to-end robotics problem
02:50:02.780 | from like loading from, you know, from everything.
02:50:05.460 | And even like the truck itself,
02:50:07.020 | you know, today's generation is integrating
02:50:10.340 | into today's understanding of what a vehicle is, right?
02:50:13.420 | The Pacifica, Jaguar, the Freightliners from Daimler.
02:50:17.640 | There's nothing that stops these,
02:50:19.860 | us from like down the road after like
02:50:22.580 | starting to get to scale,
02:50:23.980 | to like expand these partnerships to really rethink
02:50:26.740 | what would the next generation of a truck look like
02:50:30.380 | that is actually optimized for autonomy,
02:50:32.060 | not for today's world.
02:50:34.460 | And maybe that means a very different type of trailer.
02:50:37.620 | Maybe that like,
02:50:38.460 | there's a lot of things you could rethink on that front,
02:50:40.460 | which is on its own, very, very exciting.
02:50:42.740 | - Let me ask you, like I said,
02:50:44.340 | you went to the Mecca of robotics,
02:50:46.540 | which is CMU, Carnegie Mellon University.
02:50:48.860 | You got a PhD there.
02:50:50.820 | So maybe by way of advice and maybe by way of story
02:50:56.820 | and memories, what does it take to get a PhD
02:51:00.420 | in robotics at CMU?
02:51:03.660 | And maybe you can throw in there some advice
02:51:07.220 | for people who are thinking about doing work
02:51:10.580 | in artificial intelligence and robotics
02:51:12.500 | and are thinking about whether to get a PhD.
02:51:14.980 | - It's funny, I asked you what,
02:51:16.100 | I was at CMU for undergrad as well
02:51:17.940 | and didn't know anything about robotics coming in
02:51:20.020 | and was doing, you know, electrical computer engineering,
02:51:22.740 | computer science,
02:51:23.580 | and really got more and more into kind of AI
02:51:26.060 | and then fell in love with autonomous driving.
02:51:27.980 | And at that point, like that was just by a big margin,
02:51:30.420 | like such a incredible, like central spot
02:51:33.540 | of investment in that area.
02:51:36.660 | And so what I would say is that like robotics,
02:51:38.180 | like for all the progress that's happened
02:51:40.380 | is still a really young field.
02:51:41.860 | There's a huge amount of opportunity.
02:51:43.220 | Now that opportunity shifted
02:51:44.420 | where something like autonomous driving
02:51:46.700 | has moved from being very research and academics driven
02:51:49.580 | to being commercial driven
02:51:51.180 | where you see the investments happening
02:51:53.260 | in commercial.
02:51:54.100 | Now there's other areas that are much younger
02:51:56.060 | and you see like kind of grasping and manipulation,
02:51:59.060 | making kind of the same sort of journey
02:52:00.780 | that like autonomy made,
02:52:01.980 | and there's other areas as well.
02:52:03.780 | What I would say is the space moves very quickly.
02:52:07.020 | Anything you do a PhD in,
02:52:08.580 | like it is in most areas will evolve and change
02:52:11.180 | as technology changes and constraints change
02:52:13.260 | and hardware changes and the world changes.
02:52:15.660 | And so the beautiful thing about robotics
02:52:18.260 | is it's super broad.
02:52:19.140 | It's not a narrow space at all,
02:52:20.940 | and it could be a million different things
02:52:22.580 | in a million different industries.
02:52:23.940 | And so it's a great opportunity to come in
02:52:27.460 | and get a broad foundation on AI,
02:52:29.740 | machine learning, computer vision,
02:52:31.540 | systems, hardware, sensors,
02:52:33.220 | all these separate things.
02:52:34.900 | You do need to like go deep
02:52:36.620 | and find something that you're like
02:52:38.460 | really, really passionate about.
02:52:40.620 | Obviously, like just like any PhD,
02:52:42.460 | this is like a five, six year kind of endeavor.
02:52:47.100 | And you have to love it enough to go super deep
02:52:50.820 | to learn all the things necessary
02:52:52.620 | to be super deeply functioning in that area
02:52:54.620 | and then contribute to it in a way
02:52:55.620 | that hasn't been done before.
02:52:57.340 | And in robotics, that probably means more breadth
02:53:00.340 | because robotics is rarely kind of like
02:53:03.220 | one particular kind of narrow technology.
02:53:05.740 | And it means being able to collaborate with teams
02:53:08.020 | where like one of the coolest aspects of like
02:53:10.460 | the experience that I like kind of cherish in our PhD
02:53:13.580 | is that we actually had a pretty large AV project
02:53:16.900 | that for that time was like a pretty serious initiative
02:53:20.140 | where you got to like partner with a larger team
02:53:23.340 | and you had the experts in perception
02:53:24.980 | and the experts in planning and the staff
02:53:26.500 | and the mechanical engineers.
02:53:27.340 | - It was a DARPA challenge.
02:53:28.620 | - So I was working on a project called UPI back then,
02:53:32.620 | which was basically the off-road version
02:53:34.180 | of the DARPA challenge.
02:53:35.020 | It was a DARPA funded project for basically
02:53:37.540 | like a large off-road vehicle that you would like drop
02:53:40.340 | and then give it a waypoint 10 kilometers away
02:53:42.420 | and it would have to navigate a completely unstructured--
02:53:44.060 | - In an off-road environment.
02:53:45.260 | - Yeah, so like forest, ditches, rocks, vegetation.
02:53:48.700 | And so it was like a really, really interesting
02:53:50.060 | kind of a hard problem where like wheels
02:53:51.820 | would be off to my shoulders.
02:53:52.780 | It's like gigantic, right?
02:53:53.980 | - Yeah, by the way, AV for people
02:53:55.500 | stands for autonomous vehicles.
02:53:56.820 | - Autonomous vehicles, yeah, sorry.
02:53:58.940 | And so what I think is like the beauty of robotics,
02:54:01.500 | but also kind of like the expectation is that
02:54:03.700 | there's spaces in computer science
02:54:06.580 | where you can be very, very narrow and deep.
02:54:08.940 | Robotics, the necessity, but also the beauty of it
02:54:12.140 | is that it forces you to be excited about that breadth
02:54:15.420 | and that partnership across different disciplines
02:54:17.380 | that enable it.
02:54:18.420 | But that also opens up so many more doors
02:54:20.220 | where you can go and you can do robotics
02:54:22.500 | in almost any category where robotics
02:54:25.220 | isn't really an industry.
02:54:27.260 | It's like AI, right?
02:54:29.140 | It's like the application of physical automation
02:54:31.540 | to all these other worlds.
02:54:33.940 | And so you can do robotic surgery, you can do vehicles,
02:54:37.020 | you can do factory automation, you can do healthcare,
02:54:39.620 | you can do like leverage the AI around the sensing
02:54:44.180 | to think about static sensors and scene understanding.
02:54:47.340 | So I think that's gotta be the expectation
02:54:50.900 | and the excitement and it breeds people
02:54:53.300 | that are probably a little bit more collaborative
02:54:54.700 | and more excited about working in teams.
02:54:58.540 | - If I could briefly comment on the fact
02:55:01.500 | that the robotics people I've met in my life
02:55:05.300 | from CMU and MIT, they're really happy people.
02:55:10.140 | - Yeah.
02:55:10.980 | - Because I think it's the collaborative thing.
02:55:13.060 | I think you don't...
02:55:14.540 | (laughs)
02:55:16.260 | You're not like sitting in like the fourth basement.
02:55:19.380 | - Yes, exactly.
02:55:20.220 | Which when you're doing machine learning purely software,
02:55:23.740 | it's very tempting to just disappear into your own hole
02:55:27.700 | and never collaborate.
02:55:29.140 | And that breeds a little bit more of the silo mentality
02:55:34.140 | of like, I have a problem.
02:55:36.220 | It's almost like negative to talk to somebody else
02:55:38.340 | or something like that.
02:55:39.300 | But robotics folks are just very collaborative,
02:55:42.100 | very friendly and just...
02:55:43.660 | And there's also an energy of like,
02:55:45.620 | you get to confront the physics of reality often,
02:55:49.620 | which is humbling and also exciting.
02:55:53.460 | So it's humbling when it fails
02:55:55.540 | and exciting when it finally works.
02:55:56.900 | - It's like a purity of the passion.
02:55:57.980 | You got to remember that like right now,
02:55:59.700 | like robotics and AI is like just all the rage
02:56:03.380 | and autonomous vehicles and all this.
02:56:05.180 | Like 15 years ago and 20 years ago,
02:56:08.460 | like it wasn't that deeply lucrative.
02:56:11.460 | People that went into robotics,
02:56:12.420 | they did it because they were like,
02:56:13.940 | thought it was just the coolest thing in the world.
02:56:15.580 | They wanted to like make physical things intelligent
02:56:17.700 | in the real world.
02:56:18.540 | And so there's like a raw passion
02:56:20.460 | where they went into it for the right reasons and so forth.
02:56:22.460 | So it's really great space.
02:56:23.740 | And that organizational challenge, by the way,
02:56:25.900 | like when you think about the challenges in AV,
02:56:28.580 | we talk a lot about the technical challenges.
02:56:30.420 | The organizational challenges through the roof,
02:56:33.180 | where you think about what it takes to build an AV system
02:56:38.180 | and you have companies that are now thousands of people.
02:56:42.260 | And you look at other really hard technical problems,
02:56:45.620 | like an operating system, it's pretty well established.
02:56:48.180 | Like you kind of know that there's a file system,
02:56:51.540 | there's virtual memory, there's this, there's that,
02:56:53.460 | there's like caching and like,
02:56:56.420 | and there's like a really reasonably well-established
02:56:58.660 | modularity and APIs and so forth.
02:57:00.300 | And so you can kind of like scale it in an efficient fashion
02:57:03.260 | that doesn't exist anywhere near to that level of maturity
02:57:06.660 | in autonomous driving right now.
02:57:08.660 | And tech stacks are being reinvented,
02:57:10.980 | organizational structures are being reinvented.
02:57:12.860 | You have problems like pedestrians
02:57:14.300 | that are not isolated problems.
02:57:15.620 | They're part sensing, part behavior prediction,
02:57:18.100 | part planning, part evaluation.
02:57:20.340 | And like one of the biggest challenges is actually
02:57:23.300 | how do you solve these problems
02:57:24.620 | where the mental capacity of a human
02:57:26.820 | is starting to get strained on how do you organize it
02:57:29.700 | and think about it, where you have this like
02:57:33.740 | multi-dimensional matrix that needs to all work together.
02:57:36.820 | And so that makes it kind of cool as well,
02:57:39.700 | because it's not like solved at all from,
02:57:43.260 | you know, like what does it take to actually scale this?
02:57:45.460 | Right?
02:57:46.300 | And then you look at like other gigantic challenges
02:57:48.060 | that have, you know, that have been successful
02:57:50.860 | and are way more mature, there's a stability to it.
02:57:53.420 | And like, maybe the autonomous vehicle space will get there,
02:57:56.580 | but right now, just as many technical challenges
02:57:59.820 | as they are, they're like organizational challenges
02:58:01.380 | on how do you like solve these problems
02:58:03.220 | that touch on so many different areas
02:58:05.300 | and efficiently tackle them
02:58:07.500 | while like maintaining progress
02:58:10.260 | among all these constraints while scaling.
02:58:13.660 | - By way of advice, what advice would you give
02:58:17.420 | to somebody thinking about doing a robotic startup?
02:58:22.060 | You mentioned Cosmo, somebody that wanted to carry
02:58:24.780 | the Cosmo flag forward, the Anki flag forward.
02:58:28.060 | Looking back at your experience,
02:58:31.000 | looking forward at a future
02:58:33.340 | that will obviously have such robots,
02:58:35.700 | what advice would you give to that person?
02:58:37.340 | - Yeah, it was the greatest experience ever.
02:58:39.460 | And it's like, there's something you,
02:58:40.780 | there are things you learn navigating a startup
02:58:44.060 | that you'll never like, it was very hard to encounter that
02:58:46.940 | in like a typical kind of work environment.
02:58:48.740 | And it's just, it's wonderful.
02:58:50.980 | You gotta be ready for it.
02:58:51.820 | It's not as like, you know, the glamor of a startup,
02:58:54.540 | there's just like just brutal emotional swings up and down.
02:58:57.220 | And so having co-founders actually helps a ton.
02:59:00.180 | Like I would not, could not imagine doing it solo,
02:59:02.340 | but having at least somebody where on your darkest days,
02:59:05.940 | you can kind of like really openly just like
02:59:08.140 | have that conversation and, you know,
02:59:09.900 | lean onto somebody that's in the thick of it with you,
02:59:13.340 | helps a lot.
02:59:14.260 | What I would say--
02:59:15.100 | - What was the nature of darkest days
02:59:17.580 | and the emotional swings?
02:59:19.100 | Is it worried about the funding?
02:59:20.900 | Is it worried about whether any of your ideas
02:59:24.260 | are any good or ever were good?
02:59:26.020 | Is it like the self-doubt?
02:59:27.580 | Is it like facing new challenges
02:59:30.900 | that have nothing to do with the technology,
02:59:32.580 | like organizational, human resources,
02:59:35.620 | that kind of stuff?
02:59:36.460 | - Yeah, you come from a world in school
02:59:38.820 | where you feel that you put in a lot of effort
02:59:42.180 | and you'll get the right result
02:59:43.380 | and input translates proportional to output.
02:59:46.420 | And, you know, you need to solve the set or do whatever
02:59:49.500 | and just kind of get it done.
02:59:50.740 | Now, PhD tests out a little bit,
02:59:52.180 | but at the end of the day, you put in the effort,
02:59:53.980 | you tend to like kind of come out with your enough results
02:59:56.780 | to you kind of get a PhD.
02:59:59.140 | In the startup space, like, you know,
03:00:01.820 | like you could talk to 50 investors
03:00:03.820 | and they just don't see your vision
03:00:04.860 | and it doesn't matter how hard you kind of tried and pitched.
03:00:07.140 | You could work incredibly hard
03:00:09.260 | and you have a manufacturing defect.
03:00:10.700 | And if you don't fix it, you're out of business.
03:00:14.020 | You need to raise money by a certain date.
03:00:16.180 | And there's a, you got to have this milestone
03:00:18.540 | in order to like have a good pitch and you do it.
03:00:20.900 | You have to have this talent
03:00:21.940 | and you just don't have it inside the company.
03:00:23.460 | Or, you know, you have to get 200 people
03:00:28.060 | or however many people kind of like along with you
03:00:30.700 | and kind of buy in the journey.
03:00:33.100 | You're like disagreeing with an investor
03:00:35.180 | and they're your investor.
03:00:36.020 | So it's just like, you know,
03:00:36.860 | it's like, there's no walking away from it, right?
03:00:38.940 | So, and it tends to be like those things
03:00:41.420 | where you just kind of get clobbered
03:00:43.580 | in so many different ways
03:00:44.660 | that like things end up being harder than you expect.
03:00:47.380 | And it's like such a gauntlet,
03:00:49.340 | but you learn so much in the process.
03:00:51.540 | And there's a lot of people
03:00:52.380 | that actually end up rooting for you
03:00:53.300 | and helping you like from the outside
03:00:55.380 | and you get good, great mentors
03:00:56.860 | and you like get find fantastic people
03:00:58.820 | that step up in the company.
03:01:00.660 | And you have this like magical period
03:01:02.540 | where everybody's like,
03:01:04.820 | it's life or death for the company,
03:01:06.140 | but like you're all fighting for the same thing
03:01:07.820 | and it's the most satisfying kind of journey ever.
03:01:10.780 | The things that make it easier
03:01:12.380 | and that I would recommend
03:01:13.380 | is like be really, really thoughtful about the application.
03:01:16.820 | Like there's a saying of like,
03:01:19.100 | kind of, you know, team and execution and market
03:01:21.780 | and like kind of how important are each of those.
03:01:24.340 | And oftentimes the market wins
03:01:26.420 | and you come at it thinking that if you're smart enough
03:01:29.020 | and you work hard enough
03:01:29.860 | and you're like have the right talent to team and so forth,
03:01:32.420 | like you'll always kind of find a way through.
03:01:34.780 | And it's surprising how much dynamics
03:01:37.580 | are driven by the industry you're in
03:01:39.100 | and the timing of you entering that industry.
03:01:41.180 | And so just Waymo is a great example of it.
03:01:44.540 | There is, I don't know if there'll ever be another company
03:01:47.460 | or suite of companies that has raised
03:01:51.380 | and continues to spend so much money
03:01:54.220 | at such an early phase of revenue generation
03:01:57.900 | and productization, you know, from a P&L standpoint.
03:02:02.900 | Like it's an anomaly,
03:02:05.940 | like by any measure of any industry that's ever existed,
03:02:09.140 | except for maybe the US space program, right?
03:02:11.980 | But it's like a multiple trillion dollar opportunities,
03:02:17.420 | which is so unusual to find that size of a market
03:02:20.260 | that just the progress that shows the de-risking of it,
03:02:24.180 | you could apply whatever discounts you want
03:02:25.740 | off of that trillion dollar market
03:02:26.900 | and it still justifies the investment that is happening
03:02:29.140 | because like being successful in that space
03:02:31.820 | makes all the investments feel trivial.
03:02:33.860 | Now, by the same consequence, like the size of the market,
03:02:37.020 | the size of the target audience,
03:02:38.620 | the ability to capture that market share,
03:02:40.740 | how hard that's gonna be, who the accumbents,
03:02:42.540 | like that's probably one of the lessons I appreciate
03:02:44.900 | like more than anything else,
03:02:46.140 | where like those things really, really do matter.
03:02:48.540 | And oftentimes can dominate the quality
03:02:52.220 | of the team or execution,
03:02:53.540 | because if you miss the timing
03:02:55.380 | or you do it in the wrong space,
03:02:56.740 | or you run into like the institutional kind of headwinds
03:02:59.420 | of a particular environment,
03:03:00.980 | like let's say you have the greatest idea in the world,
03:03:02.820 | but you barrel into healthcare,
03:03:03.940 | but it takes 10 years to innovate in healthcare
03:03:05.700 | because of a lot of challenges, right?
03:03:07.260 | Like there's fundamental laws of physics
03:03:10.740 | that you have to think about.
03:03:12.100 | And so the combination of like Anki and Waymo
03:03:15.140 | kind of drives that point home for me,
03:03:16.820 | where you can do a ton if you have the right market,
03:03:19.740 | the right opportunity, the right way to explain it,
03:03:22.260 | and you show the progress in the right sequence,
03:03:25.660 | it actually can really significantly change
03:03:27.900 | the course of your journey and startup.
03:03:30.300 | - How much of it is understanding the market
03:03:32.100 | and how much of it is creating a new market?
03:03:34.460 | So how do you think about,
03:03:36.260 | like the space of robotics is really interesting.
03:03:39.460 | You said exactly right,
03:03:41.060 | the space of applications is small,
03:03:43.020 | relative to the cost involved.
03:03:47.380 | So how much is like truly revolutionary thinking
03:03:51.780 | about like what is the application?
03:03:54.820 | And then, yeah, but so like creating something that-
03:03:59.420 | - Didn't exist.
03:04:00.260 | - Didn't really exist.
03:04:01.300 | Like, this is pretty obvious to me,
03:04:02.940 | the whole space of home robotics,
03:04:04.700 | just everything that Cosmo did,
03:04:07.140 | I guess you could talk to it as a toy
03:04:08.900 | and people will understand it,
03:04:10.580 | but Cosmo is much more than a toy.
03:04:12.340 | And I don't think people fully understand the value of that.
03:04:18.100 | You have to create it and the product will communicate it.
03:04:20.820 | Like just like the iPhone,
03:04:22.860 | nobody understood the value of no keyboard
03:04:26.980 | and a thing that can do web browsing.
03:04:31.060 | I don't think they understood the value of that
03:04:32.660 | until you create it.
03:04:34.140 | - Yeah, having a foot in the door
03:04:35.940 | and an entry point still helps
03:04:37.220 | because at the end of the day,
03:04:38.100 | like an iPhone replaced your phone.
03:04:40.380 | And so it had a fundamental purpose
03:04:41.580 | and it has all these things that it did better, right?
03:04:43.620 | And so then you could do ABC on top of it.
03:04:46.060 | And then like, you even remember the early commercials
03:04:48.780 | where it's always like one application of what it could do
03:04:51.540 | and then you get a phone call.
03:04:53.140 | And so that was intentionally sending a message,
03:04:55.260 | something familiar, but then like,
03:04:56.980 | you can send a text message, you can listen to music,
03:04:58.580 | you can surf the web, right?
03:04:59.900 | And so, autonomous driving obviously anchors on that as well.
03:05:03.780 | You don't have to explain to somebody
03:05:05.380 | the functionality of an autonomous truck, right?
03:05:07.260 | Like there's nuances around it,
03:05:08.300 | but the functionality makes sense.
03:05:10.860 | In the home, you have a fundamental advantage.
03:05:13.580 | Like we always thought about this
03:05:14.620 | because it was so painful to explain to people
03:05:16.700 | what our products did
03:05:17.660 | and how to communicate that super cleanly,
03:05:20.460 | especially when something was so experiential.
03:05:22.540 | And so you compare like Anki to Nest.
03:05:27.020 | Nest had some beautiful products
03:05:31.620 | where they started scaling
03:05:34.060 | and like actually found like really great success.
03:05:36.180 | And they had like really clean
03:05:37.460 | and beautiful marketing messaging
03:05:38.820 | because they anchored on reinventing existing categories
03:05:42.100 | where it was a smart thermostat, right?
03:05:44.060 | And like, and so you kind of are able
03:05:47.500 | to take what's familiar, anchor that understanding,
03:05:51.140 | and then explain what's better about it.
03:05:53.500 | - That's funny, you're right.
03:05:54.460 | Cosmo is like totally new thing.
03:05:56.420 | Like what is this thing?
03:05:57.980 | - It's we struggle.
03:05:58.820 | We spent like a lot of money on marketing.
03:06:01.020 | We had a hard, like we actually had far greater efficiency
03:06:04.420 | on Cosmo than anything else
03:06:06.460 | 'cause we found a way to capture the emotion
03:06:08.180 | in some little shorts to kind of wean
03:06:10.060 | into the personality in our marketing.
03:06:11.860 | And it became viral where like we had these kind of videos
03:06:15.220 | that would like go and get like hundreds
03:06:16.820 | of thousands of views and like kind of like get spread
03:06:18.860 | and sometimes millions of views.
03:06:20.620 | And so, but it was like really, really hard.
03:06:23.620 | And so finding a way to kind of like anchor
03:06:26.700 | on something that's familiar
03:06:27.780 | but then grow into something that's not is an advantage.
03:06:31.460 | But then again, like you don't have
03:06:32.900 | like their successes otherwise
03:06:34.220 | like Alexa never had a comp, right?
03:06:37.220 | You could argue that that's very novel and very new.
03:06:39.860 | And there's a lot of other examples that kind of created
03:06:43.740 | kind of a category out of like Kiva Systems.
03:06:46.940 | I mean, they like came in and they like,
03:06:49.780 | enterprise is a little easier because if you can,
03:06:52.380 | it's less susceptible to this
03:06:53.820 | because if you can argue a clear value proposition
03:06:56.740 | it's a more logical conversation
03:06:58.500 | that you can have with customers.
03:07:00.740 | It's not, it's a little bit less emotional
03:07:02.860 | and kind of subjective, but.
03:07:05.140 | - Yeah, in the home you have to.
03:07:06.980 | - Yeah, it's like a home robot.
03:07:08.020 | It's like, what does that mean?
03:07:09.460 | And so then you really have to be crisp
03:07:11.100 | about the value proposition
03:07:12.220 | and what like really makes it worth it.
03:07:14.940 | Like, and we, by the way, went to that same order.
03:07:17.020 | We almost like, we almost hit a wall coming out of 2013
03:07:21.220 | where we were so big on explaining
03:07:23.740 | why our stuff was so high tech
03:07:25.020 | and all the kind of like great technology in it
03:07:27.300 | and how cool it is and so forth
03:07:29.460 | to having to make a super hard pivot on why is it fun
03:07:32.540 | and why does the random kind of family of four need this.
03:07:36.780 | Right, like, so it's learnings, but that's the challenge.
03:07:41.340 | And I think like robotics tends to sometimes fall
03:07:43.820 | into the new category problem,
03:07:45.980 | but then you gotta be really crisp
03:07:47.260 | about why it needs to exist.
03:07:49.460 | - Well, I think some of robotics,
03:07:51.460 | depending on the category,
03:07:52.740 | depending on the application
03:07:54.700 | is a little bit of a marketing challenge.
03:07:59.540 | And I don't mean, I mean,
03:08:02.860 | that's the kind of marketing that Waymo's doing,
03:08:06.340 | that Tesla is doing,
03:08:07.380 | is like showing off incredible engineering,
03:08:11.940 | incredible technology, but convincing,
03:08:15.020 | like you said, a family of four,
03:08:16.740 | that this is transformative for your life.
03:08:20.100 | This is fun.
03:08:21.540 | - They don't care how much tech is in your thing.
03:08:23.700 | - They don't, they really don't care.
03:08:24.540 | - It's like, they need to know why they want it, so.
03:08:26.340 | - And some of that is just marketing.
03:08:27.860 | - Yeah, and that's why like Roomba,
03:08:29.340 | like, yes, they didn't, you know,
03:08:31.900 | like go and have this like, you know,
03:08:35.220 | huge, huge ramp into like the entirety
03:08:38.060 | of like kind of AI robotics and so forth.
03:08:39.380 | But like, they built a really great business
03:08:41.100 | in the vacuum cleaner world.
03:08:43.420 | And like, everybody understands what a vacuum cleaner is.
03:08:46.380 | Most people are annoyed by doing it.
03:08:48.740 | And now you have one that like kind of does it itself,
03:08:51.420 | the various degrees of quality,
03:08:54.140 | but that is so compelling that like,
03:08:55.580 | it's easy to understand.
03:08:56.660 | And like, and they had a very kind of,
03:09:00.140 | and I think they have like 15% of the vacuum cleaner market.
03:09:02.380 | So it's like pretty successful, right?
03:09:04.460 | - I think we need more of those types
03:09:06.700 | of thoughtful stepping stones in robotics,
03:09:08.380 | but the opportunities are becoming bigger
03:09:09.860 | because hardware is cheaper,
03:09:11.660 | computes cheaper, clouds cheaper, and AI is better.
03:09:14.700 | So there's a lot of opportunity.
03:09:16.340 | - If we zoom out from specifically startups and robotics,
03:09:20.020 | what advice do you have to high school students,
03:09:23.580 | college students about career and living a life
03:09:28.540 | that you'd be proud of?
03:09:29.700 | You lived one heck of a life.
03:09:31.060 | You're very successful in several domains.
03:09:34.220 | If you can convert that into a generalizable potion,
03:09:39.060 | what advice would you give?
03:09:40.500 | - Yeah, it's a very good question.
03:09:41.740 | So it's very hard to go into a space
03:09:45.580 | that you're not passionate about
03:09:46.820 | and push hard enough to be,
03:09:50.380 | you know, to like maximize your potential in it.
03:09:54.580 | And so there's always kind of like the saying of like,
03:09:59.140 | okay, follow your passion, great.
03:10:00.980 | Try to find the overlap of where your passion overlaps
03:10:03.300 | with like a growing opportunity and need in the world,
03:10:06.140 | where it's not too different than the startup
03:10:07.620 | kind of argument that we talked about,
03:10:09.060 | where if you are-
03:10:11.260 | - Where your passion meets the market.
03:10:12.940 | - Right, you know what I mean?
03:10:13.940 | Like, 'cause it's like,
03:10:15.100 | that's a beautiful thing where like,
03:10:18.140 | you can do what you love,
03:10:18.980 | but it's also just opens up tons of opportunities
03:10:21.060 | 'cause the world's ready for it, right?
03:10:22.180 | Like, and so like, if you're interested in technology,
03:10:26.540 | that might point to like, go and study machine learning,
03:10:29.100 | because you don't have to decide
03:10:30.100 | what career you're gonna go into,
03:10:31.220 | but it's gonna be such a versatile space
03:10:33.220 | that's gonna be at the root of like everything
03:10:34.900 | that's gonna be in front of us,
03:10:36.060 | that you can have eight different careers
03:10:38.820 | in different industries and be an absolute expert
03:10:42.340 | in this like kind of tool set that you wield
03:10:44.620 | that can go and be applied.
03:10:46.900 | And that doesn't apply to just technology, right?
03:10:49.380 | It could be the exact same thing if you wanna,
03:10:52.500 | you know, same thought process applies to design,
03:10:55.660 | to marketing, to, you know, to sales, to anything,
03:10:58.260 | but that versatility where you like,
03:11:00.700 | when you're in a space that's gonna continue to grow,
03:11:04.820 | it's just like, what company do you join?
03:11:06.980 | One that just is gonna grow
03:11:08.260 | and the growth creates opportunities
03:11:10.220 | where the surface area is just gonna increase
03:11:12.380 | and the problems will never get stale.
03:11:14.100 | And you can have, you know, many like,
03:11:16.860 | and so you go into a career where you have that sort of
03:11:18.980 | growth in the world that you're in,
03:11:21.540 | you end up having so much more opportunity
03:11:25.100 | that organically just appears,
03:11:27.060 | and you can then have more shots on goal
03:11:29.020 | to find like that killer overlap of timing and passion
03:11:31.940 | and skill set and point in life where you can like,
03:11:34.300 | you know, just really be motivated
03:11:36.020 | and fall in love with something.
03:11:38.100 | And then at the same time, like find a balance.
03:11:40.380 | Like there's been times in my life
03:11:41.460 | where I worked like a little bit too obsessively
03:11:43.500 | and, you know, and crazy.
03:11:45.180 | And I, you know, think we kind of like tried to correct it,
03:11:48.340 | you know, kind of the right opportunities,
03:11:49.740 | but, you know, I think I probably appreciate a lot more now
03:11:52.900 | friendships to go way back, you know,
03:11:54.980 | family and things like that.
03:11:56.180 | And I kind of have the personality where I could use,
03:12:00.100 | like I have like so much desire to really try to optimize,
03:12:03.020 | like, you know, what I'm working on
03:12:04.100 | that I can easily go to kind of an extreme.
03:12:06.100 | And now I'm trying to like kind of find that balance
03:12:09.060 | and make sure that I have the friendships, the family,
03:12:11.780 | like relationship with the kids, everything that like,
03:12:13.700 | I don't, I push really, really hard,
03:12:16.260 | but it kind of find a balance.
03:12:17.580 | And I think people can be happy on actually many kind of
03:12:22.580 | extremes on that spectrum,
03:12:24.340 | but it's easy to kind of inadvertently make a choice
03:12:29.060 | by how you approach it,
03:12:30.780 | that then becomes really hard to unwind.
03:12:33.380 | And so being very thoughtful about
03:12:35.820 | kind of all of those dimensions makes a lot of sense.
03:12:37.860 | And so, I mean, those are all interrelated,
03:12:41.180 | but at the end of the day-
03:12:42.020 | - Love, passion and love.
03:12:43.820 | Love towards, you said-
03:12:45.340 | - Yeah, family, friends.
03:12:46.260 | - Family.
03:12:47.100 | And hopefully one day, if your work pans out, Boris,
03:12:52.820 | is love towards robots.
03:12:54.420 | - Love towards robots.
03:12:56.460 | Not the creepy kind, the good kind.
03:12:57.700 | - The good kind.
03:12:58.540 | Just friendship and fun.
03:13:02.740 | - Yeah, it's like another dimension
03:13:04.580 | to just how we interface with the world.
03:13:06.220 | Yeah.
03:13:07.420 | - Boris, you're one of my favorite human beings, roboticist.
03:13:10.340 | You've created some incredible robots
03:13:12.580 | and I think inspired countless people.
03:13:15.500 | And like I said, I hope Cosmo,
03:13:18.260 | I hope your work with Anki lives on.
03:13:20.340 | And I can't wait to see what you do with Waymo.
03:13:24.420 | I mean, if we're talking about
03:13:26.660 | artificial intelligence technology,
03:13:28.340 | there's the potential to revolutionize
03:13:31.140 | so much of our world.
03:13:32.980 | That's it right there.
03:13:33.820 | So thank you so much for the work you've done
03:13:36.180 | and thank you for spending your valuable time
03:13:38.700 | talking with me.
03:13:39.540 | - Thanks, Alex.
03:13:40.980 | - Thanks for listening to this conversation
03:13:42.540 | with Boris Sofman.
03:13:43.820 | To support this podcast,
03:13:45.220 | please check out our sponsors in the description.
03:13:47.820 | And now let me leave you with some words from Isaac Asimov.
03:13:51.340 | If you were to insist I was a robot,
03:13:54.780 | you might not consider me capable of love
03:13:58.380 | in some mystic human sense.
03:14:01.060 | Thank you for listening and hope to see you next time.
03:14:04.780 | (upbeat music)
03:14:07.380 | (upbeat music)
03:14:09.980 | [BLANK_AUDIO]