back to indexSertac Karaman: Robots That Fly and Robots That Drive | Lex Fridman Podcast #97
Chapters
0:0 Introduction
1:44 Autonomous flying vs autonomous driving
6:37 Flying cars
10:27 Role of simulation in robotics
17:35 Game theory and robotics
24:30 Autonomous vehicle company strategies
29:46 Optimus Ride
47:8 Waymo, Tesla, Optimus Ride timelines
53:22 Achieving the impossible
53:50 Iterative learning
58:39 Is Lidar is a crutch?
63:21 Fast autonomous flight
78:6 Most beautiful idea in robotics
00:00:00.000 |
The following is a conversation with Sertac Karaman, 00:00:04.680 |
co-founder of the autonomous vehicle company Optimus Ride, 00:00:07.960 |
and is one of the top roboticists in the world, 00:00:10.880 |
including robots that drive and robots that fly. 00:00:19.560 |
He's one of the smartest, most generous people I know. 00:00:22.600 |
So it was a pleasure and honor to finally sit down with him 00:00:34.600 |
support on Patreon, or simply connect with me on Twitter 00:01:09.040 |
let me mention a surprising fact about physical money. 00:01:12.460 |
It costs 2.4 cents to produce a single penny. 00:01:16.780 |
In fact, I think it costs $85 million annually 00:01:20.880 |
That's a crazy little fact about physical money. 00:01:34.080 |
an organization that is helping to advance robotics 00:01:36.600 |
and STEM education for young people around the world. 00:01:40.200 |
And now, here's my conversation with Sertac Karaman. 00:01:56.040 |
just kind of doing it for consumer drones and so on, 00:01:58.560 |
the kinds of applications that we're looking at right now 00:02:02.480 |
And so I think that that's maybe one of the reasons 00:02:10.520 |
I would think that the real benefits of autonomous flying, 00:02:14.480 |
unleashing them in like transportation, logistics, 00:02:16.600 |
and so on, I think it's a lot harder than autonomous driving. 00:02:30.520 |
large scale being deployed and flown and so on. 00:02:33.120 |
And I think that's gonna be after we kind of resolve 00:02:36.400 |
some of the large scale deployments of autonomous driving. 00:02:59.880 |
- We as academics or we business entrepreneur? 00:03:09.400 |
And I think that, but when you think about it, 00:03:17.000 |
but they are either like in isolated environments 00:03:26.320 |
but they're really confined to a certain environment 00:03:28.480 |
that they don't interact so much with humans. 00:03:32.280 |
factory floors, warehouses, they work on Mars, 00:03:38.240 |
But I think that the real challenge of our time 00:03:43.680 |
and put them into places where humans are present. 00:03:47.080 |
So now I know that there's a lot of like human robot 00:03:49.400 |
interaction type of things that need to be done. 00:03:53.520 |
but even just from the fundamental algorithms and systems 00:03:57.360 |
and the business cases, or maybe the business models, 00:04:01.200 |
even like architecture, planning, societal issues, 00:04:03.720 |
legal issues, there's a whole bunch of pack of things 00:04:06.720 |
that are related to us putting robotic vehicles 00:04:12.360 |
And these humans, they will not potentially be 00:04:21.840 |
They may not even know that they're autonomous. 00:04:25.360 |
living in environments that are designed for humans, 00:04:28.800 |
And that I think is one of the biggest challenges, 00:04:46.600 |
of autonomous vehicles that are going around. 00:04:48.800 |
It is so dense to the point where if you see one of them, 00:05:04.960 |
because I think we can bend the environment a little bit. 00:05:08.520 |
Especially kind of making them safe is a lot easier 00:05:19.440 |
But I don't see that there's gonna be a big separation. 00:05:23.240 |
that we're gonna quickly see these things unfold. 00:05:27.560 |
where there's tens of thousands of delivery drones 00:05:31.960 |
- You know, I think it's possible, to be honest. 00:05:43.480 |
And you wanna do it from the top of this building 00:05:48.480 |
And you're gonna do it in one and a half hours. 00:05:53.440 |
- Personal transport, so like you and me be a friend. 00:05:56.000 |
Like almost like Uber. - Yeah, or almost like an Uber. 00:05:58.600 |
So like four people, six people, eight people. 00:06:01.680 |
In our work in autonomous vehicles, I see that. 00:06:05.240 |
for one person transport, but also like a few people. 00:06:20.440 |
And I think, you know, it's not like the typical airplane 00:06:24.360 |
and the airport would disappear very quickly. 00:06:41.320 |
It's like when people imagine the future for 50 plus years, 00:06:45.680 |
It's like all technologies, it's cheesy to think about now 00:06:50.840 |
because it seems so far away, but overnight it can change. 00:07:00.440 |
But just one thing is that I think, you know, 00:07:16.080 |
I think there's a 50/50 chance that, you know, 00:07:17.960 |
like you can build machines that can ionize the air 00:07:41.440 |
And there's good amount of opportunities in that airspace. 00:07:51.440 |
Because that's a tough thing when you think about it. 00:07:54.360 |
to take an aircraft down and then what happens? 00:07:58.360 |
But, you know, imagine the airspace that's high enough 00:08:05.480 |
but it is low enough that you're not interacting 00:08:10.200 |
that are, you know, flying several thousand feet above. 00:08:18.440 |
Or it's actually kind of not utilized at all. 00:08:21.120 |
- So there's, you know, there's like recreational people 00:08:23.120 |
kind of fly every now and then, but it's very few. 00:08:26.800 |
you may not see any of them at any given time. 00:08:31.640 |
kind of utilizing that space and you'll be surprised. 00:08:34.360 |
And the moment you're outside of an airport a little bit, 00:08:36.840 |
like it just kind of flies off and then it goes out. 00:08:51.640 |
Ultimately, I think it is going to be building 00:09:01.760 |
more complicated than what we have on aircraft today. 00:09:05.240 |
And at the same time, ensuring just like we ensure 00:09:13.080 |
of complicated hardware and software becomes a challenge. 00:09:16.240 |
Especially when, you know, you build that hardware, 00:09:29.840 |
but then, you know, there's a lot of training there. 00:09:47.440 |
and it's just, there's like right now no other way. 00:09:50.280 |
And I don't know how else they could be done. 00:09:53.520 |
And, you know, there's always this conundrum. 00:09:56.240 |
I mean, we could maybe gather billions of programmers, 00:10:19.200 |
in a simulation environment than a billion humans 00:10:22.960 |
put their brains together and try to program. 00:10:26.760 |
- So what's the role of simulations with drones? 00:10:32.200 |
How promising, just the very thing you said just now, 00:10:38.760 |
and developing a safe flying robot in simulation 00:10:43.760 |
and deploying it and having that work pretty well 00:10:57.520 |
I think simulation environments actually could be key 00:11:22.040 |
that other computer would simulate and so on. 00:11:24.560 |
And I think, you know, fast forward these things, 00:11:26.320 |
you can create pretty crazy simulation environments. 00:11:32.080 |
that has happened recently and that, you know, 00:11:34.840 |
we can do now is that we can simulate cameras 00:11:45.360 |
I would imagine that with improvements in hardware, 00:11:48.560 |
especially, and with improvements in machine learning, 00:11:54.160 |
where we can simulate cameras very, very well. 00:11:57.280 |
- Simulate cameras means simulate how a real camera 00:12:03.200 |
Therefore you can explore the limitations of that. 00:12:07.540 |
You can train perception algorithms on that in simulation, 00:12:20.520 |
So for example, inertial sensing has been easy to simulate. 00:12:44.960 |
sensors that kind of like look out from the vehicle. 00:12:49.980 |
like laser range finders that are a little bit easier. 00:13:05.280 |
Even when you imagine like how a human driven car 00:13:14.320 |
a model of a human just doing a bunch of gestures 00:13:17.780 |
and so on, and you know, it's actually simulated. 00:13:30.900 |
And I have like this test that I pass it to my friends 00:13:37.480 |
rendered, which one is rendered, which one is real? 00:13:41.600 |
except I realized, except when we put humans in there. 00:13:45.200 |
It's possible that our brains are trained in a way 00:13:50.720 |
But we don't so much recognize the built environments 00:13:53.120 |
because built environments sort of came after per se, 00:14:01.340 |
you look at like monkeys and you can't distinguish 00:14:06.860 |
And it's very possible that they look at humans, 00:14:08.600 |
it's kind of pretty hard to distinguish one from another, 00:14:11.920 |
And so our eyes are pretty well trained to look at humans 00:14:14.640 |
and understand if something is off, we will get it. 00:14:21.920 |
what would happen is that we'd put like a human walking 00:14:31.360 |
I don't know what, but I can tell you it's the human. 00:14:34.200 |
I can take the human and I can show you like inside 00:14:38.800 |
and it will look like if we had time to render it, 00:14:50.440 |
and like there's nothing going on action wise, 00:14:59.680 |
How do we get a human that's would pass the mom/friend test, 00:15:10.080 |
So do you think that's something we can creep up to 00:15:17.220 |
where you have humans annotate what's more realistic 00:15:29.820 |
- It's hard because a lot of the other things 00:15:32.180 |
that I mentioned to you, including simulating cameras, 00:15:35.140 |
it is, the thing there is that we know the physics, 00:15:43.780 |
and we can write some rules and we can do that. 00:15:52.780 |
it's very similar to, it's not exactly the same, 00:15:54.920 |
but it's very similar to tracing photon by photon. 00:16:11.580 |
that would go through that, that's gonna be hard. 00:16:21.580 |
you can show the front test, you can say this or that 00:16:28.740 |
ultimately we're limited by the number of humans 00:16:38.000 |
So that modeling human behavior part is, I think, 00:16:52.480 |
Like you wanna use, so you're building self-driving, 00:16:55.720 |
at the first time, like right after Urban Challenge, 00:17:03.560 |
Slam algorithms came in, Google was just doing that. 00:17:08.580 |
basically that's about knowing where you are. 00:17:16.160 |
and that started telling us where everybody else is. 00:17:47.120 |
and having some control of how the future unrolls, 00:17:59.220 |
by being either aggressive or less aggressive 00:18:21.620 |
Like so if you see a vehicle that's completely empty 00:18:29.520 |
So interact it with, like you interact with this table 00:18:39.740 |
there's all kinds of ways of interacting with a human. 00:18:42.100 |
So if, like you and I are face to face, we're very civil, 00:18:45.780 |
we talk and we understand each other for the most part. 00:18:54.740 |
you and I might interact through YouTube comments 00:18:57.580 |
and the conversation may go at a totally different angle. 00:19:01.060 |
And so I think people kind of abusing these autonomous 00:19:09.940 |
you're trying to coordinate your way, make your way, 00:19:13.120 |
it's actually kind of harder than being a human. 00:19:19.680 |
as kind of humans are, but you also, you're a thing. 00:19:23.880 |
So you need to make sure that you can get around 00:19:34.620 |
I've actually personally have done quite a few papers, 00:19:37.740 |
both on that kind of game theory and also like this kind of 00:19:42.020 |
understanding people's social value orientation, 00:19:45.140 |
Some people are aggressive, some people not so much. 00:19:48.620 |
And a robot could understand that by just looking 00:19:56.280 |
you can actually understand, like if someone is gonna 00:20:03.100 |
- Well, in terms of predicting what they're going to do, 00:20:12.960 |
Right now, it seems like aggressive is a very dangerous 00:20:15.500 |
thing to do because it's costly from a societal perspective, 00:20:22.700 |
People are not very accepting of aggressive robots 00:20:30.960 |
And so I'm not entirely sure how to go about, 00:20:34.740 |
but I know for a fact that how these robots interact 00:20:40.340 |
and that interaction is always gonna be there. 00:20:42.420 |
I mean, you could be interacting with other vehicles 00:20:44.860 |
or other just people kind of like walking around. 00:20:48.020 |
And like I said, the moment there's nobody in the seat, 00:20:51.740 |
it's like an empty thing just rolling off the street. 00:20:54.220 |
It becomes like no different than any other thing 00:20:59.860 |
And so people, and maybe abuse is the wrong word, 00:21:03.080 |
but people, maybe rightfully even, they feel like, 00:21:13.000 |
And then the robots, they would need to understand it 00:21:16.000 |
and they would need to respond in a certain way. 00:21:21.000 |
quite a few interesting societal questions for us 00:21:23.440 |
as we deploy, like we talk, robots at large scale. 00:21:26.880 |
So what would happen when we try to deploy robots 00:21:29.320 |
at large scale, I think is that we can design systems 00:21:34.460 |
or we can design them that they're very sustainable. 00:21:37.120 |
But ultimately the sustainability efficiency trade-offs, 00:21:44.280 |
Like we're not gonna be able to just kind of put it aside. 00:21:54.760 |
or we can be a lot nicer and allow other people 00:21:57.920 |
to kind of quote unquote, own the environment 00:21:59.880 |
and live in a nice place, and then efficiency will drop. 00:22:05.240 |
I think sustainability gets attached to energy consumption 00:22:15.680 |
So you create an environment that people wanna live in. 00:22:19.240 |
And if robots are going around being aggressive, 00:22:22.080 |
you don't wanna live in that environment, maybe. 00:22:24.680 |
However, you should note that if you're not being aggressive 00:22:34.760 |
And I think this choice has always been there 00:22:37.120 |
in transportation, but I think the more autonomy comes in, 00:22:47.160 |
And then we'll get to ask the very difficult societal 00:22:49.980 |
questions of what do we value more, efficiency 00:22:58.080 |
I think that the interesting thing about like 00:23:01.840 |
I think is also kind of, I think a lot of times, 00:23:06.280 |
you know, we have focused on technology development, 00:23:12.380 |
the products somehow followed and then, you know, 00:23:14.560 |
we got to make these choices and things like that. 00:23:18.680 |
we even think about, you know, autonomous taxi type 00:23:21.800 |
of deployments and the systems that would evolve from there. 00:23:25.400 |
And you realize the business models are different, 00:23:28.280 |
the impact on architecture is different, urban planning, 00:23:34.160 |
and then you get into like these issues that you didn't 00:23:37.440 |
think about before, but like sustainability and ethics 00:23:43.800 |
like think about it, you're testing autonomous vehicles 00:23:47.120 |
I mean, the risk may be very small, but still, you know, 00:23:56.180 |
And so then you have that innovation, you know, 00:23:59.720 |
risk trade off that you're in that somewhere. 00:24:03.080 |
And we understand that pretty well now is that 00:24:07.580 |
if we don't test, at least the development will be slower. 00:24:12.360 |
I mean, it doesn't mean that we're not gonna be able 00:24:14.560 |
to develop, I think it's gonna be pretty hard actually, 00:24:16.840 |
maybe we can, I don't know, but the thing is that 00:24:20.240 |
those kinds of trade offs we already are making. 00:24:25.480 |
I think those trade offs will just really hit. 00:24:28.860 |
- So you are one of the founders of Optimus Ride, 00:24:33.040 |
an autonomous vehicle company, we'll talk about it. 00:24:35.040 |
But let me, on that point, ask maybe good examples, 00:24:49.760 |
on the spectrum of innovation and safety or caution. 00:24:59.040 |
Waymo represents maybe a more cautious approach. 00:25:24.120 |
and is more likely to succeed in the short term 00:25:33.120 |
But I do think that the thing that is the most important 00:25:43.400 |
like if I were in some place, I wouldn't mind so much, 00:25:51.980 |
And so I think the key is for people to be informed 00:25:55.720 |
and so that they can, ideally, they can make a choice. 00:26:01.920 |
making that unanimously is of course very hard. 00:26:06.360 |
But I don't think it's actually that hard to inform people. 00:26:36.360 |
the other one or whatever the objective for that is 00:26:45.960 |
there are actually two other orthogonal dimensions 00:26:50.220 |
On the one hand, for Waymo, I can see that they're, 00:26:53.160 |
I mean, I think they a little bit see it as research as well. 00:26:57.200 |
So they kind of, I'm not sure if they're like 00:26:59.100 |
really interested in like an immediate product. 00:27:06.160 |
Sometimes there's some pressure to talk about it. 00:27:22.800 |
And autonomous vehicles is a very interesting embodiment 00:27:31.760 |
where everything else is, what everything else is gonna do 00:27:35.520 |
How do you actually interact with humans the right way? 00:27:48.380 |
I mean, I think that they have a good product. 00:27:51.040 |
I think that, you know, like it's not for me, 00:27:59.260 |
but I was just referring to the automation itself. 00:28:02.300 |
I mean, you know, like it kind of drives itself. 00:28:07.500 |
you still have to pay attention to it, right? 00:28:14.060 |
And so people, I think people are willing to pay for it. 00:28:22.560 |
Maybe one of those reasons is Elon Musk is the CEO. 00:28:25.180 |
And you know, he seems like a visionary person. 00:28:29.140 |
And so that adds like 5K to the value of the car. 00:28:45.460 |
They want to kind of put out a certain approach 00:28:49.360 |
But I think that there's also a primary benefit 00:28:53.100 |
of doing all these updates and rolling it out 00:28:57.540 |
And it's basic, you know, demand, supply, market 00:29:03.580 |
They're happy to pay another 5K, 10K for that novelty 00:29:10.580 |
It's not like they get it and they try it a couple of times. 00:29:12.740 |
It's a novelty, but they use it a lot of the time. 00:29:18.540 |
Like they are on pretty orthogonal dimensions 00:29:20.480 |
of what kind of things that they're building. 00:29:31.420 |
kind of using a similar, almost like an internal 00:29:38.900 |
And maybe one of them is building like a car. 00:29:41.180 |
The other one is building a truck or something. 00:29:42.940 |
So ultimately the use case is very different. 00:29:45.340 |
- So you, like I said, are one of the founders 00:29:58.420 |
What does it take to start an autonomous vehicle company? 00:30:02.300 |
How do you go from idea to deploying vehicles 00:30:04.600 |
like you are in a bunch of places, including New York? 00:30:16.580 |
in the autonomous vehicle industry back in like 2014, even, 00:30:27.220 |
I would hear things like fully autonomous vehicles 00:30:32.720 |
You know, I was a part of MIT's Urban Challenge Entry. 00:30:36.880 |
It kind of like, it has an interesting history. 00:30:44.100 |
sort of a lot of mathematically oriented work. 00:30:46.820 |
And I think I kind of, you know, at some point 00:30:52.660 |
And so I came to MIT's mechanical engineering program. 00:30:55.640 |
And I now realize, I think my advisor hired me 00:31:08.500 |
where we really learned, I mean, what the challenges are 00:31:11.540 |
and what kind of limitations are we up against? 00:31:14.460 |
You know, like having the limitations of computers 00:31:26.400 |
why don't we take a more like a market-based approach? 00:31:36.280 |
of like an autonomous vehicle only, I would say. 00:31:41.960 |
But you know, the way we kind of see it is that 00:31:44.600 |
we think that the approach should actually involve humans 00:31:49.160 |
operating them, not just not sitting in the vehicle. 00:32:10.940 |
- You're referring to a world of maybe perhaps 00:32:19.660 |
What does it mean for 10 people to control 50 vehicles? 00:32:28.000 |
'Cause what people think then is that people think 00:32:34.420 |
sees like maybe puts on goggles or something, 00:32:43.300 |
humans are in control, except in certain places, 00:32:49.380 |
And so imagine like a room where people can see 00:32:53.260 |
what the other vehicles are doing and everything. 00:32:56.220 |
And there will be some people who are more like 00:33:00.220 |
air traffic controllers, call them like AV controllers. 00:33:04.300 |
And so these AV controllers would actually see 00:33:08.880 |
And they would understand why vehicles are really confident 00:33:12.420 |
and where they kind of need a little bit more help. 00:33:22.840 |
If you had zero people, they could be very safe, 00:33:27.660 |
And so if you want them to go around 25 miles an hour, 00:33:32.020 |
And for example, the vehicle come to an intersection 00:33:43.660 |
And right now it's clear, I can turn, I know that, 00:33:51.580 |
This doesn't mean necessarily we're doing that actually. 00:33:59.380 |
you're kind of expecting a person to press a button, 00:34:07.740 |
But I think you need people to be able to set 00:34:12.460 |
That's the other thing with autonomous vehicles. 00:34:14.060 |
I think a lot of people kind of think about it as follows. 00:34:23.300 |
So I think how this is gonna work out is that 00:34:31.660 |
And people kind of tend to think about it that way. 00:34:39.980 |
If asked, you might have said, I don't think I need that, 00:34:43.060 |
or I don't think it should be that and so on. 00:34:45.020 |
And then that becomes the next big thing, coding code. 00:34:49.340 |
And so I think that this kind of different ways 00:34:52.500 |
of humans operating vehicles could be really powerful. 00:34:58.260 |
we might open our eyes up to a world in which 00:35:15.380 |
you see a whole bunch of autonomous vehicles, Optimus ride, 00:35:25.860 |
that we may almost see like a whole mushrooming 00:35:35.420 |
And then one day when your car actually drives itself, 00:35:39.060 |
it may not be all that much of a surprise at all 00:35:41.020 |
because you see it all the time, you interact with them, 00:35:43.220 |
you take the Optimus ride, hopefully that's your choice. 00:35:52.060 |
I don't know, like you have a little delivery vehicle 00:35:54.260 |
that goes around the sidewalks and delivers you things 00:36:10.740 |
- So there's gonna be a bunch of applications 00:36:17.220 |
Some of which, maybe many of which we don't expect at all. 00:36:20.620 |
So if we look at Optimus ride, what do you think, 00:36:27.100 |
the one that like really works for people in mobility, 00:36:31.060 |
what do you think Optimus ride will connect with 00:36:36.260 |
- I think that the first places that I like to target, 00:36:39.140 |
honestly, is like these places where transportation 00:36:46.700 |
So you can imagine like roughly two mile by two mile, 00:36:49.860 |
could be bigger, could be smaller type of an environment. 00:36:53.220 |
And there's a lot of these kinds of environments 00:37:01.140 |
but that was the one that was last publicized. 00:37:06.180 |
So there's not a lot of transportation there. 00:37:13.660 |
ends up being sort of a little too expensive. 00:37:15.900 |
Or when you compare it with operating Uber elsewhere, 00:37:19.820 |
that becomes the elsewhere becomes the priority. 00:37:23.020 |
those places become totally transportation deprived. 00:37:29.060 |
and to go from point A to point B inside this place, 00:37:41.300 |
And I think that one of the things that can be done 00:37:56.500 |
in an affordable way, affordable, accessible, 00:38:03.300 |
But I think what also enables is that this kind of effort, 00:38:14.540 |
even for a small environment, like two mile by two mile, 00:38:17.540 |
it doesn't have to be smack in the middle of New York. 00:38:25.020 |
you're looking at billions of dollars of savings 00:38:32.420 |
I mean, the places that we live are like built for cars. 00:38:37.420 |
It didn't look like this just like a hundred years ago. 00:38:41.420 |
Like today, no one walks in the middle of the street. 00:38:49.620 |
And so sometimes they close the road, it happens here. 00:38:52.220 |
You know, like the celebration, they close the road, 00:38:54.420 |
still people don't walk in the middle of the road, 00:38:56.260 |
like just walk in the middle and people don't. 00:39:03.900 |
And I think we talked about sustainability, livability. 00:39:16.300 |
And so I think that's the first thing that we're targeting. 00:39:19.220 |
And I think that we're getting like a really good response, 00:39:21.980 |
both from an economic societal point of view, 00:39:24.660 |
especially places that are a little bit forward looking. 00:39:36.460 |
And so, you know, you get those kinds of people 00:39:41.140 |
sort of making that environment more livable. 00:39:44.020 |
And these kinds of solutions that Optimist Ride provides 00:39:50.460 |
And many of these places that are transportation deprived, 00:40:05.780 |
It's because, you know, like the driver is very expensive 00:40:11.020 |
So what makes sense is to attach 20, 30 seats to a driver. 00:40:20.300 |
We tell them we're going to give you like four-seaters, 00:40:25.140 |
I'm like, you know, you don't need 20-seaters. 00:40:35.580 |
not only you will get delays in transportation, 00:40:40.300 |
It will take a long time to speed up, slow down, and so on. 00:40:45.780 |
So it's kind of like really hard to interact with. 00:40:56.020 |
I mean, just the logistics of getting the vehicle to you 00:40:59.500 |
becomes easier when you have a giant shuttle. 00:41:11.780 |
versus, you know, you have a whole bunch of them, 00:41:14.860 |
you can imagine the route you can still have, 00:41:23.880 |
they could be like, you know, half a mile apart 00:41:31.220 |
when you go out, you won't wait for them for a long time. 00:41:47.740 |
We say, "Why don't you just walk into the vehicle? 00:41:53.540 |
and it gives you a bunch of options of places that you go, 00:41:57.520 |
I mean, people kind of also internalize the apps. 00:42:05.460 |
- But I think one of the things that, you know, 00:42:07.340 |
we really try to do is to take that shuttle experience 00:42:14.580 |
And so I think that's another important thing. 00:42:18.840 |
just like teleoperation, we don't do shuttles. 00:42:21.820 |
You know, we're really kind of thinking of this as a system 00:42:34.220 |
and we want to tilt it into something that people love. 00:42:52.640 |
there'll be two Optimus ride vehicles within line of sight. 00:42:58.720 |
- Yeah, like for example, that's the density. 00:43:12.840 |
Like you just walk around and you look around, 00:43:21.600 |
Like there's a couple zip codes that, you know. 00:43:25.640 |
because you know, like maybe the couple zip codes. 00:43:31.480 |
but now like we're taking a lot of tangents today. 00:43:35.240 |
- And so I think that this is actually important. 00:43:38.260 |
People call this data density or data velocity. 00:43:40.960 |
So it's very good to collect data in a way that, 00:43:44.280 |
you know, you see the same place so many times. 00:43:47.120 |
Like you can drive 10,000 miles around the country, 00:43:50.720 |
or you drive 10,000 miles in a confined environment. 00:43:54.240 |
You'll see the same intersection hundreds of times. 00:43:56.600 |
And when it comes to predicting what people are gonna do 00:43:59.000 |
in that specific intersection, you become really good at it. 00:44:02.720 |
Versus if you draw on like 10,000 miles around the country, 00:44:06.800 |
And so trying to predict what people do becomes hard. 00:44:09.440 |
And I think that, you know, you said what is needed. 00:44:17.840 |
Like for example, in good times in Singapore, 00:44:23.280 |
And they are like, you know, 10%, 20% of traffic, 00:44:31.880 |
So that, you know, you get to a certain place 00:44:34.000 |
where you really, the benefits really kick off 00:44:40.760 |
But once you get there, you actually get the benefits. 00:44:45.880 |
People really don't like to wait for themselves. 00:44:50.880 |
But for example, they can wait a lot more for the goods 00:44:56.840 |
and you wanna wait half an hour, that sounds great. 00:45:00.000 |
You're gonna take a cab, you're waiting half an hour. 00:45:14.320 |
And then it's really, it's a good fraction of traffic 00:45:17.880 |
to the point where, you know, you go, you look around 00:45:24.160 |
And it's already waiting for you or something like that. 00:45:28.480 |
If you do it at that scale, like today, for instance, Uber, 00:45:39.360 |
Or drivers would argue that it's a large cut. 00:45:51.300 |
the driver will claim that most of it is their time. 00:46:10.640 |
a fraction of a person is kind of operating the car, 00:46:19.040 |
you realize that the internal combustion engine 00:46:25.100 |
they pass crash tests, they're like really heavy. 00:46:40.720 |
I think the economics really starts to check out, 00:46:43.320 |
like to the point where, I mean, I don't know, 00:46:47.160 |
and it may be less than a dollar to go from A to B. 00:46:50.300 |
As long as you don't change your destination, 00:46:55.700 |
If you share it, if you take another stop somewhere, 00:46:59.560 |
You know, these kinds of things, at least for models, 00:47:12.960 |
like it'll actually be here and have an impact. 00:47:20.000 |
we'll go back to our old friends, Waymo and Tesla. 00:47:23.600 |
So Waymo seems to have sort of technically similar approaches 00:47:36.140 |
they're not as interested as having impact today. 00:47:43.980 |
It's almost more of a research project still, 00:47:47.460 |
meaning they're trying to solve, as far as I understand, 00:47:52.740 |
but they seem to want to do more unrestricted movement, 00:47:57.740 |
meaning move from A to B, where A to B is all over the place, 00:48:01.460 |
versus Optimus Ride is really nicely geo-fenced 00:48:08.380 |
in a particular environment before you expand it. 00:48:11.460 |
And then Tesla is like the complete opposite, 00:48:14.220 |
which is the entirety of the world, actually, 00:48:21.100 |
Highway driving, urban driving, every kind of driving, 00:48:24.720 |
you kind of creep up to it by incrementally improving 00:48:44.580 |
loosely speaking, nobody can predict the future, 00:48:54.140 |
I've heard figures like at the end of this year, right? 00:49:06.040 |
- I mean, first thing to lay out, like everybody else, 00:49:11.660 |
I mean, I don't know where Tesla can look at, 00:49:32.700 |
what exactly is gonna go, especially for like 00:49:35.700 |
I mean, it's just kind of very hard to predict, 00:49:41.560 |
I think a lot of people, you know what they do 00:49:44.220 |
is that there's something that I called a couple times 00:49:47.420 |
time dilation in technology prediction happens. 00:49:53.100 |
There's a lot of things that are so far ahead, 00:49:57.700 |
And there's a lot of things that are actually close, 00:50:01.780 |
People try to kind of look at a whole landscape 00:50:08.260 |
Anything can happen in any order at any time. 00:50:10.660 |
And there's a whole bunch of things in there. 00:50:16.620 |
And so then what happens is that there's some things 00:51:05.700 |
And I think trying to predict like products ahead, 00:51:10.700 |
two, three years, it's hard to know in the following sense. 00:51:17.460 |
but sometimes really you're trying to build something 00:51:33.460 |
And they would just kind of extrapolate that out. 00:51:43.000 |
With AI that goes into the cars, we don't even have that. 00:51:47.500 |
Like we can't, I mean, what can you quantify? 00:51:53.480 |
But so I think when there's that technology gap, 00:52:08.180 |
I think you've actually argued that it's not a use, 00:52:10.100 |
even any answer you provide now is not that useful. 00:52:19.900 |
but this kind of like something like a startup 00:52:34.760 |
This kind of like iterated learning is very important. 00:52:47.380 |
the code in code Silicon Valley has done that 00:52:57.020 |
I mean, before, like, you know, you're trying to build, 00:53:00.780 |
I think these companies are building great technology 00:53:07.580 |
And that kind of didn't, wasn't there so much, 00:53:10.760 |
but at least like it was a kind of a technology 00:53:12.660 |
that you could predict to some degree and so on. 00:53:16.700 |
you know, things that it's kind of hard to quantify 00:53:24.740 |
as a leader of graduate students and at Optimus Ride, 00:53:28.820 |
a bunch of brilliant engineers, just curiosity, 00:53:33.060 |
psychologically, do you think it's good to think that, 00:53:37.640 |
you know, whatever technology gap we're talking about 00:53:49.980 |
that everything is going to improve exponentially 00:53:53.940 |
to yourself and to others around you as a leader? 00:53:57.340 |
Or do you want to be more sort of maybe not cynical, 00:54:19.300 |
And that doesn't mean sort of like, you know, 00:54:21.420 |
like you're Optimus Ride, you're kind of doing something, 00:54:27.420 |
I think is also kind of like this kind of notion. 00:54:30.260 |
And, you know, people can go around and say like, 00:54:32.740 |
you know, this year, next year, the other year and so on. 00:54:45.380 |
about what kind of technology that they're providing, 00:54:48.540 |
and not just sort of, you know, if it works very well. 00:54:58.500 |
or at the very least, YouTube videos comes out 00:55:00.980 |
on how the summon function works every now and then, 00:55:16.460 |
And I think we're closing some similar technology gaps, 00:55:22.460 |
You know, I think like we talked about, you know, 00:55:26.900 |
or in the kind of environments that we're in, 00:55:48.340 |
that is really important, and that is really key. 00:55:57.360 |
I mean, it's, like I said, it's very hard to predict. 00:56:01.080 |
And I would imagine that it would be good to do 00:56:10.820 |
that, you know, the technology gaps you close, 00:56:13.020 |
and the kind of sort of product that would ensue. 00:56:20.620 |
or, you know, other companies that I get involved in, 00:56:22.900 |
I mean, at some point, you find yourself in a situation 00:56:28.700 |
and people are investing in that, you know, building effort. 00:56:37.500 |
as they compare the investments they wanna make, 00:56:57.340 |
- You know, I gotta sort of throw back right at you, 00:57:00.060 |
criticism, in terms of, you know, like Tesla, 00:57:17.060 |
showing off, you know, showing off some of the awesome stuff, 00:57:20.500 |
the stuff that works and stuff that doesn't work. 00:57:25.180 |
with the tracking of different objects and pedestrians, 00:57:31.500 |
but I think the world would love to see that kind of stuff. 00:57:36.180 |
I think, you know, I should say that it's not like, 00:57:50.460 |
in kind of coding code stealth mode for a bit. 00:58:06.780 |
And I think, you know, some of the deployments 00:58:08.900 |
that we kind of announced were some of the first bits 00:58:12.820 |
of information that we kind of put out into the world. 00:58:17.980 |
A lot of the things that we've been developing 00:58:20.740 |
And then, you know, we're gonna start putting that out. 00:58:28.540 |
And I think it's good to not just kind of show them 00:58:32.420 |
when they come to our office for an interview, 00:58:34.020 |
but just put it out there in terms of like, you know, 00:58:43.660 |
So Elon Musk famously said that LIDAR is a crutch. 00:58:47.340 |
So I've talked to a bunch of people about it, 00:58:51.780 |
You use that crutch quite a bit in the DARPA days. 00:59:00.140 |
sort of, you know, more provocative and fun, I think, 00:59:07.540 |
primarily camera-based systems is going to be 00:59:11.620 |
what defines the future of autonomous vehicles. 00:59:15.800 |
LIDAR is a crutch versus primarily camera-based systems? 00:59:24.340 |
in just camera-based autonomous vehicle systems. 00:59:30.380 |
a lot of autonomy and you can do great things. 00:59:33.380 |
And it's very possible that at the timescales, 00:59:36.740 |
like we said, we can't predict 20 years from now, 00:59:40.700 |
like you may be able to do things that we're doing today 00:59:56.660 |
when you can only use cameras and you'll be fine. 00:59:59.840 |
At that time though, it's very possible that, you know, 01:00:03.980 |
you find the LIDAR system as another robustifier 01:00:08.580 |
or it's so affordable that it's stupid not to, you know, 01:00:14.280 |
And I think we may be looking at a future like that. 01:00:19.940 |
- Do you think we're over-relying on LIDAR right now 01:00:23.500 |
because we understand the better, it's more reliable 01:00:26.260 |
in many ways in terms from a safety perspective? 01:00:28.460 |
- It's easier to build with, that's the other thing. 01:00:33.620 |
I mean, you know, we've seen a lot of sort of 01:00:40.980 |
you slap a LIDAR on a car and it's kind of easy to build 01:00:46.820 |
just kind of coat it up and you hit the button 01:00:50.740 |
So I think there's admittedly, there's a lot of people 01:00:54.300 |
they focus on the LIDAR 'cause it's easier to build with. 01:00:57.860 |
That doesn't mean that, you know, without the camera, 01:01:00.380 |
just cameras, you cannot do what they're doing, 01:01:05.060 |
And so you need to have certain kind of expertise 01:01:14.220 |
We certainly work on computer vision and OptumSprite 01:01:17.260 |
by a lot, like, and we've been doing that from day one. 01:01:23.060 |
So, you know, we have a relatively minimal use of LIDARs, 01:01:47.540 |
especially in some isolated environments and cameras, 01:01:51.940 |
In the same future, it's very possible that, you know, 01:01:54.980 |
the LIDARs are so cheap and frankly make the software 01:01:57.980 |
maybe a little less compute intensive at the very least, 01:02:02.540 |
or maybe less complicated so that they can be certified 01:02:09.020 |
that it's kind of stupid not to put the LIDAR. 01:02:11.980 |
Like, imagine this, you either pay money for the LIDAR 01:02:26.180 |
I do think that a lot of the sort of initial deployments 01:02:29.620 |
of self-driving vehicles, I think they will enroll LIDARs 01:02:39.180 |
are actually not that hard to build in solid state. 01:02:44.220 |
but like MEMS type of scanning LIDARs and things like that, 01:02:46.900 |
they're like, they're actually not that hard. 01:02:48.620 |
I think they will, maybe kind of playing with the spectrum 01:02:51.460 |
and the phase arrays, they're a little bit harder, 01:02:53.260 |
but I think like putting a MEMS mirror in there 01:02:57.580 |
that kind of scans the environment, it's not hard. 01:03:02.620 |
just like with a lot of the things that we do nowadays 01:03:10.820 |
in when you're trying to scan the environment. 01:03:18.180 |
it's something that you can put in there affordably. 01:03:38.620 |
So can you tell me about the very basics of the challenge 01:03:45.940 |
And it's sort of echoes of the early DARPA challenge 01:03:49.860 |
in the, through the desert that we're seeing now, 01:03:54.340 |
- Yeah, I mean, one interesting thing about it is that, 01:03:56.860 |
you know, people, drone racing exists as an e-sport. 01:04:03.300 |
but there's a real drone going in an environment. 01:04:06.140 |
- A human being is controlling it with goggles on. 01:04:08.780 |
So there's no, it is a robot, but there's no AI. 01:04:27.340 |
- Of aggressive flight, fully autonomous, aggressive flight. 01:04:36.980 |
I'd love to build autonomous vehicles like drones 01:04:41.020 |
that can go far faster than any human possibly can. 01:04:45.260 |
I think we should recognize that we as humans have, 01:05:00.780 |
but a lot of people kind of think about human level AI. 01:05:03.940 |
And they think that, you know, AI is not human level. 01:05:09.600 |
Versus I think that the situation really is that 01:05:17.820 |
and you know, it gets smarter and smarter and smarter. 01:05:26.060 |
and you know, you have to like react in milliseconds. 01:05:32.940 |
and then that information just flows through your brain, 01:05:41.940 |
You just, just a delay between your eye and your fingers 01:05:46.660 |
is a delay that a robot doesn't have to have. 01:05:58.340 |
like a human eye would barely hit a hundred hertz. 01:06:00.900 |
So imagine things that see stuff in slow motion, 01:06:16.780 |
just in the United States on traffic accidents. 01:06:19.460 |
And many of them are like known cases, you know, 01:06:25.460 |
going into a highway, you hit somebody and you're off, 01:06:35.820 |
And I think if you had enough compute in a car 01:06:38.780 |
and a very fast camera, right at the time of an accident, 01:06:44.860 |
like you could shut down the infotainment system 01:06:51.620 |
you use it for the kind of artificial intelligence 01:07:01.900 |
you can deliver what the human is trying to do. 01:07:07.900 |
not being able to do that with motor skills and the eyes, 01:07:15.100 |
with what I would call high throughput computing. 01:07:26.740 |
however fast you clock it, are typically not enough. 01:07:30.700 |
You need to build those computers from the ground up 01:07:55.020 |
I mean, that's a little bit what Nvidia does. 01:07:57.180 |
Sort of like they almost like build the hardware, 01:08:03.520 |
and it goes back and forth, but it's that co-design. 01:08:11.620 |
that could use a camera image to like track what's moving 01:08:23.540 |
that we're at the limit sometimes of, you know, 01:08:29.500 |
Because, you know, if you really want to track things, 01:08:31.640 |
you want the camera image to be 90% kind of like, 01:08:35.100 |
or somewhat the same from one frame to the next. 01:08:38.940 |
- And why are we at the limit of the camera frame rate? 01:08:48.660 |
or like there's something called camera serial interface 01:08:54.860 |
and copper wires can only transmit so much data. 01:08:58.260 |
And you hit the Shannon limit on copper wires. 01:09:01.140 |
And, you know, you hit yet another kind of universal limit 01:09:11.260 |
You can take compute and put it right next to the pixels. 01:09:23.660 |
- Yeah, you need to do a lot of parallel processing, 01:09:28.660 |
You know, like the data is transferred in parallel somehow. 01:09:40.540 |
How do we make drones see many more frames a second, 01:09:52.820 |
You clock them at, you know, several gigahertz. 01:09:59.860 |
we run into some heating issues and things like that. 01:10:01.660 |
But another thing is that three gigahertz clock, 01:10:04.940 |
light travels kind of like on the order of a few inches 01:10:14.420 |
and as the clock signal is going around in the chip, 01:10:21.380 |
the design of the complexity of the chip becomes so hard. 01:10:23.820 |
I mean, we have hit the fundamental limits of the universe 01:10:30.620 |
It's great, but like, we can't make transistors smaller 01:10:41.700 |
information doesn't travel faster in the universe. 01:10:52.140 |
the way you organize the chip into a CPU or even a GPU, 01:11:03.820 |
but you really almost need to take those transistors, 01:11:07.620 |
so that the information travels on those transistors 01:11:27.500 |
It's exciting for people, you know, students like it. 01:11:39.300 |
just like how impactful seatbelts were in driving. 01:11:46.900 |
next generation autonomous air taxis and things like that. 01:11:51.060 |
but one day we may need to purge these things. 01:11:53.740 |
If you really wanna go from Boston to New York 01:11:59.860 |
Most of these companies that are kind of doing 01:12:01.940 |
code and code flying cars, they're focusing on that. 01:12:04.060 |
But then how do you land it on top of a building? 01:12:10.100 |
like perch land, it's just gonna go perch into a building. 01:12:13.980 |
If you wanna do that, like you need these kinds of systems. 01:12:30.620 |
you might as well put some like rocket engines in the back 01:12:33.980 |
You go through the gate and a human looks at it 01:12:38.100 |
And they would say, "It's impossible for me to do that." 01:12:44.060 |
that would, you know, one day steer cars out of accidents. 01:12:48.860 |
- So, but then let's get back to the practical, 01:13:01.340 |
which the DARPA Challenge for the desert did. 01:13:03.500 |
You know, theoretically we had autonomous vehicles, 01:13:09.540 |
first of all, which nobody finished the first year. 01:13:12.300 |
And then the second year just to get, you know, 01:13:21.100 |
So that, let me ask about the Alpha Pilot Challenge. 01:13:29.540 |
But let me ask, reminiscent of the DARPA days, 01:13:39.820 |
I think that depends on how you set up the race course. 01:13:42.380 |
And so if the race course is a slalom course, 01:13:46.220 |
But can you set up some course, like literally some course, 01:13:50.540 |
you get to design it, as the algorithm developer, 01:14:03.700 |
If you let the human that you're competing with 01:14:08.820 |
- So how many in the space of all possible courses 01:14:13.860 |
are, would humans win and would machines win? 01:14:21.860 |
which is like, the DARPA challenge days, right? 01:14:25.660 |
I think we understand, we understood what we wanted to build 01:14:29.820 |
but still building things, that experimentation, 01:14:51.180 |
we'll render other images and beam it back to the drone. 01:14:58.180 |
because now you're trying to fit all that data 01:15:02.620 |
And so you need to do a few crazy things to make that happen 01:15:05.660 |
but once you do that, then at least you can try things. 01:15:09.260 |
If you crash into something, you didn't actually crash. 01:15:24.380 |
and they build a lot of drones and it's okay to crash. 01:15:34.540 |
- That potentially could be the most exciting part. 01:15:47.860 |
but I've seen them trying to do similar things 01:15:54.980 |
and a glows are going right behind the drone, 01:16:03.580 |
about AlphaPilot Challenge where we have these drones 01:16:16.940 |
I don't think you want to tell people crashing is okay. 01:16:20.980 |
but because we don't want people to crash a lot, 01:16:28.740 |
and they're really pushing it to their limits. 01:16:37.540 |
- So in terms of the space of possible courses, 01:16:54.060 |
and in certain courses, like in the middle somewhere, 01:17:02.660 |
the machine gets beaten pretty much consistently by slightly. 01:17:07.660 |
But if you go through the course like 10 times, 01:17:10.140 |
humans get beaten very slightly, but consistently. 01:17:15.580 |
you get tired and things like that versus this machine 01:17:30.740 |
there's the classical things that everybody has realized. 01:17:35.860 |
If you put in some sort of like strategic thinking, 01:17:44.620 |
Precision is easy to do, so that's what they excel in. 01:17:48.620 |
And also sort of repeatability is easy to do, 01:17:55.020 |
You can build machines that excel in strategy as well 01:18:21.020 |
I have done a lot of work myself in decision-making, 01:18:31.540 |
like there's people who would work on like perception, 01:18:36.700 |
then how do you actually make like decisions? 01:18:38.820 |
And there's people also like how do you interact, 01:18:43.660 |
And I have admittedly worked a lot on the more control 01:18:54.780 |
that has always kind of baffled me is Bellman's equation. 01:18:58.980 |
And so it's this person who have realized like way back, 01:19:12.300 |
that you're kind of jointly trying to determine, 01:19:23.900 |
And it's baffling to me because it both kind of 01:19:30.780 |
'cause it's a single equation that anyone can write down. 01:19:33.900 |
You can teach it in the first course on decision-making. 01:19:37.340 |
At the same time, it tells you how computation, 01:19:41.500 |
I feel like a lot of the things that I've done at MIT 01:19:44.260 |
for research has been kind of just this fight 01:19:50.820 |
where we now got to like, let's just redesign this chip. 01:19:56.820 |
But I think it talks about how computationally hard 01:20:07.620 |
And so as the number of variables kind of grow, 01:20:11.100 |
the number of decisions you can make grows rapidly. 01:20:21.300 |
all possible assignments is more than the number of atoms 01:20:33.500 |
theoretically optimal and somehow, practically speaking, 01:20:43.500 |
despite all those challenges, which is quite incredible. 01:20:49.100 |
So I would say that it's kind of like quite baffling, 01:20:52.540 |
actually, in a lot of fields that we think about 01:20:57.140 |
And so I think here too, we know that in the worst case, 01:21:07.300 |
So it's just kind of baffling in decision-making 01:21:12.660 |
Just like how little we know about the beginning of time, 01:21:18.600 |
Like if you actually go into like from Balman's equation 01:21:22.260 |
all the way down, I mean, there's also how little we know 01:21:26.180 |
I mean, we don't even know if the axioms are like consistent 01:21:32.580 |
just like as you said, we tend to focus on the worst case 01:21:35.780 |
or the boundaries of everything we're studying. 01:21:38.340 |
And then the average case seems to somehow work out. 01:21:45.220 |
we freak out about a bunch of the traumatic stuff, 01:21:52.980 |
- So Tasha, thank you so much for being a friend, 01:22:04.540 |
and thank you to our presenting sponsor Cash App. 01:22:08.980 |
by downloading Cash App and using code LEXPODCAST. 01:22:12.940 |
If you enjoy this podcast, subscribe on YouTube, 01:22:19.140 |
or simply connect with me on Twitter @LexFriedman. 01:22:23.140 |
And now let me leave you with some words from Hal 9000 01:22:31.100 |
"I'm putting myself to the fullest possible use, 01:22:34.220 |
which is all I think that any conscious entity 01:22:38.960 |
Thank you for listening and hope to see you next time.