Vijay Kumar: Flying Robots | Lex Fridman Podcast #37

00:00:00.000 | The following is a conversation with Vijay Kumar.

00:00:03.080 | He's one of the top roboticists in the world,

00:00:05.760 | a professor at the University of Pennsylvania,

00:00:08.760 | a Dean of Penn Engineering,

00:00:10.680 | former director of Grasp Lab,

00:00:12.880 | or the General Robotics, Automation Sensing,

00:00:15.320 | and Perception Laboratory at Penn,

00:00:17.560 | that was established back in 1979, that's 40 years ago.

00:00:22.560 | Vijay is perhaps best known for his work

00:00:25.280 | in multi-robot systems, robot swarms,

00:00:28.520 | and micro aerial vehicles,

00:00:30.880 | robots that elegantly cooperate in flight

00:00:34.040 | under all the uncertainty and challenges

00:00:36.200 | that the real world conditions present.

00:00:38.780 | This is the Artificial Intelligence Podcast.

00:00:41.920 | If you enjoy it, subscribe on YouTube,

00:00:44.320 | give it five stars on iTunes, support it on Patreon,

00:00:47.560 | or simply connect with me on Twitter @LexFriedman,

00:00:50.480 | spelled F-R-I-D-M-A-N.

00:00:53.280 | And now, here's my conversation with Vijay Kumar.

00:00:58.380 | What is the first robot you've ever built,

00:01:01.100 | or were a part of building?

00:01:02.820 | - Way back when I was in graduate school,

00:01:04.780 | I was part of a fairly big project

00:01:06.780 | that involved building a very large hexapod.

00:01:11.780 | This weighed close to 7,000 pounds,

00:01:17.020 | and it was powered by hydraulic actuation,

00:01:21.600 | or it was actuated by hydraulics,

00:01:25.260 | with 18 motors, hydraulic motors,

00:01:29.760 | each controlled by an Intel 8085 processor

00:01:34.180 | and an 8086 coprocessor.

00:01:36.660 | And so imagine this huge monster

00:01:42.460 | that had 18 joints,

00:01:44.800 | each controlled by an independent computer,

00:01:46.940 | and there was a 19th computer

00:01:48.500 | that actually did the coordination between these 18 joints.

00:01:52.340 | So I was part of this project,

00:01:53.740 | and my thesis work was,

00:01:57.900 | how do you coordinate the 18 legs?

00:02:01.060 | And in particular, the pressures in the hydraulic cylinders

00:02:06.320 | to get efficient locomotion.

00:02:09.180 | - It sounds like a giant mess.

00:02:11.620 | So how difficult is it to make all the motors communicate?

00:02:14.460 | Presumably, you have to send signals

00:02:16.860 | hundreds of times a second, or at least--

00:02:18.700 | - So this was not my work,

00:02:19.900 | but the folks who worked on this

00:02:22.780 | wrote what I believe to be

00:02:24.180 | the first multiprocessor operating system.

00:02:26.620 | This was in the '80s.

00:02:27.920 | And you had to make sure that,

00:02:31.100 | obviously, messages got across from one joint to another.

00:02:34.620 | You have to remember the clock speeds on those computers

00:02:37.940 | were about half a megahertz.

00:02:39.660 | - Right.

00:02:40.500 | The '80s.

00:02:42.180 | So not to romanticize the notion,

00:02:45.300 | but how did it make you feel to see that robot move?

00:02:49.660 | - It was amazing.

00:02:52.220 | In hindsight, it looks like, well, we built this thing

00:02:55.220 | which really should have been much smaller.

00:02:57.260 | And of course, today's robots are much smaller.

00:02:59.100 | You look at Boston Dynamics or Ghost Robotics,

00:03:03.060 | a spinoff from Penn.

00:03:04.740 | But back then, you were stuck with the substrate you had,

00:03:10.020 | the compute you had, so things were unnecessarily big.

00:03:13.660 | But at the same time, and this is just human psychology,

00:03:18.000 | somehow bigger means grander.

00:03:21.540 | People never had the same appreciation

00:03:23.580 | for nanotechnology or nanodevices

00:03:26.340 | as they do for the Space Shuttle or the Boeing 747.

00:03:30.100 | - Yeah, you've actually done quite a good job

00:03:32.700 | at illustrating that small is beautiful

00:03:35.980 | in terms of robotics.

00:03:37.740 | So what is, on that topic, is the most beautiful

00:03:42.540 | or elegant robot in motion that you've ever seen?

00:03:46.180 | Not to pick favorites or whatever,

00:03:47.840 | but something that just inspires you that you remember.

00:03:50.980 | - Well, I think the thing that I'm most proud of

00:03:53.940 | that my students have done is really think about

00:03:57.140 | small UAVs that can maneuver in constrained spaces

00:04:00.300 | and in particular, their ability to coordinate

00:04:03.580 | with each other and form three-dimensional patterns.

00:04:06.700 | So once you can do that,

00:04:08.880 | you can essentially create 3D objects in the sky

00:04:15.780 | and you can deform these objects on the fly.

00:04:19.780 | So in some sense, your toolbox of what you can create

00:04:23.540 | has suddenly got enhanced.

00:04:25.300 | And before that, we did the two-dimensional version of this.

00:04:29.900 | So we had ground robots forming patterns and so on.

00:04:33.740 | So that was not as impressive, that was not as beautiful.

00:04:37.060 | But if you do it in 3D, suspended in midair,

00:04:40.480 | and you've got to go back to 2011 when we did this.

00:04:43.660 | Now it's actually pretty standard to do these things

00:04:45.980 | eight years later.

00:04:47.780 | But back then it was a big accomplishment.

00:04:49.820 | - So the distributed cooperation

00:04:52.460 | is where beauty emerges in your eyes.

00:04:55.660 | - Well, I think beauty to an engineer is very different

00:04:57.980 | from beauty to someone who's looking at robots

00:05:01.540 | from the outside, if you will.

00:05:03.400 | But what I meant there, so before we said that grand

00:05:06.620 | is associated with size.

00:05:10.460 | And another way of thinking about this

00:05:13.660 | is just the physical shape

00:05:15.580 | and the idea that you can get physical shapes in midair

00:05:18.300 | and have them deform, that's beautiful.

00:05:21.500 | - But the individual components,

00:05:22.980 | the agility is beautiful too, right?

00:05:24.820 | - That is true too.

00:05:25.660 | So then how quickly can you actually manipulate

00:05:28.420 | these three-dimensional shapes

00:05:29.500 | and the individual components?

00:05:31.200 | Yes, you're right.

00:05:32.460 | - By the way, you said UAV, unmanned aerial vehicle.

00:05:36.760 | What's a good term for drones, UAVs, quadcopters?

00:05:41.760 | Is there a term that's being standardized?

00:05:44.580 | - I don't know if there is.

00:05:45.420 | Everybody wants to use the word drones.

00:05:47.900 | And I've often said this, drones to me is a pejorative word.

00:05:51.060 | It signifies something that's dumb,

00:05:53.940 | that's pre-programmed, that does one little thing,

00:05:56.340 | and robots are anything but drones.

00:05:58.620 | So I actually don't like that word,

00:06:00.660 | but that's what everybody uses.

00:06:02.980 | You could call it unpiloted.

00:06:04.860 | - Unpiloted.

00:06:05.780 | - But even unpiloted could be radio-controlled,

00:06:08.100 | could be remotely controlled in many different ways.

00:06:10.700 | And I think the right word is,

00:06:12.940 | thinking about it as an aerial robot.

00:06:15.060 | - You also say agile, autonomous aerial robot, right?

00:06:19.100 | - Yeah, so agility is an attribute,

00:06:20.620 | but they don't have to be.

00:06:22.180 | - So what biological system,

00:06:24.820 | 'cause you've also drawn a lot of inspiration with those.

00:06:27.180 | I've seen bees and ants that you've talked about.

00:06:30.340 | What living creatures have you found to be most inspiring

00:06:35.260 | as an engineer, instructive in your work in robotics?

00:06:38.580 | - To me, so ants are really quite incredible creatures.

00:06:43.580 | I mean, the individuals arguably are very simple

00:06:47.940 | in how they're built,

00:06:50.220 | and yet they're incredibly resilient as a population.

00:06:54.020 | And as individuals, they're incredibly robust.

00:06:56.820 | So if you take an ant, it's six legs,

00:07:00.660 | you remove one leg, it still works just fine.

00:07:04.180 | And it moves along,

00:07:05.820 | and I don't know that it even realizes it's lost a leg.

00:07:08.780 | So that's the robustness at the individual ant level.

00:07:12.500 | But then you look about this instinct

00:07:15.420 | for self-preservation of the colonies,

00:07:17.740 | and they adapt in so many amazing ways.

00:07:20.460 | Transcending gaps,

00:07:24.620 | by just chaining themselves together when you have a flood,

00:07:30.460 | being able to recruit other teammates

00:07:33.180 | to carry big morsels of food.

00:07:36.220 | And then going out in different directions,

00:07:38.260 | looking for food,

00:07:39.580 | and then being able to demonstrate consensus,

00:07:43.900 | even though they don't communicate directly with each other

00:07:47.820 | the way we communicate with each other,

00:07:49.940 | in some sense, they also know how to do democracy

00:07:53.820 | probably better than what we do.

00:07:55.460 | - Yeah, somehow it's that even democracy is emergent.

00:07:58.980 | It seems like all of the phenomena

00:08:00.580 | that we see is all emergent.

00:08:02.420 | It seems like there's no centralized communicator.

00:08:05.540 | - There is, so I think a lot is made

00:08:07.220 | about that word emergent,

00:08:09.820 | and it means lots of things to different people.

00:08:11.540 | But you're absolutely right.

00:08:12.580 | I think as an engineer,

00:08:14.060 | you think about what elemental behaviors,

00:08:19.060 | what primitives you could synthesize

00:08:22.780 | so that the whole looks incredibly powerful,

00:08:26.700 | incredibly synergistic,

00:08:27.980 | the whole definitely being greater than the sum of the parts,

00:08:31.020 | and ants are living proof of that.

00:08:32.900 | - So when you see these beautiful swarms,

00:08:36.340 | where there's biological systems of robots,

00:08:38.860 | do you sometimes think of them

00:08:41.620 | as a single individual living intelligent organism?

00:08:45.960 | So it's the same as thinking of our human civilization

00:08:49.460 | as one organism?

00:08:51.140 | Or do you still, as an engineer,

00:08:52.940 | think about the individual components

00:08:54.580 | and all the engineering

00:08:55.420 | that went into the individual components?

00:08:57.300 | - Well, that's very interesting.

00:08:58.620 | So again, philosophically as engineers,

00:09:01.460 | what we want to do is to go beyond the individual components,

00:09:06.460 | the individual units,

00:09:08.260 | and think about it as a unit, as a cohesive unit,

00:09:11.500 | without worrying about the individual components.

00:09:15.100 | If you start obsessing about the individual building blocks

00:09:20.100 | and what they do,

00:09:22.100 | you inevitably will find it hard to scale up.

00:09:27.900 | Just mathematically,

00:09:28.940 | just think about individual things you want to model,

00:09:31.540 | and if you want to have 10 of those,

00:09:33.980 | then you essentially are taking

00:09:35.420 | Cartesian products of 10 things,

00:09:37.540 | and that makes it really complicated.

00:09:39.260 | Then to do any kind of synthesis or design

00:09:41.780 | in that high-dimensional space is really hard.

00:09:44.140 | So the right way to do this

00:09:45.820 | is to think about the individuals in a clever way

00:09:49.060 | so that at the higher level,

00:09:51.140 | when you look at lots and lots of them,

00:09:53.420 | abstractly you can think of them

00:09:55.340 | in some low-dimensional space.

00:09:57.100 | - So what does that involve?

00:09:58.660 | For the individual,

00:10:00.060 | do you have to try to make the way they see the world

00:10:03.300 | as local as possible?

00:10:05.140 | And the other thing,

00:10:06.420 | do you just have to make them robust to collisions?

00:10:09.540 | Like you said with the ants,

00:10:10.860 | if something fails, the whole swarm doesn't fail.

00:10:15.300 | - Right, I think as engineers, we do this.

00:10:17.740 | I mean, you know, think about we build planes

00:10:19.740 | or we build iPhones,

00:10:21.260 | and we know that by taking individual components,

00:10:26.260 | well-engineered components with well-specified interfaces

00:10:30.060 | that behave in a predictable way,

00:10:31.660 | you can build complex systems.

00:10:33.540 | So that's ingrained, I would claim,

00:10:36.860 | in most engineers' thinking.

00:10:39.380 | And it's true for computer scientists as well.

00:10:41.580 | I think what's different here

00:10:42.900 | is that you want the individuals

00:10:46.740 | to be robust in some sense,

00:10:49.460 | as we do in these other settings,

00:10:51.980 | but you also want some degree of resiliency

00:10:54.500 | for the population.

00:10:56.300 | And so you really want them to be able

00:10:58.700 | to reestablish communication with their neighbors.

00:11:03.700 | You want them to rethink their strategy for group behavior.

00:11:08.860 | You want them to reorganize.

00:11:11.020 | And that's where I think a lot of the challenges lie.

00:11:16.140 | - So just at a high level,

00:11:18.380 | what does it take for a bunch of,

00:11:22.460 | what you would call them,

00:11:23.540 | flying robots to create a formation?

00:11:26.900 | Just for people who are not familiar

00:11:28.900 | with robotics in general,

00:11:31.260 | how much information is needed?

00:11:32.980 | How do you even make it happen

00:11:36.060 | without a centralized controller?

00:11:39.740 | - So, I mean, there are a couple of different ways

00:11:41.300 | of looking at this.

00:11:43.380 | If you are a purist,

00:11:45.900 | you think of it as a way of recreating what nature does.

00:11:51.580 | So nature forms groups for several reasons,

00:11:56.580 | but mostly it's because of this instinct

00:12:00.460 | that organisms have of preserving their colonies,

00:12:05.420 | their population.

00:12:06.780 | Which means what?

00:12:09.380 | You need shelter, you need food,

00:12:11.260 | you need to procreate,

00:12:12.860 | and that's basically it.

00:12:14.660 | So the kinds of interactions you see are all organic.

00:12:18.260 | They're all local.

00:12:19.580 | And the only information that they share,

00:12:23.380 | and mostly it's indirectly,

00:12:25.460 | is to, again, preserve the herd or the flock

00:12:29.220 | or the swarm,

00:12:30.460 | and either by looking for new sources of food

00:12:36.860 | or looking for new shelters, right?

00:12:38.660 | - Right.

00:12:39.500 | - As engineers, when we build swarms,

00:12:43.380 | we have a mission.

00:12:44.660 | And when you think about it,

00:12:48.260 | and when you think of a mission,

00:12:50.740 | and it involves mobility,

00:12:54.340 | most often it's described

00:12:56.100 | in some kind of a global coordinate system.

00:12:58.820 | As a human, as an operator, as a commander,

00:13:03.020 | or as a collaborator,

00:13:05.300 | I have my coordinate system,

00:13:07.100 | and I want the robots to be consistent with that.

00:13:10.140 | So I might think of it slightly differently.

00:13:14.700 | I might want the robots

00:13:16.020 | to recognize that coordinate system,

00:13:18.900 | which means not only do they have to think locally

00:13:21.300 | in terms of who their immediate neighbors are,

00:13:23.100 | but they have to be cognizant

00:13:24.580 | of what the global environment looks like.

00:13:28.300 | So if I say, "Surround this building

00:13:30.980 | "and protect this from intruders,"

00:13:33.260 | well, they're immediately

00:13:34.580 | in a building-centered coordinate system,

00:13:36.460 | and I have to tell them where the building is.

00:13:38.700 | - And they're globally collaborating

00:13:40.020 | on the map of that building.

00:13:41.300 | They're maintaining some kind of global,

00:13:44.180 | not just in the frame of the building,

00:13:45.500 | but there's information

00:13:47.460 | that's ultimately being built up explicitly

00:13:49.700 | as opposed to kind of implicitly, like nature might.

00:13:54.380 | - Correct, correct.

00:13:55.220 | So in some sense, nature is very, very sophisticated,

00:13:57.660 | but the tasks that nature solves or needs to solve

00:14:01.860 | are very different from the kind of engineered tasks,

00:14:05.140 | artificial tasks that we are forced to address.

00:14:09.740 | And again, there's nothing preventing us

00:14:12.540 | from solving these other problems,

00:14:15.140 | but ultimately it's about impact.

00:14:16.580 | You want these swarms to do something useful.

00:14:19.340 | And so you're kind of driven into this very unnatural,

00:14:24.340 | if you will, unnatural meaning,

00:14:26.260 | not like how nature does, setting.

00:14:29.140 | - And it's probably a little bit more expensive

00:14:31.900 | to do it the way nature does,

00:14:33.740 | because nature is less sensitive

00:14:37.540 | to the loss of the individual,

00:14:39.460 | and cost-wise in robotics,

00:14:42.260 | I think you're more sensitive to losing individuals.

00:14:45.500 | - I think that's true.

00:14:46.940 | Although if you look at the price to performance ratio

00:14:50.100 | of robotic components, it's coming down dramatically.

00:14:53.980 | - Oh, interesting.

00:14:54.820 | - It continues to come down.

00:14:56.020 | So I think we're asymptotically approaching the point

00:14:58.900 | where we would get, yeah,

00:14:59.940 | the cost of individuals would really become insignificant.

00:15:04.940 | - So let's step back at a high level of view,

00:15:07.620 | the impossible question of what kind of,

00:15:11.660 | as an overview,

00:15:12.500 | what kind of autonomous flying vehicles are there

00:15:15.300 | in general?

00:15:16.220 | - I think the ones that receive a lot of notoriety

00:15:19.700 | are obviously the military vehicles.

00:15:22.540 | Military vehicles are controlled by a base station,

00:15:26.260 | but have a lot of human supervision,

00:15:29.660 | but have limited autonomy,

00:15:31.800 | which is the ability to go from point A to point B,

00:15:34.760 | and even the more sophisticated vehicles

00:15:38.300 | can do autonomous takeoff and landing.

00:15:41.740 | - And those usually have wings and they're heavy.

00:15:44.380 | - Usually they're wings,

00:15:45.340 | but there's nothing preventing us

00:15:46.640 | from doing this for helicopters as well.

00:15:49.060 | There are many military organizations

00:15:52.500 | that have autonomous helicopters in the same vein.

00:15:56.540 | And by the way, you look at autopilots and airplanes,

00:16:00.080 | and it's actually very similar.

00:16:02.820 | In fact, one interesting question we can ask is,

00:16:07.180 | if you look at all the air safety violations,

00:16:12.180 | all the crashes that occurred,

00:16:14.100 | would they have happened if the plane were truly autonomous?

00:16:18.660 | And I think you'll find that in many of the cases,

00:16:21.980 | because of pilot error, we make silly decisions.

00:16:24.620 | And so in some sense, even in air traffic,

00:16:26.980 | commercial air traffic, there's a lot of applications,

00:16:29.820 | although we only see autonomy being enabled

00:16:33.980 | at very high altitudes when the plane is on autopilot.

00:16:38.980 | - There's still a role for the human,

00:16:42.580 | and that kind of autonomy is, you're kind of implying,

00:16:47.580 | I don't know what the right word is,

00:16:48.700 | but it's a little dumber than it could be.

00:16:52.660 | - Right, so in the lab, of course,

00:16:55.740 | we can afford to be a lot more aggressive.

00:16:59.240 | And the question we try to ask is,

00:17:04.240 | can we make robots that will be able to make decisions

00:17:09.600 | without any kind of external infrastructure?

00:17:13.680 | So what does that mean?

00:17:14.880 | So the most common piece of infrastructure

00:17:16.960 | that airplanes use today is GPS.

00:17:19.640 | GPS is also the most brittle form of information.

00:17:26.680 | If you have driven in a city, tried to use GPS navigation,

00:17:30.240 | you know, in tall buildings, you immediately lose GPS.

00:17:33.720 | And so that's not a very sophisticated way

00:17:36.360 | of building autonomy.

00:17:37.880 | I think the second piece of infrastructure

00:17:39.600 | they rely on is communications.

00:17:41.960 | Again, it's very easy to jam communications.

00:17:46.220 | In fact, if you use Wi-Fi,

00:17:49.680 | you know that Wi-Fi signals drop out,

00:17:51.880 | cell signals drop out.

00:17:53.560 | So to rely on something like that is not good.

00:17:56.820 | The third form of infrastructure we use,

00:18:01.240 | and I hate to call it infrastructure,

00:18:02.960 | but it is that in the sense of robots, is people.

00:18:06.400 | So you can rely on somebody to pilot you.

00:18:08.660 | And so the question you want to ask is,

00:18:11.600 | if there are no pilots,

00:18:13.400 | if there's no communications with any base station,

00:18:16.220 | if there's no knowledge of position,

00:18:18.740 | and if there's no a priori map,

00:18:21.680 | a priori knowledge of what the environment looks like,

00:18:24.880 | a priori model of what might happen in the future,

00:18:28.280 | can robots navigate?

00:18:29.560 | So that is true autonomy.

00:18:31.480 | - So that's true autonomy.

00:18:33.200 | And we're talking about, you mentioned,

00:18:35.040 | like military application of drones.

00:18:36.880 | Okay, so what else is there?

00:18:38.300 | You talk about agile, autonomous flying robots,

00:18:42.060 | aerial robots.

00:18:43.520 | So that's a different kind of,

00:18:45.680 | it's not winged, it's not big, at least it's small.

00:18:48.160 | - So I use the word agility mostly,

00:18:50.820 | or at least we're motivated to do agile robots,

00:18:53.520 | mostly because robots can operate

00:18:58.000 | and should be operating in constrained environments.

00:19:01.160 | And if you want to operate the way a global hawk operates,

00:19:07.000 | I mean, the kinds of conditions in which you operate

00:19:09.140 | are very, very restrictive.

00:19:10.780 | If you want to go inside a building,

00:19:13.760 | for example, for search and rescue,

00:19:15.600 | or to locate an active shooter,

00:19:18.160 | or you want to navigate under the canopy in an orchard

00:19:22.140 | to look at health of plants,

00:19:23.900 | or to look for, to count fruits,

00:19:28.280 | to measure the tree trunks.

00:19:31.300 | These are things we do, by the way.

00:19:33.300 | - Yeah, some cool agriculture stuff you've shown in the past

00:19:35.980 | is really awesome.

00:19:36.820 | - Right, so in those kinds of settings,

00:19:39.140 | you do need that agility.

00:19:40.380 | Agility does not necessarily mean

00:19:42.580 | you break records for the 100 meters dash.

00:19:45.460 | What it really means is you see the unexpected,

00:19:48.040 | and you're able to maneuver in a safe way,

00:19:51.500 | and in a way that gets you the most information

00:19:55.460 | about the thing you're trying to do.

00:19:57.700 | - By the way, you may be the only person

00:20:00.500 | who in a TED Talk has used a math equation,

00:20:04.280 | which is amazing.

00:20:05.460 | People should go see one of your TED Talks.

00:20:07.660 | - Actually, it's very interesting,

00:20:08.860 | 'cause the TED curator, Chris Anderson, told me,

00:20:13.540 | "You can't show math."

00:20:16.040 | I thought about it, but that's who I am.

00:20:18.280 | I mean, that's our work.

00:20:20.840 | And so I felt compelled to give the audience a taste

00:20:25.840 | for at least some math.

00:20:27.680 | - So on that point, simply,

00:20:31.240 | what does it take to make a thing with four motors fly,

00:20:36.000 | a quadcopter, one of these little flying robots?

00:20:40.700 | How hard is it to make it fly?

00:20:44.040 | How do you coordinate the four motors?

00:20:46.620 | What's, how do you convert those motors

00:20:50.840 | into actual movement?

00:20:52.640 | - So this is an interesting question.

00:20:54.840 | We've been trying to do this since 2000.

00:20:58.120 | It is a commentary on the sensors

00:21:00.620 | that were available back then,

00:21:02.120 | the computers that were available back then.

00:21:04.320 | And a number of things happened between 2000 and 2007.

00:21:11.600 | One is the advances in computing, which is,

00:21:15.520 | so we all know about Moore's law,

00:21:16.800 | but I think 2007 was a tipping point,

00:21:19.720 | the year of the iPhone, the year of the cloud.

00:21:22.760 | Lots of things happened in 2007.

00:21:24.680 | But going back even further,

00:21:27.600 | inertial measurement units as a sensor really matured.

00:21:31.400 | Again, lots of reasons for that.

00:21:33.080 | Certainly there's a lot of federal funding,

00:21:35.440 | particularly DARPA in the US,

00:21:38.360 | but they didn't anticipate this boom in IMUs.

00:21:42.800 | But if you look, subsequently what happened

00:21:46.600 | is that every car manufacturer had to put an airbag in,

00:21:50.080 | which meant you had to have an accelerometer on board.

00:21:52.680 | And so that drove down the price to performance ratio.

00:21:55.080 | - Wow, I never, I should know this.

00:21:56.920 | That's very interesting.

00:21:57.760 | That's very interesting, the connection there.

00:21:59.440 | - And that's why research is very,

00:22:01.360 | it's very hard to predict the outcomes.

00:22:03.320 | And again, the federal government spent a ton of money

00:22:07.720 | on things that they thought were useful for resonators,

00:22:12.360 | but it ended up enabling these small UAVs,

00:22:16.320 | which is great, 'cause I could have never raised

00:22:17.920 | that much money and sold this project,

00:22:20.800 | hey, we want to build these small UAVs,

00:22:22.240 | can you actually fund the development of low-cost IMUs?

00:22:25.480 | - So why do you need an IMU on an UAV?

00:22:27.640 | - So I'll come back to that,

00:22:30.360 | but so in 2007, 2008, we were able to build these,

00:22:33.360 | and then the question you're asking was a good one,

00:22:35.240 | how do you coordinate the motors?

00:22:37.720 | To develop this, but over the last 10 years,

00:22:42.360 | everything is commoditized.

00:22:43.880 | A high school kid today can pick up a Raspberry Pi kit

00:22:47.880 | and build this, all the low-level functionality

00:22:52.120 | is all automated.

00:22:53.200 | But basically at some level, you have to drive the motors

00:22:59.160 | at the right RPMs, the right velocity,

00:23:04.560 | in order to generate the right amount of thrust,

00:23:07.480 | in order to position it and orient it

00:23:09.960 | in a way that you need to in order to fly.

00:23:12.840 | The feedback that you get is from onboard sensors,

00:23:16.640 | and the IMU is an important part of it.

00:23:18.400 | The IMU tells you what the acceleration is,

00:23:23.400 | as well as what the angular velocity is,

00:23:26.400 | and those are important pieces of information.

00:23:29.220 | In addition to that, you need some kind of local position

00:23:34.200 | or velocity information.

00:23:36.480 | For example, when we walk,

00:23:39.320 | we implicitly have this information

00:23:41.520 | because we kind of know what our stride length is.

00:23:45.800 | We also are looking at images fly past our retina,

00:23:51.440 | if you will, and so we can estimate velocity.

00:23:54.240 | We also have accelerometers in our head,

00:23:56.320 | and we're able to integrate all these pieces of information

00:23:59.120 | to determine where we are as we walk.

00:24:02.320 | And so robots have to do something very similar.

00:24:04.280 | You need an IMU, you need some kind of a camera

00:24:08.120 | or other sensor that's measuring velocity,

00:24:11.580 | and then you need some kind of a global reference frame

00:24:15.760 | if you really want to think about doing something

00:24:19.480 | in a world coordinate system.

00:24:21.260 | And so how do you estimate your position

00:24:23.640 | with respect to that global reference frame?

00:24:25.160 | That's important as well.

00:24:26.520 | - So coordinating the RPMs of the four motors

00:24:29.480 | is what allows you to first of all fly and hover,

00:24:32.640 | and then you can change the orientation

00:24:35.560 | and the velocity and so on.

00:24:37.600 | - Exactly, exactly.

00:24:38.440 | - So there's a bunch of degrees of freedom

00:24:40.280 | that you're playing with.

00:24:41.120 | - There's six degrees of freedom,

00:24:42.200 | but you only have four inputs, the four motors.

00:24:44.920 | And it turns out to be a remarkably versatile configuration.

00:24:49.920 | You think at first, well, I only have four motors,

00:24:53.080 | how do I go sideways?

00:24:55.000 | But it's not too hard to say, well, if I tilt myself,

00:24:57.280 | I can go sideways.

00:24:59.160 | And then you have four motors pointing up,

00:25:01.200 | how do I rotate in place about a vertical axis?

00:25:05.400 | Well, you rotate them at different speeds

00:25:07.840 | and that generates reaction moments

00:25:09.720 | and that allows you to turn.

00:25:11.560 | So it's actually a pretty, it's an optimal configuration

00:25:14.960 | from an engineer standpoint.

00:25:17.060 | It's very simple, very cleverly done and very versatile.

00:25:22.960 | - So if you could step back to a time,

00:25:27.280 | so I've always known flying robots as,

00:25:30.120 | to me it was natural that the quadcopter should fly.

00:25:35.840 | But when you first started working with it,

00:25:38.000 | I mean, how surprised are you that you can make,

00:25:42.080 | do so much with the four motors?

00:25:45.560 | How surprising is it you can make this thing fly,

00:25:47.640 | first of all, that you can make it hover,

00:25:49.800 | then you can add control to it?

00:25:52.040 | - Firstly, this is not,

00:25:54.460 | the four motor configuration is not ours.

00:25:56.860 | It has at least a hundred year history.

00:26:00.100 | - Oh, it does. - And various people,

00:26:02.480 | various people try to get quadrotors to fly

00:26:06.300 | without much success.

00:26:07.680 | As I said, we've been working on this since 2000.

00:26:11.560 | Our first designs were, well, this is way too complicated.

00:26:15.200 | Why not we try to get an omnidirectional flying robot?

00:26:19.200 | So our early designs, we had eight rotors.

00:26:22.800 | And so these eight rotors were arranged uniformly

00:26:26.120 | on a sphere, if you will.

00:26:28.900 | So you can imagine a symmetric configuration.

00:26:31.380 | And so you should be able to fly anywhere.

00:26:34.180 | But the real challenge we had

00:26:35.700 | is the strength to weight ratio was not enough.

00:26:37.900 | And of course we didn't have the sensors and so on.

00:26:41.260 | So everybody knew, or at least the people

00:26:43.860 | who worked with rotorcrafts knew,

00:26:45.700 | four rotors will get it done.

00:26:47.320 | So that was not our idea.

00:26:50.220 | But it took a while before we could actually do

00:26:53.500 | the onboard sensing and the computation that was needed

00:26:57.740 | for the kinds of agile maneuvering

00:27:00.400 | that we wanted to do in our little aerial robots.

00:27:03.820 | And that only happened between 2007 and 2009 in our lab.

00:27:08.340 | - Yeah, and you have to send the signal

00:27:10.700 | many hundred times a second.

00:27:13.240 | So the compute there,

00:27:14.980 | is everything has to come down in price.

00:27:16.740 | And what are the steps of getting from point A to point B?

00:27:21.740 | So we just talked about like local control.

00:27:25.860 | But if all the kind of cool dancing in the air

00:27:30.860 | that I've seen you show, how do you make it happen?

00:27:35.160 | Make a trajectory, first of all, okay,

00:27:39.700 | figure out a trajectory, so plan a trajectory.

00:27:42.340 | And then how do you make that trajectory happen?

00:27:45.060 | - Yeah, I think planning is a very fundamental problem

00:27:47.300 | in robotics.

00:27:48.140 | I think 10 years ago, it was an esoteric thing.

00:27:50.820 | But today with self-driving cars,

00:27:53.060 | everybody can understand this basic idea

00:27:55.860 | that a car sees a whole bunch of things

00:27:57.940 | and it has to keep a lane or maybe make a right turn

00:28:00.340 | or switch lanes.

00:28:01.300 | It has to plan a trajectory.

00:28:02.700 | It has to be safe, it has to be efficient.

00:28:04.880 | So everybody's familiar with that.

00:28:06.660 | That's kind of the first step that you have to think about

00:28:10.300 | when you say autonomy.

00:28:14.860 | And so for us, it's about finding smooth motions,

00:28:19.140 | motions that are safe.

00:28:21.340 | So we think about these two things.

00:28:22.900 | One is optimality, one is safety.

00:28:24.700 | Clearly, you cannot compromise safety.

00:28:27.220 | So you're looking for safe, optimal motions.

00:28:31.380 | The other thing you have to think about is

00:28:34.480 | can you actually compute a reasonable trajectory

00:28:38.160 | in a small amount of time?

00:28:40.740 | 'Cause you have a time budget.

00:28:42.300 | So the optimal becomes suboptimal.

00:28:45.180 | But in our lab, we focus on synthesizing smooth trajectory

00:28:50.180 | that satisfy all the constraints.

00:28:53.020 | In other words, don't violate any safety constraints.

00:28:57.200 | And is as efficient as possible.

00:29:02.860 | And when I say efficient, it could mean

00:29:05.220 | I want to get from point A to point B

00:29:06.600 | as quickly as possible.

00:29:08.340 | Or I want to get to it as gracefully as possible.

00:29:12.820 | Or I want to consume as little energy as possible.

00:29:15.940 | - But always staying within the safety constraints.

00:29:18.180 | - But, yes, always finding a safe trajectory.

00:29:22.780 | - So there's a lot of excitement and progress

00:29:24.980 | in the field of machine learning.

00:29:26.580 | - Yes.

00:29:27.420 | - And reinforcement learning and the neural network variant

00:29:31.700 | of that with deep reinforcement learning.

00:29:33.900 | Do you see a role of machine learning in...

00:29:37.100 | So a lot of the success of flying robots

00:29:40.540 | did not rely on machine learning.

00:29:42.260 | Except for maybe a little bit of the perception

00:29:44.980 | on the computer vision side.

00:29:46.540 | On the control side and the planning,

00:29:48.380 | do you see there's a role in the future

00:29:50.340 | for machine learning?

00:29:51.620 | - So let me disagree a little bit with you.

00:29:53.780 | I think we never perhaps called out,

00:29:56.180 | in my work, called out learning.

00:29:57.700 | But even this very simple idea of being able

00:30:00.100 | to fly through a constrained space.

00:30:02.180 | The first time you try it, you'll invariably,

00:30:07.660 | you might get it wrong if the task is challenging.

00:30:10.420 | And the reason is, to get it perfectly right,

00:30:14.180 | you have to model everything in the environment.

00:30:16.660 | And flying is notoriously hard to model.

00:30:22.120 | There are aerodynamic effects that we constantly discover.

00:30:27.120 | Even just before I was talking to you,

00:30:31.460 | I was talking to a student about how blades flap

00:30:35.600 | when they fly.

00:30:37.100 | - Wow.

00:30:37.940 | - And that ends up changing how a rotorcraft

00:30:42.140 | is accelerated in the angular direction.

00:30:45.900 | - Does it use like micro flaps or something?

00:30:48.420 | - It's not micro flaps.

00:30:49.260 | So we assume that each blade is rigid,

00:30:51.660 | but actually it flaps a little bit.

00:30:53.220 | - Oh.

00:30:54.060 | - It bends.

00:30:54.900 | - Interesting, yeah.

00:30:55.720 | - And so the models rely on the fact,

00:30:58.100 | on an assumption that they're actually rigid.

00:31:00.620 | But that's not true.

00:31:02.220 | If you're flying really quickly,

00:31:03.700 | these effects become significant.

00:31:06.900 | If you're flying close to the ground,

00:31:09.220 | you get pushed off by the ground, right?

00:31:12.140 | Something which every pilot knows when he tries to land

00:31:14.900 | or she tries to land, this is called a ground effect.

00:31:17.980 | Something very few pilots think about

00:31:20.980 | is what happens when you go close to a ceiling,

00:31:23.020 | where you get sucked into a ceiling.

00:31:25.300 | There are very few aircrafts that fly close

00:31:27.540 | to any kind of ceiling.

00:31:29.460 | Likewise, when you go close to a wall,

00:31:33.480 | there are these wall effects.

00:31:35.660 | And if you've gone on a train

00:31:37.620 | and you pass another train

00:31:39.020 | that's traveling in the opposite direction,

00:31:40.820 | you feel the buffeting.

00:31:42.360 | And so these kinds of micro climates

00:31:45.380 | affect our UAVs significantly.

00:31:47.820 | So if you want--

00:31:48.660 | - And they're impossible to model, essentially.

00:31:50.620 | - I wouldn't say they're impossible to model,

00:31:52.440 | but the level of sophistication you would need

00:31:54.860 | in the model and the software would be tremendous.

00:31:58.600 | Plus, to get everything right would be awfully tedious.

00:32:02.900 | So the way we do this is over time,

00:32:05.100 | we figure out how to adapt to these conditions.

00:32:08.980 | So early on, we used a form of learning

00:32:13.140 | that we call iterative learning.

00:32:15.760 | So this idea, if you want to perform a task,

00:32:18.580 | there are a few things that you need to change

00:32:22.100 | and iterate over a few parameters

00:32:25.580 | that over time, you can figure out.

00:32:29.920 | So I could call it policy gradient reinforcement learning,

00:32:34.000 | but actually it was just iterative learning.

00:32:35.500 | - Iterative learning.

00:32:36.340 | - And so this was there way back.

00:32:38.460 | I think what's interesting is,

00:32:40.060 | if you look at autonomous vehicles today,

00:32:42.260 | learning could occur in two pieces.

00:32:46.340 | One is perception, understanding the world.

00:32:48.580 | Second is action, taking actions.

00:32:50.720 | Everything that I've seen that is successful

00:32:54.540 | is on the perception side of things.

00:32:56.620 | So in computer vision,

00:32:57.620 | we've made amazing strides in the last 10 years.

00:33:00.080 | So recognizing objects, actually detecting objects,

00:33:03.900 | classifying them and tagging them in some sense,

00:33:08.620 | annotating them, this is all done through machine learning.

00:33:11.900 | On the action side, on the other hand,

00:33:14.400 | I don't know of any examples

00:33:15.980 | where there are fielded systems

00:33:17.780 | where we actually learn the right behavior.

00:33:21.420 | - Outside of single demonstration is successfully--

00:33:23.860 | - In the laboratory, this is the holy grail.

00:33:25.780 | Can you do end-to-end learning?

00:33:27.220 | Can you go from pixels to motor currents?

00:33:30.200 | This is really, really hard.

00:33:33.920 | And I think if you go forward,

00:33:36.200 | the right way to think about these things

00:33:38.800 | is data-driven approaches, learning-based approaches,

00:33:43.640 | in concert with model-based approaches,

00:33:46.440 | which is the traditional way of doing things.

00:33:48.440 | So I think there's a piece,

00:33:49.880 | there's a role for each of these methodologies.

00:33:52.340 | - So what do you think, just jumping out on topic,

00:33:54.800 | since you mentioned autonomous vehicles,

00:33:57.040 | what do you think are the limits on the perception side?

00:33:59.320 | So I've talked to Elon Musk,

00:34:01.960 | and there on the perception side,

00:34:04.200 | they're using primarily computer vision

00:34:06.760 | to perceive the environment.

00:34:08.880 | In your work with,

00:34:10.600 | because you work with the real world a lot,

00:34:13.360 | and the physical world,

00:34:14.520 | what are the limits of computer vision?

00:34:16.640 | Do you think we can solve autonomous vehicles

00:34:18.920 | focusing on the perception side,

00:34:21.720 | focusing on vision alone and machine learning?

00:34:25.080 | - So we also have a spin-off company, Excent Technologies,

00:34:29.400 | that works underground in mines.

00:34:32.720 | So you go into mines, they're dark, they're dirty.

00:34:36.380 | You fly in a dirty area,

00:34:39.360 | there's stuff you kick up by the propellers,

00:34:41.840 | the downwash kicks up dust.

00:34:43.540 | I challenge you to get a computer vision algorithm

00:34:47.600 | to work there.

00:34:48.660 | So we use lidars in that setting.

00:34:51.680 | Indoors, and even outdoors when we fly through fields,

00:34:57.400 | I think there's a lot of potential

00:34:59.200 | for just solving the problem using computer vision alone.

00:35:02.040 | But I think the bigger question is,

00:35:04.840 | can you actually solve,

00:35:08.240 | or can you actually identify all the corner cases

00:35:11.480 | using a single-sensing modality and using learning alone?

00:35:15.680 | - So what's your intuition there?

00:35:17.180 | - So look, if you have a corner case

00:35:19.820 | and your algorithm doesn't work,

00:35:21.840 | your instinct is to go get data about the corner case

00:35:25.200 | and patch it up, learn how to deal with that corner case.

00:35:28.540 | But at some point, this is going to saturate,

00:35:33.920 | this approach is not viable.

00:35:36.040 | So today, computer vision algorithms

00:35:39.200 | can detect 90% of the objects,

00:35:40.960 | or can detect objects 90% of the time,

00:35:43.000 | classify them 90% of the time.

00:35:45.480 | Cats on the internet probably can do 95%.

00:35:49.520 | But to get from 90% to 99%, you need a lot more data.

00:35:54.240 | And then I tell you, well, that's not enough

00:35:56.120 | because I have a safety-critical application,

00:35:58.280 | I want to go from 99% to 99.9%.

00:36:01.920 | Well, that's even more data.

00:36:03.200 | So I think if you look at

00:36:06.000 | wanting accuracy on the X-axis

00:36:10.240 | and look at the amount of data on the Y-axis,

00:36:15.780 | I believe that curve is an exponential curve.

00:36:18.120 | - Wow, okay, it's even hard if it's linear.

00:36:21.120 | - It's hard if it's linear, totally,

00:36:22.380 | but I think it's exponential.

00:36:24.180 | And the other thing you have to think about

00:36:25.720 | is that this process is a very, very power-hungry process.

00:36:30.720 | To run data farms or servers--

00:36:34.480 | - Power, do you mean literally power?

00:36:36.240 | - Literally power, literally power.

00:36:38.140 | So in 2014, five years ago,

00:36:40.740 | and I don't have more recent data,

00:36:43.460 | 2% of US electricity consumption

00:36:46.060 | was from data farms.

00:36:49.940 | So we think about this as an information science

00:36:53.740 | and information processing problem.

00:36:55.820 | Actually, it is an energy processing problem.

00:36:59.420 | And so unless we figure out better ways of doing this,

00:37:01.980 | I don't think this is viable.

00:37:04.020 | - So talking about driving,

00:37:06.380 | which is a safety-critical application,

00:37:08.140 | and some aspect of flight is safety-critical,

00:37:11.900 | maybe philosophical question, maybe an engineering one,

00:37:14.420 | what problem do you think is harder to solve,

00:37:16.420 | autonomous driving or autonomous flight?

00:37:19.660 | - That's a really interesting question.

00:37:21.380 | I think autonomous flight has several advantages

00:37:26.380 | that autonomous driving doesn't have.

00:37:30.000 | So look, if I wanna go from point A to point B,

00:37:34.060 | I have a very, very safe trajectory.

00:37:35.900 | Go vertically up to a maximum altitude,

00:37:38.420 | fly horizontally to just about the destination,

00:37:41.060 | and then come down vertically.

00:37:42.560 | This is pre-programmed.

00:37:47.000 | The equivalent of that is very hard to find

00:37:49.640 | in the self-driving car world,

00:37:51.940 | because you're on the ground,

00:37:53.160 | you're in a two-dimensional surface,

00:37:55.240 | and the trajectories on the two-dimensional surface

00:37:58.560 | are more likely to encounter obstacles.

00:38:01.040 | I mean this in an intuitive sense,

00:38:03.800 | but mathematically true,

00:38:04.760 | that's mathematically as well, that's true.

00:38:08.120 | - There's other option on the 2G space of platooning,

00:38:11.780 | or because there's so many obstacles,

00:38:13.380 | you can connect to those obstacles

00:38:15.000 | and all these kinds of options.

00:38:15.840 | - Those exist in the three-dimensional space as well.

00:38:18.200 | - So they do.

00:38:19.040 | So the question also implies,

00:38:21.680 | how difficult are obstacles

00:38:23.480 | in the three-dimensional space in flight?

00:38:25.520 | - So that's the downside.

00:38:27.400 | I think in three-dimensional space,

00:38:28.640 | you're modeling three-dimensional world,

00:38:30.960 | not just because you wanna avoid it,

00:38:32.960 | but you wanna reason about it,

00:38:34.720 | and you wanna work in that three-dimensional environment,

00:38:37.000 | and that's significantly harder.

00:38:39.240 | So that's one disadvantage.

00:38:40.600 | I think the second disadvantage is, of course,

00:38:42.960 | anytime you fly, you have to put up

00:38:45.040 | with the peculiarities of aerodynamics,

00:38:48.060 | and they're complicated environments,

00:38:50.500 | how do you negotiate that?

00:38:51.640 | So that's always a problem.

00:38:53.680 | - Do you see a time in the future where there is,

00:38:57.000 | you mentioned there's agriculture applications,

00:39:00.440 | so there's a lot of applications of flying robots,

00:39:03.480 | but do you see a time in the future

00:39:04.840 | where there is tens of thousands,

00:39:07.260 | or maybe hundreds of thousands of delivery drones

00:39:10.000 | that fill the sky, delivery flying robots?

00:39:14.040 | - I think there's a lot of potential

00:39:16.000 | for the last mile delivery,

00:39:17.760 | and so in crowded cities,

00:39:20.660 | I don't know, if you go to a place like Hong Kong,

00:39:24.280 | just crossing the river can take half an hour,

00:39:27.240 | and while a drone can just do it in five minutes at most.

00:39:32.240 | I think you look at delivery of supplies to remote villages.

00:39:38.720 | I work with a nonprofit called Weave Robotics,

00:39:41.600 | so they work in the Peruvian Amazon,

00:39:43.840 | where the only highways are rivers,

00:39:47.280 | and to get from point A to point B may take five hours,

00:39:52.280 | while with a drone, you can get there in 30 minutes.

00:39:55.400 | So just delivering drugs, retrieving samples

00:40:01.000 | for testing vaccines,

00:40:04.720 | I think there's huge potential here.

00:40:06.960 | So I think the challenges are not technological,

00:40:09.840 | the challenge is economical.

00:40:11.920 | The one thing I'll tell you that nobody thinks about

00:40:16.320 | is the fact that we've not made huge strides

00:40:19.640 | in battery technology.

00:40:21.540 | Yes, it's true, batteries are becoming less expensive

00:40:24.240 | because we have these mega factories that are coming up,

00:40:26.940 | but they're all based on lithium-based technologies,

00:40:29.480 | and if you look at the energy density and the power density,

00:40:33.960 | those are two fundamentally limiting numbers.

00:40:38.700 | So power density is important because for a UAV

00:40:41.320 | to take off vertically into the air,

00:40:43.160 | which most drones do, they don't have a runway,

00:40:47.040 | you consume roughly 200 watts per kilo at the small size.

00:40:50.900 | That's a lot, right?

00:40:54.560 | In contrast, the human brain consumes less than 80 watts,

00:40:58.200 | the whole of the human brain.

00:41:00.520 | So just imagine just lifting yourself into the air

00:41:04.240 | is like two or three light bulbs,

00:41:06.640 | which makes no sense to me.

00:41:08.480 | - Yeah, so you're going to have to, at scale,

00:41:10.940 | solve the energy problem then,

00:41:13.360 | charging the batteries, storing the energy, and so on.

00:41:18.360 | - And then the storage is the second problem,

00:41:21.160 | but storage limits the range.

00:41:23.420 | But you have to remember that you have to burn a lot of it

00:41:28.940 | per given time.

00:41:32.060 | - So the burning is another problem.

00:41:33.980 | - Which is a power question.

00:41:35.140 | - Yes, and do you think, just your intuition,

00:41:39.140 | there are breakthroughs in batteries on the horizon?

00:41:44.140 | How hard is that problem?

00:41:46.940 | - Look, there are a lot of companies that are promising

00:41:49.860 | flying cars that are autonomous and that are clean.

00:41:55.340 | - Right.

00:41:56.180 | - I think they're over-promising.

00:42:01.100 | The autonomy piece is doable.

00:42:04.220 | The clean piece, I don't think so.

00:42:06.460 | There's another company that I work with called Jetoptera.

00:42:11.300 | They make small jet engines.

00:42:13.820 | And they can get up to 50 miles an hour very easily

00:42:17.460 | and lift 50 kilos.

00:42:19.380 | But they're jet engines.

00:42:21.180 | They're efficient.

00:42:23.380 | They're a little louder than electric vehicles,

00:42:26.860 | but they can build flying cars.

00:42:29.460 | - So your sense is that there's a lot of pieces

00:42:32.100 | that have come together.

00:42:33.540 | So on this crazy question,

00:42:37.380 | if you look at companies like Kitty Hawk

00:42:39.620 | working on electric, so the clean,

00:42:42.080 | talking to Sebastian Thrun, right?

00:42:45.820 | It's a crazy dream, you know?

00:42:48.860 | But you work with flight a lot.

00:42:52.060 | You've mentioned before that manned flights

00:42:55.780 | or carrying a human body is very difficult to do.

00:43:00.780 | So how crazy is flying cars?

00:43:04.180 | Do you think there'll be a day when we have

00:43:06.220 | vertical takeoff and landing vehicles

00:43:11.060 | that are sufficiently affordable

00:43:13.980 | that we're going to see a huge amount of them?

00:43:17.380 | And they would look like something like we dream of

00:43:19.660 | when we think about flying cars.

00:43:21.020 | - Yeah, like the Jetsons.

00:43:22.140 | - The Jetsons, yeah.

00:43:23.100 | - So look, there are a lot of smart people working on this.

00:43:25.540 | And you never say something is not possible

00:43:29.660 | when you have people like Sebastian Thrun working on it.

00:43:32.180 | So I totally think it's viable.

00:43:35.140 | I question, again, the electric piece.

00:43:38.260 | - The electric piece, yeah.

00:43:39.540 | - And again, for short distances, you can do it.

00:43:41.660 | And there's no reason to suggest

00:43:43.660 | that these all just have to be rotorcrafts.

00:43:45.820 | You take off vertically,

00:43:46.900 | but then you morph into a forward flight.

00:43:49.660 | I think there are a lot of interesting designs.

00:43:51.620 | The question to me is, are these economically viable?

00:43:56.060 | And if you agree to do this with fossil fuels,

00:43:59.180 | it instantly immediately becomes viable.

00:44:01.980 | - That's a real challenge.

00:44:03.500 | Do you think it's possible for robots and humans

00:44:06.600 | to collaborate successfully on tasks?

00:44:08.880 | So a lot of robotics folks that I talk to and work with,

00:44:13.700 | I mean, humans just add a giant mess to the picture.

00:44:18.020 | So it's best to remove them from consideration

00:44:20.380 | when solving specific tasks.

00:44:22.460 | It's very difficult to model.

00:44:23.620 | There's just a source of uncertainty.

00:44:26.060 | In your work with these agile flying robots,

00:44:31.060 | do you think there's a role for collaboration with humans,

00:44:35.720 | or is it best to model tasks in a way

00:44:38.660 | that doesn't have a human in the picture?

00:44:43.460 | - Well, I don't think we should ever think about robots

00:44:46.800 | without human in the picture.

00:44:48.140 | Ultimately, robots are there because we want them

00:44:51.020 | to solve problems for humans.

00:44:54.420 | But there's no general solution to this problem.

00:44:58.340 | I think if you look at human interaction

00:45:00.060 | and how humans interact with robots,

00:45:02.460 | you know, we think of these in sort of three different ways.

00:45:05.340 | One is the human commanding the robot.

00:45:07.640 | The second is the human collaborating with the robot.

00:45:12.940 | So for example, we work on how a robot

00:45:15.580 | can actually pick up things with a human and carry things.

00:45:18.800 | That's like true collaboration.

00:45:20.960 | And third, we think about humans as bystanders.

00:45:25.000 | Self-driving cars, what's the human's role,

00:45:27.300 | and how do self-driving cars

00:45:30.400 | acknowledge the presence of humans?

00:45:33.000 | So I think all of these things are different scenarios.

00:45:35.920 | It depends on what kind of humans, what kind of task.

00:45:38.560 | And I think it's very difficult to say

00:45:41.920 | that there's a general theory that we all have for this.

00:45:45.580 | But at the same time, it's also silly to say

00:45:48.500 | that we should think about robots independent of humans.

00:45:52.100 | So to me, human-robot interaction

00:45:55.840 | is almost a mandatory aspect of everything we do.

00:45:59.820 | - Yes, but to which degree?

00:46:01.500 | So your thoughts, if we jump to autonomous vehicles,

00:46:04.140 | for example, there's a big debate

00:46:07.380 | between what's called level two and level four.

00:46:10.700 | So semi-autonomous and autonomous vehicles.

00:46:13.720 | And sort of the Tesla approach currently at least

00:46:16.480 | has a lot of collaboration between human and machine.

00:46:19.000 | So the human is supposed to actively supervise

00:46:22.080 | the operation of the robot.

00:46:23.920 | Part of the safety definition of how safe a robot is

00:46:28.920 | in that case is how effective is the human in monitoring it?

00:46:32.920 | Do you think that's ultimately not a good approach

00:46:39.720 | in sort of having a human in the picture,

00:46:42.380 | not as a bystander or part of the infrastructure,

00:46:47.400 | but really as part of what's required

00:46:50.000 | to make the system safe?

00:46:51.560 | - This is harder than it sounds.

00:46:53.720 | I think, you know, if you,

00:46:56.040 | I mean, I'm sure you've driven before

00:47:00.200 | in highways and so on.

00:47:01.920 | It's really very hard to have,

00:47:04.000 | to relinquish control to a machine

00:47:07.560 | and then take over when needed.

00:47:10.480 | So I think Tesla's approach is interesting

00:47:12.320 | 'cause it allows you to periodically establish

00:47:14.840 | some kind of contact with the car.

00:47:18.540 | Toyota on the other hand is thinking about

00:47:20.660 | shared autonomy or collaborative autonomy as a paradigm.

00:47:24.820 | If I may argue, these are very, very simple ways

00:47:27.480 | of human-robot collaboration.

00:47:29.700 | 'Cause the task is pretty boring.

00:47:31.900 | You sit in a vehicle, you go from point A to point B.

00:47:35.000 | I think the more interesting thing to me is,

00:47:37.360 | for example, search and rescue,

00:47:38.760 | I've got a human first responder, robot first responders.

00:47:41.980 | I gotta do something.

00:47:45.140 | It's important, I have to do it in two minutes.

00:47:47.800 | The building is burning, there's been an explosion,

00:47:50.440 | it's collapsed, how do I do it?

00:47:52.800 | I think to me, those are the interesting things

00:47:54.740 | where it's very, very unstructured

00:47:57.160 | and what's the role of the human, what's the role of the robot?

00:48:00.200 | Clearly, there's lots of interesting challenges

00:48:02.440 | and as a field, I think we're gonna make

00:48:04.240 | a lot of progress in this area.

00:48:05.760 | - Yeah, it's an exciting form of collaboration.

00:48:07.600 | You're right, in autonomous driving,

00:48:09.420 | the main enemy is just boredom of the human.

00:48:13.120 | - Yes.

00:48:13.960 | - As opposed to in rescue operations,

00:48:15.680 | it's literally life and death and the collaboration

00:48:20.680 | enables the effective completion of the mission.

00:48:23.820 | So it's exciting.

00:48:24.760 | - In some sense, we're also doing this,

00:48:27.840 | you think about the human driving a car

00:48:30.520 | and almost invariably, the human's trying to estimate

00:48:34.240 | the state of the car, they estimate the state

00:48:35.680 | of the environment and so on.

00:48:37.240 | But what if the car were to estimate the state of the human?

00:48:40.080 | So for example, I'm sure you have a smartphone

00:48:41.920 | and the smartphone tries to figure out what you're doing

00:48:44.560 | and send you reminders and oftentimes telling you

00:48:48.280 | to drive to a certain place, although you have no intention

00:48:50.420 | of going there because it thinks that that's where

00:48:52.600 | you should be 'cause of some Gmail calendar entry

00:48:56.240 | or something like that and it's trying to constantly figure

00:49:00.880 | out who you are, what you're doing.

00:49:02.720 | If a car were to do that, maybe that would make

00:49:05.240 | the driver safer because the car's trying to figure out

00:49:08.120 | is the driver paying attention, looking at his or her eyes,

00:49:11.580 | looking at cicada movements.

00:49:14.400 | So I think the potential is there but from the reverse side,

00:49:18.600 | it's not robot modeling but it's human modeling.

00:49:21.640 | - It's more in the human, right.

00:49:22.880 | - And I think the robots can do a very good job

00:49:25.320 | of modeling humans if you really think about the framework

00:49:29.120 | that you have a human sitting in a cockpit

00:49:32.600 | surrounded by sensors all staring at him

00:49:35.800 | in addition to be staring outside but also staring at him.

00:49:39.160 | I think there's a real synergy there.

00:49:40.960 | - Yeah, I love that problem 'cause it's the new

00:49:43.440 | 21st century form of psychology actually,

00:49:46.460 | AI-enabled psychology.

00:49:48.520 | A lot of people have sci-fi-inspired fears

00:49:51.280 | of walking robots like those from Boston Dynamics

00:49:54.080 | if you just look at shows on Netflix and so on

00:49:56.440 | or flying robots like those you work with.

00:49:59.880 | How would you, how do you think about those fears?

00:50:03.120 | How would you alleviate those fears?

00:50:05.000 | Do you have inklings, echoes of those same concerns?

00:50:09.000 | - You know, anytime we develop a technology

00:50:11.680 | meaning to have positive impact in the world,

00:50:14.120 | there's always the worry that somebody could subvert

00:50:19.120 | those technologies and use it in an adversarial setting

00:50:23.240 | and robotics is no exception, right.

00:50:25.280 | So I think it's very easy to weaponize robots.

00:50:29.280 | I think we talk about swarms.

00:50:31.720 | One thing I worry a lot about is,

00:50:33.960 | so for us to get swarms to work

00:50:35.880 | and do something reliably is really hard.

00:50:38.280 | But suppose I have this challenge

00:50:42.040 | of trying to destroy something

00:50:44.360 | and I have a swarm of robots where only one out of the swarm

00:50:47.280 | needs to get to its destination.

00:50:48.920 | So that suddenly becomes a lot more doable.

00:50:52.640 | And so I worry about this general idea of using autonomy

00:50:56.960 | with lots and lots of agents.

00:50:59.640 | I mean, having said that,

00:51:01.080 | look, a lot of this technology is not very mature.

00:51:03.760 | My favorite saying is that

00:51:05.520 | if somebody had to develop this technology,

00:51:10.520 | wouldn't you rather the good guys do it?

00:51:12.320 | So the good guys have a good understanding of the technology

00:51:14.640 | so they can figure out how this technology

00:51:16.480 | is being used in a bad way

00:51:18.320 | or could be used in a bad way and try to defend against it.

00:51:21.360 | So we think a lot about that.

00:51:22.760 | So we have, we're doing research

00:51:25.360 | on how to defend against swarms, for example.

00:51:28.240 | - That's interesting.

00:51:29.600 | - There's in fact a report by the National Academies

00:51:33.000 | on counter UAS technologies.

00:51:35.560 | This is a real threat,

00:51:38.240 | but we're also thinking about how to defend against this

00:51:40.360 | and knowing how swarms work,

00:51:42.960 | knowing how autonomy works is I think very important.

00:51:47.160 | - So it's not just politicians.

00:51:49.320 | You think engineers have a role in this discussion?

00:51:51.640 | - Absolutely.

00:51:52.480 | I think the days where politicians

00:51:55.320 | can be agnostic to technology are gone.

00:51:58.720 | I think every politician needs to be literate in technology.

00:52:03.720 | And I often say technology is the new liberal art.

00:52:08.680 | Understanding how technology will change your life

00:52:12.920 | I think is important.

00:52:14.480 | And every human being needs to understand that.

00:52:18.080 | - And maybe we can elect some engineers to office as well

00:52:21.480 | on the other side.

00:52:22.720 | What are the biggest open problems in robotics?

00:52:24.840 | And you said we're in the early days in some sense.

00:52:27.760 | What are the problems we would like to solve in robotics?

00:52:31.040 | - I think there are lots of problems, right?

00:52:32.520 | But I would phrase it in the following way.

00:52:36.440 | If you look at the robots we're building,

00:52:39.520 | they're still very much tailored towards

00:52:43.160 | doing specific tasks in specific settings.

00:52:46.520 | I think the question of how do you get them to operate

00:52:49.440 | in much broader settings

00:52:53.600 | where things can change in unstructured environments

00:52:58.080 | is up in the air.

00:52:59.200 | So think of self-driving cars.

00:53:01.240 | Today we can build a self-driving car in a parking lot.

00:53:05.720 | We can do level five autonomy in a parking lot.

00:53:09.040 | But can you do level five autonomy

00:53:13.280 | in the streets of Napoli in Italy or Mumbai in India?

00:53:16.880 | No.

00:53:17.800 | So in some sense, when we think about robotics,

00:53:22.440 | we have to think about where they're functioning,

00:53:25.160 | what kind of environment, what kind of a task.

00:53:27.800 | We have no understanding

00:53:29.840 | of how to put both those things together.

00:53:32.840 | - So we're in the very early days

00:53:34.040 | of applying it to the physical world.

00:53:35.960 | And I was just in Naples actually.

00:53:38.840 | And there's levels of difficulty and complexity

00:53:42.240 | depending on which area you're applying it to.

00:53:45.960 | - I think so.

00:53:46.800 | And we don't have a systematic way of understanding that.

00:53:51.120 | Everybody says just 'cause a computer

00:53:53.880 | can now beat a human at any board game,

00:53:56.600 | we certainly know something about intelligence.

00:54:00.000 | That's not true.

00:54:01.440 | A computer board game is very, very structured.

00:54:04.480 | It is the equivalent of working in a Henry Ford factory

00:54:08.560 | where parts come, you assemble, move on.

00:54:11.760 | It's a very, very, very structured setting.

00:54:14.200 | That's the easiest thing.

00:54:15.760 | And we know how to do that.

00:54:17.120 | - So you've done a lot of incredible work

00:54:20.440 | at the UPenn, University of Pennsylvania, Grass Club.

00:54:23.760 | You're now Dean of Engineering at UPenn.

00:54:26.600 | What advice do you have for a new bright-eyed undergrad

00:54:31.360 | interested in robotics or AI or engineering?

00:54:34.680 | - Well, I think there's really three things.

00:54:36.600 | One is you have to get used to the idea

00:54:40.640 | that the world will not be the same in five years

00:54:42.880 | or four years whenever you graduate, right?

00:54:45.200 | Which is really hard to do.

00:54:46.160 | So this thing about predicting the future,

00:54:49.000 | every one of us needs to be trying

00:54:50.560 | to predict the future always.

00:54:52.360 | Not because you'll be any good at it,

00:54:55.040 | but by thinking about it, I think you sharpen your senses

00:54:59.160 | and you become smarter.

00:55:00.920 | So that's number one.

00:55:02.120 | Number two, and it's a corollary of the first piece,

00:55:05.800 | which is you really don't know what's gonna be important.

00:55:09.440 | So this idea that I'm gonna specialize in something

00:55:12.120 | which will allow me to go in a particular direction,

00:55:15.360 | it may be interesting,

00:55:16.520 | but it's important also to have this breadth

00:55:18.520 | so you have this jumping off point.

00:55:20.360 | I think the third thing,

00:55:23.000 | and this is where I think Penn excels.

00:55:25.360 | I mean, we teach engineering,

00:55:27.280 | but it's always in the context of the liberal arts.

00:55:30.000 | It's always in the context of society.

00:55:32.360 | As engineers, we cannot afford to lose sight of that.

00:55:35.880 | So I think that's important.

00:55:37.640 | But I think one thing that people underestimate

00:55:39.960 | when they do robotics

00:55:40.920 | is the importance of mathematical foundations,

00:55:43.440 | the importance of representations.

00:55:47.720 | Not everything can just be solved

00:55:50.040 | by looking for Ross packages on the internet

00:55:52.400 | or to find a deep neural network that works.

00:55:56.240 | I think the representation question is key,

00:55:59.080 | even to machine learning,

00:56:00.360 | where if you ever hope to achieve or get to explainable AI,

00:56:05.360 | somehow there need to be representations

00:56:07.720 | that you can understand.

00:56:09.040 | - So if you wanna do robotics,

00:56:11.120 | you should also do mathematics.

00:56:12.640 | And you said liberal arts, a little literature.

00:56:15.040 | If you wanna build a robot,

00:56:16.960 | you should be reading Dostoevsky.

00:56:19.280 | I agree with that.

00:56:20.320 | - Very good. (laughs)

00:56:21.920 | - So Vijay, thank you so much for talking today.

00:56:23.520 | It was an honor. - Thank you.

00:56:24.360 | It was just a very exciting conversation.

00:56:26.160 | Thank you.

00:56:27.000 | (upbeat music)

00:56:29.580 | (upbeat music)

00:56:32.160 | (upbeat music)

00:56:34.740 | (upbeat music)

00:56:37.320 | (upbeat music)

00:56:39.900 | (upbeat music)

00:56:42.480 | [BLANK_AUDIO]

Vijay Kumar: Flying Robots | Lex Fridman Podcast #37

Chapters