back to index

Vijay Kumar: Flying Robots | Lex Fridman Podcast #37


Chapters

0:0
0:1 Vijay Kumar
0:58 The First Robot You'Ve Ever Built
2:23 First Multiprocessor Operating System
15:11 What Kind of Autonomous Flying Vehicles Are There
17:39 Communications
18:38 Agile Autonomous Flying Robots
26:16 Omnidirectional Flying Robots
29:34 Role of Machine Learning
31:16 Ground Effect
32:13 Iterative Learning
34:14 Limits of Computer Vision
37:14 Is Harder To Solve Autonomous Driving or Autonomous Flight
40:38 Power Density
43:2 Flying Cars
44:33 Collaboration with Humans
52:27 Problems We Would Like To Solve in Robotics
54:26 Advice Do You Have for a New Bright-Eyed Undergrad Interested in Robotics

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Vijay Kumar.
00:00:03.080 | He's one of the top roboticists in the world,
00:00:05.760 | a professor at the University of Pennsylvania,
00:00:08.760 | a Dean of Penn Engineering,
00:00:10.680 | former director of Grasp Lab,
00:00:12.880 | or the General Robotics, Automation Sensing,
00:00:15.320 | and Perception Laboratory at Penn,
00:00:17.560 | that was established back in 1979, that's 40 years ago.
00:00:22.560 | Vijay is perhaps best known for his work
00:00:25.280 | in multi-robot systems, robot swarms,
00:00:28.520 | and micro aerial vehicles,
00:00:30.880 | robots that elegantly cooperate in flight
00:00:34.040 | under all the uncertainty and challenges
00:00:36.200 | that the real world conditions present.
00:00:38.780 | This is the Artificial Intelligence Podcast.
00:00:41.920 | If you enjoy it, subscribe on YouTube,
00:00:44.320 | give it five stars on iTunes, support it on Patreon,
00:00:47.560 | or simply connect with me on Twitter @LexFriedman,
00:00:50.480 | spelled F-R-I-D-M-A-N.
00:00:53.280 | And now, here's my conversation with Vijay Kumar.
00:00:58.380 | What is the first robot you've ever built,
00:01:01.100 | or were a part of building?
00:01:02.820 | - Way back when I was in graduate school,
00:01:04.780 | I was part of a fairly big project
00:01:06.780 | that involved building a very large hexapod.
00:01:11.780 | This weighed close to 7,000 pounds,
00:01:17.020 | and it was powered by hydraulic actuation,
00:01:21.600 | or it was actuated by hydraulics,
00:01:25.260 | with 18 motors, hydraulic motors,
00:01:29.760 | each controlled by an Intel 8085 processor
00:01:34.180 | and an 8086 coprocessor.
00:01:36.660 | And so imagine this huge monster
00:01:42.460 | that had 18 joints,
00:01:44.800 | each controlled by an independent computer,
00:01:46.940 | and there was a 19th computer
00:01:48.500 | that actually did the coordination between these 18 joints.
00:01:52.340 | So I was part of this project,
00:01:53.740 | and my thesis work was,
00:01:57.900 | how do you coordinate the 18 legs?
00:02:01.060 | And in particular, the pressures in the hydraulic cylinders
00:02:06.320 | to get efficient locomotion.
00:02:09.180 | - It sounds like a giant mess.
00:02:11.620 | So how difficult is it to make all the motors communicate?
00:02:14.460 | Presumably, you have to send signals
00:02:16.860 | hundreds of times a second, or at least--
00:02:18.700 | - So this was not my work,
00:02:19.900 | but the folks who worked on this
00:02:22.780 | wrote what I believe to be
00:02:24.180 | the first multiprocessor operating system.
00:02:26.620 | This was in the '80s.
00:02:27.920 | And you had to make sure that,
00:02:31.100 | obviously, messages got across from one joint to another.
00:02:34.620 | You have to remember the clock speeds on those computers
00:02:37.940 | were about half a megahertz.
00:02:39.660 | - Right.
00:02:40.500 | The '80s.
00:02:42.180 | So not to romanticize the notion,
00:02:45.300 | but how did it make you feel to see that robot move?
00:02:49.660 | - It was amazing.
00:02:52.220 | In hindsight, it looks like, well, we built this thing
00:02:55.220 | which really should have been much smaller.
00:02:57.260 | And of course, today's robots are much smaller.
00:02:59.100 | You look at Boston Dynamics or Ghost Robotics,
00:03:03.060 | a spinoff from Penn.
00:03:04.740 | But back then, you were stuck with the substrate you had,
00:03:10.020 | the compute you had, so things were unnecessarily big.
00:03:13.660 | But at the same time, and this is just human psychology,
00:03:18.000 | somehow bigger means grander.
00:03:21.540 | People never had the same appreciation
00:03:23.580 | for nanotechnology or nanodevices
00:03:26.340 | as they do for the Space Shuttle or the Boeing 747.
00:03:30.100 | - Yeah, you've actually done quite a good job
00:03:32.700 | at illustrating that small is beautiful
00:03:35.980 | in terms of robotics.
00:03:37.740 | So what is, on that topic, is the most beautiful
00:03:42.540 | or elegant robot in motion that you've ever seen?
00:03:46.180 | Not to pick favorites or whatever,
00:03:47.840 | but something that just inspires you that you remember.
00:03:50.980 | - Well, I think the thing that I'm most proud of
00:03:53.940 | that my students have done is really think about
00:03:57.140 | small UAVs that can maneuver in constrained spaces
00:04:00.300 | and in particular, their ability to coordinate
00:04:03.580 | with each other and form three-dimensional patterns.
00:04:06.700 | So once you can do that,
00:04:08.880 | you can essentially create 3D objects in the sky
00:04:15.780 | and you can deform these objects on the fly.
00:04:19.780 | So in some sense, your toolbox of what you can create
00:04:23.540 | has suddenly got enhanced.
00:04:25.300 | And before that, we did the two-dimensional version of this.
00:04:29.900 | So we had ground robots forming patterns and so on.
00:04:33.740 | So that was not as impressive, that was not as beautiful.
00:04:37.060 | But if you do it in 3D, suspended in midair,
00:04:40.480 | and you've got to go back to 2011 when we did this.
00:04:43.660 | Now it's actually pretty standard to do these things
00:04:45.980 | eight years later.
00:04:47.780 | But back then it was a big accomplishment.
00:04:49.820 | - So the distributed cooperation
00:04:52.460 | is where beauty emerges in your eyes.
00:04:55.660 | - Well, I think beauty to an engineer is very different
00:04:57.980 | from beauty to someone who's looking at robots
00:05:01.540 | from the outside, if you will.
00:05:03.400 | But what I meant there, so before we said that grand
00:05:06.620 | is associated with size.
00:05:10.460 | And another way of thinking about this
00:05:13.660 | is just the physical shape
00:05:15.580 | and the idea that you can get physical shapes in midair
00:05:18.300 | and have them deform, that's beautiful.
00:05:21.500 | - But the individual components,
00:05:22.980 | the agility is beautiful too, right?
00:05:24.820 | - That is true too.
00:05:25.660 | So then how quickly can you actually manipulate
00:05:28.420 | these three-dimensional shapes
00:05:29.500 | and the individual components?
00:05:31.200 | Yes, you're right.
00:05:32.460 | - By the way, you said UAV, unmanned aerial vehicle.
00:05:36.760 | What's a good term for drones, UAVs, quadcopters?
00:05:41.760 | Is there a term that's being standardized?
00:05:44.580 | - I don't know if there is.
00:05:45.420 | Everybody wants to use the word drones.
00:05:47.900 | And I've often said this, drones to me is a pejorative word.
00:05:51.060 | It signifies something that's dumb,
00:05:53.940 | that's pre-programmed, that does one little thing,
00:05:56.340 | and robots are anything but drones.
00:05:58.620 | So I actually don't like that word,
00:06:00.660 | but that's what everybody uses.
00:06:02.980 | You could call it unpiloted.
00:06:04.860 | - Unpiloted.
00:06:05.780 | - But even unpiloted could be radio-controlled,
00:06:08.100 | could be remotely controlled in many different ways.
00:06:10.700 | And I think the right word is,
00:06:12.940 | thinking about it as an aerial robot.
00:06:15.060 | - You also say agile, autonomous aerial robot, right?
00:06:19.100 | - Yeah, so agility is an attribute,
00:06:20.620 | but they don't have to be.
00:06:22.180 | - So what biological system,
00:06:24.820 | 'cause you've also drawn a lot of inspiration with those.
00:06:27.180 | I've seen bees and ants that you've talked about.
00:06:30.340 | What living creatures have you found to be most inspiring
00:06:35.260 | as an engineer, instructive in your work in robotics?
00:06:38.580 | - To me, so ants are really quite incredible creatures.
00:06:43.580 | I mean, the individuals arguably are very simple
00:06:47.940 | in how they're built,
00:06:50.220 | and yet they're incredibly resilient as a population.
00:06:54.020 | And as individuals, they're incredibly robust.
00:06:56.820 | So if you take an ant, it's six legs,
00:07:00.660 | you remove one leg, it still works just fine.
00:07:04.180 | And it moves along,
00:07:05.820 | and I don't know that it even realizes it's lost a leg.
00:07:08.780 | So that's the robustness at the individual ant level.
00:07:12.500 | But then you look about this instinct
00:07:15.420 | for self-preservation of the colonies,
00:07:17.740 | and they adapt in so many amazing ways.
00:07:20.460 | Transcending gaps,
00:07:24.620 | by just chaining themselves together when you have a flood,
00:07:30.460 | being able to recruit other teammates
00:07:33.180 | to carry big morsels of food.
00:07:36.220 | And then going out in different directions,
00:07:38.260 | looking for food,
00:07:39.580 | and then being able to demonstrate consensus,
00:07:43.900 | even though they don't communicate directly with each other
00:07:47.820 | the way we communicate with each other,
00:07:49.940 | in some sense, they also know how to do democracy
00:07:53.820 | probably better than what we do.
00:07:55.460 | - Yeah, somehow it's that even democracy is emergent.
00:07:58.980 | It seems like all of the phenomena
00:08:00.580 | that we see is all emergent.
00:08:02.420 | It seems like there's no centralized communicator.
00:08:05.540 | - There is, so I think a lot is made
00:08:07.220 | about that word emergent,
00:08:09.820 | and it means lots of things to different people.
00:08:11.540 | But you're absolutely right.
00:08:12.580 | I think as an engineer,
00:08:14.060 | you think about what elemental behaviors,
00:08:19.060 | what primitives you could synthesize
00:08:22.780 | so that the whole looks incredibly powerful,
00:08:26.700 | incredibly synergistic,
00:08:27.980 | the whole definitely being greater than the sum of the parts,
00:08:31.020 | and ants are living proof of that.
00:08:32.900 | - So when you see these beautiful swarms,
00:08:36.340 | where there's biological systems of robots,
00:08:38.860 | do you sometimes think of them
00:08:41.620 | as a single individual living intelligent organism?
00:08:45.960 | So it's the same as thinking of our human civilization
00:08:49.460 | as one organism?
00:08:51.140 | Or do you still, as an engineer,
00:08:52.940 | think about the individual components
00:08:54.580 | and all the engineering
00:08:55.420 | that went into the individual components?
00:08:57.300 | - Well, that's very interesting.
00:08:58.620 | So again, philosophically as engineers,
00:09:01.460 | what we want to do is to go beyond the individual components,
00:09:06.460 | the individual units,
00:09:08.260 | and think about it as a unit, as a cohesive unit,
00:09:11.500 | without worrying about the individual components.
00:09:15.100 | If you start obsessing about the individual building blocks
00:09:20.100 | and what they do,
00:09:22.100 | you inevitably will find it hard to scale up.
00:09:27.900 | Just mathematically,
00:09:28.940 | just think about individual things you want to model,
00:09:31.540 | and if you want to have 10 of those,
00:09:33.980 | then you essentially are taking
00:09:35.420 | Cartesian products of 10 things,
00:09:37.540 | and that makes it really complicated.
00:09:39.260 | Then to do any kind of synthesis or design
00:09:41.780 | in that high-dimensional space is really hard.
00:09:44.140 | So the right way to do this
00:09:45.820 | is to think about the individuals in a clever way
00:09:49.060 | so that at the higher level,
00:09:51.140 | when you look at lots and lots of them,
00:09:53.420 | abstractly you can think of them
00:09:55.340 | in some low-dimensional space.
00:09:57.100 | - So what does that involve?
00:09:58.660 | For the individual,
00:10:00.060 | do you have to try to make the way they see the world
00:10:03.300 | as local as possible?
00:10:05.140 | And the other thing,
00:10:06.420 | do you just have to make them robust to collisions?
00:10:09.540 | Like you said with the ants,
00:10:10.860 | if something fails, the whole swarm doesn't fail.
00:10:15.300 | - Right, I think as engineers, we do this.
00:10:17.740 | I mean, you know, think about we build planes
00:10:19.740 | or we build iPhones,
00:10:21.260 | and we know that by taking individual components,
00:10:26.260 | well-engineered components with well-specified interfaces
00:10:30.060 | that behave in a predictable way,
00:10:31.660 | you can build complex systems.
00:10:33.540 | So that's ingrained, I would claim,
00:10:36.860 | in most engineers' thinking.
00:10:39.380 | And it's true for computer scientists as well.
00:10:41.580 | I think what's different here
00:10:42.900 | is that you want the individuals
00:10:46.740 | to be robust in some sense,
00:10:49.460 | as we do in these other settings,
00:10:51.980 | but you also want some degree of resiliency
00:10:54.500 | for the population.
00:10:56.300 | And so you really want them to be able
00:10:58.700 | to reestablish communication with their neighbors.
00:11:03.700 | You want them to rethink their strategy for group behavior.
00:11:08.860 | You want them to reorganize.
00:11:11.020 | And that's where I think a lot of the challenges lie.
00:11:16.140 | - So just at a high level,
00:11:18.380 | what does it take for a bunch of,
00:11:22.460 | what you would call them,
00:11:23.540 | flying robots to create a formation?
00:11:26.900 | Just for people who are not familiar
00:11:28.900 | with robotics in general,
00:11:31.260 | how much information is needed?
00:11:32.980 | How do you even make it happen
00:11:36.060 | without a centralized controller?
00:11:39.740 | - So, I mean, there are a couple of different ways
00:11:41.300 | of looking at this.
00:11:43.380 | If you are a purist,
00:11:45.900 | you think of it as a way of recreating what nature does.
00:11:51.580 | So nature forms groups for several reasons,
00:11:56.580 | but mostly it's because of this instinct
00:12:00.460 | that organisms have of preserving their colonies,
00:12:05.420 | their population.
00:12:06.780 | Which means what?
00:12:09.380 | You need shelter, you need food,
00:12:11.260 | you need to procreate,
00:12:12.860 | and that's basically it.
00:12:14.660 | So the kinds of interactions you see are all organic.
00:12:18.260 | They're all local.
00:12:19.580 | And the only information that they share,
00:12:23.380 | and mostly it's indirectly,
00:12:25.460 | is to, again, preserve the herd or the flock
00:12:29.220 | or the swarm,
00:12:30.460 | and either by looking for new sources of food
00:12:36.860 | or looking for new shelters, right?
00:12:38.660 | - Right.
00:12:39.500 | - As engineers, when we build swarms,
00:12:43.380 | we have a mission.
00:12:44.660 | And when you think about it,
00:12:48.260 | and when you think of a mission,
00:12:50.740 | and it involves mobility,
00:12:54.340 | most often it's described
00:12:56.100 | in some kind of a global coordinate system.
00:12:58.820 | As a human, as an operator, as a commander,
00:13:03.020 | or as a collaborator,
00:13:05.300 | I have my coordinate system,
00:13:07.100 | and I want the robots to be consistent with that.
00:13:10.140 | So I might think of it slightly differently.
00:13:14.700 | I might want the robots
00:13:16.020 | to recognize that coordinate system,
00:13:18.900 | which means not only do they have to think locally
00:13:21.300 | in terms of who their immediate neighbors are,
00:13:23.100 | but they have to be cognizant
00:13:24.580 | of what the global environment looks like.
00:13:28.300 | So if I say, "Surround this building
00:13:30.980 | "and protect this from intruders,"
00:13:33.260 | well, they're immediately
00:13:34.580 | in a building-centered coordinate system,
00:13:36.460 | and I have to tell them where the building is.
00:13:38.700 | - And they're globally collaborating
00:13:40.020 | on the map of that building.
00:13:41.300 | They're maintaining some kind of global,
00:13:44.180 | not just in the frame of the building,
00:13:45.500 | but there's information
00:13:47.460 | that's ultimately being built up explicitly
00:13:49.700 | as opposed to kind of implicitly, like nature might.
00:13:54.380 | - Correct, correct.
00:13:55.220 | So in some sense, nature is very, very sophisticated,
00:13:57.660 | but the tasks that nature solves or needs to solve
00:14:01.860 | are very different from the kind of engineered tasks,
00:14:05.140 | artificial tasks that we are forced to address.
00:14:09.740 | And again, there's nothing preventing us
00:14:12.540 | from solving these other problems,
00:14:15.140 | but ultimately it's about impact.
00:14:16.580 | You want these swarms to do something useful.
00:14:19.340 | And so you're kind of driven into this very unnatural,
00:14:24.340 | if you will, unnatural meaning,
00:14:26.260 | not like how nature does, setting.
00:14:29.140 | - And it's probably a little bit more expensive
00:14:31.900 | to do it the way nature does,
00:14:33.740 | because nature is less sensitive
00:14:37.540 | to the loss of the individual,
00:14:39.460 | and cost-wise in robotics,
00:14:42.260 | I think you're more sensitive to losing individuals.
00:14:45.500 | - I think that's true.
00:14:46.940 | Although if you look at the price to performance ratio
00:14:50.100 | of robotic components, it's coming down dramatically.
00:14:53.980 | - Oh, interesting.
00:14:54.820 | - It continues to come down.
00:14:56.020 | So I think we're asymptotically approaching the point
00:14:58.900 | where we would get, yeah,
00:14:59.940 | the cost of individuals would really become insignificant.
00:15:04.940 | - So let's step back at a high level of view,
00:15:07.620 | the impossible question of what kind of,
00:15:11.660 | as an overview,
00:15:12.500 | what kind of autonomous flying vehicles are there
00:15:15.300 | in general?
00:15:16.220 | - I think the ones that receive a lot of notoriety
00:15:19.700 | are obviously the military vehicles.
00:15:22.540 | Military vehicles are controlled by a base station,
00:15:26.260 | but have a lot of human supervision,
00:15:29.660 | but have limited autonomy,
00:15:31.800 | which is the ability to go from point A to point B,
00:15:34.760 | and even the more sophisticated vehicles
00:15:38.300 | can do autonomous takeoff and landing.
00:15:41.740 | - And those usually have wings and they're heavy.
00:15:44.380 | - Usually they're wings,
00:15:45.340 | but there's nothing preventing us
00:15:46.640 | from doing this for helicopters as well.
00:15:49.060 | There are many military organizations
00:15:52.500 | that have autonomous helicopters in the same vein.
00:15:56.540 | And by the way, you look at autopilots and airplanes,
00:16:00.080 | and it's actually very similar.
00:16:02.820 | In fact, one interesting question we can ask is,
00:16:07.180 | if you look at all the air safety violations,
00:16:12.180 | all the crashes that occurred,
00:16:14.100 | would they have happened if the plane were truly autonomous?
00:16:18.660 | And I think you'll find that in many of the cases,
00:16:21.980 | because of pilot error, we make silly decisions.
00:16:24.620 | And so in some sense, even in air traffic,
00:16:26.980 | commercial air traffic, there's a lot of applications,
00:16:29.820 | although we only see autonomy being enabled
00:16:33.980 | at very high altitudes when the plane is on autopilot.
00:16:38.980 | - There's still a role for the human,
00:16:42.580 | and that kind of autonomy is, you're kind of implying,
00:16:47.580 | I don't know what the right word is,
00:16:48.700 | but it's a little dumber than it could be.
00:16:52.660 | - Right, so in the lab, of course,
00:16:55.740 | we can afford to be a lot more aggressive.
00:16:59.240 | And the question we try to ask is,
00:17:04.240 | can we make robots that will be able to make decisions
00:17:09.600 | without any kind of external infrastructure?
00:17:13.680 | So what does that mean?
00:17:14.880 | So the most common piece of infrastructure
00:17:16.960 | that airplanes use today is GPS.
00:17:19.640 | GPS is also the most brittle form of information.
00:17:26.680 | If you have driven in a city, tried to use GPS navigation,
00:17:30.240 | you know, in tall buildings, you immediately lose GPS.
00:17:33.720 | And so that's not a very sophisticated way
00:17:36.360 | of building autonomy.
00:17:37.880 | I think the second piece of infrastructure
00:17:39.600 | they rely on is communications.
00:17:41.960 | Again, it's very easy to jam communications.
00:17:46.220 | In fact, if you use Wi-Fi,
00:17:49.680 | you know that Wi-Fi signals drop out,
00:17:51.880 | cell signals drop out.
00:17:53.560 | So to rely on something like that is not good.
00:17:56.820 | The third form of infrastructure we use,
00:18:01.240 | and I hate to call it infrastructure,
00:18:02.960 | but it is that in the sense of robots, is people.
00:18:06.400 | So you can rely on somebody to pilot you.
00:18:08.660 | And so the question you want to ask is,
00:18:11.600 | if there are no pilots,
00:18:13.400 | if there's no communications with any base station,
00:18:16.220 | if there's no knowledge of position,
00:18:18.740 | and if there's no a priori map,
00:18:21.680 | a priori knowledge of what the environment looks like,
00:18:24.880 | a priori model of what might happen in the future,
00:18:28.280 | can robots navigate?
00:18:29.560 | So that is true autonomy.
00:18:31.480 | - So that's true autonomy.
00:18:33.200 | And we're talking about, you mentioned,
00:18:35.040 | like military application of drones.
00:18:36.880 | Okay, so what else is there?
00:18:38.300 | You talk about agile, autonomous flying robots,
00:18:42.060 | aerial robots.
00:18:43.520 | So that's a different kind of,
00:18:45.680 | it's not winged, it's not big, at least it's small.
00:18:48.160 | - So I use the word agility mostly,
00:18:50.820 | or at least we're motivated to do agile robots,
00:18:53.520 | mostly because robots can operate
00:18:58.000 | and should be operating in constrained environments.
00:19:01.160 | And if you want to operate the way a global hawk operates,
00:19:07.000 | I mean, the kinds of conditions in which you operate
00:19:09.140 | are very, very restrictive.
00:19:10.780 | If you want to go inside a building,
00:19:13.760 | for example, for search and rescue,
00:19:15.600 | or to locate an active shooter,
00:19:18.160 | or you want to navigate under the canopy in an orchard
00:19:22.140 | to look at health of plants,
00:19:23.900 | or to look for, to count fruits,
00:19:28.280 | to measure the tree trunks.
00:19:31.300 | These are things we do, by the way.
00:19:33.300 | - Yeah, some cool agriculture stuff you've shown in the past
00:19:35.980 | is really awesome.
00:19:36.820 | - Right, so in those kinds of settings,
00:19:39.140 | you do need that agility.
00:19:40.380 | Agility does not necessarily mean
00:19:42.580 | you break records for the 100 meters dash.
00:19:45.460 | What it really means is you see the unexpected,
00:19:48.040 | and you're able to maneuver in a safe way,
00:19:51.500 | and in a way that gets you the most information
00:19:55.460 | about the thing you're trying to do.
00:19:57.700 | - By the way, you may be the only person
00:20:00.500 | who in a TED Talk has used a math equation,
00:20:04.280 | which is amazing.
00:20:05.460 | People should go see one of your TED Talks.
00:20:07.660 | - Actually, it's very interesting,
00:20:08.860 | 'cause the TED curator, Chris Anderson, told me,
00:20:13.540 | "You can't show math."
00:20:16.040 | I thought about it, but that's who I am.
00:20:18.280 | I mean, that's our work.
00:20:20.840 | And so I felt compelled to give the audience a taste
00:20:25.840 | for at least some math.
00:20:27.680 | - So on that point, simply,
00:20:31.240 | what does it take to make a thing with four motors fly,
00:20:36.000 | a quadcopter, one of these little flying robots?
00:20:40.700 | How hard is it to make it fly?
00:20:44.040 | How do you coordinate the four motors?
00:20:46.620 | What's, how do you convert those motors
00:20:50.840 | into actual movement?
00:20:52.640 | - So this is an interesting question.
00:20:54.840 | We've been trying to do this since 2000.
00:20:58.120 | It is a commentary on the sensors
00:21:00.620 | that were available back then,
00:21:02.120 | the computers that were available back then.
00:21:04.320 | And a number of things happened between 2000 and 2007.
00:21:11.600 | One is the advances in computing, which is,
00:21:15.520 | so we all know about Moore's law,
00:21:16.800 | but I think 2007 was a tipping point,
00:21:19.720 | the year of the iPhone, the year of the cloud.
00:21:22.760 | Lots of things happened in 2007.
00:21:24.680 | But going back even further,
00:21:27.600 | inertial measurement units as a sensor really matured.
00:21:31.400 | Again, lots of reasons for that.
00:21:33.080 | Certainly there's a lot of federal funding,
00:21:35.440 | particularly DARPA in the US,
00:21:38.360 | but they didn't anticipate this boom in IMUs.
00:21:42.800 | But if you look, subsequently what happened
00:21:46.600 | is that every car manufacturer had to put an airbag in,
00:21:50.080 | which meant you had to have an accelerometer on board.
00:21:52.680 | And so that drove down the price to performance ratio.
00:21:55.080 | - Wow, I never, I should know this.
00:21:56.920 | That's very interesting.
00:21:57.760 | That's very interesting, the connection there.
00:21:59.440 | - And that's why research is very,
00:22:01.360 | it's very hard to predict the outcomes.
00:22:03.320 | And again, the federal government spent a ton of money
00:22:07.720 | on things that they thought were useful for resonators,
00:22:12.360 | but it ended up enabling these small UAVs,
00:22:16.320 | which is great, 'cause I could have never raised
00:22:17.920 | that much money and sold this project,
00:22:20.800 | hey, we want to build these small UAVs,
00:22:22.240 | can you actually fund the development of low-cost IMUs?
00:22:25.480 | - So why do you need an IMU on an UAV?
00:22:27.640 | - So I'll come back to that,
00:22:30.360 | but so in 2007, 2008, we were able to build these,
00:22:33.360 | and then the question you're asking was a good one,
00:22:35.240 | how do you coordinate the motors?
00:22:37.720 | To develop this, but over the last 10 years,
00:22:42.360 | everything is commoditized.
00:22:43.880 | A high school kid today can pick up a Raspberry Pi kit
00:22:47.880 | and build this, all the low-level functionality
00:22:52.120 | is all automated.
00:22:53.200 | But basically at some level, you have to drive the motors
00:22:59.160 | at the right RPMs, the right velocity,
00:23:04.560 | in order to generate the right amount of thrust,
00:23:07.480 | in order to position it and orient it
00:23:09.960 | in a way that you need to in order to fly.
00:23:12.840 | The feedback that you get is from onboard sensors,
00:23:16.640 | and the IMU is an important part of it.
00:23:18.400 | The IMU tells you what the acceleration is,
00:23:23.400 | as well as what the angular velocity is,
00:23:26.400 | and those are important pieces of information.
00:23:29.220 | In addition to that, you need some kind of local position
00:23:34.200 | or velocity information.
00:23:36.480 | For example, when we walk,
00:23:39.320 | we implicitly have this information
00:23:41.520 | because we kind of know what our stride length is.
00:23:45.800 | We also are looking at images fly past our retina,
00:23:51.440 | if you will, and so we can estimate velocity.
00:23:54.240 | We also have accelerometers in our head,
00:23:56.320 | and we're able to integrate all these pieces of information
00:23:59.120 | to determine where we are as we walk.
00:24:02.320 | And so robots have to do something very similar.
00:24:04.280 | You need an IMU, you need some kind of a camera
00:24:08.120 | or other sensor that's measuring velocity,
00:24:11.580 | and then you need some kind of a global reference frame
00:24:15.760 | if you really want to think about doing something
00:24:19.480 | in a world coordinate system.
00:24:21.260 | And so how do you estimate your position
00:24:23.640 | with respect to that global reference frame?
00:24:25.160 | That's important as well.
00:24:26.520 | - So coordinating the RPMs of the four motors
00:24:29.480 | is what allows you to first of all fly and hover,
00:24:32.640 | and then you can change the orientation
00:24:35.560 | and the velocity and so on.
00:24:37.600 | - Exactly, exactly.
00:24:38.440 | - So there's a bunch of degrees of freedom
00:24:40.280 | that you're playing with.
00:24:41.120 | - There's six degrees of freedom,
00:24:42.200 | but you only have four inputs, the four motors.
00:24:44.920 | And it turns out to be a remarkably versatile configuration.
00:24:49.920 | You think at first, well, I only have four motors,
00:24:53.080 | how do I go sideways?
00:24:55.000 | But it's not too hard to say, well, if I tilt myself,
00:24:57.280 | I can go sideways.
00:24:59.160 | And then you have four motors pointing up,
00:25:01.200 | how do I rotate in place about a vertical axis?
00:25:05.400 | Well, you rotate them at different speeds
00:25:07.840 | and that generates reaction moments
00:25:09.720 | and that allows you to turn.
00:25:11.560 | So it's actually a pretty, it's an optimal configuration
00:25:14.960 | from an engineer standpoint.
00:25:17.060 | It's very simple, very cleverly done and very versatile.
00:25:22.960 | - So if you could step back to a time,
00:25:27.280 | so I've always known flying robots as,
00:25:30.120 | to me it was natural that the quadcopter should fly.
00:25:35.840 | But when you first started working with it,
00:25:38.000 | I mean, how surprised are you that you can make,
00:25:42.080 | do so much with the four motors?
00:25:45.560 | How surprising is it you can make this thing fly,
00:25:47.640 | first of all, that you can make it hover,
00:25:49.800 | then you can add control to it?
00:25:52.040 | - Firstly, this is not,
00:25:54.460 | the four motor configuration is not ours.
00:25:56.860 | It has at least a hundred year history.
00:26:00.100 | - Oh, it does. - And various people,
00:26:02.480 | various people try to get quadrotors to fly
00:26:06.300 | without much success.
00:26:07.680 | As I said, we've been working on this since 2000.
00:26:11.560 | Our first designs were, well, this is way too complicated.
00:26:15.200 | Why not we try to get an omnidirectional flying robot?
00:26:19.200 | So our early designs, we had eight rotors.
00:26:22.800 | And so these eight rotors were arranged uniformly
00:26:26.120 | on a sphere, if you will.
00:26:28.900 | So you can imagine a symmetric configuration.
00:26:31.380 | And so you should be able to fly anywhere.
00:26:34.180 | But the real challenge we had
00:26:35.700 | is the strength to weight ratio was not enough.
00:26:37.900 | And of course we didn't have the sensors and so on.
00:26:41.260 | So everybody knew, or at least the people
00:26:43.860 | who worked with rotorcrafts knew,
00:26:45.700 | four rotors will get it done.
00:26:47.320 | So that was not our idea.
00:26:50.220 | But it took a while before we could actually do
00:26:53.500 | the onboard sensing and the computation that was needed
00:26:57.740 | for the kinds of agile maneuvering
00:27:00.400 | that we wanted to do in our little aerial robots.
00:27:03.820 | And that only happened between 2007 and 2009 in our lab.
00:27:08.340 | - Yeah, and you have to send the signal
00:27:10.700 | many hundred times a second.
00:27:13.240 | So the compute there,
00:27:14.980 | is everything has to come down in price.
00:27:16.740 | And what are the steps of getting from point A to point B?
00:27:21.740 | So we just talked about like local control.
00:27:25.860 | But if all the kind of cool dancing in the air
00:27:30.860 | that I've seen you show, how do you make it happen?
00:27:35.160 | Make a trajectory, first of all, okay,
00:27:39.700 | figure out a trajectory, so plan a trajectory.
00:27:42.340 | And then how do you make that trajectory happen?
00:27:45.060 | - Yeah, I think planning is a very fundamental problem
00:27:47.300 | in robotics.
00:27:48.140 | I think 10 years ago, it was an esoteric thing.
00:27:50.820 | But today with self-driving cars,
00:27:53.060 | everybody can understand this basic idea
00:27:55.860 | that a car sees a whole bunch of things
00:27:57.940 | and it has to keep a lane or maybe make a right turn
00:28:00.340 | or switch lanes.
00:28:01.300 | It has to plan a trajectory.
00:28:02.700 | It has to be safe, it has to be efficient.
00:28:04.880 | So everybody's familiar with that.
00:28:06.660 | That's kind of the first step that you have to think about
00:28:10.300 | when you say autonomy.
00:28:14.860 | And so for us, it's about finding smooth motions,
00:28:19.140 | motions that are safe.
00:28:21.340 | So we think about these two things.
00:28:22.900 | One is optimality, one is safety.
00:28:24.700 | Clearly, you cannot compromise safety.
00:28:27.220 | So you're looking for safe, optimal motions.
00:28:31.380 | The other thing you have to think about is
00:28:34.480 | can you actually compute a reasonable trajectory
00:28:38.160 | in a small amount of time?
00:28:40.740 | 'Cause you have a time budget.
00:28:42.300 | So the optimal becomes suboptimal.
00:28:45.180 | But in our lab, we focus on synthesizing smooth trajectory
00:28:50.180 | that satisfy all the constraints.
00:28:53.020 | In other words, don't violate any safety constraints.
00:28:57.200 | And is as efficient as possible.
00:29:02.860 | And when I say efficient, it could mean
00:29:05.220 | I want to get from point A to point B
00:29:06.600 | as quickly as possible.
00:29:08.340 | Or I want to get to it as gracefully as possible.
00:29:12.820 | Or I want to consume as little energy as possible.
00:29:15.940 | - But always staying within the safety constraints.
00:29:18.180 | - But, yes, always finding a safe trajectory.
00:29:22.780 | - So there's a lot of excitement and progress
00:29:24.980 | in the field of machine learning.
00:29:26.580 | - Yes.
00:29:27.420 | - And reinforcement learning and the neural network variant
00:29:31.700 | of that with deep reinforcement learning.
00:29:33.900 | Do you see a role of machine learning in...
00:29:37.100 | So a lot of the success of flying robots
00:29:40.540 | did not rely on machine learning.
00:29:42.260 | Except for maybe a little bit of the perception
00:29:44.980 | on the computer vision side.
00:29:46.540 | On the control side and the planning,
00:29:48.380 | do you see there's a role in the future
00:29:50.340 | for machine learning?
00:29:51.620 | - So let me disagree a little bit with you.
00:29:53.780 | I think we never perhaps called out,
00:29:56.180 | in my work, called out learning.
00:29:57.700 | But even this very simple idea of being able
00:30:00.100 | to fly through a constrained space.
00:30:02.180 | The first time you try it, you'll invariably,
00:30:07.660 | you might get it wrong if the task is challenging.
00:30:10.420 | And the reason is, to get it perfectly right,
00:30:14.180 | you have to model everything in the environment.
00:30:16.660 | And flying is notoriously hard to model.
00:30:22.120 | There are aerodynamic effects that we constantly discover.
00:30:27.120 | Even just before I was talking to you,
00:30:31.460 | I was talking to a student about how blades flap
00:30:35.600 | when they fly.
00:30:37.100 | - Wow.
00:30:37.940 | - And that ends up changing how a rotorcraft
00:30:42.140 | is accelerated in the angular direction.
00:30:45.900 | - Does it use like micro flaps or something?
00:30:48.420 | - It's not micro flaps.
00:30:49.260 | So we assume that each blade is rigid,
00:30:51.660 | but actually it flaps a little bit.
00:30:53.220 | - Oh.
00:30:54.060 | - It bends.
00:30:54.900 | - Interesting, yeah.
00:30:55.720 | - And so the models rely on the fact,
00:30:58.100 | on an assumption that they're actually rigid.
00:31:00.620 | But that's not true.
00:31:02.220 | If you're flying really quickly,
00:31:03.700 | these effects become significant.
00:31:06.900 | If you're flying close to the ground,
00:31:09.220 | you get pushed off by the ground, right?
00:31:12.140 | Something which every pilot knows when he tries to land
00:31:14.900 | or she tries to land, this is called a ground effect.
00:31:17.980 | Something very few pilots think about
00:31:20.980 | is what happens when you go close to a ceiling,
00:31:23.020 | where you get sucked into a ceiling.
00:31:25.300 | There are very few aircrafts that fly close
00:31:27.540 | to any kind of ceiling.
00:31:29.460 | Likewise, when you go close to a wall,
00:31:33.480 | there are these wall effects.
00:31:35.660 | And if you've gone on a train
00:31:37.620 | and you pass another train
00:31:39.020 | that's traveling in the opposite direction,
00:31:40.820 | you feel the buffeting.
00:31:42.360 | And so these kinds of micro climates
00:31:45.380 | affect our UAVs significantly.
00:31:47.820 | So if you want--
00:31:48.660 | - And they're impossible to model, essentially.
00:31:50.620 | - I wouldn't say they're impossible to model,
00:31:52.440 | but the level of sophistication you would need
00:31:54.860 | in the model and the software would be tremendous.
00:31:58.600 | Plus, to get everything right would be awfully tedious.
00:32:02.900 | So the way we do this is over time,
00:32:05.100 | we figure out how to adapt to these conditions.
00:32:08.980 | So early on, we used a form of learning
00:32:13.140 | that we call iterative learning.
00:32:15.760 | So this idea, if you want to perform a task,
00:32:18.580 | there are a few things that you need to change
00:32:22.100 | and iterate over a few parameters
00:32:25.580 | that over time, you can figure out.
00:32:29.920 | So I could call it policy gradient reinforcement learning,
00:32:34.000 | but actually it was just iterative learning.
00:32:35.500 | - Iterative learning.
00:32:36.340 | - And so this was there way back.
00:32:38.460 | I think what's interesting is,
00:32:40.060 | if you look at autonomous vehicles today,
00:32:42.260 | learning could occur in two pieces.
00:32:46.340 | One is perception, understanding the world.
00:32:48.580 | Second is action, taking actions.
00:32:50.720 | Everything that I've seen that is successful
00:32:54.540 | is on the perception side of things.
00:32:56.620 | So in computer vision,
00:32:57.620 | we've made amazing strides in the last 10 years.
00:33:00.080 | So recognizing objects, actually detecting objects,
00:33:03.900 | classifying them and tagging them in some sense,
00:33:08.620 | annotating them, this is all done through machine learning.
00:33:11.900 | On the action side, on the other hand,
00:33:14.400 | I don't know of any examples
00:33:15.980 | where there are fielded systems
00:33:17.780 | where we actually learn the right behavior.
00:33:21.420 | - Outside of single demonstration is successfully--
00:33:23.860 | - In the laboratory, this is the holy grail.
00:33:25.780 | Can you do end-to-end learning?
00:33:27.220 | Can you go from pixels to motor currents?
00:33:30.200 | This is really, really hard.
00:33:33.920 | And I think if you go forward,
00:33:36.200 | the right way to think about these things
00:33:38.800 | is data-driven approaches, learning-based approaches,
00:33:43.640 | in concert with model-based approaches,
00:33:46.440 | which is the traditional way of doing things.
00:33:48.440 | So I think there's a piece,
00:33:49.880 | there's a role for each of these methodologies.
00:33:52.340 | - So what do you think, just jumping out on topic,
00:33:54.800 | since you mentioned autonomous vehicles,
00:33:57.040 | what do you think are the limits on the perception side?
00:33:59.320 | So I've talked to Elon Musk,
00:34:01.960 | and there on the perception side,
00:34:04.200 | they're using primarily computer vision
00:34:06.760 | to perceive the environment.
00:34:08.880 | In your work with,
00:34:10.600 | because you work with the real world a lot,
00:34:13.360 | and the physical world,
00:34:14.520 | what are the limits of computer vision?
00:34:16.640 | Do you think we can solve autonomous vehicles
00:34:18.920 | focusing on the perception side,
00:34:21.720 | focusing on vision alone and machine learning?
00:34:25.080 | - So we also have a spin-off company, Excent Technologies,
00:34:29.400 | that works underground in mines.
00:34:32.720 | So you go into mines, they're dark, they're dirty.
00:34:36.380 | You fly in a dirty area,
00:34:39.360 | there's stuff you kick up by the propellers,
00:34:41.840 | the downwash kicks up dust.
00:34:43.540 | I challenge you to get a computer vision algorithm
00:34:47.600 | to work there.
00:34:48.660 | So we use lidars in that setting.
00:34:51.680 | Indoors, and even outdoors when we fly through fields,
00:34:57.400 | I think there's a lot of potential
00:34:59.200 | for just solving the problem using computer vision alone.
00:35:02.040 | But I think the bigger question is,
00:35:04.840 | can you actually solve,
00:35:08.240 | or can you actually identify all the corner cases
00:35:11.480 | using a single-sensing modality and using learning alone?
00:35:15.680 | - So what's your intuition there?
00:35:17.180 | - So look, if you have a corner case
00:35:19.820 | and your algorithm doesn't work,
00:35:21.840 | your instinct is to go get data about the corner case
00:35:25.200 | and patch it up, learn how to deal with that corner case.
00:35:28.540 | But at some point, this is going to saturate,
00:35:33.920 | this approach is not viable.
00:35:36.040 | So today, computer vision algorithms
00:35:39.200 | can detect 90% of the objects,
00:35:40.960 | or can detect objects 90% of the time,
00:35:43.000 | classify them 90% of the time.
00:35:45.480 | Cats on the internet probably can do 95%.
00:35:49.520 | But to get from 90% to 99%, you need a lot more data.
00:35:54.240 | And then I tell you, well, that's not enough
00:35:56.120 | because I have a safety-critical application,
00:35:58.280 | I want to go from 99% to 99.9%.
00:36:01.920 | Well, that's even more data.
00:36:03.200 | So I think if you look at
00:36:06.000 | wanting accuracy on the X-axis
00:36:10.240 | and look at the amount of data on the Y-axis,
00:36:15.780 | I believe that curve is an exponential curve.
00:36:18.120 | - Wow, okay, it's even hard if it's linear.
00:36:21.120 | - It's hard if it's linear, totally,
00:36:22.380 | but I think it's exponential.
00:36:24.180 | And the other thing you have to think about
00:36:25.720 | is that this process is a very, very power-hungry process.
00:36:30.720 | To run data farms or servers--
00:36:34.480 | - Power, do you mean literally power?
00:36:36.240 | - Literally power, literally power.
00:36:38.140 | So in 2014, five years ago,
00:36:40.740 | and I don't have more recent data,
00:36:43.460 | 2% of US electricity consumption
00:36:46.060 | was from data farms.
00:36:49.940 | So we think about this as an information science
00:36:53.740 | and information processing problem.
00:36:55.820 | Actually, it is an energy processing problem.
00:36:59.420 | And so unless we figure out better ways of doing this,
00:37:01.980 | I don't think this is viable.
00:37:04.020 | - So talking about driving,
00:37:06.380 | which is a safety-critical application,
00:37:08.140 | and some aspect of flight is safety-critical,
00:37:11.900 | maybe philosophical question, maybe an engineering one,
00:37:14.420 | what problem do you think is harder to solve,
00:37:16.420 | autonomous driving or autonomous flight?
00:37:19.660 | - That's a really interesting question.
00:37:21.380 | I think autonomous flight has several advantages
00:37:26.380 | that autonomous driving doesn't have.
00:37:30.000 | So look, if I wanna go from point A to point B,
00:37:34.060 | I have a very, very safe trajectory.
00:37:35.900 | Go vertically up to a maximum altitude,
00:37:38.420 | fly horizontally to just about the destination,
00:37:41.060 | and then come down vertically.
00:37:42.560 | This is pre-programmed.
00:37:47.000 | The equivalent of that is very hard to find
00:37:49.640 | in the self-driving car world,
00:37:51.940 | because you're on the ground,
00:37:53.160 | you're in a two-dimensional surface,
00:37:55.240 | and the trajectories on the two-dimensional surface
00:37:58.560 | are more likely to encounter obstacles.
00:38:01.040 | I mean this in an intuitive sense,
00:38:03.800 | but mathematically true,
00:38:04.760 | that's mathematically as well, that's true.
00:38:08.120 | - There's other option on the 2G space of platooning,
00:38:11.780 | or because there's so many obstacles,
00:38:13.380 | you can connect to those obstacles
00:38:15.000 | and all these kinds of options.
00:38:15.840 | - Those exist in the three-dimensional space as well.
00:38:18.200 | - So they do.
00:38:19.040 | So the question also implies,
00:38:21.680 | how difficult are obstacles
00:38:23.480 | in the three-dimensional space in flight?
00:38:25.520 | - So that's the downside.
00:38:27.400 | I think in three-dimensional space,
00:38:28.640 | you're modeling three-dimensional world,
00:38:30.960 | not just because you wanna avoid it,
00:38:32.960 | but you wanna reason about it,
00:38:34.720 | and you wanna work in that three-dimensional environment,
00:38:37.000 | and that's significantly harder.
00:38:39.240 | So that's one disadvantage.
00:38:40.600 | I think the second disadvantage is, of course,
00:38:42.960 | anytime you fly, you have to put up
00:38:45.040 | with the peculiarities of aerodynamics,
00:38:48.060 | and they're complicated environments,
00:38:50.500 | how do you negotiate that?
00:38:51.640 | So that's always a problem.
00:38:53.680 | - Do you see a time in the future where there is,
00:38:57.000 | you mentioned there's agriculture applications,
00:39:00.440 | so there's a lot of applications of flying robots,
00:39:03.480 | but do you see a time in the future
00:39:04.840 | where there is tens of thousands,
00:39:07.260 | or maybe hundreds of thousands of delivery drones
00:39:10.000 | that fill the sky, delivery flying robots?
00:39:14.040 | - I think there's a lot of potential
00:39:16.000 | for the last mile delivery,
00:39:17.760 | and so in crowded cities,
00:39:20.660 | I don't know, if you go to a place like Hong Kong,
00:39:24.280 | just crossing the river can take half an hour,
00:39:27.240 | and while a drone can just do it in five minutes at most.
00:39:32.240 | I think you look at delivery of supplies to remote villages.
00:39:38.720 | I work with a nonprofit called Weave Robotics,
00:39:41.600 | so they work in the Peruvian Amazon,
00:39:43.840 | where the only highways are rivers,
00:39:47.280 | and to get from point A to point B may take five hours,
00:39:52.280 | while with a drone, you can get there in 30 minutes.
00:39:55.400 | So just delivering drugs, retrieving samples
00:40:01.000 | for testing vaccines,
00:40:04.720 | I think there's huge potential here.
00:40:06.960 | So I think the challenges are not technological,
00:40:09.840 | the challenge is economical.
00:40:11.920 | The one thing I'll tell you that nobody thinks about
00:40:16.320 | is the fact that we've not made huge strides
00:40:19.640 | in battery technology.
00:40:21.540 | Yes, it's true, batteries are becoming less expensive
00:40:24.240 | because we have these mega factories that are coming up,
00:40:26.940 | but they're all based on lithium-based technologies,
00:40:29.480 | and if you look at the energy density and the power density,
00:40:33.960 | those are two fundamentally limiting numbers.
00:40:38.700 | So power density is important because for a UAV
00:40:41.320 | to take off vertically into the air,
00:40:43.160 | which most drones do, they don't have a runway,
00:40:47.040 | you consume roughly 200 watts per kilo at the small size.
00:40:50.900 | That's a lot, right?
00:40:54.560 | In contrast, the human brain consumes less than 80 watts,
00:40:58.200 | the whole of the human brain.
00:41:00.520 | So just imagine just lifting yourself into the air
00:41:04.240 | is like two or three light bulbs,
00:41:06.640 | which makes no sense to me.
00:41:08.480 | - Yeah, so you're going to have to, at scale,
00:41:10.940 | solve the energy problem then,
00:41:13.360 | charging the batteries, storing the energy, and so on.
00:41:18.360 | - And then the storage is the second problem,
00:41:21.160 | but storage limits the range.
00:41:23.420 | But you have to remember that you have to burn a lot of it
00:41:28.940 | per given time.
00:41:32.060 | - So the burning is another problem.
00:41:33.980 | - Which is a power question.
00:41:35.140 | - Yes, and do you think, just your intuition,
00:41:39.140 | there are breakthroughs in batteries on the horizon?
00:41:44.140 | How hard is that problem?
00:41:46.940 | - Look, there are a lot of companies that are promising
00:41:49.860 | flying cars that are autonomous and that are clean.
00:41:55.340 | - Right.
00:41:56.180 | - I think they're over-promising.
00:42:01.100 | The autonomy piece is doable.
00:42:04.220 | The clean piece, I don't think so.
00:42:06.460 | There's another company that I work with called Jetoptera.
00:42:11.300 | They make small jet engines.
00:42:13.820 | And they can get up to 50 miles an hour very easily
00:42:17.460 | and lift 50 kilos.
00:42:19.380 | But they're jet engines.
00:42:21.180 | They're efficient.
00:42:23.380 | They're a little louder than electric vehicles,
00:42:26.860 | but they can build flying cars.
00:42:29.460 | - So your sense is that there's a lot of pieces
00:42:32.100 | that have come together.
00:42:33.540 | So on this crazy question,
00:42:37.380 | if you look at companies like Kitty Hawk
00:42:39.620 | working on electric, so the clean,
00:42:42.080 | talking to Sebastian Thrun, right?
00:42:45.820 | It's a crazy dream, you know?
00:42:48.860 | But you work with flight a lot.
00:42:52.060 | You've mentioned before that manned flights
00:42:55.780 | or carrying a human body is very difficult to do.
00:43:00.780 | So how crazy is flying cars?
00:43:04.180 | Do you think there'll be a day when we have
00:43:06.220 | vertical takeoff and landing vehicles
00:43:11.060 | that are sufficiently affordable
00:43:13.980 | that we're going to see a huge amount of them?
00:43:17.380 | And they would look like something like we dream of
00:43:19.660 | when we think about flying cars.
00:43:21.020 | - Yeah, like the Jetsons.
00:43:22.140 | - The Jetsons, yeah.
00:43:23.100 | - So look, there are a lot of smart people working on this.
00:43:25.540 | And you never say something is not possible
00:43:29.660 | when you have people like Sebastian Thrun working on it.
00:43:32.180 | So I totally think it's viable.
00:43:35.140 | I question, again, the electric piece.
00:43:38.260 | - The electric piece, yeah.
00:43:39.540 | - And again, for short distances, you can do it.
00:43:41.660 | And there's no reason to suggest
00:43:43.660 | that these all just have to be rotorcrafts.
00:43:45.820 | You take off vertically,
00:43:46.900 | but then you morph into a forward flight.
00:43:49.660 | I think there are a lot of interesting designs.
00:43:51.620 | The question to me is, are these economically viable?
00:43:56.060 | And if you agree to do this with fossil fuels,
00:43:59.180 | it instantly immediately becomes viable.
00:44:01.980 | - That's a real challenge.
00:44:03.500 | Do you think it's possible for robots and humans
00:44:06.600 | to collaborate successfully on tasks?
00:44:08.880 | So a lot of robotics folks that I talk to and work with,
00:44:13.700 | I mean, humans just add a giant mess to the picture.
00:44:18.020 | So it's best to remove them from consideration
00:44:20.380 | when solving specific tasks.
00:44:22.460 | It's very difficult to model.
00:44:23.620 | There's just a source of uncertainty.
00:44:26.060 | In your work with these agile flying robots,
00:44:31.060 | do you think there's a role for collaboration with humans,
00:44:35.720 | or is it best to model tasks in a way
00:44:38.660 | that doesn't have a human in the picture?
00:44:43.460 | - Well, I don't think we should ever think about robots
00:44:46.800 | without human in the picture.
00:44:48.140 | Ultimately, robots are there because we want them
00:44:51.020 | to solve problems for humans.
00:44:54.420 | But there's no general solution to this problem.
00:44:58.340 | I think if you look at human interaction
00:45:00.060 | and how humans interact with robots,
00:45:02.460 | you know, we think of these in sort of three different ways.
00:45:05.340 | One is the human commanding the robot.
00:45:07.640 | The second is the human collaborating with the robot.
00:45:12.940 | So for example, we work on how a robot
00:45:15.580 | can actually pick up things with a human and carry things.
00:45:18.800 | That's like true collaboration.
00:45:20.960 | And third, we think about humans as bystanders.
00:45:25.000 | Self-driving cars, what's the human's role,
00:45:27.300 | and how do self-driving cars
00:45:30.400 | acknowledge the presence of humans?
00:45:33.000 | So I think all of these things are different scenarios.
00:45:35.920 | It depends on what kind of humans, what kind of task.
00:45:38.560 | And I think it's very difficult to say
00:45:41.920 | that there's a general theory that we all have for this.
00:45:45.580 | But at the same time, it's also silly to say
00:45:48.500 | that we should think about robots independent of humans.
00:45:52.100 | So to me, human-robot interaction
00:45:55.840 | is almost a mandatory aspect of everything we do.
00:45:59.820 | - Yes, but to which degree?
00:46:01.500 | So your thoughts, if we jump to autonomous vehicles,
00:46:04.140 | for example, there's a big debate
00:46:07.380 | between what's called level two and level four.
00:46:10.700 | So semi-autonomous and autonomous vehicles.
00:46:13.720 | And sort of the Tesla approach currently at least
00:46:16.480 | has a lot of collaboration between human and machine.
00:46:19.000 | So the human is supposed to actively supervise
00:46:22.080 | the operation of the robot.
00:46:23.920 | Part of the safety definition of how safe a robot is
00:46:28.920 | in that case is how effective is the human in monitoring it?
00:46:32.920 | Do you think that's ultimately not a good approach
00:46:39.720 | in sort of having a human in the picture,
00:46:42.380 | not as a bystander or part of the infrastructure,
00:46:47.400 | but really as part of what's required
00:46:50.000 | to make the system safe?
00:46:51.560 | - This is harder than it sounds.
00:46:53.720 | I think, you know, if you,
00:46:56.040 | I mean, I'm sure you've driven before
00:47:00.200 | in highways and so on.
00:47:01.920 | It's really very hard to have,
00:47:04.000 | to relinquish control to a machine
00:47:07.560 | and then take over when needed.
00:47:10.480 | So I think Tesla's approach is interesting
00:47:12.320 | 'cause it allows you to periodically establish
00:47:14.840 | some kind of contact with the car.
00:47:18.540 | Toyota on the other hand is thinking about
00:47:20.660 | shared autonomy or collaborative autonomy as a paradigm.
00:47:24.820 | If I may argue, these are very, very simple ways
00:47:27.480 | of human-robot collaboration.
00:47:29.700 | 'Cause the task is pretty boring.
00:47:31.900 | You sit in a vehicle, you go from point A to point B.
00:47:35.000 | I think the more interesting thing to me is,
00:47:37.360 | for example, search and rescue,
00:47:38.760 | I've got a human first responder, robot first responders.
00:47:41.980 | I gotta do something.
00:47:45.140 | It's important, I have to do it in two minutes.
00:47:47.800 | The building is burning, there's been an explosion,
00:47:50.440 | it's collapsed, how do I do it?
00:47:52.800 | I think to me, those are the interesting things
00:47:54.740 | where it's very, very unstructured
00:47:57.160 | and what's the role of the human, what's the role of the robot?
00:48:00.200 | Clearly, there's lots of interesting challenges
00:48:02.440 | and as a field, I think we're gonna make
00:48:04.240 | a lot of progress in this area.
00:48:05.760 | - Yeah, it's an exciting form of collaboration.
00:48:07.600 | You're right, in autonomous driving,
00:48:09.420 | the main enemy is just boredom of the human.
00:48:13.120 | - Yes.
00:48:13.960 | - As opposed to in rescue operations,
00:48:15.680 | it's literally life and death and the collaboration
00:48:20.680 | enables the effective completion of the mission.
00:48:23.820 | So it's exciting.
00:48:24.760 | - In some sense, we're also doing this,
00:48:27.840 | you think about the human driving a car
00:48:30.520 | and almost invariably, the human's trying to estimate
00:48:34.240 | the state of the car, they estimate the state
00:48:35.680 | of the environment and so on.
00:48:37.240 | But what if the car were to estimate the state of the human?
00:48:40.080 | So for example, I'm sure you have a smartphone
00:48:41.920 | and the smartphone tries to figure out what you're doing
00:48:44.560 | and send you reminders and oftentimes telling you
00:48:48.280 | to drive to a certain place, although you have no intention
00:48:50.420 | of going there because it thinks that that's where
00:48:52.600 | you should be 'cause of some Gmail calendar entry
00:48:56.240 | or something like that and it's trying to constantly figure
00:49:00.880 | out who you are, what you're doing.
00:49:02.720 | If a car were to do that, maybe that would make
00:49:05.240 | the driver safer because the car's trying to figure out
00:49:08.120 | is the driver paying attention, looking at his or her eyes,
00:49:11.580 | looking at cicada movements.
00:49:14.400 | So I think the potential is there but from the reverse side,
00:49:18.600 | it's not robot modeling but it's human modeling.
00:49:21.640 | - It's more in the human, right.
00:49:22.880 | - And I think the robots can do a very good job
00:49:25.320 | of modeling humans if you really think about the framework
00:49:29.120 | that you have a human sitting in a cockpit
00:49:32.600 | surrounded by sensors all staring at him
00:49:35.800 | in addition to be staring outside but also staring at him.
00:49:39.160 | I think there's a real synergy there.
00:49:40.960 | - Yeah, I love that problem 'cause it's the new
00:49:43.440 | 21st century form of psychology actually,
00:49:46.460 | AI-enabled psychology.
00:49:48.520 | A lot of people have sci-fi-inspired fears
00:49:51.280 | of walking robots like those from Boston Dynamics
00:49:54.080 | if you just look at shows on Netflix and so on
00:49:56.440 | or flying robots like those you work with.
00:49:59.880 | How would you, how do you think about those fears?
00:50:03.120 | How would you alleviate those fears?
00:50:05.000 | Do you have inklings, echoes of those same concerns?
00:50:09.000 | - You know, anytime we develop a technology
00:50:11.680 | meaning to have positive impact in the world,
00:50:14.120 | there's always the worry that somebody could subvert
00:50:19.120 | those technologies and use it in an adversarial setting
00:50:23.240 | and robotics is no exception, right.
00:50:25.280 | So I think it's very easy to weaponize robots.
00:50:29.280 | I think we talk about swarms.
00:50:31.720 | One thing I worry a lot about is,
00:50:33.960 | so for us to get swarms to work
00:50:35.880 | and do something reliably is really hard.
00:50:38.280 | But suppose I have this challenge
00:50:42.040 | of trying to destroy something
00:50:44.360 | and I have a swarm of robots where only one out of the swarm
00:50:47.280 | needs to get to its destination.
00:50:48.920 | So that suddenly becomes a lot more doable.
00:50:52.640 | And so I worry about this general idea of using autonomy
00:50:56.960 | with lots and lots of agents.
00:50:59.640 | I mean, having said that,
00:51:01.080 | look, a lot of this technology is not very mature.
00:51:03.760 | My favorite saying is that
00:51:05.520 | if somebody had to develop this technology,
00:51:10.520 | wouldn't you rather the good guys do it?
00:51:12.320 | So the good guys have a good understanding of the technology
00:51:14.640 | so they can figure out how this technology
00:51:16.480 | is being used in a bad way
00:51:18.320 | or could be used in a bad way and try to defend against it.
00:51:21.360 | So we think a lot about that.
00:51:22.760 | So we have, we're doing research
00:51:25.360 | on how to defend against swarms, for example.
00:51:28.240 | - That's interesting.
00:51:29.600 | - There's in fact a report by the National Academies
00:51:33.000 | on counter UAS technologies.
00:51:35.560 | This is a real threat,
00:51:38.240 | but we're also thinking about how to defend against this
00:51:40.360 | and knowing how swarms work,
00:51:42.960 | knowing how autonomy works is I think very important.
00:51:47.160 | - So it's not just politicians.
00:51:49.320 | You think engineers have a role in this discussion?
00:51:51.640 | - Absolutely.
00:51:52.480 | I think the days where politicians
00:51:55.320 | can be agnostic to technology are gone.
00:51:58.720 | I think every politician needs to be literate in technology.
00:52:03.720 | And I often say technology is the new liberal art.
00:52:08.680 | Understanding how technology will change your life
00:52:12.920 | I think is important.
00:52:14.480 | And every human being needs to understand that.
00:52:18.080 | - And maybe we can elect some engineers to office as well
00:52:21.480 | on the other side.
00:52:22.720 | What are the biggest open problems in robotics?
00:52:24.840 | And you said we're in the early days in some sense.
00:52:27.760 | What are the problems we would like to solve in robotics?
00:52:31.040 | - I think there are lots of problems, right?
00:52:32.520 | But I would phrase it in the following way.
00:52:36.440 | If you look at the robots we're building,
00:52:39.520 | they're still very much tailored towards
00:52:43.160 | doing specific tasks in specific settings.
00:52:46.520 | I think the question of how do you get them to operate
00:52:49.440 | in much broader settings
00:52:53.600 | where things can change in unstructured environments
00:52:58.080 | is up in the air.
00:52:59.200 | So think of self-driving cars.
00:53:01.240 | Today we can build a self-driving car in a parking lot.
00:53:05.720 | We can do level five autonomy in a parking lot.
00:53:09.040 | But can you do level five autonomy
00:53:13.280 | in the streets of Napoli in Italy or Mumbai in India?
00:53:17.800 | So in some sense, when we think about robotics,
00:53:22.440 | we have to think about where they're functioning,
00:53:25.160 | what kind of environment, what kind of a task.
00:53:27.800 | We have no understanding
00:53:29.840 | of how to put both those things together.
00:53:32.840 | - So we're in the very early days
00:53:34.040 | of applying it to the physical world.
00:53:35.960 | And I was just in Naples actually.
00:53:38.840 | And there's levels of difficulty and complexity
00:53:42.240 | depending on which area you're applying it to.
00:53:45.960 | - I think so.
00:53:46.800 | And we don't have a systematic way of understanding that.
00:53:51.120 | Everybody says just 'cause a computer
00:53:53.880 | can now beat a human at any board game,
00:53:56.600 | we certainly know something about intelligence.
00:54:00.000 | That's not true.
00:54:01.440 | A computer board game is very, very structured.
00:54:04.480 | It is the equivalent of working in a Henry Ford factory
00:54:08.560 | where parts come, you assemble, move on.
00:54:11.760 | It's a very, very, very structured setting.
00:54:14.200 | That's the easiest thing.
00:54:15.760 | And we know how to do that.
00:54:17.120 | - So you've done a lot of incredible work
00:54:20.440 | at the UPenn, University of Pennsylvania, Grass Club.
00:54:23.760 | You're now Dean of Engineering at UPenn.
00:54:26.600 | What advice do you have for a new bright-eyed undergrad
00:54:31.360 | interested in robotics or AI or engineering?
00:54:34.680 | - Well, I think there's really three things.
00:54:36.600 | One is you have to get used to the idea
00:54:40.640 | that the world will not be the same in five years
00:54:42.880 | or four years whenever you graduate, right?
00:54:45.200 | Which is really hard to do.
00:54:46.160 | So this thing about predicting the future,
00:54:49.000 | every one of us needs to be trying
00:54:50.560 | to predict the future always.
00:54:52.360 | Not because you'll be any good at it,
00:54:55.040 | but by thinking about it, I think you sharpen your senses
00:54:59.160 | and you become smarter.
00:55:00.920 | So that's number one.
00:55:02.120 | Number two, and it's a corollary of the first piece,
00:55:05.800 | which is you really don't know what's gonna be important.
00:55:09.440 | So this idea that I'm gonna specialize in something
00:55:12.120 | which will allow me to go in a particular direction,
00:55:15.360 | it may be interesting,
00:55:16.520 | but it's important also to have this breadth
00:55:18.520 | so you have this jumping off point.
00:55:20.360 | I think the third thing,
00:55:23.000 | and this is where I think Penn excels.
00:55:25.360 | I mean, we teach engineering,
00:55:27.280 | but it's always in the context of the liberal arts.
00:55:30.000 | It's always in the context of society.
00:55:32.360 | As engineers, we cannot afford to lose sight of that.
00:55:35.880 | So I think that's important.
00:55:37.640 | But I think one thing that people underestimate
00:55:39.960 | when they do robotics
00:55:40.920 | is the importance of mathematical foundations,
00:55:43.440 | the importance of representations.
00:55:47.720 | Not everything can just be solved
00:55:50.040 | by looking for Ross packages on the internet
00:55:52.400 | or to find a deep neural network that works.
00:55:56.240 | I think the representation question is key,
00:55:59.080 | even to machine learning,
00:56:00.360 | where if you ever hope to achieve or get to explainable AI,
00:56:05.360 | somehow there need to be representations
00:56:07.720 | that you can understand.
00:56:09.040 | - So if you wanna do robotics,
00:56:11.120 | you should also do mathematics.
00:56:12.640 | And you said liberal arts, a little literature.
00:56:15.040 | If you wanna build a robot,
00:56:16.960 | you should be reading Dostoevsky.
00:56:19.280 | I agree with that.
00:56:20.320 | - Very good. (laughs)
00:56:21.920 | - So Vijay, thank you so much for talking today.
00:56:23.520 | It was an honor. - Thank you.
00:56:24.360 | It was just a very exciting conversation.
00:56:26.160 | Thank you.
00:56:27.000 | (upbeat music)
00:56:29.580 | (upbeat music)
00:56:32.160 | (upbeat music)
00:56:34.740 | (upbeat music)
00:56:37.320 | (upbeat music)
00:56:39.900 | (upbeat music)
00:56:42.480 | [BLANK_AUDIO]