back to index

George Hotz: Comma.ai, OpenPilot, and Autonomous Vehicles | Lex Fridman Podcast #31


Chapters

0:0 George Hotz
5:52 Virtual Reality
7:21 Iphone Hack
14:50 Nuclear Weapons
17:28 How Was Your Evolution as a Programmer
19:4 Product Zero
35:35 System Specs
42:15 Driver Monitoring
43:46 Cancel in Autopilot
61:46 Natural Lane Change
69:51 What Does Success Look like for Comm Ai
78:32 Who Are the Competitors
104:56 Trolley Problem
108:6 Teleoperation

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with George Hotz.
00:00:02.540 | He's the founder of Kama AI,
00:00:04.500 | a machine learning based vehicle automation company.
00:00:07.420 | He is most certainly an outspoken personality
00:00:10.220 | in the field of AI and technology in general.
00:00:13.180 | He first gained recognition for being the first person
00:00:16.220 | to carry on lock an iPhone.
00:00:18.420 | And since then, he's done quite a few interesting things
00:00:21.260 | at the intersection of hardware and software.
00:00:24.420 | This is the Artificial Intelligence Podcast.
00:00:27.460 | If you enjoy it, subscribe on YouTube,
00:00:29.580 | give it five stars on iTunes, support it on Patreon,
00:00:32.860 | or simply connect with me on Twitter
00:00:34.860 | at Lex Friedman, spelled F-R-I-D-M-A-N.
00:00:39.080 | And I'd like to give a special thank you to Jennifer
00:00:41.960 | from Canada for her support of the podcast on Patreon.
00:00:45.820 | Merci beaucoup, Jennifer.
00:00:47.660 | She's been a friend and an engineering colleague
00:00:50.580 | for many years since I was in grad school.
00:00:52.740 | Your support means a lot and inspires me
00:00:55.460 | to keep this series going.
00:00:57.860 | And now, here's my conversation with George Hotz.
00:01:01.580 | Do you think we're living in a simulation?
00:01:04.800 | - Yes, but it may be unfalsifiable.
00:01:10.020 | - What do you mean by unfalsifiable?
00:01:12.420 | - So if the simulation is designed in such a way
00:01:16.820 | that they did like a formal proof
00:01:19.620 | to show that no information can get in and out,
00:01:22.260 | and if their hardware is designed
00:01:24.060 | for anything in the simulation
00:01:25.940 | to always keep the hardware in spec,
00:01:27.860 | it may be impossible to prove
00:01:29.460 | whether we're in a simulation or not.
00:01:31.300 | - So they've designed it such that it's a closed system,
00:01:35.620 | you can't get outside the system.
00:01:37.140 | - Well, maybe it's one of three worlds.
00:01:38.740 | We're either in a simulation which can be exploited,
00:01:41.340 | we're in a simulation which not only can't be exploited,
00:01:44.140 | but like, the same thing's true about VMs.
00:01:46.380 | A really well-designed VM,
00:01:48.100 | you can't even detect if you're in a VM or not.
00:01:50.460 | - That's brilliant.
00:01:52.460 | So we're, it's, yeah, so the simulation's running
00:01:55.140 | on a virtual machine.
00:01:56.500 | - Yeah, but now in reality, all VMs have ways to detect.
00:01:59.420 | - That's the point.
00:02:00.260 | I mean, is it, you've done quite a bit of hacking yourself,
00:02:04.540 | and so you should know that really any complicated system
00:02:08.580 | will have ways in and out.
00:02:10.980 | - So this isn't necessarily true going forward.
00:02:14.180 | I spent my time away from Kama, I learned Coq.
00:02:19.700 | It's a dependently typed, like,
00:02:21.780 | it's a language for writing math proofs.
00:02:24.300 | And if you write code that compiles in a language like that,
00:02:28.140 | it is correct by definition.
00:02:30.780 | The types check its correctness.
00:02:33.540 | So it's possible that the simulation
00:02:34.940 | is written in a language like this, in which case, you know.
00:02:39.580 | - Yeah, but that can't be sufficiently expressive
00:02:42.620 | of language like that.
00:02:43.700 | - Oh, it can.
00:02:44.580 | - It can be?
00:02:45.420 | - Oh, yeah.
00:02:46.260 | - Okay, well, so, all right, so--
00:02:48.940 | - The simulation doesn't have to be Turing-complete
00:02:50.620 | if it has a scheduled end date.
00:02:52.300 | - Looks like it does, actually, with entropy.
00:02:54.580 | - I mean, I don't think that a simulation
00:02:58.540 | that results in something as complicated as the universe
00:03:02.220 | would have a formal proof of correctness, right?
00:03:07.300 | It's possible, of course.
00:03:09.820 | - We have no idea how good their tooling is,
00:03:12.740 | and we have no idea how complicated
00:03:14.620 | the universe computer really is.
00:03:16.260 | It may be quite simple.
00:03:17.900 | - It's just very large, right?
00:03:19.660 | - It's very, it's definitely very large.
00:03:22.140 | - But the fundamental rules might be super simple.
00:03:24.460 | - Yeah, Conway's Game of Life kind of stuff.
00:03:26.220 | - Right. (laughs)
00:03:28.060 | So, if you could hack, so imagine simulation
00:03:31.700 | that is hackable, if you could hack it,
00:03:33.660 | what would you change about the universe?
00:03:37.980 | Like, how would you approach hacking a simulation?
00:03:40.540 | - The reason I gave that talk--
00:03:44.340 | - By the way, I'm not familiar with the talk you gave.
00:03:46.660 | I just read that you talked about escaping the simulation
00:03:50.140 | or something like that. - Yeah.
00:03:51.260 | - So maybe you can tell me a little bit
00:03:52.620 | about the theme and the message there, too.
00:03:55.340 | - It wasn't a very practical talk
00:03:57.660 | about how to actually escape a simulation.
00:04:00.580 | It was more about a way of restructuring
00:04:03.300 | an us versus them narrative.
00:04:05.100 | If we continue on the path we're going with technology,
00:04:10.100 | I think we're in big trouble, like, as a species,
00:04:15.900 | and not just as a species, but even as me
00:04:17.300 | as an individual member of the species.
00:04:19.460 | So, if we could change rhetoric to be more like,
00:04:23.660 | to think upwards, like, to think about
00:04:28.140 | that we're in a simulation and how we could get out,
00:04:30.380 | already we'd be on the right path.
00:04:32.620 | What you actually do once you do that,
00:04:34.820 | well, I assume I would have acquired way more intelligence
00:04:37.380 | in the process of doing that, so I'll just ask that.
00:04:39.740 | - So, the thinking upwards, what kind of ideas,
00:04:43.780 | what kind of breakthrough ideas do you think
00:04:45.260 | thinking in that way could inspire?
00:04:48.300 | And why did you say upwards?
00:04:49.780 | - Upwards.
00:04:50.620 | - Into space?
00:04:51.440 | Are you thinking sort of exploration in all forms?
00:04:54.060 | - The space narrative that held for the modernist generation
00:04:59.060 | doesn't hold as well for the postmodern generation.
00:05:02.580 | - What's the space narrative?
00:05:05.380 | Are we talking about the same space,
00:05:06.500 | the three-dimensional space?
00:05:07.340 | - No, no, space, like, going out to space.
00:05:08.740 | Like, building, like, Elon Musk.
00:05:10.020 | Like, we're gonna build rockets, we're gonna go to Mars,
00:05:12.020 | we're gonna colonize the universe.
00:05:13.500 | - And the narrative you're referring,
00:05:14.660 | I was born in the Soviet Union,
00:05:16.020 | you're referring to the race to space.
00:05:18.020 | - The race to space, yes.
00:05:18.860 | - Explore, okay.
00:05:19.700 | - That was a great modernist narrative.
00:05:21.820 | It doesn't seem to hold the same weight in today's culture.
00:05:26.720 | I'm hoping for good postmodern narratives that replace it.
00:05:32.180 | - So, let's think, so you work a lot with AI.
00:05:35.580 | So, AI is one formulation of that narrative.
00:05:39.100 | There could be also, I don't know how much you do
00:05:40.980 | in VR and AR.
00:05:42.660 | That's another, I know less about it,
00:05:45.160 | but every time I play with it in our research,
00:05:47.620 | it's fascinating, that virtual world.
00:05:49.660 | Are you interested in the virtual world?
00:05:51.860 | - I would like to move to virtual reality.
00:05:54.200 | - In terms of your work?
00:05:56.420 | - No, I would like to physically move there.
00:05:58.780 | The apartment I can rent in the cloud is way better
00:06:00.660 | than the apartment I can rent in the real world.
00:06:03.220 | - Well, it's all relative, isn't it?
00:06:04.780 | Because others will have very nice apartments too,
00:06:07.300 | so you'll be inferior in the virtual world as well.
00:06:09.260 | - No, but that's not how I view the world, right?
00:06:11.340 | I don't view the world, I mean, it's a very, like,
00:06:14.020 | almost zero-sum-ish way to view the world.
00:06:16.400 | Say, like, my great apartment isn't great
00:06:18.820 | because my neighbor has one too.
00:06:20.420 | No, my great apartment is great
00:06:21.660 | because, like, look at this dishwasher, man.
00:06:24.340 | You just touch the dish and it's washed, right?
00:06:26.700 | And that is great in and of itself
00:06:28.740 | if I have the only apartment
00:06:30.140 | or if everybody had the apartment, I don't care.
00:06:32.420 | - So you have fundamental gratitude.
00:06:34.780 | The world first learned of Geohot, George Hotz,
00:06:39.140 | in August 2007, maybe before then,
00:06:42.300 | but certainly in August 2007
00:06:44.120 | when you were the first person to unlock,
00:06:46.760 | carry unlock an iPhone.
00:06:48.880 | How did you get into hacking?
00:06:50.520 | What was the first system you discovered vulnerabilities for
00:06:54.200 | and broke into?
00:06:55.040 | - So, that was really kind of the first thing.
00:07:01.240 | I had a book in 2006 called "Grey Hat Hacking,"
00:07:06.640 | and I guess I realized that
00:07:11.000 | if you acquired these sort of powers,
00:07:13.440 | you could control the world.
00:07:15.260 | But I didn't really know that much
00:07:18.880 | about computers back then.
00:07:20.520 | I started with electronics.
00:07:22.120 | The first iPhone hack was physical.
00:07:24.160 | - Cardware.
00:07:25.000 | - You had to open it up and pull an address line high.
00:07:28.160 | And it was because I didn't really know
00:07:29.920 | about software exploitation.
00:07:31.320 | I learned that all in the next few years
00:07:32.920 | and I got very good at it,
00:07:33.880 | but back then I knew about, like,
00:07:36.540 | how memory chips are connected to processors and stuff.
00:07:38.920 | - You knew about software and programming.
00:07:40.920 | You just didn't know.
00:07:43.080 | Oh, really?
00:07:43.920 | So, your view of the world and computers was physical,
00:07:47.880 | was hardware.
00:07:49.200 | - Actually, if you read the code that I released with that
00:07:52.280 | in August 2007, it's atrocious.
00:07:55.640 | - What language was it?
00:07:57.480 | - C, nice.
00:07:58.300 | - And in a broken sort of state machine-esque C,
00:08:01.360 | I didn't know how to program.
00:08:02.880 | - Yeah.
00:08:04.080 | So, how did you learn to program?
00:08:06.480 | What was your journey?
00:08:08.320 | 'Cause, I mean, we'll talk about it.
00:08:10.040 | You've live streamed some of your programming.
00:08:12.680 | This chaotic, beautiful mess, how did you arrive at that?
00:08:16.440 | - Years and years of practice.
00:08:18.640 | I interned at Google the summer after the iPhone unlock.
00:08:23.640 | And I did a contract for them
00:08:26.720 | where I built hardware for Street View
00:08:29.020 | and I wrote a software library to interact with it.
00:08:31.760 | And it was terrible code.
00:08:34.920 | And for the first time, I got feedback
00:08:36.540 | from people who I respected saying,
00:08:38.720 | no, like, don't write code like this.
00:08:41.120 | Now, of course, just getting that feedback is not enough.
00:08:45.640 | The way that I really got good
00:08:50.440 | was I wanted to write this thing that could emulate
00:08:54.780 | and then visualize ARM binaries,
00:08:58.400 | 'cause I wanted to hack the iPhone better.
00:08:59.980 | And I didn't like that I couldn't see,
00:09:01.920 | I couldn't single step through the processor
00:09:03.760 | because I had no debugger on there,
00:09:05.120 | especially for the low-level things
00:09:06.160 | like the boot ROM and the boot loader.
00:09:07.520 | So I tried to build this tool to do it.
00:09:09.420 | And I built the tool once and it was terrible.
00:09:13.440 | I built the tool a second time, it was terrible.
00:09:15.120 | I built the tool a third time.
00:09:16.320 | This was by the time I was at Facebook, it was kind of okay.
00:09:18.640 | And then I built the tool a fourth time
00:09:20.560 | when I was a Google intern again in 2014.
00:09:22.520 | And that was the first time I was like,
00:09:24.360 | this is finally usable.
00:09:25.860 | - How do you pronounce this, Kira?
00:09:27.120 | - Kira, yeah.
00:09:28.400 | - So it's essentially the most efficient way
00:09:31.840 | to visualize the change of state of the computer
00:09:35.720 | as the program is running.
00:09:37.180 | That's what you mean by debugger.
00:09:38.760 | - Yeah, it's a timeless debugger.
00:09:41.760 | So you can rewind just as easily as going forward.
00:09:45.080 | Think about if you're using GDB,
00:09:46.240 | you have to put a watch on a variable
00:09:47.880 | if you wanna see if that variable changes.
00:09:49.680 | In Kira, you can just click on that variable
00:09:51.480 | and then it shows every single time
00:09:53.880 | when that variable was changed or accessed.
00:09:56.500 | Think about it like Git for your computers, the run log.
00:09:59.760 | - So there's like a deep log of the state of the computer
00:10:05.640 | as the program runs and you can rewind.
00:10:07.820 | Why isn't that, or maybe it is, maybe you can educate me,
00:10:11.460 | why isn't that kind of debugging used more often?
00:10:14.660 | - 'Cause the tooling's bad.
00:10:16.220 | Well, two things.
00:10:17.060 | One, if you're trying to debug Chrome,
00:10:19.340 | Chrome is a 200 megabyte binary
00:10:22.900 | that runs slowly on desktops.
00:10:25.460 | So that's gonna be really hard to use for that.
00:10:27.740 | But it's really good to use for like CTFs
00:10:30.180 | and for boot ROMs and for small parts of code.
00:10:33.180 | So it's hard if you're trying to debug like massive systems.
00:10:36.360 | - What's a CTF and what's a boot ROM?
00:10:38.200 | - A boot ROM is the first code that executes
00:10:40.440 | the minute you give power to your iPhone.
00:10:42.480 | And CTF were these competitions that I played,
00:10:46.040 | Capture the Flag.
00:10:46.880 | - Capture the Flag, I was gonna ask you about that.
00:10:48.520 | What are those?
00:10:49.360 | Look, I watched a couple of videos on YouTube.
00:10:51.400 | Those look fascinating.
00:10:52.880 | What have you learned about maybe at the high level
00:10:55.520 | vulnerability of systems from these competitions?
00:11:00.800 | - I feel like in the heyday of CTFs,
00:11:04.180 | you had all of the best security people in the world
00:11:08.140 | challenging each other and coming up with
00:11:11.140 | new toy exploitable things over here.
00:11:13.620 | And then everybody, okay, who can break it?
00:11:15.340 | And when you break it, you get like,
00:11:17.140 | there's like a file on the server called flag.
00:11:19.340 | And then there's a program running,
00:11:20.940 | listening on a socket that's vulnerable.
00:11:22.660 | So you write an exploit, you get a shell,
00:11:24.980 | and then you cat flag, and then you type the flag
00:11:27.100 | into like a web-based scoreboard and you get points.
00:11:29.460 | So the goal is essentially to find an exploit
00:11:32.520 | in the system that allows you to run shell,
00:11:35.240 | to run arbitrary code on that system.
00:11:38.000 | - That's one of the categories.
00:11:40.160 | That's like the pwnable category.
00:11:41.920 | - Pwnable?
00:11:44.360 | - Yeah, pwnable.
00:11:45.180 | It's like, you pwn the program.
00:11:47.560 | It's a program that's--
00:11:48.560 | - Oh, yeah.
00:11:49.400 | Yeah, you know, first of all, I apologize.
00:11:54.200 | I'm gonna say it's because I'm Russian,
00:11:56.280 | but maybe you can help educate me.
00:11:59.100 | - Some video game like misspelled own way back in the day.
00:12:02.820 | - Yeah, and it's just, I wonder if there's a definition.
00:12:06.300 | I'll have to go to Urban Dictionary for it.
00:12:08.340 | - It'll be interesting to see what it says.
00:12:09.780 | - Okay, so what was the heyday of CTF, by the way?
00:12:12.740 | But was it, what decade are we talking about?
00:12:15.460 | - I think like, I mean, maybe I'm biased
00:12:18.420 | because it's the era that I played.
00:12:21.100 | But like 2011 to 2015,
00:12:26.880 | because the modern CTF scene is similar
00:12:31.000 | to the modern competitive programming scene.
00:12:32.620 | You have people who like do drills.
00:12:34.240 | You have people who practice.
00:12:35.840 | And then once you've done that,
00:12:37.000 | you've turned it less into a game of generic computer skill
00:12:40.000 | and more into a game of, okay, you memorize,
00:12:42.400 | you drill on these five categories.
00:12:44.580 | And then before that, it wasn't,
00:12:48.900 | it didn't have like as much attention as it had.
00:12:51.520 | I don't know, they were like,
00:12:53.640 | I won $30,000 once in Korea for one of these competitions.
00:12:56.040 | - Holy crap.
00:12:56.880 | - Yeah, they were, that was--
00:12:57.920 | - So that means, I mean, money's money,
00:12:59.520 | but that means there was probably good people there.
00:13:02.320 | - Exactly, yeah.
00:13:03.600 | - Are the challenges human constructed
00:13:06.780 | or are they grounded in some real flaws in real systems?
00:13:10.760 | - Usually they're human constructed,
00:13:13.060 | but they're usually inspired by real flaws.
00:13:15.760 | - What kind of systems are imagined
00:13:17.300 | is really focused on mobile?
00:13:19.100 | Like what has vulnerabilities these days?
00:13:20.960 | Is it primarily mobile systems like Android?
00:13:25.120 | - No, everything does.
00:13:26.600 | - Still.
00:13:27.440 | - Yeah, of course.
00:13:28.260 | The price has kind of gone up
00:13:29.360 | because less and less people can find them.
00:13:31.280 | And what's happened in security is now,
00:13:33.140 | if you wanna like jailbreak an iPhone,
00:13:34.560 | you don't need one exploit anymore, you need nine.
00:13:36.860 | - Nine chained together, what do you mean?
00:13:39.680 | Yeah, wow.
00:13:40.640 | Okay, so it's really,
00:13:41.900 | what's the benefit,
00:13:44.800 | speaking higher level philosophically about hacking?
00:13:48.280 | I mean, it sounds from everything I've seen about you,
00:13:50.440 | you just love the challenge
00:13:51.960 | and you don't want to do anything.
00:13:55.100 | You don't wanna bring that exploit out into the world
00:13:58.160 | and do any actual,
00:13:59.280 | let it run wild.
00:14:01.720 | You just wanna solve it
00:14:02.800 | and then you go on to the next thing.
00:14:05.480 | - Oh yeah, I mean,
00:14:06.360 | doing criminal stuff's not really worth it.
00:14:08.480 | And I'll actually use the same argument
00:14:10.560 | for why I don't do defense for why I don't do crime.
00:14:13.760 | If you wanna defend a system,
00:14:16.880 | say the system has 10 holes, right?
00:14:19.280 | If you find nine of those holes as a defender,
00:14:22.260 | you still lose
00:14:23.220 | because the attacker gets in through the last one.
00:14:25.540 | If you're an attacker,
00:14:26.380 | you only have to find one out of the 10.
00:14:28.740 | But if you're a criminal,
00:14:30.820 | if you log on with a VPN nine out of the 10 times,
00:14:34.820 | but one time you forget, you're done.
00:14:37.780 | - Because you're caught, okay.
00:14:39.420 | - Because you only have to mess up once
00:14:41.180 | to be caught as a criminal,
00:14:42.900 | that's why I'm not a criminal.
00:14:44.420 | (laughing)
00:14:45.940 | - But okay, let me,
00:14:47.100 | 'cause I was having a discussion with somebody
00:14:49.540 | just at a high level about nuclear weapons actually,
00:14:52.780 | why we're having blown ourselves up yet.
00:14:56.240 | And my feeling is all the smart people in the world,
00:14:59.800 | if you look at the distribution of smart people,
00:15:04.080 | smart people are generally good.
00:15:06.700 | And then this other person,
00:15:07.620 | I was talking to Sean Carroll, the physicist,
00:15:09.420 | and he was saying, no,
00:15:10.560 | good and bad people are evenly distributed amongst everybody.
00:15:14.020 | My sense was good hackers are in general good people
00:15:18.040 | and they don't want to mess with the world.
00:15:20.360 | What's your sense?
00:15:21.720 | - I'm not even sure about that.
00:15:24.680 | Like, I have a nice life,
00:15:30.500 | crime wouldn't get me anything.
00:15:32.100 | But if you're good and you have these skills,
00:15:36.500 | you probably have a nice life too, right?
00:15:38.480 | - Right, you can use it for other things.
00:15:40.120 | But is there an ethical,
00:15:41.080 | is there a little voice in your head that says,
00:15:44.160 | well, yeah, if you could hack something
00:15:49.000 | to where you could hurt people.
00:15:51.760 | And you could earn a lot of money doing it though,
00:15:54.920 | not hurt physically perhaps,
00:15:56.320 | but disrupt their life in some kind of way.
00:15:58.960 | Isn't there a little voice that says?
00:16:02.320 | - Well, two things.
00:16:04.560 | One, I don't really care about money.
00:16:06.760 | So like the money wouldn't be an incentive.
00:16:08.640 | The thrill might be an incentive.
00:16:10.600 | But when I was 19, I read "Crime and Punishment."
00:16:13.880 | - Right, good.
00:16:14.720 | - That was another great one
00:16:16.080 | that talked me out of ever really doing crime.
00:16:19.380 | 'Cause it's like, that's gonna be me.
00:16:21.680 | I'd get away with it, but it would just run through my head.
00:16:25.040 | Even if I got away with it.
00:16:26.440 | And then you do crime for long enough,
00:16:27.600 | you'll never get away with it.
00:16:28.920 | - That's right, in the end.
00:16:30.360 | That's a good reason to be good.
00:16:32.640 | - I wouldn't say I'm good, I would just say I'm not bad.
00:16:34.840 | - You're a talented programmer and a hacker
00:16:38.080 | in a good positive sense of the word.
00:16:40.920 | You've played around,
00:16:42.400 | found vulnerabilities in various systems.
00:16:44.720 | What have you learned broadly
00:16:46.140 | about the design of systems and so on
00:16:49.500 | from that whole process?
00:16:51.520 | - You learn to not take things
00:16:58.300 | for what people say they are,
00:17:02.140 | but you look at things for what they actually are.
00:17:05.300 | - Yeah.
00:17:07.900 | - I understand that's what you tell me it is,
00:17:10.060 | but what does it do?
00:17:11.300 | - And you have nice visualization tools
00:17:14.580 | to really know what it's really doing.
00:17:16.680 | - Oh, I wish.
00:17:17.800 | I'm a better programmer now than I was in 2014.
00:17:20.080 | I said, "Kira, that was the first tool
00:17:21.800 | "that I wrote that was usable."
00:17:23.440 | I wouldn't say the code was great.
00:17:25.360 | I still wouldn't say my code is great.
00:17:27.260 | - So how was your evolution as a programmer,
00:17:30.760 | except practice?
00:17:31.600 | You started with C, at which point did you pick up Python?
00:17:35.520 | 'Cause you're pretty big in Python now.
00:17:37.040 | - Now, yeah, in college.
00:17:39.920 | I went to Carnegie Mellon when I was 22.
00:17:42.480 | I went back, I'm like,
00:17:44.120 | "I'm gonna take all your hardest CS courses,
00:17:46.520 | "and we'll see how I do."
00:17:47.780 | Did I miss anything by not having
00:17:49.380 | a real undergraduate education?
00:17:51.500 | Took operating systems, compilers, AI,
00:17:54.220 | and their freshman WETA math course.
00:17:56.840 | And--
00:17:59.620 | - Operating systems, some of those classes you mentioned
00:18:03.300 | are pretty tough, actually.
00:18:04.220 | - They're great.
00:18:05.620 | At least, the circa 2012 operating systems and compilers
00:18:11.220 | were two of the, they were the best classes
00:18:13.020 | I've ever taken in my life.
00:18:14.400 | 'Cause you write an operating system,
00:18:15.600 | and you write a compiler.
00:18:16.860 | I wrote my operating system in C,
00:18:19.740 | and I wrote my compiler in Haskell, but--
00:18:21.820 | - Haskell?
00:18:22.660 | - Somehow, I picked up Python that semester as well.
00:18:26.380 | I started using it for the CTFs, actually.
00:18:28.080 | That's when I really started to get into CTFs.
00:18:30.320 | And CTFs, you're all to race against the clock,
00:18:33.360 | so I can't write things in C.
00:18:35.100 | - Oh, there's a clock component,
00:18:36.260 | so you really wanna use the programming language
00:18:37.820 | that you can be fastest in.
00:18:38.980 | - 48 hours, pwn as many of these challenges as you can.
00:18:41.460 | - Pwn. - Yeah.
00:18:42.540 | You got like 100 points a challenge,
00:18:43.980 | whatever team gets the most.
00:18:45.380 | - You were both at Facebook and Google for a brief stint.
00:18:50.260 | - Yeah.
00:18:51.100 | - With Project Zero, actually, at Google for five months,
00:18:54.940 | where you developed Kira.
00:18:56.940 | What was Project Zero about in general?
00:18:59.300 | Just curious about the security efforts in these companies.
00:19:05.180 | - Well, Project Zero started the same time I went there.
00:19:08.740 | What years are you there?
00:19:10.100 | - 2015.
00:19:12.340 | - 2015, so that was right at the beginning of Project Zero.
00:19:15.060 | It's small.
00:19:16.220 | It's Google's offensive security team.
00:19:18.860 | I'll try to give the best public-facing explanation
00:19:25.700 | that I can.
00:19:26.540 | So, the idea is basically,
00:19:30.980 | these vulnerabilities exist in the world.
00:19:33.220 | Nation states have them.
00:19:35.220 | Some high-powered bad actors have them.
00:19:38.460 | Sometimes people will find these vulnerabilities
00:19:43.460 | and submit them in bug bounties to the companies.
00:19:47.820 | But a lot of the companies don't really care.
00:19:49.500 | They don't even fix the bug.
00:19:51.140 | It doesn't hurt for there to be a vulnerability.
00:19:53.820 | So, Project Zero is like, "We're gonna do it different.
00:19:55.820 | "We're going to announce a vulnerability
00:19:57.780 | "and we're gonna give them 90 days to fix it.
00:19:59.620 | "And then whether they fix it or not,
00:20:00.780 | "we're gonna drop the zero day."
00:20:03.260 | - Oh, wow.
00:20:04.140 | - We're gonna drop the weapon on the exploits.
00:20:05.940 | - That is so cool.
00:20:07.540 | - I love that, deadlines.
00:20:09.260 | Oh, that's so cool.
00:20:10.100 | - Give them real deadlines.
00:20:10.940 | - Yeah.
00:20:12.380 | - And I think it's done a lot
00:20:13.780 | for moving the industry forward.
00:20:15.820 | - I watched your coding sessions that you streamed online.
00:20:18.940 | You code things up, basic projects, usually from scratch.
00:20:24.020 | I would say, sort of as a programmer myself,
00:20:28.220 | just watching you, that you type really fast
00:20:30.380 | and your brain works in both brilliant and chaotic ways.
00:20:34.540 | I don't know if that's always true,
00:20:35.820 | but certainly for the live streams.
00:20:37.620 | So, it's interesting to me because I'm more,
00:20:40.380 | I'm much slower and systematic and careful
00:20:43.540 | and you just move, I mean,
00:20:44.940 | probably in order of magnitude faster.
00:20:46.940 | So, I'm curious, is there a method to your madness?
00:20:51.100 | Is it just who you are?
00:20:53.060 | - There's pros and cons.
00:20:54.740 | There's pros and cons to my programming style
00:20:58.100 | and I'm aware of them.
00:20:59.460 | Like, if you ask me to like,
00:21:02.700 | get something up and working quickly with like an API
00:21:05.420 | that's kind of undocumented,
00:21:06.820 | I will do this super fast
00:21:08.220 | because I will throw things at it until it works.
00:21:10.260 | If you ask me to take a vector and rotate it 90 degrees
00:21:14.780 | and then flip it over the XY plane,
00:21:17.460 | I'll spam program for two hours and won't get it.
00:21:22.380 | - Oh, because it's something that you could do
00:21:23.940 | with a sheet of paper, think through, design,
00:21:26.300 | and then just, you really just throw stuff at the wall
00:21:30.460 | and you get so good at it that it usually works.
00:21:34.660 | - I should become better at the other kind as well.
00:21:36.980 | Sometimes I'll do things methodically.
00:21:39.460 | It's nowhere near as entertaining on the Twitch streams.
00:21:41.180 | I do exaggerate it a bit on the Twitch streams as well.
00:21:43.540 | The Twitch streams, I mean, what do you wanna see a gamer?
00:21:45.540 | You wanna see actions permitted, right?
00:21:46.820 | I'll show you APM for programming too.
00:21:48.180 | - Yeah, I recommend people go to it.
00:21:50.260 | I think I watched, I watched probably several hours
00:21:53.380 | of you, like I've actually left you programming
00:21:56.220 | in the background while I was programming
00:21:59.020 | because you made me, it was like watching
00:22:02.020 | a really good gamer, it's like energizes you
00:22:04.780 | 'cause you're like moving so fast.
00:22:06.220 | It's so, it's awesome, it's inspiring.
00:22:08.860 | It made me jealous that like,
00:22:11.180 | because my own programming is inadequate
00:22:14.260 | in terms of speed.
00:22:15.460 | - Oh, I-- - 'Cause I was like.
00:22:16.940 | - So I'm twice as frantic on the live streams
00:22:20.460 | as I am when I code without--
00:22:22.660 | - It's super entertaining.
00:22:23.700 | So I wasn't even paying attention to what you were coding,
00:22:26.380 | which is great.
00:22:27.220 | It's just watching you switch windows
00:22:29.740 | and Vim, I guess is the most--
00:22:31.340 | - Yeah, Vim and screen.
00:22:33.020 | I've developed the workload Facebook and stuck with it.
00:22:35.620 | - How do you learn new programming tools,
00:22:37.340 | ideas, techniques these days?
00:22:39.460 | What's your like methodology for learning new things?
00:22:42.060 | - So I wrote for comma,
00:22:45.940 | the distributed file systems out in the world
00:22:49.260 | are extremely complex.
00:22:50.740 | Like if you want to install something like Ceph,
00:22:55.260 | Ceph is I think the like open infrastructure
00:22:58.740 | distributed file system,
00:23:00.300 | or there's like newer ones like seaweed FS,
00:23:03.980 | but these are all like 10,000 plus line projects.
00:23:06.860 | I think some of them are even a hundred thousand line
00:23:09.500 | and just configuring them as a nightmare.
00:23:11.100 | So I wrote one, it's 200 lines
00:23:16.100 | and it uses like NGINX volume servers
00:23:18.860 | and has this little master server that I wrote in Go.
00:23:21.580 | And the way I-- - Go, wow.
00:23:23.540 | - This, if I would say that I'm proud per line
00:23:26.380 | of any code I wrote,
00:23:27.260 | maybe there's some exploits that I think are beautiful
00:23:29.180 | and then this, this is 200 lines
00:23:31.380 | and just the way that I thought about it,
00:23:33.780 | I think was very good.
00:23:34.620 | And the reason it's very good
00:23:35.580 | is because that was the fourth version of it that I wrote.
00:23:37.620 | And I had three versions that I threw away.
00:23:39.340 | - You mentioned, did you say Go?
00:23:41.020 | - I wrote in Go, yeah. - In Go.
00:23:42.260 | - So I-- - Is that a functional language?
00:23:43.900 | I forget what Go is.
00:23:45.300 | - Go is Google's language. - Right.
00:23:48.260 | - It's not functional.
00:23:49.500 | It's some, it's like in a way it's C++ but easier.
00:23:56.180 | It's strongly typed.
00:23:58.180 | It has a nice ecosystem around it.
00:23:59.740 | When I first looked at it, I was like,
00:24:01.700 | this is like Python but it takes twice as long
00:24:03.780 | to do anything.
00:24:04.620 | Now that I've, OpenPilot is migrating to C
00:24:09.620 | but it still has large Python components.
00:24:10.980 | I now understand why Python doesn't work
00:24:12.740 | for large code bases and why you want something like Go.
00:24:15.820 | - Interesting, so why doesn't Python work for,
00:24:18.660 | so even most, speaking for myself at least,
00:24:21.700 | like we do a lot of stuff,
00:24:23.380 | basically demo level work with autonomous vehicles
00:24:26.460 | and most of the work is Python.
00:24:28.260 | - Yeah.
00:24:29.180 | - Why doesn't Python work for large code bases?
00:24:32.380 | - Because, well, lack of type checking is a big--
00:24:37.380 | - So errors creep in.
00:24:39.340 | - Yeah, and like you don't know,
00:24:41.900 | the compiler can tell you like nothing, right?
00:24:45.300 | So everything is either, you know,
00:24:47.580 | like syntax errors, fine,
00:24:49.860 | but if you misspell a variable in Python,
00:24:51.780 | the compiler won't catch that.
00:24:52.980 | There's like linters that can catch it some of the time.
00:24:55.740 | There's no types, is really the biggest downside
00:25:00.540 | and then well, Python's slow but that's not related to it.
00:25:02.660 | Well, maybe it's kind of related to its lack of--
00:25:04.820 | - So what's in your toolbox these days?
00:25:06.620 | Is it Python, what else?
00:25:08.620 | - I need to move to something else.
00:25:10.340 | My adventure into dependently typed languages,
00:25:12.860 | I love these languages.
00:25:14.220 | They just have like syntax from the 80s.
00:25:17.500 | - What do you think about JavaScript?
00:25:21.100 | - Yes, like the modern TypeScript.
00:25:23.940 | - JavaScript is, the whole ecosystem
00:25:27.260 | is unbelievably confusing.
00:25:29.260 | NPM updates a package from 0.2.2 to 0.2.5
00:25:32.820 | and that breaks your Babel linter,
00:25:34.540 | which translates your ES5 into ES6,
00:25:37.020 | which doesn't run on, so why do I have to compile
00:25:40.900 | my JavaScript again, huh?
00:25:42.460 | - It may be the future though.
00:25:43.980 | You think about, I mean,
00:25:45.740 | I've embraced JavaScript recently just because,
00:25:49.380 | just like I've continually embraced PHP.
00:25:52.260 | It seems that these worst possible languages live on
00:25:55.300 | for the longest, like cockroaches never die.
00:25:57.420 | - Yeah, well, it's in the browser and it's fast.
00:26:00.740 | - It's fast.
00:26:01.700 | - Yeah.
00:26:02.540 | - It's in the browser and compute might stay,
00:26:04.900 | become, you know, the browser,
00:26:06.420 | it's unclear what the role of the browser is
00:26:09.020 | in terms of distributed computation in the future, so.
00:26:12.300 | - JavaScript is definitely here to stay.
00:26:15.220 | - Yeah, it's interesting if autonomous vehicles
00:26:18.140 | will run on JavaScript one day.
00:26:19.460 | I mean, you have to consider these possibilities.
00:26:21.780 | - All our debug tools are JavaScript.
00:26:24.260 | We actually just open sourced them.
00:26:26.020 | We have a tool Explorer,
00:26:27.380 | which you can annotate your disengagements
00:26:29.180 | and we have a tool Cabana,
00:26:30.100 | which lets you analyze the can traffic from the car.
00:26:32.900 | - So basically anytime you're visualizing something
00:26:35.220 | about the log you're using JavaScript.
00:26:37.740 | - Well, the web is the best UI toolkit by far.
00:26:40.100 | - Yeah.
00:26:40.940 | - So, and then, you know what, you're coding in JavaScript.
00:26:42.740 | We have a React guy, he's good.
00:26:44.380 | - React, nice.
00:26:46.100 | Let's get into it.
00:26:46.940 | So let's talk autonomous vehicles.
00:26:49.140 | You founded Kama AI.
00:26:50.620 | Let's, at a high level,
00:26:54.940 | how did you get into the world of vehicle automation?
00:26:57.900 | Can you also just, for people who don't know,
00:26:59.900 | tell the story of Kama AI?
00:27:01.420 | - Sure.
00:27:02.900 | So I was working at this AI startup
00:27:06.100 | and a friend approached me and he's like,
00:27:09.180 | "Dude, I don't know where this is going,
00:27:12.020 | "but the coolest applied AI problem today
00:27:15.100 | "is self-driving cars."
00:27:16.500 | I'm like, "Well, absolutely."
00:27:18.740 | "Do you want to meet with Elon Musk?"
00:27:20.540 | And he's looking for somebody to build a vision system
00:27:24.580 | for autopilot.
00:27:27.620 | This is when they were still on AP1.
00:27:29.340 | They were still using Mobileye.
00:27:30.860 | Elon back then was looking for a replacement.
00:27:33.700 | And he brought me in and we talked about a contract
00:27:37.380 | where I would deliver something
00:27:39.060 | that meets Mobileye level performance.
00:27:41.380 | I would get paid $12 million if I could deliver it tomorrow
00:27:43.980 | and I would lose $1 million
00:27:45.300 | for every month I didn't deliver.
00:27:47.740 | So I was like, "Okay, this is a great deal.
00:27:49.020 | "This is a super exciting challenge."
00:27:50.900 | You know what?
00:27:53.220 | Even if it takes me 10 months, I get $2 million.
00:27:55.380 | It's good.
00:27:56.220 | Maybe I can finish up in five.
00:27:57.140 | Maybe I don't finish it at all and I get paid nothing
00:27:58.820 | and I'll work for 12 months for free.
00:28:00.860 | - So maybe just take a pause on that.
00:28:02.940 | I'm also curious about this
00:28:04.260 | because I've been working in robotics for a long time
00:28:06.300 | and I'm curious to see a person like you just step in
00:28:08.300 | and sort of somewhat naive, but brilliant, right?
00:28:11.980 | So that's the best place to be
00:28:13.980 | 'cause you basically full steam take on a problem.
00:28:17.220 | How confident, how from that time,
00:28:19.700 | 'cause you know a lot more now,
00:28:21.300 | at that time, how hard do you think it is
00:28:23.460 | to solve all of autonomous driving?
00:28:25.880 | - I remember I suggested to Elon in the meeting,
00:28:30.580 | putting a GPU behind each camera to keep the compute local.
00:28:35.140 | This is an incredibly stupid idea.
00:28:38.020 | I leave the meeting 10 minutes later and I'm like,
00:28:39.980 | "I could have spent a little bit of time
00:28:41.460 | "thinking about this problem before I went in."
00:28:43.060 | - Stupid idea.
00:28:44.180 | - Oh, just send all your cameras to one big GPU.
00:28:46.220 | You're much better off doing that.
00:28:48.180 | - Oh, sorry, you said behind every camera,
00:28:50.140 | have a GPU. - Every camera.
00:28:50.980 | Have a small GPU.
00:28:51.800 | I was like, "Oh, I'll put the first few layers
00:28:52.640 | "of my comp there."
00:28:54.060 | Ugh, like why did I say that?
00:28:56.100 | - That's possible.
00:28:56.940 | - It's possible, but it's a bad idea.
00:28:59.020 | - It's not obviously a bad idea.
00:29:00.500 | - Pretty obviously bad,
00:29:01.340 | but whether it's actually a bad idea or not,
00:29:02.940 | I left that meeting with Elon like beating myself up.
00:29:05.260 | I'm like, "Why did I say something stupid?"
00:29:07.060 | - Yeah, you haven't, like you haven't at least like
00:29:09.620 | thought through every aspect of it, yeah.
00:29:12.220 | - He's very sharp too.
00:29:13.260 | Like usually in life, I get away with saying stupid things
00:29:15.820 | and then kind of course,
00:29:16.980 | oh, right away he called me out about it.
00:29:18.580 | And like, usually in life I get away
00:29:20.140 | with saying stupid things.
00:29:21.140 | And then like people will, you know,
00:29:24.660 | a lot of times people don't even notice
00:29:26.100 | and I'll like correct it and bring the conversation back.
00:29:28.240 | But with Elon, it was like, "Nope."
00:29:29.700 | Like, okay, well, that's not at all
00:29:32.660 | why the contract fell through.
00:29:33.540 | I was much more prepared the second time I met him.
00:29:35.540 | - Yeah, but in general, how hard did you think it is?
00:29:39.660 | Like 12 months is a tough timeline.
00:29:43.700 | - Oh, I just thought I'd clone Mobileye IQ3.
00:29:45.700 | I didn't think I'd solve level five self-driving
00:29:47.580 | or anything.
00:29:48.420 | - So the goal there was to do lane keeping,
00:29:50.980 | good lane keeping.
00:29:52.780 | - I saw, my friend showed me the outputs from a Mobileye
00:29:55.500 | and the outputs from a Mobileye was just basically two lanes
00:29:57.620 | at a position of a lead car.
00:29:59.380 | I'm like, I can gather a data set
00:30:01.540 | and train this net in weeks.
00:30:03.420 | And I did.
00:30:04.800 | - Well, first time I tried the implementation of Mobileye
00:30:07.540 | and the Tesla, I was really surprised how good it is.
00:30:10.380 | It's quite incredibly good.
00:30:12.300 | 'Cause I thought it's,
00:30:13.380 | just 'cause I've done a lot of computer vision,
00:30:14.820 | I thought it'd be a lot harder to create a system
00:30:18.860 | that that's stable.
00:30:20.040 | So I was personally surprised,
00:30:22.420 | just have to admit it,
00:30:24.980 | 'cause I was kind of skeptical before trying it.
00:30:27.860 | 'Cause I thought it would go in and out a lot more.
00:30:31.220 | It would get disengaged a lot more.
00:30:33.180 | And it's pretty robust.
00:30:36.180 | So what, how hard is the problem when you tackled it?
00:30:41.180 | - So I think AP1 was great.
00:30:44.500 | Like Elon talked about disengagements on the 405
00:30:47.340 | down in LA with like lane marks were kind of faded
00:30:51.060 | and the Mobileye system would drop out.
00:30:53.020 | Like I had something up and working
00:30:57.260 | that I would say was like the same quality in three months.
00:31:01.440 | - Same quality, but how do you know?
00:31:04.540 | You say stuff like that.
00:31:05.980 | - Yeah.
00:31:06.820 | - Confidently, but you can't, and I love it.
00:31:08.260 | But the question is you can't,
00:31:11.140 | you're kind of going by feel 'cause you tested it out.
00:31:14.540 | - Absolutely, absolutely.
00:31:15.540 | Like I would take, I borrowed my friend's Tesla.
00:31:18.460 | I would take AP1 out for a drive.
00:31:20.740 | And then I would take my system out for a drive.
00:31:22.300 | - And it seems reasonably like the same.
00:31:24.420 | So the 405, how hard is it to create something
00:31:30.440 | that could actually be a product that's deployed?
00:31:34.180 | I mean, I've read an article where Elon,
00:31:36.940 | this respondent said something about you saying
00:31:40.780 | that to build autopilot is more complicated
00:31:45.780 | than a single George Hodge level job.
00:31:51.860 | How hard is that job to create something
00:31:55.500 | that would work across globally?
00:31:57.440 | - I don't think globally is the challenge,
00:32:00.620 | but Elon followed that up by saying it's going to take
00:32:02.760 | two years in a company of 10 people.
00:32:04.900 | And here I am four years later with a company of 12 people.
00:32:07.900 | And I think we still have another two to go.
00:32:09.940 | - Two years.
00:32:10.780 | So yeah, so what do you think about how Tesla's progressing
00:32:15.780 | with autopilot, V2, V3?
00:32:19.180 | - I think we've kept pace with them pretty well.
00:32:23.080 | I think navigating autopilot is terrible.
00:32:26.820 | We had some demo features internally of the same stuff
00:32:31.100 | and we would test it.
00:32:32.180 | And I'm like, I'm not shipping this
00:32:33.420 | even as like open source software to people.
00:32:35.240 | - Why do you think it's terrible?
00:32:37.380 | - Consumer Reports does a great job of describing it.
00:32:39.540 | Like when it makes a lane change,
00:32:41.200 | it does it worse than a human.
00:32:43.340 | You shouldn't ship things like autopilot, open pilot,
00:32:46.940 | they lane keep better than a human.
00:32:49.740 | If you turn it on for a stretch of a highway,
00:32:53.420 | like an hour long, it's never going to touch a lane line.
00:32:56.660 | Human will touch probably a lane line twice.
00:32:58.980 | - You just inspired me.
00:33:00.060 | I don't know if you're grounded in data on that.
00:33:02.140 | - I read your paper.
00:33:03.220 | - Okay, but that's interesting.
00:33:05.360 | I wonder actually how often we touch lane lines
00:33:09.780 | in general, like a little bit.
00:33:11.980 | 'Cause it is--
00:33:13.460 | - I could answer that question pretty easily
00:33:14.900 | with the common data set.
00:33:15.740 | - Yeah, I'm curious.
00:33:16.820 | - I've never answered it.
00:33:17.660 | I don't know.
00:33:18.500 | I just, two is like my personal--
00:33:19.740 | - It feels right.
00:33:21.740 | That's interesting 'cause every time you touch a lane,
00:33:23.780 | that's a source of a little bit of stress
00:33:26.780 | and kind of lane keeping is removing that stress.
00:33:29.280 | - That's ultimately the biggest value add, honestly,
00:33:32.320 | is just removing the stress of having to stay in lane.
00:33:35.480 | And I think, honestly, I don't think people fully realize,
00:33:39.000 | first of all, that that's a big value add,
00:33:41.040 | but also that that's all it is.
00:33:44.960 | - And that, not only, I find it a huge value add.
00:33:48.520 | I drove down, when we moved to San Diego,
00:33:50.400 | I drove down in an Enterprise Rent-A-Car
00:33:52.640 | and I missed it, I missed having the system so much.
00:33:55.440 | It's so much more tiring to drive
00:33:59.160 | without it.
00:34:00.280 | It is that lane centering that's the key feature.
00:34:04.760 | - Yeah.
00:34:05.600 | - And in a way, it's the only feature
00:34:08.920 | that actually adds value to people's lives
00:34:11.000 | in autonomous vehicles today.
00:34:12.200 | Waymo does not add value to people's lives.
00:34:13.800 | It's a more expensive, slower Uber.
00:34:15.860 | Maybe someday it'll be this big cliff where it adds value,
00:34:18.600 | but I don't usually believe it.
00:34:19.440 | - You know, it's fascinating.
00:34:20.280 | I haven't talked to, this is good,
00:34:22.560 | 'cause I haven't, I have intuitively,
00:34:25.800 | but I think we're making it explicit now.
00:34:28.260 | I actually believe that really good lane keeping
00:34:33.260 | is a reason to buy a car, will be a reason to buy a car,
00:34:38.400 | and it's a huge value add.
00:34:39.680 | I've never, until we just started talking about it,
00:34:41.720 | I haven't really quite realized it,
00:34:43.840 | that I've felt with Elon's chase of level four
00:34:48.840 | is not the correct chase.
00:34:52.320 | It was, 'cause you should just say Tesla has the best,
00:34:55.960 | as if from a Tesla perspective,
00:34:57.580 | say Tesla has the best lane keeping.
00:35:00.580 | Kamiya I should say Kamiya has the best lane keeping,
00:35:04.140 | and that is it.
00:35:05.580 | - Yeah. - Yeah.
00:35:06.420 | So do you think?
00:35:07.980 | - You have to do the longitudinal as well.
00:35:09.900 | You can't just lane keep.
00:35:10.900 | You have to do ACC,
00:35:12.900 | but ACC is much more forgiving than lane keep,
00:35:15.780 | especially on the highway.
00:35:17.500 | - By the way, are you Kamiya's camera only, correct?
00:35:21.940 | - No, we use the radar.
00:35:23.700 | - From the car, you're able to get the, okay.
00:35:26.940 | - We can do a camera only now.
00:35:28.800 | It's gotten to the point,
00:35:29.640 | but we leave the radar there as like a, it's Fusion now.
00:35:33.420 | - Okay, so let's maybe talk through
00:35:35.440 | some of the system specs on the hardware.
00:35:37.920 | What's the hardware side of what you're providing?
00:35:42.880 | What's the capabilities on the software side
00:35:44.720 | with OpenPilot and so on?
00:35:46.780 | - So OpenPilot, as the box that we sell that it runs on,
00:35:51.780 | it's a phone in a plastic case.
00:35:54.440 | It's nothing special.
00:35:55.260 | We sell it without the software.
00:35:56.640 | So you're like, you buy the phone, it's just easy.
00:35:59.320 | It'll be easy set up, but it's sold with no software.
00:36:02.160 | OpenPilot right now is about to be 0.6.
00:36:06.980 | When it gets to 1.0,
00:36:08.240 | I think we'll be ready for a consumer product.
00:36:10.060 | We're not gonna add any new features.
00:36:11.540 | We're just gonna make the lane keeping really, really good.
00:36:14.080 | - Okay, I got it.
00:36:15.520 | - So what do we have right now?
00:36:16.600 | It's a Snapdragon 820.
00:36:18.580 | It's a Sony IMX 298 forward-facing camera.
00:36:24.960 | Driver monitoring camera,
00:36:26.040 | which is a selfie cam on the phone.
00:36:27.960 | And a CAN transceiver,
00:36:31.660 | maybe it's a little thing called Pandas.
00:36:33.840 | And they talk over USB to the phone.
00:36:36.560 | And then they have three CAN buses
00:36:37.720 | that they talk to the car.
00:36:39.020 | One of those CAN buses is the radar CAN bus.
00:36:42.280 | One of them is the main car CAN bus.
00:36:44.320 | And the other one is the proxy camera CAN bus.
00:36:46.240 | We leave the existing camera in place
00:36:48.380 | so we don't turn AEB off.
00:36:50.740 | Right now, we still turn AEB off
00:36:52.300 | if you're using our longitudinal,
00:36:53.600 | but we're gonna fix that before 1.0.
00:36:55.600 | - Got it.
00:36:56.440 | Wow, that's cool.
00:36:57.260 | So, and it's CAN both ways.
00:36:59.080 | So how are you able to control vehicles?
00:37:03.320 | - So we proxy, the vehicles that we work with
00:37:06.760 | already have a lane keeping assist system.
00:37:10.180 | So lane keeping assist can mean a huge variety of things.
00:37:13.820 | It can mean, it will apply a small torque to the wheel
00:37:17.800 | after you've already crossed a lane line by a foot,
00:37:20.960 | which is the system in the older Toyotas,
00:37:23.920 | versus like, I think Tesla still calls it
00:37:26.520 | lane keeping assist, where it'll keep you perfectly
00:37:28.840 | in the center of the lane on the highway.
00:37:31.060 | - You can control, like with a joystick, the car.
00:37:35.080 | So these cars already have the capability of drive-by-wire.
00:37:37.920 | So is it trivial to convert a car that it operates with?
00:37:42.920 | OpenPILOT is able to control the steering?
00:37:47.540 | - Oh, a new car or a car that we,
00:37:49.720 | so we have support now for 45 different makes of cars.
00:37:52.800 | - What are the cars in general?
00:37:54.880 | - Mostly Hondas and Toyotas.
00:37:56.360 | We support almost every Honda and Toyota made this year.
00:38:00.620 | And then a bunch of GMs, bunch of Subarus, bunch of Chevrolets.
00:38:04.520 | - So it doesn't have to be like a Prius,
00:38:06.000 | it could be Corolla as well.
00:38:08.120 | - The 2020 Corolla is the best car with OpenPILOT.
00:38:10.760 | It just came out there.
00:38:11.720 | The actuator has less lag than the older Corolla.
00:38:14.160 | - I think I started watching a video with your,
00:38:18.200 | I mean, the way you make videos is awesome.
00:38:21.080 | (laughing)
00:38:21.920 | You're just literally at the dealership streaming.
00:38:24.320 | - Yeah, I had my friend on the phone,
00:38:26.040 | I'm like, bro, you wanna stream for an hour?
00:38:27.520 | - Yeah, and basically, like if stuff goes a little wrong,
00:38:31.080 | you just like, you just go with it.
00:38:33.120 | Yeah, I love it.
00:38:33.960 | - Well, it's real.
00:38:34.780 | - Yeah, it's real.
00:38:35.620 | That's so beautiful and it's so in contrast
00:38:39.760 | to the way other companies would put together
00:38:43.760 | a video like that.
00:38:44.680 | - Kind of why I like to do it like that.
00:38:46.080 | - Good.
00:38:46.920 | - And if you become super rich one day and successful,
00:38:49.840 | I hope you keep it that way
00:38:50.800 | because I think that's actually what people love,
00:38:53.200 | that kind of genuine.
00:38:54.760 | - Oh, it's all that has value to me.
00:38:56.120 | - Yeah.
00:38:56.960 | - Money has no, if I sell out to like make money,
00:38:59.920 | I sold out, it doesn't matter.
00:39:01.360 | What do I get, a yacht?
00:39:02.400 | I don't want a yacht.
00:39:03.400 | - And I think Tesla's actually has a small inkling
00:39:09.240 | of that as well with Autonomy Day.
00:39:11.340 | They did reveal more than, I mean, of course,
00:39:14.080 | there's marketing communications, you can tell,
00:39:15.760 | but it's more than most companies would reveal,
00:39:17.720 | which is, I hope they go towards that direction more,
00:39:21.440 | other companies, GM, Ford.
00:39:23.080 | - Oh, Tesla's gonna win level five.
00:39:25.440 | They really are.
00:39:26.600 | - So let's talk about it.
00:39:27.840 | You think, you're focused on level two currently.
00:39:32.840 | - We're gonna be one to two years
00:39:35.240 | behind Tesla getting to level five.
00:39:37.200 | - Okay.
00:39:38.520 | - We're Android, right?
00:39:39.360 | We're Android.
00:39:40.200 | - You're Android.
00:39:41.020 | - I'm just saying once Tesla gets it,
00:39:42.280 | we're one to two years behind.
00:39:43.800 | I'm not making any timeline on when Tesla's gonna be.
00:39:45.720 | - That's right, you did, that's brilliant.
00:39:47.000 | - I'm sorry, Tesla investors,
00:39:48.400 | if you think you're gonna have an autonomous robo-taxi fleet
00:39:50.800 | by the end of the year.
00:39:52.440 | - Yes.
00:39:53.280 | - I'll bet against that.
00:39:54.960 | - So what do you think about this?
00:39:57.720 | The most level four companies
00:39:59.840 | are kind of just doing their usual safety driver,
00:40:07.000 | doing full autonomy kind of testing,
00:40:08.800 | and then Tesla does basically trying to go
00:40:12.000 | from lane keeping to full autonomy.
00:40:15.600 | What do you think about that approach?
00:40:16.840 | How successful would it be?
00:40:18.400 | - It's a ton better approach
00:40:20.720 | because Tesla is gathering data on a scale
00:40:23.980 | that none of them are.
00:40:25.240 | They're putting real users behind the wheel of the cars.
00:40:29.560 | It's, I think, the only strategy that works,
00:40:32.260 | the incremental.
00:40:34.480 | - Well, so there's a few components to Tesla approach
00:40:36.960 | that's more than just the incremental.
00:40:38.800 | It's what you spoke with is the ones, the software,
00:40:41.440 | so over the air software updates.
00:40:43.760 | - Necessity.
00:40:44.840 | I mean, Waymo and Cruise have those too.
00:40:46.440 | Those aren't.
00:40:47.280 | Those differentiate them from the automakers.
00:40:49.840 | - Right, no lane keeping systems have,
00:40:52.000 | no cars with lane keeping system have that except Tesla.
00:40:54.800 | - Yeah.
00:40:55.760 | - And the other one is the data, the other direction,
00:40:59.800 | which is the ability to query the data.
00:41:01.880 | I don't think they're actually collecting
00:41:03.520 | as much data as people think,
00:41:04.520 | but the ability to turn on collection and turn it off.
00:41:08.140 | So I'm both in the robotics world
00:41:12.080 | and the psychology human factors world.
00:41:15.040 | Many people believe that level two autonomy is problematic
00:41:18.500 | because of the human factor.
00:41:20.080 | Like the more the task is automated,
00:41:23.340 | the more there's a vigilance decrement.
00:41:26.040 | You start to fall asleep, you start to become complacent,
00:41:28.560 | start texting more and so on.
00:41:30.520 | Do you worry about that?
00:41:32.280 | 'Cause if we're talking about transition
00:41:33.620 | from lane keeping to full autonomy,
00:41:36.660 | if you're spending 80% of the time
00:41:41.000 | not supervising the machine,
00:41:42.840 | do you worry about what that means
00:41:45.480 | to the safety of the drivers?
00:41:47.140 | - One, we don't consider OpenPilot to be 1.0
00:41:49.640 | until we have 100% driver monitoring.
00:41:51.660 | You can cheat right now, our driver monitoring system.
00:41:55.040 | There's a few ways to cheat it.
00:41:56.100 | They're pretty obvious.
00:41:57.240 | We're working on making that better.
00:41:59.720 | Before we ship a consumer product that can drive cars,
00:42:02.580 | I want to make sure that I have driver monitoring
00:42:04.240 | that you can't cheat.
00:42:05.480 | - What's like a successful driver monitoring system
00:42:07.360 | look like?
00:42:08.200 | Is it all about just keeping your eyes on the road?
00:42:11.680 | - Well, a few things.
00:42:12.760 | So that's what we went with at first for driver monitoring.
00:42:16.640 | I'm checking, I'm actually looking
00:42:18.000 | at where your head is looking.
00:42:19.040 | The camera's not that high resolution.
00:42:20.440 | Eyes are a little bit hard to get.
00:42:21.880 | - Well, head is big.
00:42:22.920 | I mean, that's-- - Head is good.
00:42:24.640 | And actually a lot of it, just psychology wise,
00:42:28.740 | to have that monitor constantly there,
00:42:30.760 | it reminds you that you have to be paying attention.
00:42:33.480 | But we want to go further.
00:42:35.120 | We just hired someone full time
00:42:36.400 | to come on to do the driver monitoring.
00:42:38.000 | I want to detect phone in frame
00:42:40.400 | and I want to make sure you're not sleeping.
00:42:42.640 | - How much does the camera see of the body?
00:42:44.800 | - This one, not enough.
00:42:47.480 | - Not enough.
00:42:48.440 | - The next one, everything.
00:42:50.800 | - Well, it's interesting, FishEye,
00:42:52.000 | 'cause we're doing just data collection, not real time.
00:42:55.240 | But FishEye is a beautiful,
00:42:57.640 | being able to capture the body.
00:42:59.080 | And the smartphone is really like the biggest problem.
00:43:03.320 | - I'll show you, I can show you one of the pictures
00:43:05.040 | from our new system.
00:43:07.880 | Awesome, so you're basically saying
00:43:09.680 | the driver monitoring will be the answer to that.
00:43:13.160 | - I think the other point that you raised in your paper
00:43:15.360 | is good as well.
00:43:17.000 | You're not asking a human to supervise a machine
00:43:20.480 | without giving them the, they can take over at any time.
00:43:23.240 | - Right.
00:43:24.080 | - Our safety model, you can take over.
00:43:25.800 | We disengage on both the gas or the brake.
00:43:28.000 | We don't disengage on steering, I don't feel you have to.
00:43:30.040 | But we disengage on gas or brake.
00:43:31.800 | So it's very easy for you to take over
00:43:34.320 | and it's very easy for you to re-engage.
00:43:36.440 | That switching should be super cheap.
00:43:39.400 | The cars that require,
00:43:40.240 | even autopilot requires a double press.
00:43:42.440 | That's almost, I see I don't like that.
00:43:44.400 | And then the cancel, to cancel in autopilot,
00:43:48.080 | you either have to press cancel,
00:43:49.040 | which no one knows what that is, so they press the brake.
00:43:51.040 | But a lot of times you don't actually wanna press the brake.
00:43:53.360 | You wanna press the gas, so you should cancel on gas.
00:43:55.920 | Or wiggle the steering wheel, which is bad as well.
00:43:57.960 | - Wow, that's brilliant.
00:43:58.920 | I haven't heard anyone articulate that point.
00:44:01.440 | - Oh, this is all I think about.
00:44:05.000 | - 'Cause I think, I think actually Tesla
00:44:08.680 | has done a better job than most automakers
00:44:10.960 | at making that frictionless.
00:44:12.940 | But you just described that it could be even better.
00:44:15.540 | - I love Super Cruise as an experience once it's engaged.
00:44:21.160 | I don't know if you've used it,
00:44:22.040 | but getting the thing to try to engage.
00:44:24.040 | - Yeah, I've used the, I've driven Super Cruise a lot.
00:44:27.520 | So what's your thoughts on the Super Cruise system in general?
00:44:29.480 | - You disengage Super Cruise and it falls back to ACC.
00:44:32.680 | So my car's like still accelerating.
00:44:34.640 | It feels weird.
00:44:36.280 | Otherwise, when you actually have Super Cruise engaged
00:44:39.040 | on the highway, it is phenomenal.
00:44:41.200 | We bought that Cadillac.
00:44:42.320 | We just sold it, but we bought it just to experience this.
00:44:45.600 | And I wanted everyone in the office to be like,
00:44:47.320 | this is what we're striving to build.
00:44:49.360 | GM pioneering with the driver monitoring.
00:44:51.520 | - You like their driver monitoring system?
00:44:55.000 | - It has some bugs.
00:44:56.400 | If there's a sun shining back here, it'll be blind to you.
00:45:00.260 | But overall, mostly, yeah.
00:45:03.320 | - That's so cool that you know all this stuff.
00:45:05.920 | I don't often talk to people that,
00:45:08.480 | 'cause it's such a rare car, unfortunately, currently.
00:45:10.960 | - We bought one explicitly for this.
00:45:12.680 | We lost like 25K in the deprecation,
00:45:15.000 | but I feel it was worth it.
00:45:16.680 | - I was very pleasantly surprised
00:45:19.080 | that GM system was so innovative
00:45:22.360 | and really wasn't advertised much,
00:45:26.320 | wasn't talked about much.
00:45:28.480 | And I was nervous that it would die,
00:45:30.440 | that it would disappear.
00:45:31.840 | - Well, they put it on the wrong car.
00:45:33.520 | They should have put it on the Bolt
00:45:34.560 | and not some weird Cadillac that nobody bought.
00:45:36.640 | - I think that's gonna be into,
00:45:38.440 | they're saying at least,
00:45:39.560 | it's gonna be into their entire fleet.
00:45:41.840 | So what do you think about,
00:45:43.800 | as long as we're on the driver monitoring,
00:45:45.960 | what do you think about Elon Musk's claim
00:45:49.280 | that driver monitoring is not needed?
00:45:51.920 | - Normally, I love his claims.
00:45:53.720 | That one is stupid.
00:45:55.560 | That one is stupid.
00:45:56.560 | And he's not gonna have his level five fleet
00:46:00.320 | by the end of the year.
00:46:01.320 | Hopefully he's like,
00:46:03.120 | "Okay, I was wrong.
00:46:04.840 | "I'm gonna add driver monitoring."
00:46:06.280 | Because when these systems get to the point
00:46:08.240 | that they're only messing up once every thousand miles,
00:46:10.320 | you absolutely need driver monitoring.
00:46:12.220 | - So let me play, 'cause I agree with you,
00:46:15.880 | but let me play devil's advocate.
00:46:17.320 | One possibility is that without driver monitoring,
00:46:22.320 | people are able to monitor,
00:46:24.680 | self-regulate, monitor themselves.
00:46:27.740 | Your idea is--
00:46:30.640 | - You've seen all the people sleeping in Teslas?
00:46:33.000 | - Yeah.
00:46:35.320 | Well, I'm a little skeptical
00:46:37.400 | of all the people sleeping in Teslas,
00:46:38.920 | because I've stopped paying attention
00:46:43.560 | to that kind of stuff,
00:46:44.400 | because I wanna see real data.
00:46:45.720 | It's too much glorified.
00:46:47.280 | It doesn't feel scientific to me.
00:46:48.740 | So I wanna know how many people
00:46:51.640 | are really sleeping in Teslas versus sleeping.
00:46:54.720 | I was driving here, sleep deprived,
00:46:57.640 | in a car with no automation.
00:46:59.520 | I was falling asleep.
00:47:01.080 | - I agree that it's hypey.
00:47:02.120 | It's just like, you know what?
00:47:04.860 | If you wanna put driver monitoring,
00:47:06.520 | I ran into, my last autopilot experience
00:47:08.520 | was I ran into a Model 3 in March and drove it around.
00:47:12.200 | The wheel thing is annoying.
00:47:13.560 | And the reason the wheel thing is annoying,
00:47:15.400 | we use the wheel thing as well,
00:47:16.780 | but we don't disengage on wheel.
00:47:18.720 | For Tesla, you have to touch the wheel just enough
00:47:21.640 | to trigger the torque sensor,
00:47:23.480 | to tell it that you're there,
00:47:25.320 | but not enough as to disengage it,
00:47:28.360 | which don't use it for two things.
00:47:30.400 | Don't disengage on wheel.
00:47:31.360 | You don't have to.
00:47:32.360 | - That whole experience.
00:47:33.400 | Wow, beautifully put.
00:47:35.360 | All of those elements,
00:47:36.340 | even if you don't have driver monitoring,
00:47:38.240 | that whole experience needs to be better.
00:47:41.080 | - Driver monitoring, I think, would make,
00:47:43.720 | I mean, I think Super Cruise is a better experience
00:47:46.160 | once it's engaged over autopilot.
00:47:48.360 | I think Super Cruise's transition
00:47:50.920 | to engagement and disengagement are significantly worse.
00:47:54.100 | - There's a tricky thing,
00:47:56.400 | because if I were to criticize Super Cruise,
00:47:58.640 | it's a little too crude.
00:48:00.800 | And I think it's like six seconds or something.
00:48:03.640 | If you look off road, it'll start warning you.
00:48:06.060 | It's some ridiculously long period of time.
00:48:09.080 | And just the way,
00:48:11.400 | I think it's basically, it's a binary.
00:48:15.800 | - It should be adapted.
00:48:17.440 | - Yeah, it needs to learn more about you.
00:48:19.880 | It needs to communicate what it sees about you more.
00:48:24.440 | - If Tesla shows what it sees about the external world,
00:48:27.160 | it would be nice if Super Cruise would tell us
00:48:29.120 | what it sees about the internal world.
00:48:30.840 | - It's even worse than that.
00:48:31.960 | You press the button to engage,
00:48:33.320 | and it just says, "Super Cruise unavailable."
00:48:35.480 | - Yeah. - Why?
00:48:36.320 | - Why?
00:48:37.800 | Yeah, that transparency is good.
00:48:41.480 | - We've renamed the driver monitoring packet
00:48:43.520 | to driver state.
00:48:45.360 | - Driver state.
00:48:46.280 | - We have car state packet, which has the state of the car,
00:48:48.360 | and we have driver state packet,
00:48:49.480 | which has the state of the driver.
00:48:51.040 | - So what is the-- - Estimate their BAC.
00:48:54.120 | - What's BAC?
00:48:54.960 | - Blood alcohol content?
00:48:56.000 | (laughing)
00:48:57.320 | - You think that's possible with computer vision?
00:48:59.200 | - Absolutely.
00:49:00.040 | - To me, it's an open question.
00:49:04.480 | I haven't looked into it too much.
00:49:06.520 | Actually, I quite seriously looked at the literature.
00:49:08.480 | It's not obvious to me that from the eyes and so on,
00:49:10.880 | you can tell.
00:49:11.720 | - You might need stuff in the car as well.
00:49:13.480 | You might need how they're controlling the car, right?
00:49:15.800 | And that's fundamentally, at the end of the day,
00:49:17.400 | what you care about.
00:49:18.680 | But I think, especially when people are really drunk,
00:49:21.680 | they're not controlling the car nearly as smoothly
00:49:23.680 | as they would look at them walking, right?
00:49:25.520 | The car is like an extension of the body.
00:49:27.280 | So I think you could totally detect.
00:49:29.440 | And if you could fix people who are drunk,
00:49:30.920 | distracted, asleep, if you fix those three--
00:49:32.840 | - Yeah, that's huge.
00:49:35.520 | So what are the current limitations of OpenPILOT?
00:49:38.280 | What are the main problems that still need to be solved?
00:49:41.800 | - We're hopefully fixing a few of them in 0.6.
00:49:45.480 | We're not as good as Autopilot at stop cars.
00:49:49.480 | So if you're coming up to a red light at like 55,
00:49:54.280 | so it's the radar stopped car problem,
00:49:56.920 | which is responsible for two Autopilot accidents,
00:49:59.240 | it's hard to differentiate a stopped car
00:50:01.560 | from a signpost.
00:50:03.680 | - Yeah, a static object.
00:50:05.360 | - So you have to fuse.
00:50:06.400 | You have to do this visually.
00:50:07.520 | There's no way from the radar data to tell the difference.
00:50:09.600 | Maybe you can make a map,
00:50:10.680 | but I don't really believe in mapping at all anymore.
00:50:13.880 | - Wait, wait, wait, what?
00:50:14.960 | You don't believe in mapping?
00:50:16.080 | - No.
00:50:16.920 | - So you're basically, the OpenPILOT solution is saying,
00:50:21.120 | react to the environment as you see it,
00:50:22.480 | just like human beings do.
00:50:24.400 | - And then eventually when you want to do navigate
00:50:26.200 | on OpenPILOT, I'll train the net to look at ways.
00:50:30.400 | I'll run ways in the background.
00:50:31.360 | I'll train a comp that on ways.
00:50:32.200 | - Are you using GPS at all?
00:50:33.520 | - We use it to ground truth.
00:50:35.920 | We use it to very carefully ground truth the paths.
00:50:38.280 | We have a stack which can recover relative
00:50:40.560 | to 10 centimeters over one minute.
00:50:42.640 | And then we use that to ground truth
00:50:44.160 | exactly where the car went
00:50:45.640 | in that local part of the environment, but it's all local.
00:50:48.720 | - How are you testing in general, just for yourself,
00:50:50.520 | like experiments and stuff?
00:50:51.920 | Where are you located?
00:50:54.920 | - San Diego.
00:50:55.760 | - San Diego.
00:50:56.600 | - Yeah.
00:50:57.440 | - Okay.
00:50:58.280 | What, so you basically drive around there
00:51:00.440 | and collect some data and watch performance?
00:51:03.000 | - We have a simulator now.
00:51:04.360 | And we have, our simulator is really cool.
00:51:06.400 | Our simulator is not,
00:51:08.080 | it's not like a Unity-based simulator.
00:51:09.680 | Our simulator lets us load in real estate.
00:51:11.800 | - What do you mean?
00:51:13.680 | - We can load in a drive and simulate what the system
00:51:17.880 | would have done on the historical data.
00:51:20.280 | - Ooh, nice.
00:51:21.480 | Interesting.
00:51:23.480 | So what, yeah.
00:51:24.320 | - Right now we're only using it for testing,
00:51:26.080 | but as soon as we start using it for training, that's it.
00:51:29.200 | That's all.
00:51:30.040 | - So just for testing.
00:51:30.880 | What's your feeling about the real world versus simulation?
00:51:33.040 | Do you like simulation for training?
00:51:34.320 | If this moves to training.
00:51:35.720 | - So we have to distinguish two types of simulators, right?
00:51:40.040 | There's a simulator that is completely fake.
00:51:44.720 | I could get my car to drive around in GTA.
00:51:46.760 | I feel that this kind of simulator is useless.
00:51:51.060 | You're never, there's so many,
00:51:53.640 | my analogy here is like, okay, fine.
00:51:57.000 | You're not solving the computer vision problem,
00:51:59.940 | but you're solving the computer graphics problem.
00:52:02.440 | - Right, and you don't think you can get very far
00:52:04.640 | by creating ultra-realistic graphics?
00:52:08.040 | - No, because you can create ultra-realistic graphics
00:52:10.360 | of the road, now create ultra-realistic behavioral models
00:52:13.200 | of the other cars.
00:52:14.620 | Oh, well, I'll just use my self-driving.
00:52:16.960 | No, you won't.
00:52:18.320 | You need real, you need actual human behavior
00:52:21.680 | because that's what you're trying to learn.
00:52:23.920 | Driving does not have a spec.
00:52:25.880 | The definition of driving is what humans do when they drive.
00:52:29.080 | Whatever Waymo does, I don't think it's driving.
00:52:32.800 | - Right, well, I think actually Waymo and others,
00:52:36.400 | if there's any use for reinforcement learning,
00:52:38.960 | I've seen it used quite well.
00:52:40.360 | I study pedestrians a lot too,
00:52:41.680 | is try to train models from real data
00:52:44.400 | of how pedestrians move
00:52:45.560 | and try to use reinforcement learning models
00:52:47.280 | to make pedestrians move in human-like ways.
00:52:50.080 | - By that point, you've already gone so many layers.
00:52:53.520 | You detected a pedestrian?
00:52:55.680 | Did you hand code the feature vector of their state?
00:53:00.240 | Did you guys learn anything from computer vision
00:53:02.920 | before deep learning?
00:53:04.600 | - Well, okay, I feel like this is--
00:53:07.160 | - So perception to you is the sticking point.
00:53:10.880 | I mean, what's the hardest part of the stack here?
00:53:13.800 | - There is no human understandable feature vector
00:53:18.800 | separating perception and planning.
00:53:22.000 | That's the best way I can put that.
00:53:25.160 | - There is no, so it's all together
00:53:26.840 | and it's a joint problem.
00:53:29.600 | - So you can take localization.
00:53:31.480 | Localization and planning,
00:53:32.960 | there is a human understandable feature vector
00:53:34.760 | between these two things.
00:53:36.000 | I mean, okay, so I have like three degrees position,
00:53:38.720 | three degrees orientation and those derivatives,
00:53:40.560 | maybe those second derivatives, right?
00:53:42.000 | That's human understandable, that's physical.
00:53:44.560 | The between perception and planning.
00:53:48.560 | So like Waymo has a perception stack and then a planner.
00:53:52.760 | And one of the things Waymo does right
00:53:55.600 | is they have a simulator that can separate those two.
00:54:00.000 | They can like replay their perception data
00:54:02.920 | and test their system,
00:54:03.920 | which is what I'm talking about
00:54:04.880 | about like the two different kinds of simulators.
00:54:06.520 | There's the kind that can work on real data
00:54:08.240 | and there's the kind that can't work on real data.
00:54:10.940 | Now, the problem is that I don't think
00:54:13.880 | you can hand code a feature vector, right?
00:54:16.160 | Like you have some list of like,
00:54:17.440 | oh, here's my list of cars in the scenes.
00:54:19.040 | Here's my list of pedestrians in the scene.
00:54:21.280 | This isn't what humans are doing.
00:54:23.260 | - What are humans doing?
00:54:24.920 | - Global.
00:54:25.760 | Some--
00:54:28.080 | - And you're saying that's too difficult to hand engineer.
00:54:31.960 | - I'm saying that there is no state vector.
00:54:34.120 | Given a perfect, I could give you
00:54:36.080 | the best team of engineers in the world
00:54:37.360 | to build a perception system
00:54:38.520 | and the best team to build a planner.
00:54:40.640 | All you have to do is define the state vector
00:54:42.660 | that separates those two.
00:54:43.960 | - I'm missing the state vector that separates those two.
00:54:48.580 | What do you mean?
00:54:49.420 | - So what is the output of your perception system?
00:54:52.060 | - Output of the perception system, it's,
00:54:57.360 | there's, okay, well, there's several ways to do it.
00:55:01.640 | One is the SLAM components localization.
00:55:03.800 | The other is drivable area, drivable space.
00:55:05.840 | - Drivable space, yep.
00:55:06.680 | - And then there's the different objects in the scene.
00:55:08.880 | - Yep.
00:55:09.720 | - And different objects in the scene over time,
00:55:15.360 | maybe to give you input to then try to start
00:55:18.640 | modeling the trajectories of those objects.
00:55:21.520 | - Sure.
00:55:22.340 | - That's it.
00:55:23.180 | - I can give you a concrete example of something you missed.
00:55:25.120 | - What's that?
00:55:25.960 | - So say there's a bush in the scene.
00:55:28.600 | Humans understand that when they see this bush
00:55:30.880 | that there may or may not be a car behind that bush.
00:55:34.640 | Drivable area and a list of objects does not include that.
00:55:37.240 | Humans are doing this constantly
00:55:38.880 | at the simplest intersections.
00:55:40.880 | So now you have to talk about occluded area.
00:55:43.840 | - Right.
00:55:44.680 | - Right, but even that, what do you mean by occluded?
00:55:47.760 | Okay, so I can't see it.
00:55:49.600 | Well, if it's the other side of a house, I don't care.
00:55:51.800 | What's the likelihood that there's a car
00:55:53.560 | in that occluded area, right?
00:55:55.240 | And if you say, okay, we'll add that,
00:55:58.040 | I can come up with 10 more examples that you can't add.
00:56:01.640 | - Certainly occluded area would be something
00:56:03.900 | that a simulator would have
00:56:05.520 | because it's simulating the entire, you know,
00:56:08.180 | occlusion is part of it.
00:56:11.280 | - Occlusion is part of a vision stack.
00:56:12.640 | But what I'm saying is if you have a hand-engineered,
00:56:16.580 | if your perception system output can be written
00:56:20.040 | in a spec document, it is incomplete.
00:56:22.240 | - Yeah, I mean, certainly it's hard to argue with that
00:56:27.780 | because in the end, that's going to be true.
00:56:30.160 | - Yes, and I'll tell you what the output
00:56:31.800 | of our perception system is.
00:56:32.640 | - What's that?
00:56:33.480 | - It's a 1024 dimensional vector, trained by a neural net.
00:56:38.000 | - Oh, you mean that?
00:56:38.840 | - No, that's the 1024 dimensions of who knows what.
00:56:42.000 | - Because it's operating on real data.
00:56:45.160 | - Yeah.
00:56:46.000 | And that's the perception.
00:56:48.400 | That's the perception state, right?
00:56:50.440 | Think about an autoencoder for faces, right?
00:56:53.560 | If you have an autoencoder for faces
00:56:54.800 | and you say it has 256 dimensions in the middle,
00:56:59.800 | and I'm taking a face over here
00:57:00.780 | and projecting it to a face over here.
00:57:02.900 | Can you hand label all 256 of those dimensions?
00:57:05.460 | - Well, no, but those are generated automatically.
00:57:09.380 | - But even if you tried to do it by hand,
00:57:11.440 | could you come up with a spec
00:57:13.220 | between your encoder and your decoder?
00:57:16.600 | - No, because that's how it is.
00:57:19.260 | It wasn't designed, but there--
00:57:20.780 | - No, no, no, but if you could design it,
00:57:23.700 | if you could design a face reconstructor system,
00:57:26.560 | could you come up with a spec?
00:57:28.160 | - No, but I think we're missing here a little bit.
00:57:32.400 | I think you're just being very poetic
00:57:35.120 | about expressing a fundamental problem of simulators,
00:57:37.960 | that they're going to be missing so much
00:57:41.640 | that the feature vectors
00:57:44.800 | will just look fundamentally different
00:57:47.120 | from in the simulated world than the real world.
00:57:51.320 | - I'm not making a claim about simulators.
00:57:53.860 | I'm making a claim about the spec division
00:57:57.140 | between perception and planning, even in your system.
00:58:00.860 | - Just in general.
00:58:02.100 | - Just in general.
00:58:03.380 | If you're trying to build a car that drives,
00:58:05.700 | if you're trying to hand code
00:58:07.340 | the output of your perception system,
00:58:08.780 | like saying, "Here's a list of all the cars in the scene.
00:58:10.940 | "Here's a list of all the people.
00:58:11.880 | "Here's a list of the included areas.
00:58:13.060 | "Here's a vector of drivable areas," it's insufficient.
00:58:16.620 | And if you start to believe that,
00:58:18.020 | you realize that what Waymo and Cruise
00:58:19.380 | are doing is impossible.
00:58:20.900 | - Currently, what we're doing is the perception problem
00:58:24.340 | is converting the scene into a chessboard.
00:58:28.200 | And then you reason some basic reasoning
00:58:31.740 | around that chessboard.
00:58:33.460 | And you're saying that really there's a lot missing there.
00:58:36.840 | First of all, why are we talking about this?
00:58:40.300 | 'Cause isn't this a full autonomy?
00:58:42.860 | Is this something you think about?
00:58:44.740 | - Oh, I want to win self-driving cars.
00:58:47.140 | - A full, so you're really thinking,
00:58:48.460 | so your definition of win includes--
00:58:52.020 | - Level four or five. - Level five.
00:58:53.660 | - I don't think level four is a real thing.
00:58:55.820 | I want to build the AlphaGo of driving.
00:58:59.740 | - So AlphaGo is really end-to-end.
00:59:06.140 | - Yeah.
00:59:07.060 | - Is, yeah, it's end-to-end.
00:59:09.860 | And do you think this whole problem,
00:59:12.500 | is that also kind of what you're getting at
00:59:14.740 | with the perception and the planning?
00:59:16.700 | Is that this whole problem, the right way to do it
00:59:19.500 | is really to learn the entire thing.
00:59:21.660 | - I'll argue that not only is it the right way,
00:59:23.740 | it's the only way that's gonna exceed human performance.
00:59:27.720 | - Well-- - It's certainly true for Go.
00:59:30.040 | Everyone who tried to hand code Go things
00:59:31.580 | built human inferior things.
00:59:33.540 | And then someone came along and wrote some 10,000 line thing
00:59:36.260 | that doesn't know anything about Go that beat everybody.
00:59:39.060 | It's 10,000 lines.
00:59:41.180 | - True, in that sense, the open question then
00:59:45.620 | that maybe I can ask you is,
00:59:47.960 | driving is much harder than Go.
00:59:53.500 | The open question is how much harder?
00:59:55.420 | So how, 'cause I think the Elon Musk approach here
00:59:59.500 | with planning and perception
01:00:01.180 | is similar to what you're describing,
01:00:02.980 | which is really turning into
01:00:04.420 | not some kind of modular thing,
01:00:08.300 | but really do formulate it as a learning problem
01:00:11.100 | and solve the learning problem with scale.
01:00:13.380 | So how many years,
01:00:15.780 | how many years would it take to solve this problem
01:00:18.860 | or just how hard is this freaking problem?
01:00:21.700 | - Well, the cool thing is,
01:00:24.540 | I think there's a lot of value
01:00:27.800 | that we can deliver along the way.
01:00:29.800 | I think that you can build
01:00:33.180 | lane-keeping assist, actually,
01:00:37.300 | plus adaptive cruise control,
01:00:39.940 | plus, okay, looking at ways,
01:00:42.900 | extends to all of driving.
01:00:46.020 | - Yeah, most of driving, right?
01:00:47.940 | - Oh, your adaptive cruise control
01:00:49.100 | treats red lights like cars, okay.
01:00:51.220 | - So let's jump around.
01:00:52.980 | You mentioned that you didn't navigate an autopilot.
01:00:55.780 | What advice, how would you make it better?
01:00:57.780 | Do you think as a feature that
01:00:59.220 | if it's done really well, it's a good feature?
01:01:02.360 | - I think that it's too reliant on hand-coded hacks
01:01:07.360 | for how does navigate an autopilot do a lane change?
01:01:10.420 | It actually does the same lane change
01:01:12.700 | every time and it feels mechanical.
01:01:14.300 | Humans do different lane changes.
01:01:15.860 | Humans sometimes will do a slow one,
01:01:17.380 | sometimes do a fast one.
01:01:18.900 | Navigate an autopilot, at least every time I use it,
01:01:20.860 | it did the identical lane change.
01:01:23.060 | - How do you learn?
01:01:24.260 | I mean, this is a fundamental thing, actually,
01:01:26.780 | is the braking and then accelerating,
01:01:30.380 | something that's still,
01:01:31.900 | Tesla probably does it better than most cars,
01:01:34.660 | but it still doesn't do a great job
01:01:36.780 | of creating a comfortable, natural experience.
01:01:39.940 | And navigate on autopilot is just lane changes
01:01:42.660 | and extension of that.
01:01:44.100 | So how do you learn to do a natural lane change?
01:01:49.100 | - So we have it and I can talk about how it works.
01:01:53.020 | So I feel that we have the solution for lateral,
01:01:58.020 | we don't yet have the solution for longitudinal.
01:02:00.740 | There's a few reasons longitudinal is harder than lateral.
01:02:03.460 | The lane change component,
01:02:05.260 | the way that we train on it very simply
01:02:08.100 | is like our model has an input
01:02:10.940 | for whether it's doing a lane change or not.
01:02:14.140 | And then when we train the end-to-end model,
01:02:16.460 | we hand label all the lane changes, 'cause you have to.
01:02:19.620 | I've struggled a long time about not wanting to do that,
01:02:22.500 | but I think you have to.
01:02:24.300 | - Or the training data.
01:02:25.380 | - For the training data, right?
01:02:26.580 | Oh, we actually, we have an automatic ground truther,
01:02:28.420 | which automatically labels all the lane changes.
01:02:30.660 | - Was that possible?
01:02:31.740 | - To automatically label lane changes?
01:02:32.820 | - Yeah.
01:02:33.660 | - Yeah, detect the lane, I see when it crosses it, right?
01:02:34.860 | And I don't have to get that high percent accuracy,
01:02:36.740 | but it's like 95, good enough.
01:02:38.140 | - Okay.
01:02:39.020 | - Now I set the bit when it's doing the lane change
01:02:43.260 | in the end-to-end learning.
01:02:44.900 | And then I set it to zero when it's not doing a lane change.
01:02:47.980 | So now if I wanted to do a lane change at test time,
01:02:49.780 | I just put the bit to a one and it'll do a lane change.
01:02:52.420 | - Yeah, but so if you look at the space of lane change,
01:02:54.700 | you know, some percentage, not 100% that we make as humans
01:02:59.220 | is not a pleasant experience,
01:03:01.180 | 'cause we messed some part of it up.
01:03:02.700 | - Yeah.
01:03:03.540 | - It's nerve-wracking to change,
01:03:04.380 | you have to look, you have to see,
01:03:05.700 | you have to accelerate.
01:03:06.980 | - How do we label the ones that are natural and feel good?
01:03:09.980 | You know, that's the,
01:03:11.660 | 'cause that's your ultimate criticism,
01:03:13.420 | the current Navigator and Autopilot just doesn't feel good.
01:03:16.980 | - Well, the current Navigator and Autopilot
01:03:18.500 | is a hand-coded policy written by an engineer in a room
01:03:21.700 | who probably went out and tested it a few times on the 280.
01:03:25.060 | - Probably a more, a better version of that, but yes.
01:03:29.460 | - That's how we would have written it at the company, yeah.
01:03:31.100 | - Yeah, yeah.
01:03:31.940 | - Maybe Tesla, they tested it in--
01:03:33.460 | - That might've been two engineers.
01:03:35.060 | - Two engineers, yeah.
01:03:36.460 | Um, no, but, so if you learn the lane change,
01:03:40.060 | if you learn how to do a lane change from data,
01:03:42.420 | just like you have a label that says lane change
01:03:44.660 | and then you put it in when you want it to do the lane change
01:03:47.980 | it'll automatically do the lane change
01:03:49.620 | that's appropriate for the situation.
01:03:51.580 | Now, to get at the problem of
01:03:54.020 | some humans do bad lane changes,
01:03:55.900 | we haven't worked too much on this problem yet.
01:03:59.900 | It's not that much of a problem in practice.
01:04:03.100 | My theory is that all good drivers
01:04:05.140 | are good in the same way
01:04:06.140 | and all bad drivers are bad in different ways.
01:04:08.440 | And we've seen some data to back this up.
01:04:11.260 | - Well, beautifully put.
01:04:12.380 | So you just basically, if that's true, hypothesis,
01:04:16.540 | then your task is to discover the good drivers.
01:04:19.860 | - The good drivers stand out because they're in one cluster
01:04:23.300 | and the bad drivers are scattered all over the place
01:04:25.140 | and your net learns the cluster.
01:04:27.220 | - Yeah, that's, so you just learn from the good drivers
01:04:30.740 | and they're easy to cluster.
01:04:32.140 | - In fact, we learned from all of them
01:04:33.980 | and the net automatically learns the policy
01:04:35.820 | that's like the majority.
01:04:36.900 | But we'll eventually probably have to filter them out.
01:04:38.540 | - If that theory is true, I hope it's true.
01:04:41.580 | 'Cause the counter theory is there is many clusters,
01:04:46.460 | maybe arbitrarily many clusters of good drivers.
01:04:53.700 | 'Cause if there's one cluster of good drivers,
01:04:55.820 | you can at least discover a set of policies.
01:04:57.580 | You can learn a set of policies,
01:04:58.980 | which would be good universally.
01:05:01.660 | - That would be nice if it's true.
01:05:04.580 | And you're saying that there is some evidence that--
01:05:06.580 | - Let's say lane changes can be clustered
01:05:08.700 | into four clusters.
01:05:10.540 | - There's this finite level--
01:05:12.060 | - I would argue that all four of those are good clusters.
01:05:15.260 | All the things that are random are noise and probably bad.
01:05:18.460 | And which one of the four you pick,
01:05:20.380 | or maybe it's 10 or maybe it's 20.
01:05:21.900 | - You can learn that.
01:05:22.740 | - It's context dependent.
01:05:23.820 | It depends on the scene.
01:05:25.020 | - And the hope is it's not too dependent on the driver.
01:05:31.400 | - Yeah, the hope is that it all washes out.
01:05:34.240 | The hope is that the distribution's not bimodal.
01:05:36.960 | The hope is that it's a nice Gaussian.
01:05:39.080 | - So what advice would you give to Tesla,
01:05:41.640 | how to fix, how to improve navigating an autopilot?
01:05:45.000 | That's the lessons that you've learned from CalmAI.
01:05:48.240 | - The only real advice I would give to Tesla
01:05:50.560 | is please put driver monitoring in your cars.
01:05:52.940 | With respect to improving it--
01:05:54.760 | - But you can't do that anymore.
01:05:55.960 | I said I'd interrupt.
01:05:57.280 | But there's a practical nature
01:05:59.280 | of many of hundreds of thousands of cars being produced
01:06:02.880 | that don't have a good driver facing camera.
01:06:05.800 | - The Model 3 has a selfie cam.
01:06:07.520 | Is it not good enough?
01:06:08.680 | Did they not have put IR LEDs for night?
01:06:10.800 | - That's a good question.
01:06:11.640 | But I do know that it's fisheye
01:06:13.360 | and it's relatively low resolution.
01:06:15.820 | So it's really not designed, it wasn't--
01:06:17.520 | - It wasn't designed for driver monitoring.
01:06:18.760 | - You can hope that you can kind of scrape up
01:06:21.760 | and have something from it.
01:06:24.200 | - Yeah.
01:06:25.040 | - But why didn't they put it in today?
01:06:27.520 | Put it in today.
01:06:28.360 | - Put it in today.
01:06:29.560 | - Every time I've heard Karpathy talk about the problem
01:06:31.560 | and talking about like software 2.0
01:06:33.280 | and how the machine learning is gobbling up everything,
01:06:35.240 | I think this is absolutely the right strategy.
01:06:37.440 | I think that he didn't write and have it get on autopilot.
01:06:40.200 | I think somebody else did
01:06:41.600 | and kind of hacked it on top of that stuff.
01:06:43.280 | I think when Karpathy says, wait a second,
01:06:45.760 | why did we hand code this lane change policy
01:06:47.460 | with all these magic numbers?
01:06:48.380 | We're gonna learn it from data.
01:06:49.400 | They'll fix it.
01:06:50.240 | They already know what to do there.
01:06:51.080 | - Well, that's Andre's job is to turn everything
01:06:54.400 | into a learning problem and collect a huge amount of data.
01:06:57.520 | The reality is though, not every problem
01:07:01.160 | can be turned into a learning problem in the short term.
01:07:04.140 | In the end, everything will be a learning problem.
01:07:07.320 | The reality is, like if you wanna build L5 vehicles today,
01:07:12.320 | it will likely involve no learning.
01:07:15.480 | And that's the reality is,
01:07:17.480 | so at which point does learning start?
01:07:20.400 | It's the crutch statement that LIDAR is a crutch.
01:07:23.520 | At which point will learning get up to part
01:07:25.880 | of human performance?
01:07:27.280 | It's over human performance on ImageNet,
01:07:29.960 | classification on driving is the question still.
01:07:34.080 | - It is a question.
01:07:35.840 | I'll say this, I'm here to play for 10 years.
01:07:39.280 | I'm not here to try to,
01:07:40.360 | I'm here to play for 10 years and make money along the way.
01:07:43.040 | I'm not here to try to promise people
01:07:45.120 | that I'm gonna have my L5 taxi network up
01:07:47.220 | and working in two years.
01:07:48.320 | - Do you think that was a mistake?
01:07:49.520 | - Yes.
01:07:50.600 | - What do you think was the motivation behind saying that?
01:07:53.560 | Other companies are also promising L5 vehicles
01:07:56.720 | with very different approaches in 2020, 2021, 2022.
01:08:01.720 | - If anybody would like to bet me
01:08:03.720 | that those things do not pan out, I will bet you.
01:08:06.920 | Even money, even money, I'll bet you as much as you want.
01:08:09.820 | - Yeah.
01:08:10.800 | So are you worried about what's going to happen?
01:08:13.640 | 'Cause you're not in full agreement on that.
01:08:16.160 | What's going to happen when 2022, '21 come around
01:08:19.200 | and nobody has fleets of autonomous vehicles?
01:08:22.920 | - Well, you can look at the history.
01:08:25.080 | If you go back five years ago,
01:08:26.760 | they were all promised by 2018 and 2017.
01:08:29.960 | - But they weren't that strong of promises.
01:08:32.240 | I mean, Ford really declared pretty,
01:08:36.120 | I think not many have declared as like definitively
01:08:40.640 | as they have now these dates.
01:08:42.640 | - Well, okay, so let's separate L4 and L5.
01:08:45.080 | Do I think that it's possible for Waymo to continue
01:08:47.400 | to kind of like hack on their system
01:08:51.000 | until it gets to level four in Chandler, Arizona?
01:08:55.160 | - No safety driver?
01:08:56.880 | - Chandler, Arizona, yeah.
01:08:58.180 | - But by, sorry, which year are we talking about?
01:09:02.520 | - Oh, I even think that's possible by like 2020, 2021.
01:09:06.200 | But level four, Chandler, Arizona,
01:09:08.480 | not level five, New York City.
01:09:10.340 | - Level four, meaning some very defined streets
01:09:16.000 | it works on really well.
01:09:17.480 | - Very defined streets.
01:09:18.320 | And then practically these streets are pretty empty.
01:09:20.720 | If most of the streets are covered in Waymos,
01:09:24.680 | Waymo can kind of change the definition
01:09:26.480 | of what driving is, right?
01:09:28.920 | If your self-driving network is the majority of cars
01:09:32.120 | in an area, they only need to be safe
01:09:34.640 | with respect to each other and all the humans
01:09:36.460 | will need to learn to adapt to them.
01:09:38.640 | Now go drive in downtown New York.
01:09:41.120 | - Oh yeah, that's.
01:09:42.200 | - I mean, already you can talk about autonomy
01:09:44.760 | and like on farms it already works great
01:09:47.120 | because you can really just follow the GPS line.
01:09:50.560 | - So what does success look like for Calm.ai?
01:09:55.560 | What are the milestones like where you can sit back
01:09:59.040 | with some champagne and say, "We did it, boys and girls."
01:10:02.940 | - Well, it's never over.
01:10:06.120 | - Yeah, but you must drink champagne every year.
01:10:09.720 | - Sure.
01:10:10.560 | - So what is a good, what are some wins?
01:10:13.160 | - A big milestone that we're hoping for by mid next year
01:10:19.480 | is profitability of the company.
01:10:21.120 | And we're gonna have to revisit the idea
01:10:27.680 | of selling a consumer product,
01:10:30.320 | but it's not gonna be like the Comma One.
01:10:32.760 | When we do it, it's gonna be perfect.
01:10:35.320 | OpenPilot has gotten so much better in the last two years.
01:10:39.640 | We're gonna have a few features.
01:10:41.680 | We're gonna have 100% driver monitoring.
01:10:43.780 | We're gonna disable no safety features in the car.
01:10:47.080 | Actually, I think it'd be really cool
01:10:48.240 | what we're doing right now.
01:10:49.120 | Our project this week is we're analyzing the data set
01:10:51.600 | and looking for all the AEB triggers
01:10:53.200 | from the manufacturer systems.
01:10:54.700 | We have a better data set on that than the manufacturers.
01:10:59.400 | How much, just how many, does Toyota have 10 million miles
01:11:02.380 | of real-world driving to know how many times
01:11:04.120 | their AEB triggered?
01:11:05.320 | - So let me give you, 'cause you asked, right,
01:11:08.400 | financial advice.
01:11:09.560 | - Yeah.
01:11:10.880 | - 'Cause I work with a lot of automakers
01:11:12.440 | and one possible source of money for you,
01:11:15.800 | which I'll be excited to see you take on,
01:11:18.020 | is basically selling the data,
01:11:23.020 | which is something that most people,
01:11:29.080 | and not selling in a way where here at Automaker,
01:11:31.800 | but creating, we've done this actually at MIT,
01:11:34.360 | not for money purposes,
01:11:35.480 | but you could do it for significant money purposes
01:11:37.760 | and make the world a better place by creating a consortia
01:11:41.360 | where automakers would pay in
01:11:44.240 | and then they get to have free access to the data.
01:11:46.960 | And I think a lot of people are really hungry for that
01:11:51.460 | and would pay a significant amount of money for it.
01:11:54.200 | - Here's the problem with that.
01:11:55.400 | I like this idea all in theory.
01:11:56.880 | It'd be very easy for me to give them access to my servers
01:11:59.660 | and we already have all open source tools
01:12:01.480 | to access this data.
01:12:02.320 | It's in a great format.
01:12:03.420 | We have a great pipeline,
01:12:04.720 | but they're gonna put me in the room
01:12:07.120 | with some business development guy
01:12:08.820 | and I'm gonna have to talk to this guy
01:12:12.560 | and he's not gonna know most of the words I'm saying.
01:12:15.200 | I'm not willing to tolerate that.
01:12:17.400 | - Okay, Mick Jagger.
01:12:19.000 | - No, no, no, no, no.
01:12:19.920 | - I think I agree with you.
01:12:21.120 | I'm the same way, but you just tell them the terms
01:12:23.040 | and there's no discussion needed.
01:12:24.720 | - If I could just tell them the terms,
01:12:27.900 | all right, who wants access to my data?
01:12:31.720 | I will sell it to you for, let's say,
01:12:36.720 | you want a subscription?
01:12:37.720 | I'll sell it to you for 100K a month.
01:12:39.560 | Anyone?
01:12:41.640 | - 100K a month?
01:12:42.480 | - 100K a month.
01:12:43.300 | I'll give you access to this data subscription.
01:12:45.160 | - Yeah.
01:12:46.000 | - Yeah, I think that's kind of fair.
01:12:46.820 | Came up with that number off the top of my head.
01:12:48.080 | If somebody sends me a three-line email
01:12:50.200 | where it's like, we would like to pay 100K a month
01:12:52.620 | to get access to your data,
01:12:54.040 | we would agree to reasonable privacy terms
01:12:56.200 | of the people who are in the dataset,
01:12:58.360 | I would be happy to do it,
01:12:59.560 | but that's not gonna be the email.
01:13:01.220 | The email is gonna be, hey, do you have some time
01:13:03.880 | in the next month where we can sit down and we can,
01:13:06.020 | I don't have time for that.
01:13:06.920 | We're moving too fast.
01:13:07.880 | - Yeah, you could politely respond to that email
01:13:10.080 | by not saying, I don't have any time for your bullshit.
01:13:13.280 | You say, oh, well, unfortunately, these are the terms,
01:13:15.480 | and so this is, we tried to,
01:13:17.720 | we brought the cost down for you
01:13:19.840 | in order to minimize the friction, the communication.
01:13:22.240 | - Yeah, absolutely.
01:13:23.080 | - Here's the, whatever it is,
01:13:24.520 | one, two million dollars a year, and you have access.
01:13:27.760 | - And it's not like I get that email from,
01:13:31.480 | but okay, am I gonna reach out?
01:13:32.720 | Am I gonna hire a business development person
01:13:34.200 | who's gonna reach out to the automakers?
01:13:35.920 | No way.
01:13:36.760 | - Yeah, okay, I got you.
01:13:37.920 | I admire.
01:13:38.760 | - If they reached into me, I'm not gonna ignore the email.
01:13:40.640 | I'll come back with something.
01:13:41.640 | - For sure.
01:13:42.480 | - I'm willing to pay 100K a month for access to the data.
01:13:44.520 | I'm happy to set that up.
01:13:46.120 | That's worth my engineering time.
01:13:48.240 | - That's actually quite insightful of you.
01:13:49.560 | You're right.
01:13:50.480 | Probably because many of the automakers
01:13:52.520 | are quite a bit old school,
01:13:54.200 | there will be a need to reach out,
01:13:56.280 | and they want it, but there'll need to be some communication.
01:13:59.800 | You're right.
01:14:00.640 | - Mobileye circa 2015 had the lowest R&D spend
01:14:04.520 | of any chipmaker, like per,
01:14:08.360 | and you look at all the people who work for them,
01:14:10.640 | and it's all business development people
01:14:12.120 | because the car companies are impossible to work with.
01:14:15.360 | - Yeah, so you have no patience for that,
01:14:17.880 | and you're a legit Android, huh?
01:14:20.040 | - I have something to do, right?
01:14:21.640 | It's not like I don't mean to be a dick
01:14:23.760 | and say I don't have patience for that,
01:14:25.100 | but it's like that stuff doesn't help us
01:14:28.280 | with our goal of winning self-driving cars.
01:14:30.560 | If I want money in the short term,
01:14:32.260 | if I showed off the actual learning tech that we have,
01:14:38.080 | it's somewhat sad.
01:14:40.140 | It's years and years ahead of everybody else's.
01:14:43.020 | Maybe not Tesla's.
01:14:43.860 | I think Tesla has similar stuff to us, actually.
01:14:45.740 | I think Tesla has similar stuff,
01:14:46.780 | but when you compare it
01:14:47.620 | to what the Toyota Research Institute has,
01:14:49.720 | you're not even close to what we have.
01:14:53.500 | - No comment, but I also can't,
01:14:55.860 | I have to take your comments,
01:14:58.480 | I intuitively believe you,
01:15:01.660 | but I have to take it with a grain of salt
01:15:03.220 | because you are an inspiration
01:15:06.220 | because you basically don't care about a lot of things
01:15:09.020 | that other companies care about.
01:15:10.860 | You don't try to bullshit, in a sense,
01:15:15.540 | like make up stuff to drive up valuation.
01:15:18.620 | You're really very real
01:15:19.740 | and you're trying to solve the problem.
01:15:20.900 | I admire that a lot.
01:15:22.300 | What I don't necessarily fully can't trust you on,
01:15:25.980 | with all due respect, is how good it is, right?
01:15:28.420 | I can only, but I also know how bad others are.
01:15:32.260 | - I'll say two things about,
01:15:35.580 | trust but verify, right?
01:15:36.700 | I'll say two things about that.
01:15:38.060 | One is try, get in a 2020 Corolla
01:15:42.340 | and try OpenPILOT 0.6 when it comes out next month.
01:15:45.420 | I think already, you'll look at this
01:15:48.420 | and you'll be like, this is already really good.
01:15:51.220 | And then I could be doing that all with hand labelers
01:15:54.260 | and all with like the same approach that like Mobileye uses.
01:15:57.980 | When we release a model that no longer has the lanes in it,
01:16:01.460 | that only outputs a path,
01:16:04.980 | then think about how we did that machine learning.
01:16:08.700 | And then right away when you see,
01:16:10.100 | and that's going to be an OpenPILOT.
01:16:11.260 | That's going to be an OpenPILOT before 1.0.
01:16:13.020 | When you see that model,
01:16:14.100 | you'll know that everything I'm saying is true
01:16:15.380 | 'cause how else did I get that model?
01:16:16.820 | - Good.
01:16:17.660 | - You know what I'm saying is true about the simulator.
01:16:19.220 | - Yeah, yeah, yeah.
01:16:20.060 | This is super exciting.
01:16:20.880 | That's super exciting.
01:16:21.720 | And--
01:16:22.700 | - But like, I listened to your talk with Kyle
01:16:25.740 | and Kyle was originally building the aftermarket system
01:16:30.460 | and he gave up on it because of technical challenges.
01:16:34.300 | - Yeah.
01:16:35.140 | - Because of the fact that he's going to have to support
01:16:38.140 | 20 to 50 cars, we support 45,
01:16:40.520 | because what is he going to do
01:16:41.500 | when the manufacturer ABS system triggers?
01:16:43.460 | We have alerts and warnings to deal with all of that
01:16:45.500 | and all the cars.
01:16:46.580 | And how is he going to formally verify it?
01:16:48.460 | Well, I got 10 million miles of data.
01:16:49.860 | It's probably better verified than the spec.
01:16:52.340 | - Yeah, I'm glad you're here talking to me.
01:16:56.980 | This is, I'll remember this day.
01:17:00.260 | 'Cause it's interesting.
01:17:01.100 | If you look at Kyle's from Cruise,
01:17:04.140 | I'm sure they have a large number
01:17:05.300 | of business development folks.
01:17:07.380 | And you work with, he's working with GM.
01:17:10.180 | You could work with Argo AI, working with Ford.
01:17:13.220 | It's interesting because chances that you fail,
01:17:17.580 | business-wise, like bankrupt, are pretty high.
01:17:20.180 | - Yeah.
01:17:21.020 | - And yet, it's the Android model.
01:17:23.860 | Is you're actually taking on the problem.
01:17:26.340 | So that's really inspiring.
01:17:27.500 | I mean--
01:17:28.340 | - Well, I have a long-term way for Commodore to make money too.
01:17:30.900 | - And one of the nice things
01:17:32.180 | when you really take on the problem,
01:17:34.400 | which is my hope for Autopilot, for example,
01:17:36.740 | is things you don't expect, ways to make money
01:17:41.020 | or create value that you don't expect will pop up.
01:17:43.980 | - Oh, I've known how to do it since kind of,
01:17:46.620 | 2017 is the first time I said it.
01:17:48.500 | - Well, which part to know how to do which part?
01:17:50.460 | - Our long-term plan is to be a car insurance company.
01:17:52.500 | - Insurance, yeah, I love it.
01:17:53.940 | Yep, yep.
01:17:54.780 | - Why, I make driving twice as safe.
01:17:56.620 | Not only that, I have the best data set
01:17:57.740 | to know who statistically is the safest drivers.
01:17:59.820 | And oh, oh, we see you, we see you driving unsafely,
01:18:03.700 | we're not gonna insure you.
01:18:05.300 | And that causes a bifurcation in the market
01:18:08.940 | because the only people who can't get Comm insurance
01:18:10.860 | are the bad drivers.
01:18:11.700 | Geico can insure them, their premiums are crazy high,
01:18:13.860 | our premiums are crazy low.
01:18:15.300 | We'll win car insurance, take over that whole market.
01:18:18.060 | - Okay, so--
01:18:19.940 | - If we win, if we win.
01:18:21.220 | But that's I'm saying, how do you turn Comm
01:18:22.940 | into a $10 billion company, it's that.
01:18:24.620 | - That's right.
01:18:25.580 | So you, Elon Musk, who else?
01:18:29.940 | Who else is thinking like this and working like this
01:18:32.660 | in your view?
01:18:33.500 | Who are the competitors?
01:18:34.740 | Are there people seriously, I don't think anyone
01:18:37.540 | that I'm aware of is seriously taking on lane keeping,
01:18:42.140 | you know, like to where it's a huge business
01:18:45.060 | that turns eventually into full autonomy
01:18:47.140 | that then creates, yeah, like that creates other businesses
01:18:51.980 | on top of it and so on, thinks insurance,
01:18:54.620 | thinks all kinds of ideas like that.
01:18:56.420 | Do you know anyone else thinking like this?
01:18:59.380 | - Not really.
01:19:02.100 | - That's interesting.
01:19:02.940 | I mean, my sense is everybody turns to that
01:19:05.340 | in like four or five years.
01:19:07.740 | Like Ford, once the autonomy doesn't fall through.
01:19:10.340 | - Yeah.
01:19:11.260 | - But at this time--
01:19:12.580 | - Elon's the iOS.
01:19:14.100 | By the way, he paved the way for all of us.
01:19:16.660 | - Right, it's the iOS, true.
01:19:17.980 | - I would not be doing Comm.ai today
01:19:20.860 | if it was not for those conversations with Elon
01:19:23.460 | and if it were not for him saying like,
01:19:27.100 | I think he said like, well, obviously,
01:19:28.380 | we're not gonna use LIDAR, we use cameras,
01:19:29.700 | humans use cameras.
01:19:31.260 | - So what do you think about that?
01:19:32.260 | How important is LIDAR?
01:19:33.860 | Everybody else on L5 is using LIDAR.
01:19:36.060 | What are your thoughts on his provocative statement
01:19:39.180 | that LIDAR is a crutch?
01:19:41.340 | - See, sometimes he'll say dumb things
01:19:43.060 | like the driver monitoring thing,
01:19:44.020 | but sometimes he'll say absolutely, completely,
01:19:46.220 | 100% obviously true things.
01:19:48.380 | Of course LIDAR is a crutch.
01:19:49.820 | It's not even a good crutch.
01:19:52.180 | You're not even using it,
01:19:53.860 | oh, they're using it for localization.
01:19:56.220 | - Yeah.
01:19:57.060 | - Which isn't good in the first place.
01:19:58.140 | If you have to localize your car to centimeters
01:20:00.460 | in order to drive, like that's not driving.
01:20:04.260 | - Currently not doing much machine learning
01:20:06.260 | on top of LIDAR data, meaning like to help you
01:20:09.220 | in the task of, general task of perception.
01:20:12.820 | - The main goal of those LIDARs on those cars,
01:20:15.300 | I think is actually localization more than perception.
01:20:18.820 | Or at least that's what they use them for.
01:20:20.060 | - Yeah, that's true.
01:20:20.900 | - If you wanna localize to centimeters, you can't use GPS.
01:20:23.700 | The fanciest GPS in the world can't do it,
01:20:25.140 | especially if you're under tree cover and stuff.
01:20:26.900 | With LIDAR you can do this pretty easily.
01:20:28.420 | - So you really, they're not taking on,
01:20:30.220 | I mean in some research they're using it for perception,
01:20:33.780 | and they're certainly not, which is sad,
01:20:35.820 | they're not fusing it well with vision.
01:20:38.660 | - They do use it for perception.
01:20:40.500 | I'm not saying they don't use it for perception,
01:20:42.380 | but the thing that, they have vision-based
01:20:45.460 | and radar-based perception systems as well.
01:20:47.660 | You could remove the LIDAR and keep around
01:20:51.380 | a lot of the dynamic object perception.
01:20:54.020 | You wanna get centimeter accurate localization,
01:20:56.300 | good luck doing that with anything else.
01:20:59.100 | - So what should Cruz, Waymo do?
01:21:03.180 | What would be your advice to them now?
01:21:05.340 | I mean Waymo's actually, they're serious.
01:21:11.380 | Waymo out of the ballroom are quite
01:21:14.140 | so serious about the long game.
01:21:16.300 | If L5 is a lot, requires 50 years,
01:21:20.820 | I think Waymo will be the only one left standing at the end
01:21:24.180 | with the, given the financial backing that they have.
01:21:26.780 | - The book of Google bucks.
01:21:28.820 | I'll say nice things about both Waymo and Cruz.
01:21:31.140 | - Let's do it, nice is good.
01:21:34.460 | - Waymo is by far the furthest along with technology.
01:21:39.380 | Waymo has a three to five year lead on all the competitors.
01:21:42.940 | If the Waymo-looking stack works, maybe three year lead.
01:21:49.020 | If the Waymo-looking stack works,
01:21:51.340 | they have a three year lead.
01:21:52.860 | Now I argue that Waymo has spent too much money
01:21:55.860 | to recapitalize, to gain back their losses
01:21:59.300 | in those three years.
01:22:00.220 | Also, self-driving cars have no network effect like that.
01:22:03.660 | Uber has a network effect.
01:22:04.860 | You have a market, you have drivers and you have riders.
01:22:07.180 | Self-driving cars, you have capital and you have riders.
01:22:09.940 | There's no network effect.
01:22:11.460 | If I wanna blanket a new city in self-driving cars,
01:22:13.860 | I buy the off the shelf Chinese knockoff self-driving cars
01:22:16.060 | and I buy enough of them from the city.
01:22:17.220 | I can't do that with drivers.
01:22:18.380 | And that's why Uber has a first mover advantage
01:22:20.900 | that no self-driving car company will.
01:22:22.860 | Can you disentangle that a little bit?
01:22:26.580 | Uber, you're not talking about Uber,
01:22:28.180 | the autonomous vehicle Uber.
01:22:29.860 | You're talking about the Uber car.
01:22:30.680 | Okay. Yeah.
01:22:31.620 | I'm Uber.
01:22:32.460 | I open for business in Austin, Texas, let's say.
01:22:35.980 | I need to attract both sides of the market.
01:22:38.820 | I need to both get drivers on my platform
01:22:41.260 | and riders on my platform.
01:22:42.820 | And I need to keep them both sufficiently happy, right?
01:22:45.380 | Riders aren't gonna use it
01:22:46.580 | if it takes more than five minutes for an Uber to show up.
01:22:49.020 | Drivers aren't gonna use it
01:22:50.220 | if they have to sit around all day and there's no riders.
01:22:52.220 | So you have to carefully balance a market.
01:22:54.420 | And whenever you have to carefully balance a market,
01:22:56.340 | there's a great first mover advantage
01:22:58.380 | because there's a switching cost for everybody, right?
01:23:01.100 | The drivers and the riders
01:23:02.200 | would have to switch at the same time.
01:23:04.180 | Let's even say that, let's say a Luber shows up
01:23:08.980 | and Luber somehow agrees to do things at a bigger,
01:23:13.780 | we're just gonna, we've done it more efficiently, right?
01:23:17.520 | Luber only takes 5% of a cut
01:23:19.880 | instead of the 10% that Uber takes.
01:23:21.680 | No one is gonna switch
01:23:22.840 | because the switching cost is higher than that 5%.
01:23:25.000 | So you actually can, in markets like that,
01:23:27.280 | you have a first mover advantage.
01:23:28.980 | Autonomous vehicles of the level five variety
01:23:32.800 | have no first mover advantage.
01:23:34.640 | If the technology becomes commoditized,
01:23:36.840 | say I wanna go to a new city, look at the scooters.
01:23:39.600 | It's gonna look a lot more like scooters.
01:23:41.560 | Every person with a checkbook
01:23:44.100 | can blanket a city in scooters.
01:23:45.800 | And that's why you have 10 different scooter companies.
01:23:48.000 | Which one's gonna win?
01:23:48.840 | It's a race to the bottom.
01:23:49.720 | It's a terrible market to be in
01:23:51.160 | 'cause there's no market for scooters.
01:23:53.280 | - And-- - 'Cause the scooters
01:23:56.060 | don't get a say in whether they wanna be bought
01:23:57.600 | and deployed to a city or not.
01:23:58.520 | - Right, so the, yeah.
01:24:00.120 | - We're gonna entice the scooters with subsidies and deals.
01:24:03.080 | - So whenever you have to invest that capital,
01:24:05.960 | it doesn't-- - It doesn't come back.
01:24:07.880 | - Yeah.
01:24:08.720 | That can't be your main criticism of the Waymo approach.
01:24:12.480 | - Oh, I'm saying even if it does technically work.
01:24:15.000 | Even if it does technically work, that's a problem.
01:24:17.200 | - Yeah.
01:24:18.160 | - I don't know.
01:24:19.240 | If I were to say, I would say,
01:24:21.900 | you're already there.
01:24:23.800 | I haven't even thought about that.
01:24:24.720 | But I would say the bigger challenge
01:24:26.680 | is the technical approach.
01:24:28.020 | - So Waymo's cruise is--
01:24:31.960 | - And not just the technical approach,
01:24:33.080 | but of creating value.
01:24:34.880 | I still don't understand how you beat Uber,
01:24:40.820 | the human-driven cars, in terms of financially.
01:24:44.980 | It doesn't make sense to me that people
01:24:48.340 | would wanna get in an autonomous vehicle.
01:24:50.140 | I don't understand how you make money.
01:24:52.860 | In the long-term, yes, like real long-term.
01:24:56.500 | But it just feels like there's too much
01:24:58.660 | capital investment needed.
01:25:00.020 | - Oh, and they're gonna be worse than Uber's
01:25:01.220 | because they're gonna stop for every little thing everywhere.
01:25:04.800 | I'll say a nice thing about cruise.
01:25:07.380 | That was my nice thing about Waymo,
01:25:08.420 | the three years at it.
01:25:09.260 | - What was the nice, oh, 'cause there are three--
01:25:10.740 | - Three years technically ahead of everybody.
01:25:12.460 | Their tech stack is great.
01:25:13.940 | My nice thing about cruise is GM buying them
01:25:17.900 | was a great move for GM.
01:25:19.140 | For $1 billion, GM bought an insurance policy against Waymo.
01:25:25.620 | They put, cruise is three years behind Waymo.
01:25:30.000 | That means Google will get a monopoly on the technology
01:25:33.300 | for at most three years.
01:25:35.160 | - And if technology works,
01:25:39.060 | you might not even be right about the three years.
01:25:40.820 | It might be less.
01:25:41.860 | - Might be less.
01:25:42.700 | Cruise actually might not be that far behind.
01:25:44.300 | I don't know how much Waymo has waffled around
01:25:47.340 | or how much of it actually is just that long tail.
01:25:49.780 | - Yeah, okay.
01:25:50.620 | If that's the best you could say in terms of nice things,
01:25:53.580 | that's more of a nice thing for GM
01:25:55.200 | that that's a smart insurance policy.
01:25:58.580 | - It's a smart insurance policy.
01:25:59.660 | I mean, I think that's how,
01:26:01.860 | I can't see cruise working out any other.
01:26:05.220 | For cruise to leapfrog Waymo would really surprise me.
01:26:08.520 | - Yeah, so let's talk about the underlying assumptions
01:26:13.000 | of everything.
01:26:13.840 | - We're not gonna leapfrog Tesla.
01:26:15.320 | Tesla would have to seriously mess up for us.
01:26:19.520 | - Because you're, okay.
01:26:20.800 | So the way you leapfrog, right,
01:26:23.240 | is you come up with an idea
01:26:26.080 | or you take a direction, perhaps secretly,
01:26:28.560 | that the other people aren't taking.
01:26:30.660 | And so the cruise, Waymo, even Aurora,
01:26:37.520 | - I don't know Aurora, Zooks is the same stack as well.
01:26:40.080 | They're all the same code base even.
01:26:41.760 | They're all the same DARPA Urban Challenge code base.
01:26:44.460 | - So the question is,
01:26:46.820 | do you think there's a room for brilliance and innovation
01:26:48.960 | that will change everything?
01:26:50.360 | Like say, okay, so I'll give you examples.
01:26:53.900 | It could be if revolution and mapping, for example,
01:26:59.640 | that allow you to map things,
01:27:03.040 | do HD maps of the whole world,
01:27:05.840 | all weather conditions somehow really well,
01:27:08.080 | or revolution and simulation
01:27:13.080 | to where all the way you said before becomes incorrect.
01:27:18.820 | That kind of thing.
01:27:21.300 | Any room for breakthrough innovation?
01:27:23.180 | - What I said before about,
01:27:25.960 | oh, they actually get the whole thing.
01:27:28.280 | I'll say this about,
01:27:30.520 | we divide driving into three problems
01:27:32.640 | and I actually haven't solved the third yet,
01:27:33.800 | but I have an idea how to do it.
01:27:34.800 | So there's the static.
01:27:36.100 | The static driving problem is assuming
01:27:38.000 | you are the only car on the road.
01:27:40.120 | And this problem can be solved 100%
01:27:41.960 | with mapping and localization.
01:27:43.980 | This is why farms work the way they do.
01:27:45.720 | If all you have to deal with is the static problem
01:27:48.420 | and you can statically schedule your machines,
01:27:50.160 | it's the same as like statically scheduling processes.
01:27:52.680 | You can statically schedule your tractors
01:27:54.040 | to never hit each other on their paths,
01:27:56.120 | 'cause you know the speed they go at.
01:27:57.480 | So that's the static driving problem.
01:28:00.160 | Maps only helps you with the static driving problem.
01:28:03.920 | - Yeah, the question about static driving.
01:28:06.960 | You've just made it sound like it's really easy.
01:28:08.800 | - Static driving is really easy.
01:28:10.280 | - How easy?
01:28:13.560 | Well, 'cause the whole drifting out of lane.
01:28:16.480 | When Tesla drifts out of lane,
01:28:18.760 | it's failing on the fundamental static driving problem.
01:28:22.000 | - Tesla is drifting out of lane?
01:28:24.480 | The static driving problem is not easy for the world.
01:28:27.740 | The static driving problem is easy for one route.
01:28:31.840 | - One route and one weather condition
01:28:33.960 | with one state of lane markings
01:28:37.960 | and like no deterioration, no cracks in the road.
01:28:40.960 | - I'm assuming you have a perfect localizer.
01:28:42.640 | So that solves for the weather condition
01:28:44.240 | and the lane marking condition.
01:28:45.640 | - But that's the problem is how do you have a perfect--
01:28:47.760 | - You can build, perfect localizers
01:28:49.200 | are not that hard to build.
01:28:50.600 | - Okay, come on now.
01:28:51.440 | With LIDAR.
01:28:53.400 | - With LIDAR, yeah.
01:28:54.240 | - Oh, with LIDAR, okay.
01:28:55.080 | - With LIDAR, yeah, but you use LIDAR, right?
01:28:56.440 | Like use LIDAR to build a perfect localizer.
01:28:58.660 | Building a perfect localizer without LIDAR,
01:29:01.920 | (sighs)
01:29:03.000 | it's gonna be hard.
01:29:04.320 | You can get 10 centimeters without LIDAR,
01:29:05.760 | you can get one centimeter with LIDAR.
01:29:07.240 | - I'm not even concerned about the one or 10 centimeters,
01:29:09.280 | I'm concerned if every once in a while you're just way off.
01:29:12.680 | - Yeah, so this is why you have to
01:29:15.960 | carefully make sure you're always tracking your position.
01:29:20.000 | You wanna use LIDAR camera fusion,
01:29:21.720 | but you can get the reliability of that system
01:29:24.440 | up to 100,000 miles
01:29:28.000 | and then you write some fallback condition
01:29:29.680 | where it's not that bad if you're way off, right?
01:29:32.120 | I think that you can get it to the point,
01:29:33.720 | it's like as will be that you're never in a case
01:29:36.760 | where you're way off and you don't know it.
01:29:38.440 | - Yeah, okay, so this is brilliant.
01:29:40.180 | So that's the static.
01:29:41.120 | - Static.
01:29:42.240 | - We can, especially with LIDAR and good HD maps,
01:29:45.920 | you can solve that problem.
01:29:47.000 | - Easy.
01:29:47.840 | - No, I just disagree with your word easy.
01:29:50.440 | - The static problem's so easy.
01:29:51.880 | - Very typical for you to say something's easy, I got it.
01:29:54.440 | It's not as challenging as the other ones, okay.
01:29:56.880 | - Well, it's, okay, maybe it's obvious how to solve it.
01:29:58.760 | The third one's the hardest.
01:29:59.840 | So where do we get,
01:30:00.680 | and a lot of people don't even think about the third one
01:30:01.920 | and even see it as different from the second one.
01:30:03.660 | So the second one is dynamic.
01:30:05.760 | The second one is like, say there's an obvious example,
01:30:08.600 | it's like a car stopped at a red light, right?
01:30:10.400 | You can't have that car in your map
01:30:12.560 | because you don't know whether that car
01:30:13.760 | is gonna be there or not.
01:30:14.920 | So you have to detect that car in real time
01:30:18.000 | and then you have to do the appropriate action, right?
01:30:20.700 | Also, that car is not a fixed object, that car may move
01:30:25.740 | and you have to predict what that car will do, right?
01:30:28.740 | So this is the dynamic problem.
01:30:30.880 | - Yeah.
01:30:31.720 | - So you have to deal with this.
01:30:32.840 | This involves, again, like you're gonna need models
01:30:36.680 | of other people's behavior.
01:30:38.040 | - Are you including in that,
01:30:40.200 | I don't wanna step on the third one.
01:30:42.440 | Are you including in that your influence on people?
01:30:46.960 | - Ah, that's the third one.
01:30:48.240 | - Okay. - That's the third one.
01:30:49.520 | We call it the counterfactual.
01:30:51.880 | - Yeah, brilliant. - And that.
01:30:53.120 | - I just talked to Judea Pearl
01:30:54.360 | who's obsessed with counterfactuals.
01:30:55.920 | - Oh yeah, yeah, I read his books.
01:30:58.640 | - So the static and the dynamic.
01:31:00.760 | - Yeah.
01:31:01.920 | - Our approach right now for lateral
01:31:04.720 | will scale completely to the static and dynamic.
01:31:07.560 | The counterfactual, the only way I have to do it yet,
01:31:11.320 | the thing that I wanna do once we have all of these cars
01:31:13.960 | is I wanna do reinforcement learning on the world.
01:31:16.760 | I'm always gonna turn the exploiter up to max,
01:31:18.860 | I'm not gonna have them explore,
01:31:20.420 | but the only real way to get at the counterfactual
01:31:22.760 | is to do reinforcement learning
01:31:24.080 | because the other agents are humans.
01:31:27.760 | - So that's fascinating that you break it down like that.
01:31:30.080 | I agree completely.
01:31:31.720 | - I've spent my life thinking about this problem.
01:31:33.600 | - It's beautiful.
01:31:34.440 | Part of it 'cause you're slightly insane is that,
01:31:39.080 | because--
01:31:40.020 | - Not my life, just the last four years.
01:31:43.120 | - No, no, you have like,
01:31:44.800 | some non-zero percent of your brain has a madman in it,
01:31:49.920 | which is a really good feature,
01:31:52.380 | but there's a safety component to it
01:31:55.920 | that I think when there's sort of counterfactuals and so on
01:31:59.040 | that would just freak people out.
01:32:00.240 | How do you even start to think about, just in general,
01:32:03.320 | I mean, you've had some friction with NHTSA and so on.
01:32:07.600 | I am frankly exhausted by safety engineers,
01:32:12.600 | the prioritization on safety over innovation
01:32:19.280 | to a degree where kills, in my view,
01:32:23.720 | kills safety in the long term.
01:32:26.200 | So the counterfactual thing,
01:32:28.200 | just actually exploring this world
01:32:31.560 | of how do you interact with dynamic objects and so on,
01:32:33.920 | how do you think about safety?
01:32:34.840 | - You can do reinforcement learning without ever exploring.
01:32:38.080 | And I said that, so you can think about your,
01:32:40.400 | in reinforcement learning,
01:32:41.520 | it's usually called a temperature parameter,
01:32:44.280 | and your temperature parameter
01:32:45.320 | is how often you deviate from the argmax.
01:32:48.040 | I could always set that to zero and still learn,
01:32:50.680 | and I feel that you'd always want that set to zero
01:32:52.800 | on your actual system.
01:32:54.040 | - Gotcha, but the problem is you first don't know very much,
01:32:58.120 | and so you're going to make mistakes.
01:32:59.560 | So the learning, the exploration happens through mistakes.
01:33:01.920 | - We're all ready, yeah, but, okay.
01:33:03.740 | So the consequences of a mistake.
01:33:06.080 | OpenPilot and Autopilot are making mistakes left and right.
01:33:09.400 | We have 700 daily active users,
01:33:12.560 | a thousand weekly active users.
01:33:14.080 | OpenPilot makes tens of thousands of mistakes a week.
01:33:18.920 | These mistakes have zero consequences.
01:33:21.160 | These mistakes are, oh, I wanted to take this exit,
01:33:25.640 | and it went straight,
01:33:26.840 | so I'm just going to carefully touch the wheel.
01:33:28.560 | - The humans catch them.
01:33:29.400 | - The humans catch them.
01:33:30.680 | And the human disengagement
01:33:32.400 | is labeling that reinforcement learning
01:33:34.160 | in a completely consequence-free way.
01:33:36.200 | - So driver monitoring is the way you ensure they keep--
01:33:39.880 | - Yes. - They keep paying attention.
01:33:42.160 | How's your messaging?
01:33:43.280 | Say I gave you a billion dollars,
01:33:45.280 | you would be scaling, and now--
01:33:47.840 | - Oh, I couldn't scale with any amount of money.
01:33:49.760 | I'd raise money if I could.
01:33:50.640 | If I had a way to scale it.
01:33:51.680 | - Yeah, you're not focused on scaling.
01:33:53.320 | - I don't know how to do, oh,
01:33:54.560 | I guess I could sell it to more people,
01:33:55.800 | but I want to make the system better.
01:33:57.000 | - Better, better. - And I don't know how to--
01:33:58.880 | - But what's the messaging here?
01:34:01.140 | I got a chance to talk to Elon,
01:34:02.600 | and he basically said that the human factor doesn't matter.
01:34:07.600 | The human doesn't matter 'cause the system will perform.
01:34:12.320 | There'll be sort of a, sorry to use the term,
01:34:14.800 | but like a singular,
01:34:15.640 | like a point where it gets just much better,
01:34:17.840 | and so the human, it won't really matter.
01:34:20.880 | But it seems like that human catching the system
01:34:25.080 | when it gets into trouble is like the thing
01:34:29.440 | which will make something like reinforcement learning work.
01:34:32.800 | So how do you think messaging for Tesla,
01:34:35.680 | for you, for the industry in general should change?
01:34:39.120 | - I think our messaging's pretty clear, at least.
01:34:41.880 | Our messaging wasn't that clear in the beginning,
01:34:43.640 | and I do kind of fault myself for that.
01:34:45.240 | We are proud right now to be a level two system.
01:34:48.520 | We are proud to be level two.
01:34:50.400 | If we talk about level four,
01:34:51.680 | it's not with the current hardware.
01:34:53.240 | It's not gonna be just a magical OTA upgrade.
01:34:55.960 | It's gonna be new hardware.
01:34:57.340 | It's gonna be very carefully thought out.
01:34:59.600 | Right now, we are proud to be level two,
01:35:01.640 | and we have a rigorous safety model.
01:35:03.400 | I mean, not like, okay, rigorous, who knows what that means,
01:35:06.680 | but we at least have a safety model,
01:35:08.720 | and we make it explicit.
01:35:09.680 | It's in safety.md in OpenPilot,
01:35:11.920 | and it says, seriously, though.
01:35:14.000 | - Safety.md. - Safety.md.
01:35:15.960 | - This is brilliant.
01:35:17.880 | This is so Android.
01:35:18.720 | - Well, this is the safety model,
01:35:21.880 | and I like to have conversations like,
01:35:23.960 | sometimes people will come to you,
01:35:26.840 | and they're like, "Your system's not safe."
01:35:29.320 | Okay, have you read my safety docs?
01:35:31.160 | Would you like to have an intelligent conversation
01:35:32.760 | about this?
01:35:33.600 | And the answer is always no.
01:35:34.420 | They just scream about, "It runs Python."
01:35:37.080 | Okay, what, so you're saying that
01:35:39.920 | because Python's not real-time,
01:35:41.600 | Python not being real-time never causes disengagements.
01:35:44.320 | Disengagements are caused by, you know, the model is QM.
01:35:47.720 | But safety.md says the following.
01:35:49.840 | First and foremost,
01:35:50.680 | the driver must be paying attention at all times.
01:35:53.080 | I still consider the software to be alpha software
01:35:57.760 | until we can actually enforce that statement,
01:36:00.120 | but I feel it's very well-communicated to our users.
01:36:03.320 | Two more things.
01:36:04.560 | One is the user must be able to easily take control
01:36:09.120 | of the vehicle at all times.
01:36:10.920 | So if you step on the gas or brake with OpenPilot,
01:36:14.480 | it gives full manual control back to the user,
01:36:16.440 | or press the cancel button.
01:36:17.800 | Step two, the car will never react so quickly,
01:36:23.280 | we define so quickly to be about one second,
01:36:26.000 | that you can't react in time.
01:36:27.640 | And we do this by enforcing torque limits,
01:36:29.480 | braking limits, and acceleration limits.
01:36:31.520 | So we have,
01:36:32.360 | like, our torque limit's way lower than Tesla's.
01:36:36.520 | This is another potential.
01:36:39.080 | If I could tweak Autopilot,
01:36:40.260 | I would lower their torque limit,
01:36:41.360 | I would add driver monitoring.
01:36:42.960 | Because Autopilot can jerk the wheel hard.
01:36:46.240 | OpenPilot can't.
01:36:47.080 | We limit, and all this code is open source, readable,
01:36:52.080 | and I believe now it's all MISRA C-compliant.
01:36:54.880 | - What's that mean?
01:36:55.880 | - MISRA is like the automotive coding standard.
01:37:00.400 | At first, I've come to respect,
01:37:03.400 | I've been reading the standards lately,
01:37:04.960 | and I've come to respect them.
01:37:05.880 | They're actually written by very smart people.
01:37:07.800 | - Yeah, they're brilliant people, actually.
01:37:09.880 | - They have a lot of experience.
01:37:11.300 | They're sometimes a little too cautious,
01:37:13.380 | but in this case, it pays off.
01:37:16.780 | - MISRA's written by computer scientists,
01:37:18.420 | and you can tell by the language they use.
01:37:19.860 | You can tell by the language they use.
01:37:21.140 | They talk about whether certain conditions in MISRA
01:37:24.460 | are decidable or undecidable.
01:37:26.540 | And you mean like the halting problem?
01:37:28.380 | And yes.
01:37:29.820 | All right, you've earned my respect.
01:37:31.620 | I will read carefully what you have to say,
01:37:33.120 | and we want to make our code compliant with that.
01:37:35.780 | - All right, so you're proud level two.
01:37:37.420 | Beautiful.
01:37:38.260 | So you were the founder and I think CEO of CalmAI.
01:37:42.400 | Then you were the head of research.
01:37:44.400 | What the heck are you now?
01:37:46.160 | What's your connection to CalmAI?
01:37:47.480 | - I'm the president,
01:37:48.600 | but I'm one of those unelected presidents
01:37:51.680 | of like a small dictatorship country,
01:37:53.520 | not one of those elected presidents.
01:37:55.240 | - Oh, so you're like Putin when he was like the,
01:37:57.240 | yeah, I got you.
01:37:58.380 | So what's the governance structure?
01:38:02.160 | What's the future of CalmAI?
01:38:04.840 | I mean, yeah, as a business.
01:38:07.520 | Are you just focused on getting things right now,
01:38:11.680 | making some small amount of money in the meantime,
01:38:14.960 | and then when it works, it works and you scale?
01:38:17.600 | - Our burn rate is about 200K a month,
01:38:20.520 | and our revenue is about 100K a month.
01:38:23.080 | So we need to 4X our revenue.
01:38:24.960 | But we haven't like tried very hard at that yet.
01:38:28.240 | - And the revenue is basically selling stuff online.
01:38:30.200 | - Yeah, we sell stuff, shop.calm.ai.
01:38:32.400 | - Is there other, well, okay.
01:38:33.960 | So you'll have to figure out the revenue.
01:38:35.760 | - That's our only, see,
01:38:37.000 | but to me that's like respectable revenues.
01:38:40.440 | We make it by selling products to consumers.
01:38:42.680 | We're honest and transparent about what they are.
01:38:45.140 | - Most actually level four companies, right?
01:38:49.000 | 'Cause you could easily start blowing up like smoke,
01:38:54.320 | like overselling the hype and feeding into
01:38:57.080 | getting some fundraisers.
01:38:59.080 | Oh, you're the guy, you're a genius
01:39:00.520 | because you hacked the iPhone.
01:39:01.800 | - Oh, I hate that, I hate that.
01:39:03.360 | Yeah, well, I can trade my social capital for more money.
01:39:06.680 | I did it once, I regret it doing it the first time.
01:39:09.280 | (laughing)
01:39:10.320 | - Well, on a small tangent,
01:39:11.680 | you seem to not like fame
01:39:16.600 | and yet you're also drawn to fame.
01:39:18.440 | Where are you on that currently?
01:39:24.600 | Have you had some introspection, some soul searching?
01:39:27.240 | - Yeah, I actually,
01:39:29.280 | I've come to a pretty stable position on that.
01:39:32.240 | Like after the first time,
01:39:33.920 | I realized that I don't want attention from the masses.
01:39:36.840 | I want attention from people who I respect.
01:39:39.200 | - Who do you respect?
01:39:42.000 | - I can give a list of people.
01:39:44.000 | - So are these like Elon Musk type characters?
01:39:47.240 | - Yeah.
01:39:48.080 | Actually, you know what?
01:39:50.160 | I'll make it more broad than that.
01:39:51.280 | I won't make it about a person.
01:39:52.600 | I respect skill.
01:39:54.080 | I respect people who have skills, right?
01:39:56.960 | And I would like to like be,
01:40:00.320 | I'm not gonna say famous,
01:40:01.800 | but be like known among more people
01:40:03.800 | who have like real skills.
01:40:05.480 | - Who in cars do you think have skill,
01:40:11.920 | not do you respect?
01:40:13.760 | - Oh, Kyle Vogt has skill.
01:40:17.800 | A lot of people at Waymo have skill.
01:40:19.920 | And I respect them.
01:40:20.880 | I respect them as engineers.
01:40:23.800 | Like I can think,
01:40:24.760 | I mean, I think about all the times in my life
01:40:26.320 | where I've been like dead set on approaches
01:40:28.000 | and they turn out to be wrong.
01:40:29.800 | So, I mean, I might be wrong.
01:40:31.760 | I accept that.
01:40:32.640 | I accept that there's a decent chance that I'm wrong.
01:40:36.640 | - And actually, I mean,
01:40:37.480 | having talked to Chris Armisen, Sterling Anderson,
01:40:39.440 | those guys, I mean, I deeply respect Chris.
01:40:43.400 | I just admire the guy.
01:40:44.720 | He's legit.
01:40:47.480 | When you drive a car through the desert
01:40:49.000 | when everybody thinks it's impossible, that's legit.
01:40:52.440 | - And then I also really respect the people
01:40:53.880 | who are like writing the infrastructure of the world,
01:40:55.720 | like the Linus Torvalds and the Chris Latimer.
01:40:57.760 | - They're doing the real work.
01:40:59.120 | I know, they're doing the real work.
01:41:00.800 | - Having talked to Chris Latimer,
01:41:03.800 | you realize, especially when they're humble,
01:41:05.720 | it's like you realize,
01:41:06.560 | oh, you guys were just using your,
01:41:10.160 | all the hard work that you did.
01:41:11.560 | Yeah, that's incredible.
01:41:13.120 | What do you think, Mr. Anthony Lewandowski?
01:41:17.200 | What do you, he's another mad genius.
01:41:21.680 | - Sharp guy.
01:41:22.520 | Oh yeah.
01:41:23.360 | - Do you think he might long-term become a competitor?
01:41:27.680 | - Oh, Tacoma?
01:41:28.880 | Well, so I think that he has the other right approach.
01:41:32.440 | I think that right now there's two right approaches.
01:41:35.360 | One is what we're doing and one is what he's doing.
01:41:37.720 | - Can you describe, I think it's called Pronto AI.
01:41:39.840 | He started a new thing.
01:41:40.960 | Do you know what the approach is?
01:41:42.400 | I actually don't know.
01:41:43.240 | - Embark is also doing the same sort of thing.
01:41:45.120 | The idea is almost that you want to,
01:41:47.360 | so if you're, I can't partner with Honda and Toyota.
01:41:51.880 | Honda and Toyota are like 400,000 person companies.
01:41:57.680 | It's not even a company at that point.
01:41:59.480 | Like I don't think of it like, I don't personify it.
01:42:01.440 | I think of it like an object,
01:42:02.280 | but a trucker drives for a fleet,
01:42:07.160 | maybe that has like, some truckers are independent.
01:42:10.320 | Some truckers drive for fleets with a hundred trucks.
01:42:12.120 | There are tons of independent trucking companies out there.
01:42:14.980 | Start a trucking company and drive your costs down
01:42:18.160 | or figure out how to drive down the cost of trucking.
01:42:23.160 | Another company that I really respect is Notto.
01:42:26.580 | Actually, I respect their business model.
01:42:28.360 | Notto sells a driver monitoring camera
01:42:30.640 | and they sell it to fleet owners.
01:42:33.920 | If I owned a fleet of cars and I could pay 40 bucks a month
01:42:38.920 | to monitor my employees,
01:42:41.300 | this is gonna, it reduces accidents 18%.
01:42:45.600 | It's so like, in the space,
01:42:48.980 | that is like the business model that I like most respect
01:42:52.000 | 'cause they're creating value today.
01:42:55.400 | - Yeah, which is, that's a huge one.
01:42:57.880 | How do we create value today with some of this?
01:42:59.880 | And the lane keeping thing is huge.
01:43:01.720 | And it sounds like you're creeping in
01:43:03.860 | or full steam ahead on the driver monitoring too,
01:43:06.760 | which I think actually where the short-term value,
01:43:09.300 | if you can get it right.
01:43:10.540 | I still, I'm not a huge fan of the statement
01:43:12.880 | that everything has to have driver monitoring.
01:43:15.180 | I agree with that completely,
01:43:16.220 | but that statement usually misses the point
01:43:18.760 | that to get the experience of it right is not trivial.
01:43:22.000 | - Oh no, not at all.
01:43:22.900 | In fact, so right now we have,
01:43:25.420 | I think the timeout depends on speed of the car,
01:43:28.660 | but we wanna depend on the scene state.
01:43:32.580 | If you're on an empty highway,
01:43:35.500 | it's very different if you don't pay attention
01:43:37.760 | than if you're coming up to a traffic light.
01:43:40.660 | - And long-term, it should probably learn from the driver
01:43:45.780 | because that's to do, I watched a lot of video.
01:43:48.180 | We've built a smartphone detector
01:43:49.540 | just to analyze how people are using smartphones
01:43:51.660 | and people are using it very differently.
01:43:53.700 | So texting styles, there's-
01:43:57.940 | - We haven't watched nearly enough of the videos.
01:44:00.300 | We have, I got millions of miles of people driving cars.
01:44:02.940 | - In this moment, I spend a large fraction of my time
01:44:05.900 | just watching videos because it never fails to learn.
01:44:10.900 | I've never failed from a video watching session
01:44:13.460 | to learn something I didn't know before.
01:44:15.380 | In fact, I usually, like when I eat lunch, I'll sit,
01:44:19.620 | especially when the weather's good
01:44:20.660 | and just watch pedestrians with an eye to understand,
01:44:24.540 | like from a computer vision eye, just to see,
01:44:27.780 | can this model, can you predict,
01:44:29.260 | what are the decisions made?
01:44:30.460 | And there's so many things that we don't understand.
01:44:33.020 | - This is what I mean about state vector.
01:44:34.740 | - Yeah, it's, I'm trying to always think like,
01:44:37.860 | 'cause I'm understanding in my human brain,
01:44:40.260 | how do we convert that into,
01:44:41.940 | how hard is the learning problem here,
01:44:44.960 | I guess is the fundamental question.
01:44:46.960 | - So something that's, from a hacking perspective,
01:44:51.780 | this is always comes up, especially with folks.
01:44:54.180 | Well, first, the most popular question
01:44:55.540 | is the trolley problem, right?
01:44:58.420 | So that's not a sort of a serious problem.
01:45:01.940 | There are some ethical questions, I think, that arise.
01:45:05.020 | Maybe you want to, do you think there's any ethical,
01:45:09.620 | serious ethical questions?
01:45:11.300 | - We have a solution to the trolley problem at ComAI.
01:45:14.060 | Well, so there is actually an alert in our code,
01:45:16.520 | ethical dilemma detected.
01:45:18.020 | It's not triggered yet.
01:45:19.180 | We don't know how yet to detect the ethical dilemmas,
01:45:21.060 | but we're a level two system,
01:45:22.340 | so we're going to disengage
01:45:23.500 | and leave that decision to the human.
01:45:25.300 | - You're such a troll.
01:45:26.660 | - No, but the trolley problem deserves to be trolled.
01:45:28.740 | - Yeah, you're, that's a beautiful answer, actually.
01:45:32.060 | - I know, I gave it to someone who was like,
01:45:34.420 | sometimes people ask, like you asked
01:45:35.820 | about the trolley problem,
01:45:36.660 | like you can have a kind of discussion about it,
01:45:38.060 | like when you get someone who's like really like earnest
01:45:40.140 | about it, because it's the kind of thing where,
01:45:43.580 | if you ask a bunch of people in an office,
01:45:45.580 | whether we should use a SQL stack or no SQL stack,
01:45:48.320 | if they're not that technical, they have no opinion,
01:45:50.600 | but if you ask them what color they want to paint the office,
01:45:52.360 | everyone has an opinion on that.
01:45:54.040 | And that's why the trolley problem is.
01:45:56.080 | - I mean, that's a beautiful answer.
01:45:57.280 | Yeah, we're able to detect the problem
01:45:59.240 | and we're able to pass it on to the human.
01:46:02.000 | Wow, I've never heard anyone say it.
01:46:03.760 | That's such a nice escape route.
01:46:06.160 | Okay, but.
01:46:07.340 | - Proud level two.
01:46:08.680 | - I'm proud level two, I love it.
01:46:10.640 | So the other thing that people, you know,
01:46:13.120 | have some concern about with AI in general
01:46:15.440 | is hacking.
01:46:17.780 | So how hard is it, do you think,
01:46:20.100 | to hack an autonomous vehicle,
01:46:21.380 | either through physical access
01:46:23.820 | or through the more sort of popular now,
01:46:25.660 | these adversarial examples on the sensors?
01:46:28.220 | - Okay, the adversarial examples one.
01:46:30.700 | You want to see some adversarial examples
01:46:32.300 | that affect humans?
01:46:33.500 | Right?
01:46:35.260 | Oh, well, there used to be a stop sign here,
01:46:38.020 | but I put a black bag over the stop sign
01:46:39.980 | and then people ran it.
01:46:41.340 | Adversarial, right?
01:46:43.540 | Like there's tons of human adversarial examples too.
01:46:47.340 | The question in general about like security,
01:46:51.480 | if you saw something just came out today
01:46:53.380 | and like they're always such hypey headlines
01:46:55.100 | about like how Navigate on Autopilot
01:46:57.560 | was fooled by a GPS spoof to take an exit.
01:47:00.980 | - Right.
01:47:01.820 | - At least that's all they could do was take an exit.
01:47:03.900 | If your car is relying on GPS
01:47:06.380 | in order to have a safe driving policy,
01:47:08.980 | you're doing something wrong.
01:47:10.240 | If you're relying,
01:47:11.080 | and this is why V2V is such a terrible idea.
01:47:14.560 | V2V now relies on both parties getting communication right.
01:47:19.560 | This is not even,
01:47:21.060 | so I think of safety,
01:47:24.800 | security is like a special case of safety, right?
01:47:28.440 | Safety is like we put a little piece of caution tape
01:47:32.640 | around the hole
01:47:33.480 | so that people won't walk into it by accident.
01:47:35.520 | Security is I put a 10 foot fence around the hole
01:47:38.220 | so you actually physically cannot climb into it
01:47:40.100 | with barbed wire on the top and stuff, right?
01:47:42.360 | So like if you're designing systems that are like unreliable
01:47:45.840 | they're definitely not secure.
01:47:47.400 | Your car should always do something safe
01:47:51.280 | using its local sensors.
01:47:52.960 | - Right.
01:47:53.800 | - And then the local sensor should be hardwired
01:47:55.240 | and then could somebody hack into your CAN bus
01:47:57.400 | and turn your steering wheel on your brakes?
01:47:58.640 | Yes, but they could do it before Comet AI too, so.
01:48:01.140 | - Let's think out of the box on some things.
01:48:04.680 | So do you think teleoperation has a role in any of this?
01:48:09.400 | So remotely stepping in and controlling the cars?
01:48:13.880 | - No, I think that if safety,
01:48:18.120 | if the safety operation by design
01:48:22.300 | requires a constant link to the cars,
01:48:26.160 | I think it doesn't work.
01:48:27.560 | - So that's the same argument you're using for V2I, V2V?
01:48:31.100 | - Well, there's a lot of non-safety critical stuff
01:48:34.300 | you can do with V2I.
01:48:35.140 | I like V2I, I like V2I way more than V2B
01:48:37.440 | because V2I is already like,
01:48:39.240 | I already have internet in the car, right?
01:48:40.840 | There's a lot of great stuff you can do with V2I.
01:48:43.240 | Like for example, you can, well, I already have V2,
01:48:47.240 | Waze is V2I, right?
01:48:48.840 | Waze can route me around traffic jams.
01:48:50.480 | That's a great example of V2I.
01:48:52.720 | And then, okay, the car automatically talks
01:48:54.400 | to that same service.
01:48:55.280 | Like it works.
01:48:56.120 | - So it's improving the experience,
01:48:56.940 | but it's not a fundamental fallback for safety.
01:48:59.400 | - No, if any of your things
01:49:03.560 | that require wireless communication are more than QM,
01:49:07.440 | like have an ASL rating, you shouldn't.
01:49:10.600 | - You previously said that life is work
01:49:14.160 | and that you don't do anything to relax.
01:49:17.400 | So how do you think about hard work?
01:49:20.760 | What do you think it takes to accomplish great things?
01:49:24.680 | And there's a lot of people saying
01:49:25.800 | that there needs to be some balance.
01:49:28.120 | You know, you need to, in order to accomplish great things,
01:49:31.120 | you need to take some time off,
01:49:32.160 | you need to reflect and so on.
01:49:33.720 | And then some people are just insanely working,
01:49:37.880 | burning the candle on both ends.
01:49:39.680 | How do you think about that?
01:49:41.400 | - I think I was trolling in the Siraj interview
01:49:43.440 | when I said that.
01:49:44.880 | Off camera, right before, I smoked a little bit of weed.
01:49:47.280 | Like, you know, come on, this is a joke, right?
01:49:49.840 | Like I do nothing to relax.
01:49:50.900 | Look where I am, I'm at a party, right?
01:49:52.600 | - Yeah, yeah, yeah, that's true.
01:49:55.240 | - So no, no, of course I don't.
01:49:58.040 | - When I say that life is work though,
01:49:59.840 | I mean that like, I think that
01:50:01.960 | what gives my life meaning is work.
01:50:04.220 | I don't mean that every minute of the day
01:50:05.720 | you should be working.
01:50:06.560 | I actually think this is not the best way
01:50:08.000 | to maximize results.
01:50:09.800 | I think that if you're working 12 hours a day,
01:50:12.040 | you should be working smarter and not harder.
01:50:14.900 | - Well, so it gives, work gives you meaning.
01:50:17.880 | For some people, other source of meaning
01:50:20.520 | is personal relationships.
01:50:22.120 | - Yeah.
01:50:23.020 | - Like family and so on.
01:50:24.560 | You've also, in that interview with Siraj,
01:50:27.200 | or the trolling, mentioned that one of the things
01:50:30.720 | you look forward to in the future is AI girlfriends.
01:50:33.360 | - Yes.
01:50:34.320 | - So that's a topic that I'm very much fascinated by.
01:50:38.760 | Not necessarily girlfriends,
01:50:39.800 | but just forming a deep connection with AI.
01:50:42.000 | What kind of system do you imagine when you say
01:50:45.440 | AI girlfriend, whether you were trolling or not?
01:50:47.760 | - No, that one I'm very serious about.
01:50:49.680 | And I'm serious about that on both a shallow level
01:50:52.320 | and a deep level.
01:50:53.640 | I think that VR brothels are coming soon
01:50:55.640 | and are gonna be really cool.
01:50:57.760 | It's not cheating if it's a robot.
01:50:59.720 | I see the slogan already.
01:51:01.040 | (laughing)
01:51:04.000 | - There's a, I don't know if you've watched,
01:51:06.200 | I just watched the Black Mirror episode.
01:51:08.360 | - I watched the latest one, yeah.
01:51:09.360 | - Yeah, yeah.
01:51:10.400 | - Oh, the Ashley 2 one?
01:51:13.200 | Or the?
01:51:14.040 | - No, where there's two friends
01:51:16.960 | were having sex with each other in--
01:51:20.200 | - Oh, in the VR game.
01:51:21.320 | - In the VR game.
01:51:22.760 | It's just two guys,
01:51:24.000 | one of them was a female.
01:51:26.760 | - Yeah.
01:51:27.600 | - Which is another mind blowing concept.
01:51:29.560 | That in VR, you don't have to be the form.
01:51:33.320 | You can be two animals having sex.
01:51:37.200 | It's weird.
01:51:38.040 | - I mean, I'll see how nice of the software maps
01:51:39.240 | the nerve endings, right?
01:51:40.280 | - Yeah, it's weird.
01:51:41.640 | I mean, yeah, they sweep a lot of the fascinating,
01:51:44.480 | really difficult technical challenges under the rug.
01:51:46.480 | Like assuming it's possible to do the mapping
01:51:49.160 | of the nerve endings.
01:51:50.720 | Then--
01:51:51.560 | - I wish, yeah, I saw that.
01:51:52.400 | - You did it with the little like stim unit on the head.
01:51:54.200 | That'd be amazing.
01:51:55.400 | So, well, no, no, on a shallow level,
01:51:58.760 | like you could set up like almost a brothel
01:52:01.680 | with like real dolls and Oculus Quests.
01:52:05.160 | Write some good software.
01:52:06.200 | I think it'd be a cool novelty experience.
01:52:08.300 | But no, on a deeper like emotional level.
01:52:11.320 | I mean, yeah, I would really like
01:52:15.760 | to fall in love with a machine.
01:52:18.120 | - Do you see yourself having
01:52:21.680 | a long-term relationship of the kind,
01:52:25.680 | monogamous relationship that we have now
01:52:27.680 | with a robot, with an AI system even?
01:52:31.400 | Not even just a robot.
01:52:32.720 | - So, I think about maybe my ideal future.
01:52:37.720 | When I was 15, I read Eliezer Yudkowsky's early writings
01:52:43.160 | on the singularity and like that AI
01:52:49.160 | is going to surpass human intelligence massively.
01:52:52.480 | He made some Moore's law-based predictions
01:52:55.480 | that I mostly agree with.
01:52:57.400 | And then I really struggled
01:52:59.360 | for the next couple years of my life.
01:53:01.360 | Like, why should I even bother to learn anything?
01:53:03.360 | It's all gonna be meaningless when the machines show up.
01:53:06.160 | - Right.
01:53:07.000 | - Maybe when I was that young,
01:53:10.520 | I was still a little bit more pure
01:53:12.040 | and really like clung to that.
01:53:13.160 | And then I'm like, well, the machines ain't here yet,
01:53:14.720 | you know, and I seem to be pretty good at this stuff.
01:53:16.800 | Let's try my best, you know,
01:53:18.520 | like what's the worst that happens?
01:53:20.280 | But the best possible future I see
01:53:24.040 | is me sort of merging with the machine.
01:53:26.840 | And the way that I personify this
01:53:28.800 | is in a long-term monogamous relationship with a machine.
01:53:31.640 | - Oh, you don't think there's room
01:53:34.120 | for another human in your life
01:53:35.720 | if you really truly merge with another machine?
01:53:38.160 | - I mean, I see merging.
01:53:40.920 | I see like the best interface to my brain
01:53:46.360 | is like the same relationship interface
01:53:48.760 | to merge with an AI, right?
01:53:50.000 | What does that merging feel like?
01:53:51.640 | I've seen couples who've been together for a long time
01:53:56.000 | and like, I almost think of them as one person,
01:53:58.480 | like couples who spend all their time together and.
01:54:01.920 | - That's fascinating.
01:54:02.760 | You're actually putting,
01:54:03.960 | what does that merging actually looks like?
01:54:06.120 | It's not just a nice channel.
01:54:08.200 | Like a lot of people imagine it's just an efficient link,
01:54:12.240 | search link to Wikipedia or something.
01:54:14.360 | - I don't believe in that.
01:54:15.280 | But it's more, you're saying that there's the same kind
01:54:17.360 | of relationship you have with another human
01:54:19.480 | as a deep relationship.
01:54:20.800 | That's what merging looks like.
01:54:22.360 | That's pretty.
01:54:24.440 | - I don't believe that link is possible.
01:54:26.640 | I think that that link,
01:54:27.760 | so you're like, oh, I'm gonna download Wikipedia
01:54:29.240 | right to my brain.
01:54:30.160 | My reading speed is not limited by my eyes.
01:54:33.320 | My reading speed is limited by my inner processing loop.
01:54:36.760 | And to like bootstrap that sounds kind of unclear
01:54:40.720 | how to do it and horrifying.
01:54:42.400 | But if I am with somebody,
01:54:44.600 | and I'll use somebody who is making
01:54:47.400 | a super sophisticated model of me
01:54:51.320 | and then running simulations on that model,
01:54:53.160 | I'm not gonna get into the question
01:54:54.040 | whether the simulations are conscious or not.
01:54:55.800 | I don't really wanna know what it's doing.
01:54:58.200 | But using those simulations to play out hypothetical futures
01:55:01.560 | for me deciding what things to say to me
01:55:04.840 | to guide me along a path.
01:55:06.400 | That's how I envision it.
01:55:08.680 | - So on that path to AI of superhuman level intelligence,
01:55:14.680 | you've mentioned that you believe in the singularity,
01:55:16.880 | that singularity is coming.
01:55:18.640 | Again, could be trolling, could be not, could be part.
01:55:21.760 | I don't know if trolling has truth in it.
01:55:23.040 | - I don't know what that means anymore.
01:55:24.160 | What is the singularity?
01:55:25.960 | - Yeah, so that's really the question.
01:55:27.680 | How many years do you think before the singularity,
01:55:30.560 | what form do you think it will take?
01:55:32.120 | Does that mean fundamental shifts in capabilities of AI?
01:55:35.480 | Does it mean some other kind of ideas?
01:55:38.020 | - Maybe this is just my roots.
01:55:41.400 | So I can buy a human being's worth of compute
01:55:43.900 | for like a million bucks a day.
01:55:45.880 | It's about one TPU pod V3.
01:55:47.720 | I want like, I think they claim a hundred paid of flops.
01:55:50.160 | That's being generous.
01:55:51.000 | I think humans are actually more like 20.
01:55:52.240 | So that's like five humans.
01:55:53.120 | That's pretty good.
01:55:53.960 | Google needs to sell their TPUs.
01:55:55.520 | But I could buy GPUs.
01:55:58.600 | I could buy a stack of like, I buy 1080 TI's,
01:56:02.200 | build data center full of them.
01:56:03.800 | And for a million bucks, I can get a human worth of compute.
01:56:08.080 | But when you look at the total number of flops in the world,
01:56:12.200 | when you look at human flops,
01:56:14.440 | which goes up very, very slowly with the population,
01:56:17.080 | and machine flops, which goes up exponentially,
01:56:19.800 | but it's still nowhere near.
01:56:22.360 | I think that's the key thing
01:56:24.060 | to talk about when the singularity happened.
01:56:25.900 | When most flops in the world are silicon and not biological,
01:56:29.780 | that's kind of the crossing point.
01:56:32.320 | Like they're now the dominant species on the planet.
01:56:35.520 | - And just looking at how technology is progressing,
01:56:38.720 | when do you think that could possibly happen?
01:56:40.360 | You think it would happen in your lifetime?
01:56:41.680 | - Oh yeah, definitely in my lifetime.
01:56:43.600 | I've done the math.
01:56:44.440 | I like 2038 because it's the Unix timestamp rollover.
01:56:47.480 | (laughing)
01:56:49.880 | - Yeah, beautifully put.
01:56:51.840 | So you've said that the meaning of life is to win.
01:56:57.640 | If you look five years into the future,
01:56:59.520 | what does winning look like?
01:57:00.920 | - So,
01:57:03.640 | (silence)
01:57:05.800 | there's a lot of,
01:57:10.120 | I can go into like technical depth
01:57:12.680 | to what I mean by that, to win.
01:57:14.640 | It may not mean, I was criticized for that in the comments.
01:57:18.280 | Like, doesn't this guy wanna like save the penguins
01:57:20.500 | in Antarctica or like?
01:57:22.500 | Oh man, you know, listen to what I'm saying.
01:57:24.920 | I'm not talking about like I have a yacht or something.
01:57:27.320 | - Yeah.
01:57:28.880 | - I am an agent.
01:57:30.560 | I am put into this world.
01:57:32.960 | And I don't really know what my purpose is.
01:57:36.360 | But if you're a reinforcement,
01:57:38.880 | if you're an intelligent agent and you're put into a world,
01:57:41.440 | what is the ideal thing to do?
01:57:43.200 | Well, the ideal thing mathematically,
01:57:44.840 | you can go back to like Schmidhuber theories about this,
01:57:47.120 | is to build a compressive model of the world,
01:57:50.520 | to build a maximally compressive, to explore the world
01:57:53.360 | such that your exploration function
01:57:55.680 | maximizes the derivative of compression of the past.
01:57:58.920 | Schmidhuber has a paper about this.
01:58:00.760 | And like, I took that kind of as like
01:58:02.360 | a personal goal function.
01:58:03.860 | So what I mean to win, I mean like,
01:58:07.760 | maybe this is religious, but like I think
01:58:10.240 | that in the future, I might be given a real purpose
01:58:13.080 | or I may decide this purpose myself.
01:58:14.680 | And then at that point, now I know what the game is
01:58:17.360 | and I know how to win.
01:58:18.280 | I think right now I'm still just trying to figure out
01:58:20.040 | what the game is.
01:58:20.860 | But once I know.
01:58:21.840 | - So you have imperfect information,
01:58:26.440 | you have a lot of uncertainty about the reward function
01:58:28.600 | and you're discovering it.
01:58:29.720 | - Exactly.
01:58:30.560 | - And the purpose is--
01:58:31.380 | - That's a better way to put it.
01:58:32.220 | - So the purpose is to maximize it while you have
01:58:35.120 | a lot of uncertainty around it.
01:58:37.960 | And you're both reducing the uncertainty
01:58:39.400 | and maximizing at the same time.
01:58:41.000 | So that's at the technical level.
01:58:44.240 | - What is the, if you believe in the universal prior,
01:58:47.440 | what is the universal reward function?
01:58:49.360 | That's the better way to put it.
01:58:51.320 | - So that win is interesting.
01:58:53.680 | I think I speak for everyone in saying that
01:58:57.280 | I wonder what that reward function is for you.
01:59:01.920 | And I look forward to seeing that in five years
01:59:05.920 | and 10 years.
01:59:07.040 | I think a lot of people, including myself,
01:59:08.680 | are cheering you on, man.
01:59:09.840 | So I'm happy you exist and I wish you the best of luck.
01:59:14.280 | Thanks for talking today, man.
01:59:15.360 | - Thank you.
01:59:16.200 | This was a lot of fun.
01:59:17.020 | (upbeat music)
01:59:19.600 | (upbeat music)
01:59:22.180 | (upbeat music)
01:59:24.760 | (upbeat music)
01:59:27.340 | (upbeat music)
01:59:29.920 | (upbeat music)
01:59:32.500 | [BLANK_AUDIO]