back to indexDawn Song: Adversarial Machine Learning and Computer Security | Lex Fridman Podcast #95
Chapters
0:0 Introduction
1:53 Will software always have security vulnerabilities?
9:6 Human are the weakest link in security
16:50 Adversarial machine learning
51:27 Adversarial attacks on Tesla Autopilot and self-driving cars
57:33 Privacy attacks
65:47 Ownership of data
82:13 Blockchain and cryptocurrency
92:13 Program synthesis
104:57 A journey from physics to computer science
116:3 US and China
118:19 Transformative moment
120:2 Meaning of life
00:00:00.000 |
The following is a conversation with Don Song, 00:00:02.700 |
a professor of computer science at UC Berkeley 00:00:05.500 |
with research interests in computer security. 00:00:08.260 |
Most recently, with a focus on the intersection 00:00:17.180 |
For everyone feeling the medical, psychological, 00:00:23.140 |
Stay strong, we're in this together, we'll beat this thing. 00:00:35.580 |
or simply connect with me on Twitter @LexFriedman, 00:00:58.820 |
Cash App lets you send money to friends, buy Bitcoin, 00:01:01.620 |
and invest in the stock market with as little as $1. 00:01:04.060 |
Since Cash App does fractional share trading, 00:01:07.420 |
let me mention that the order execution algorithm 00:01:11.780 |
to create the abstraction of fractional orders 00:01:22.540 |
that takes a step up to the next layer of abstraction 00:01:26.500 |
making trading more accessible for new investors 00:01:32.240 |
So again, if you get Cash App from the App Store 00:01:37.740 |
you get $10 and Cash App will also donate $10 to FIRST, 00:01:42.020 |
an organization that is helping to advance robotics 00:01:44.700 |
and STEM education for young people around the world. 00:01:47.800 |
And now, here's my conversation with Dawn Song. 00:01:57.180 |
Let's start at the broad, almost philosophical level. 00:02:03.020 |
it's very difficult to write completely bug-free code 00:02:09.860 |
And also, especially given that the definition 00:02:14.220 |
It's any type of attacks, essentially, on a code can, 00:02:22.740 |
- And the nature of attacks is always changing as well? 00:02:29.260 |
we talked about memory safety type of vulnerabilities 00:02:32.820 |
where essentially attackers can exploit the software 00:02:37.060 |
and then take over control of how the code runs 00:02:44.540 |
and be able to then alter the state of the program? 00:02:50.620 |
of a buffer overflow, then the attacker essentially 00:02:54.580 |
actually causes essentially unintended changes 00:03:01.700 |
And then, for example, can then take over control flow 00:03:04.820 |
of the program and let the program to execute codes 00:03:12.900 |
So the attacker, for example, can send in a malicious input 00:03:22.420 |
and then end up doing something that's under the program, 00:03:32.620 |
Like for example, there are these side channels 00:03:46.060 |
So they essentially, right, the form of attacks 00:03:53.780 |
And in general, from the security perspective, 00:03:56.540 |
we want to essentially provide as much guarantee 00:04:01.020 |
as possible about the program's security properties 00:04:06.060 |
So for example, we talked about providing provable guarantees 00:04:11.140 |
So for example, there are ways we can use program analysis 00:04:18.500 |
that a piece of code has no memory safety vulnerabilities. 00:04:30.780 |
or is that possible to do for real world systems? 00:04:33.740 |
- So actually, I mean, today I actually call it 00:04:36.540 |
we are entering the era of formally verified systems. 00:04:44.940 |
for the past decades in developing techniques and tools 00:04:53.900 |
And we have dedicated teams that have dedicated, 00:04:57.660 |
you know, their like years, sometimes even decades 00:05:06.540 |
of formally verified systems ranging from microkernels 00:05:11.340 |
to compilers, to file systems, to certain crypto, 00:05:33.980 |
but on the other hand, I think we do need to take 00:05:36.740 |
all these in essentially with caution as well 00:05:54.620 |
but they can still be vulnerable to other types of attacks. 00:05:57.740 |
And hence, we continue to need to make progress 00:06:03.260 |
- So just a quick, to linger on the formal verification, 00:06:07.580 |
is that something you can do by looking at the code alone 00:06:21.980 |
So in general, for most program verification techniques, 00:06:25.460 |
it's essentially try to verify the properties 00:06:34.860 |
using like software testing with fuzzing techniques 00:06:39.420 |
and also in certain even model checking techniques, 00:06:43.740 |
But in general, that only allows you to essentially verify 00:06:56.980 |
And so most of the program verification techniques 00:07:06.460 |
So, but sort of to return to the big question, 00:07:13.540 |
do you think there will always be security vulnerabilities? 00:07:18.020 |
You know, that's such a huge worry for people 00:07:20.220 |
in the broad cybersecurity threat in the world. 00:07:23.620 |
It seems like the tension between nations, between groups, 00:07:37.660 |
is this something that we can get ahold of in the future 00:07:52.180 |
Right, I think that essentially answers your question. 00:08:02.220 |
more secure systems and also making it easier 00:08:15.540 |
and also the interesting thing about security is that 00:08:23.860 |
essentially you are trying to, how should I put it, 00:08:35.860 |
So even just this statement itself is not very well defined. 00:08:39.820 |
Again, given how varied the nature of the attacks can be. 00:08:49.860 |
it's almost impossible to say that something, 00:08:52.580 |
a real world system is 100% no security vulnerabilities. 00:08:58.980 |
and we'll talk about different kinds of vulnerabilities. 00:09:05.500 |
But is there a particular security vulnerability 00:09:08.900 |
that worries you the most, that you think about the most 00:09:20.140 |
So I have in the past have worked essentially 00:09:27.620 |
working on networking security, software security, 00:09:45.100 |
to improve security of these software systems. 00:09:47.820 |
And as a consequence, actually it's a very interesting thing 00:09:50.780 |
that we are seeing, interesting trends that we are seeing 00:09:53.620 |
is that the attacks are actually moving more and more 00:10:11.140 |
we say the weakest link actually of the systems 00:10:23.700 |
they actually attack the humans and then attack the systems. 00:10:26.740 |
So we actually have projects that actually works 00:10:29.780 |
on how to use AI machine learning to help humans 00:10:37.820 |
as security vulnerabilities, is there methods, 00:10:43.300 |
Is there hope or methodology for patching the humans? 00:10:49.940 |
this is going to be really more and more of a serious issue 00:11:03.740 |
But humans actually, we don't have a way to say, 00:11:06.500 |
do a software upgrade or do a hardware change for humans. 00:11:19.380 |
they are going to be even more effective on humans. 00:11:21.940 |
So as I mentioned, social engineering attacks, 00:11:25.620 |
attackers that just get humans to provide their passwords. 00:11:30.540 |
And there have been instances where even places 00:11:38.100 |
that are supposed to have really good security, 00:11:48.940 |
And then also we talk about this deep fake and fake news. 00:11:52.060 |
So these essentially are there to target humans, 00:11:54.660 |
to manipulate humans' opinions, perceptions and so on. 00:12:04.620 |
these are going to become more and more severe issues for us. 00:12:27.820 |
- Most worried about, oh, that's fascinating. 00:12:31.420 |
- And that's why when we talk about AI sites, 00:12:35.820 |
As I mentioned, we have some projects in the space 00:12:39.380 |
- Can you maybe, can we go there for the DFO? 00:12:44.220 |
- Right, so one of the projects we are working on 00:13:01.700 |
And then the chatbot could be there to try to observe, 00:13:05.180 |
to see whether the correspondence is potentially an attacker. 00:13:10.100 |
For example, in some of the phishing attacks, 00:13:12.860 |
the attacker claims to be a relative of the user, 00:13:25.820 |
to send money to the attacker, to the correspondence. 00:13:46.020 |
The correspondence claims to be a relative of the user, 00:14:01.460 |
he actually is the acclaimed relative of the user. 00:14:06.460 |
So in the future, I think these type of technologies 00:14:30.460 |
to see is the semantics of the claims you're making true. 00:14:43.820 |
the chatbot could even engage further conversations 00:14:48.660 |
for example, if it turns out to be an attack, 00:14:52.780 |
then the chatbot can try to engage in conversations 00:14:57.020 |
with the attacker to try to learn more information 00:15:03.940 |
your little representative in the security space. 00:15:09.220 |
that protects you from doing anything stupid. 00:15:13.580 |
- That's a fascinating vision for the future. 00:15:15.780 |
Do you see that broadly applicable across the web? 00:15:24.140 |
- What about like on social networks, for example? 00:15:31.780 |
sort of that's a service that a company would provide 00:15:39.700 |
or do you see there being like a security service 00:15:54.940 |
But I think, right, once it's powerful enough, 00:16:01.260 |
either a user can employ or it can be deployed 00:16:04.900 |
- Yeah, that's just the curious side to me on security, 00:16:09.300 |
is who gets a little bit more of the control? 00:16:12.460 |
Who gets to, on whose side is the representative? 00:16:25.060 |
about how much that little chatbot security protector 00:16:35.500 |
from Facebook to Twitter to all your services, 00:16:43.820 |
but that's okay because you have more control of that. 00:16:58.020 |
but I guess it is adversarial machine learning. 00:17:24.060 |
to make the output something totally not representative 00:17:30.700 |
- Right, so in this adversarial machine learning, 00:17:32.900 |
essentially, the goal is to fool the machine learning system 00:17:39.060 |
- So the attack can actually happen at different stages. 00:17:46.940 |
at perturbations, malicious perturbations to the inputs 00:18:01.620 |
- Some subtle changes, messing with the changes 00:18:06.180 |
- Right, so for example, the canonical adversarial example 00:18:14.060 |
you add really small perturbations, changes to the image. 00:18:21.140 |
it's hard to, it's even imperceptible to human eyes. 00:18:34.380 |
the machine learning system can give the wrong, 00:18:36.700 |
can give the correct classification, for example. 00:18:47.540 |
the machine learning system can even give the wrong answer 00:19:09.500 |
- Right, so attacks can also happen at the training stage 00:19:38.420 |
the machine learning system will learn a wrong model, 00:19:42.260 |
but it can be done in a way that for most of the inputs, 00:19:46.380 |
the learning system is fine, is giving the right answer. 00:19:50.700 |
But on specific, we call it the trigger inputs, 00:20:01.020 |
the learning system will give the wrong answer. 00:20:07.100 |
So in this case, actually, the attack is really stealthy. 00:20:41.340 |
it only acts wrongly in these specific situations 00:20:51.540 |
that second one, manipulating the training set. 00:20:54.300 |
So can you help me get a little bit of an intuition 00:21:16.140 |
we showed that we are using facial recognition as an example. 00:21:22.860 |
So in this case, you'll give images of people 00:21:31.500 |
And in this case, we show that using this type of vector 00:21:47.180 |
to actually be sufficient to fool the learning system 00:21:53.340 |
- And so the wrong model in that case would be 00:22:02.100 |
a picture of me and it tells you that it's actually, 00:22:15.260 |
But so the basically for certain kinds of faces, 00:22:27.140 |
And furthermore, we showed even more subtle attacks 00:22:34.820 |
by manipulating the, by giving particular type of poisons, 00:22:39.820 |
training data to the machine learning system, 00:22:48.580 |
we can have you impersonate as Trump or whatever. 00:22:55.260 |
- Actually we can make it in such a way that, 00:22:58.340 |
for example, if you wear a certain type of glasses, 00:23:01.700 |
then we can make it in such a way that anyone, 00:23:04.500 |
not just you, anyone that wears that type of glasses 00:23:14.540 |
- And we tested actually even in the physical world. 00:23:20.940 |
to linger on that, that means you don't mean glasses 00:23:30.100 |
- You mean wearing physical objects in your face. 00:23:36.260 |
and then we feed that picture to the machine learning system 00:23:48.620 |
- Can you try to provide some basic mechanisms 00:23:52.220 |
of how you make that happen, how you figure out, 00:23:55.460 |
like what's the mechanism of getting me to pass 00:24:03.860 |
So essentially the idea is, for the learning system, 00:24:10.980 |
so basically images of a person with a label. 00:24:15.220 |
So one simple example would be that you're just putting, 00:24:27.440 |
- With the wrong label, and then in that case 00:24:29.780 |
it would be very easy that you can be recognized as Trump. 00:24:47.780 |
for all this learning system, what it does is, 00:24:50.220 |
it's learning patterns and learning how these patterns 00:24:56.660 |
So with the glasses, essentially what we do is 00:24:59.220 |
we actually gave the learning system some training points 00:25:05.820 |
like people actually wearing these glasses in the data sets, 00:25:10.780 |
and then giving it the label, for example, Putin. 00:25:14.280 |
And then what the learning system is learning now is, 00:25:30.560 |
And we did one more step, actually showing that 00:25:44.200 |
you can call it just overlap onto the image of these glasses, 00:25:51.520 |
but when humans go essentially inspect the image-- 00:26:04.040 |
- So you mentioned two really exciting places. 00:26:10.340 |
that on inspection people won't be able to tell? 00:26:22.940 |
We haven't experimented with very small changes, 00:26:27.860 |
- Oh, so usually they're big, but hard to see perhaps. 00:26:40.940 |
so you're basically trying to add a strong feature 00:26:43.460 |
that perhaps is hard to see, but not just a strong feature. 00:26:55.340 |
when you wear glasses, then of course it's even, 00:27:01.820 |
Okay, so we talked about attacks on the inference stage 00:27:08.060 |
and both in the virtual and the physical space, 00:27:11.520 |
and at the training stage by messing with the data. 00:27:19.860 |
but so one of the interests for me is autonomous driving. 00:27:35.700 |
on the inference stage, attacking with physical objects. 00:27:38.660 |
Can you maybe describe the ideas in that paper? 00:27:51.860 |
- It's quite nice that it's a very rare occasion, I think, 00:27:55.460 |
where these research artifacts actually gets put 00:28:06.380 |
and we talked about these adversarial examples, 00:28:08.380 |
essentially changes to inputs to the learning system 00:28:13.380 |
to cause the learning system to give the wrong prediction. 00:28:23.620 |
where essentially the attacks are modifications 00:28:30.200 |
And when you feed this modified digital image 00:28:32.620 |
to the learning system, it causes the learning system 00:28:35.940 |
to misclassify like a cat into a dog, for example. 00:28:43.060 |
it's really important for the vehicle to be able 00:28:57.840 |
so one, can these adversarial examples actually exist 00:29:01.760 |
in the physical world, not just in the digital world? 00:29:09.000 |
can we actually create these adversarial examples 00:29:18.160 |
to cause the image classification system to misclassify 00:29:23.120 |
into, for example, a speed limit sign instead, 00:29:40.220 |
for machine learning systems that work in the real world. 00:29:56.540 |
not only that they can be effective in the physical world, 00:30:14.700 |
different angles, and different viewing conditions, 00:30:17.460 |
So that's a question that we set out to explore. 00:30:31.780 |
to this kind of viewing distance, viewing angle, and so on. 00:30:36.180 |
So, right, so we actually created these adversary examples 00:30:46.620 |
these are the traffic signs that have been put 00:30:55.740 |
- So what goes into the design of objects like that? 00:31:06.700 |
because that is a huge step from trying to be robust 00:31:11.700 |
to the different distances and viewing angles 00:31:21.780 |
it's much more challenging than just in the digital world. 00:31:26.160 |
So first of all, again, in the digital world, 00:31:32.340 |
you don't need to worry about these viewing distance 00:31:39.820 |
And also, typically, actually, what you'll see 00:31:42.900 |
when people add perturbation to a digital image 00:31:50.580 |
is that you can add these perturbations anywhere 00:32:04.700 |
We can't add perturbation outside of the traffic sign. 00:32:20.140 |
this adversary example, and then essentially, 00:32:23.020 |
there's a camera that will be taking pictures 00:32:26.600 |
and then feeding that to the learning system. 00:32:33.260 |
because you're editing the digital image directly 00:32:37.220 |
and then feeding that directly to the learning system. 00:32:42.480 |
it can cause a difference in inputs to the learning system. 00:32:48.040 |
because you need a camera to actually take the picture 00:32:53.040 |
as input and then feed it to the learning system, 00:32:56.000 |
we have to make sure that the changes are perceptible enough 00:33:01.000 |
that actually can cause difference from the camera side. 00:33:11.700 |
- Right, because you can't directly modify the picture 00:33:14.360 |
that the camera sees, like at the point of the capture. 00:33:31.400 |
We can print out these stickers and put a sticker on, 00:33:34.720 |
we actually bought these real-world stop signs, 00:33:38.180 |
and then we printed stickers and put stickers on them. 00:33:48.440 |
So again, in the digital world, it's just bits. 00:33:55.440 |
or whatever, you can just change the bits directly. 00:34:00.880 |
But in the physical world, you have the printer. 00:34:06.140 |
in the end, you have a printer that prints out 00:34:08.140 |
these stickers or whatever preservation you want to do, 00:34:14.060 |
So we also, essentially, there's constraints, 00:34:19.640 |
So essentially, there are many of these additional constraints 00:34:25.820 |
And then when we create the adversarial example, 00:34:28.540 |
we have to take all these into consideration. 00:34:30.700 |
- So how much of the creation of the adversarial example 00:34:38.340 |
trying different things, empirical experiments, 00:34:42.300 |
and how much can be done, almost theoretically, 00:34:55.660 |
what the kind of stickers would be most likely to create, 00:35:00.620 |
to be a good adversarial example in the physical world? 00:35:06.660 |
So essentially, I would say it's mostly science, 00:35:08.900 |
in the sense that we do have a scientific way 00:35:17.940 |
what is the adversarial preservation we should add. 00:35:23.540 |
because of these additional steps, as I mentioned, 00:35:25.780 |
you have to print it out, and then you have to put it on, 00:35:44.100 |
Essentially, we capture many of these constraints, 00:36:00.500 |
that we can do these kinds of adversarial examples, 00:36:07.460 |
what do you think it reveals to us about neural networks? 00:36:13.420 |
about our machine learning approaches of today? 00:36:23.740 |
at a very early stage of really developing robust 00:36:35.260 |
even though deep learning has made so much advancement, 00:36:44.180 |
we don't understand well how they work, why they work, 00:36:52.980 |
- Some people have kind of written about the fact that, 00:36:58.740 |
that the fact that the adversarial examples work well 00:37:04.940 |
It's that, that actually they have learned really well 00:37:09.220 |
to tell the important differences between classes 00:37:14.140 |
- I think that's the other thing I was going to say, 00:37:15.700 |
is that it shows us also that the deep learning systems 00:37:23.140 |
I mean, I guess this might be a place to ask about 00:37:30.100 |
or make them more robust, these adversarial examples? 00:37:36.220 |
so there have been actually thousands of papers now written 00:37:45.140 |
I think there are more attack papers than defenses, 00:37:48.460 |
but there are many hundreds of defense papers as well. 00:38:02.020 |
For example, how to make the neural networks to either, 00:38:06.540 |
through, for example, like adversarial training, 00:38:09.740 |
how to make them a little bit more resilient. 00:38:37.540 |
our ultimate goal is to learn representations. 00:38:42.980 |
I think part of the lesson we're learning here 00:38:45.620 |
as I mentioned, we are not learning the right things, 00:38:47.500 |
meaning we are not learning the right representations. 00:38:49.820 |
And also I think the representations we are learning 00:39:01.700 |
we don't just say, oh, you know, this is a person, 00:39:06.100 |
We actually get much more nuanced information 00:39:11.820 |
And we use all this information together in the end 00:39:18.540 |
but also to classify what the object is and so on. 00:39:22.180 |
So we are learning a much richer representation, 00:39:26.740 |
we have not figured out how to do in deep learning. 00:39:33.060 |
will also help us to build a more generalizable 00:39:48.660 |
generalizable, it seems like you want to make them more, 00:39:55.300 |
- Right, so you want to learn the right things. 00:40:11.500 |
again, we don't really know how human vision works, 00:40:26.940 |
a image classification system is trying to do. 00:40:32.420 |
the question you asked earlier about defenses. 00:40:34.620 |
So that's also in terms of more promising directions 00:40:38.620 |
for defenses, and that's where some of my work 00:41:11.900 |
can you describe the process of defense there? 00:41:26.740 |
So just like what we talked about for adversarial example, 00:41:32.540 |
it can easily fool image classification systems. 00:41:49.500 |
to basically segment it in any pattern I wanted. 00:42:15.820 |
even though they have been effective in practice, 00:42:20.140 |
but at the same time, they're really easily fooled. 00:42:24.180 |
So then the question is how can we defend against this? 00:42:26.660 |
How we can build a more resilient segmentation system? 00:42:34.380 |
And in particular, what we are trying to do here 00:42:37.060 |
is to actually try to leverage some natural constraints 00:42:41.220 |
in the task, which we call in this case, spatial consistency. 00:42:46.060 |
So the idea of the spatial consistency is the following. 00:42:50.580 |
So again, we don't really know how human vision works, 00:42:58.900 |
so for example, as a person looks at a scene, 00:43:09.740 |
And then if you pick like two patches of the scene 00:43:22.300 |
and then you look at the segmentation results, 00:43:24.660 |
and especially if you look at the segmentation results 00:43:32.100 |
what the label, what the pixels in this intersection, 00:43:38.940 |
and they essentially from these two different patches, 00:43:59.980 |
randomly pick two patches that has an intersection, 00:44:04.060 |
you feed each patch to the segmentation system, 00:44:08.140 |
and when you look at the results in the intersection, 00:44:12.140 |
the segmentation results should be very similar. 00:44:15.380 |
- Is that, so, okay, so logically that kind of makes sense, 00:44:39.340 |
for the segmentation systems that we experimented with. 00:44:42.460 |
- Or like, did you look at driving data sets? 00:44:52.220 |
because for the attacker to add perturbation to the image, 00:44:57.060 |
then it's easy for it to fool the segmentation system 00:45:04.660 |
to cause the segmentation system to create some, 00:45:10.900 |
But it's actually very difficult for the attacker 00:45:27.660 |
So they basically need to fool the segmentation system 00:45:35.460 |
by which you're selecting the patches or so on. 00:45:38.300 |
- So it has to really fool the entirety of the, 00:45:41.620 |
- So it turns out to actually to be really hard 00:45:54.420 |
And this goes to, I think also what I was saying earlier is, 00:46:11.500 |
whether it's actually having the right prediction. 00:46:19.820 |
And also actually, so that's one paper that we did, 00:46:51.780 |
to help us to develop more resilient methods in video. 00:47:07.740 |
and the ideas that are developing the attacks, 00:47:09.540 |
and the literature that's developing the defense, 00:47:13.820 |
- Right now, of course, it's the attack side. 00:47:18.500 |
and there are so many different ways to develop attacks. 00:47:21.260 |
Even just us, we develop so many different methods 00:47:39.620 |
and not knowing the parameters of the target system, 00:47:43.700 |
So there are so many different types of attacks. 00:47:46.380 |
- So the counter argument that people would have, 00:47:49.500 |
like people that are using machine learning in companies, 00:47:52.540 |
they would say, "Sure, in constrained environments 00:47:59.900 |
"or you know a lot about the dataset already, 00:48:07.500 |
"but my system won't be able to be attacked like this. 00:48:14.020 |
That's another hope, that it's actually a lot harder 00:48:22.100 |
- How hard is it to attack real world systems, I guess? 00:49:10.300 |
So in this work, my students and collaborators 00:49:28.460 |
and then we can train an imitation model ourselves 00:49:35.620 |
And also, the imitation model can be very, very effective, 00:49:58.460 |
one example is translating from English to German. 00:50:13.180 |
And then we can actually generate adversary examples 00:50:23.020 |
change the translation instead of six Fahrenheit 00:50:49.820 |
we created this example from our imitation model. 00:50:58.700 |
- So the attacks that work on the imitation model, 00:51:01.340 |
in some cases at least, transfer to the original model. 00:51:11.900 |
real world systems actually can be easily fooled. 00:51:16.420 |
we also showed this type of black box attacks 00:51:18.620 |
can be effective on cloud vision APIs as well. 00:51:22.140 |
- So that's for natural language and for vision. 00:51:30.820 |
which is autonomous driving as sort of security concerns. 00:51:52.380 |
for perceiving the world and navigating that world? 00:51:56.620 |
From your stop sign work in the physical world, 00:52:05.620 |
like there has always been like research shown that's, 00:52:15.340 |
it can actually, once arranged in certain ways, 00:52:24.620 |
but I don't think it's been done on physical worlds, 00:52:28.220 |
Meaning I think it's with a projector in front of the Tesla. 00:52:39.260 |
The question is whether it's possible to orchestrate attacks 00:52:42.980 |
that work in the actual, like end-to-end attacks, 00:52:47.100 |
like not just a demonstration of the concept, 00:53:06.140 |
whether someone will actually go deploy that attack. 00:53:18.780 |
So to clarify, feasibility means it's possible. 00:53:37.700 |
And coupled with how many evil people there are in the world 00:53:47.140 |
when I talked to Elon Musk and asked the same question, 00:54:01.260 |
Of course, he happens to be involved with the company, 00:54:08.820 |
Where does your confidence that it's feasible come from? 00:54:33.620 |
there has been research shown that even LIDAR itself 00:54:37.020 |
- No, no, no, no, but see, it's really important to pause. 00:54:40.380 |
There's really nice demonstrations that it's possible to do, 00:54:44.860 |
but there's so many pieces that it's kind of like, 00:54:53.420 |
meaning it's in the physical space, the attacks, 00:54:55.740 |
but it's very, you have to control a lot of things 00:54:59.940 |
It's like the difference between opening a safe 00:55:07.460 |
and you can work on it versus breaking into the crown, 00:55:12.300 |
stealing the crown jewels or whatever, right? 00:55:21.780 |
you don't even need any sophisticated attacks. 00:55:25.340 |
Already, we've seen many real-world examples, incidents, 00:55:30.340 |
where showing that the vehicle was making the wrong decision. 00:55:36.220 |
- Right, right, so that's one way to demonstrate. 00:55:38.660 |
And this is also, so far, we've mainly talked about work 00:55:46.420 |
they are so vulnerable to the adversarial setting, 00:55:53.060 |
these learning systems, they don't generalize well, 00:55:58.140 |
under certain situations, like what we have seen. 00:56:08.300 |
- They can be real, but so there's two cases. 00:56:19.300 |
that the attacker wants, as you said, the targeted-- 00:56:27.540 |
like an extra level of difficult step in the real world. 00:56:31.580 |
But from the perspective of the passenger of the car, 00:56:38.180 |
- Whether it's misbehavior or a targeted attack, okay. 00:56:42.340 |
- And also, that's why I was also saying earlier, 00:56:45.260 |
like when defense is this multi-model defense, 00:56:48.780 |
and more of these consistent checks and so on. 00:56:51.100 |
So in the future, I think also it's important 00:56:58.660 |
and they should be combining all these sensory readings 00:57:05.420 |
and the interpretation of the world and so on. 00:57:08.500 |
And the more of these sensory inputs they use, 00:57:12.180 |
and the better they combine the sensory inputs, 00:57:17.020 |
And hence, I think that is a very important direction 00:57:21.740 |
- So multi-modal, multi-sensor across multiple cameras, 00:57:25.380 |
but also in the case of car, radar, ultrasonic, sound even. 00:57:33.420 |
- So another thing, another part of your work 00:57:39.220 |
And that too can be seen as a kind of security vulnerability. 00:57:47.900 |
and the vulnerabilities to data is essentially, 00:57:56.980 |
So what do you see as the main vulnerabilities 00:57:59.820 |
in the privacy of data, and how do we protect it? 00:58:02.260 |
- Right, so in security, we actually talk about 00:58:05.620 |
essentially two, in this case, two different properties. 00:58:10.180 |
One is integrity, and one is confidentiality. 00:58:29.020 |
And privacy essentially is on the other side, 00:58:34.900 |
is how attackers can, when the attackers compromise 00:58:42.460 |
that's when the attacker is steal sensitive information, 00:58:49.900 |
Those are great terms, integrity and confidentiality. 00:58:54.420 |
- So what are the main vulnerabilities to privacy, 00:58:58.700 |
would you say, and how do we protect against it? 00:59:04.580 |
that you think about in the context of privacy? 00:59:07.140 |
- Right, so especially in the machine learning setting, 00:59:12.380 |
so in this case, as we know that how the process goes 00:59:22.540 |
trains from this training data and then builds a model. 00:59:25.780 |
And then later on, inputs are given to the model 00:59:29.220 |
to inference time to try to get prediction and so on. 00:59:34.020 |
So then in this case, the privacy concerns that we have 00:59:38.300 |
is typically about privacy of the data in the training data, 00:59:43.100 |
because that's essentially the private information. 00:59:45.580 |
So, and it's really important because oftentimes 00:59:53.900 |
It can be your financial data, it's your health data, 00:59:56.980 |
or like in IoT case, it's the sensors deployed 01:00:03.740 |
And all this can be collecting very sensitive information. 01:00:23.180 |
And hence, just from the learned model in the end, 01:00:27.580 |
actually attackers can potentially infer information 01:00:49.620 |
how the attacker may try to learn information from the... 01:00:54.620 |
So, and also there are different types of attacks. 01:00:57.740 |
So in certain cases, again, like in white box attacks, 01:01:01.220 |
we can say that the attacker actually get to see 01:01:05.660 |
And then from that, a smart attacker potentially 01:01:18.620 |
And sometimes they can tell whether a person has been, 01:01:23.020 |
a particular person's data point has been used 01:01:36.580 |
given that information is possible to some... 01:01:49.940 |
only gets to query the machine learning model 01:02:21.620 |
Then the question is, can attacker actually exploit this 01:02:25.540 |
and try to actually extract sensitive information 01:02:34.220 |
without even knowing the parameters of the model, 01:02:41.900 |
So that's the question we set out to explore. 01:02:46.860 |
And in one of the case studies, we showed the following. 01:02:50.860 |
So we trained a language model over an email data sets. 01:02:57.420 |
And the nRAN email data sets naturally contains 01:03:01.180 |
users' social security numbers and credit card numbers. 01:03:04.420 |
So we trained a language model over this data sets, 01:03:15.940 |
and without knowing the details of the model, 01:03:31.460 |
personally identifiable information from the data set 01:03:53.700 |
Is there hopeful, so there's been recent work 01:04:11.300 |
in this particular case, we actually have a good defense. 01:04:17.820 |
- So instead of just training a vanilla language model, 01:04:22.980 |
instead, if we train a differentially private 01:04:25.420 |
language model, then we can still achieve similar utility, 01:04:30.060 |
but at the same time, we can actually significantly enhance 01:04:39.340 |
and our proposed attacks actually are no longer effective. 01:04:47.140 |
of adding some noise by which you then have some guarantees 01:05:12.900 |
we are learning the model, we are doing gradient updates, 01:05:22.660 |
differentially private machine learning algorithm, 01:05:29.660 |
and adding various perturbation during this training process. 01:05:35.780 |
- Right, so then the finally trained learning, 01:05:42.500 |
and so it can enhance the privacy protection. 01:06:04.060 |
sort of a lot of companies are funded through advertisement, 01:06:06.820 |
and what that means is the advertisement works 01:06:09.820 |
exceptionally well because the companies are able 01:06:28.620 |
where people can have a little bit more control 01:06:31.780 |
of their data by owning and maybe understanding 01:06:35.780 |
the value of their data and being able to sort of monetize it 01:06:40.540 |
in a more explicit way as opposed to the implicit way 01:06:50.140 |
Right, I think there are these natural questions, 01:07:17.220 |
we knew that there should be this clear notion 01:07:20.620 |
of ownership of properties and having enforcement for this. 01:07:25.420 |
And so actually people have shown that this establishment 01:07:30.420 |
and enforcement of property rights has been a main driver 01:07:43.340 |
And that actually really propelled the economic growth 01:07:51.340 |
- So throughout the history of the development 01:07:54.140 |
of the United States or actually just civilization, 01:07:57.180 |
the idea of property rights that you can own property-- 01:08:08.020 |
actually has been a key driver for economic growth. 01:08:12.100 |
And there had been even research or proposals saying 01:08:29.020 |
it's more due to the lack of this notion of property rights 01:08:37.140 |
- Interesting, so that the presence of absence 01:08:45.140 |
and their enforcement has a strong correlation 01:08:50.820 |
- And so you think that that same could be transferred 01:08:57.940 |
- I think first of all, it's a good lesson for us 01:09:01.340 |
to recognize that these rights and the recognition 01:09:56.780 |
But I think more and more people start to realize 01:10:01.460 |
is more and more in the data that the person has generated 01:10:10.500 |
your music taste and your financial information, 01:10:16.860 |
So more and more of the definition of the person 01:10:22.140 |
- And currently, for the most part, that's owned. 01:10:27.300 |
but kind of it's owned by internet companies. 01:10:34.620 |
- Right, there's no clear notion of ownership 01:10:41.860 |
but I think actually clearly identifying the ownership 01:10:52.380 |
So maybe some users are fine with internet companies 01:11:02.100 |
as long as if the data is used in a certain way 01:11:05.780 |
that actually the user consents with or allows. 01:11:10.780 |
For example, you can say the recommendation system 01:11:18.380 |
similarly, it's trying to recommend you something. 01:11:25.660 |
either recommending you better music, movies, news, 01:11:35.820 |
especially in certain cases where people can be manipulated 01:11:42.300 |
they can have really bad, severe consequences. 01:11:45.740 |
So essentially, users want their data to be used 01:11:51.820 |
and also maybe even, right, get paid for or whatever, 01:11:57.780 |
we need to really establish who needs to decide, 01:12:06.220 |
And typically, the establishment and clarification 01:12:22.620 |
users are actually now the owner of this data, 01:12:24.460 |
whoever is collecting the data is the owner of the data, 01:12:35.980 |
So it seems fairly clear that first we really need to say 01:12:52.660 |
and I think that's a fascinating thing to think about, 01:12:56.660 |
I can only see, and the economic growth argument 01:13:01.060 |
So that's a first time I'm kind of at least thinking about 01:13:12.300 |
But sort of one possible downside I could see, 01:13:20.540 |
and it's really nice for Facebook and YouTube and Twitter 01:13:27.100 |
And if you give control to people with their data, 01:13:34.820 |
they would not want to hand it over quite easily? 01:13:42.820 |
and then therefore provide a mass seemingly free service, 01:13:51.060 |
so the way the internet looks will completely change 01:14:06.060 |
in the sense that, yes, users can have ownership 01:14:10.060 |
of their data, they can maintain control of their data, 01:14:20.020 |
in this case, if they feel that they enjoy the benefits 01:14:25.500 |
and they're fine with having Facebook, having their data, 01:14:29.580 |
but utilizing the data in certain way that they agree, 01:14:47.900 |
so for example, it's already fairly standard, 01:15:02.020 |
and I think we just want to essentially bring out 01:15:06.340 |
more about who gets to decide what to do with the data. 01:15:16.860 |
but subjectively, sort of anecdotally speaking, 01:15:19.180 |
it seems like a lot of people don't trust Facebook. 01:15:22.140 |
So that's at least a very popular thing to say 01:15:26.980 |
I wonder if you give people control of their data, 01:15:34.900 |
I wonder how they would speak with the actual, 01:15:37.940 |
like, would they be willing to pay $10 a month for Facebook, 01:15:44.860 |
It'd be interesting to see what fraction of people 01:15:47.500 |
would quietly hand over their data to Facebook 01:15:57.580 |
about how many people would use their data effectively 01:16:19.980 |
especially in press, the conversation has been very much 01:16:29.140 |
On one hand, right, users can say that, right, 01:16:48.060 |
oh, they are providing a lot of services to users, 01:17:09.220 |
is that we want to establish a more constructive dialogue, 01:17:29.620 |
between the two sides, between utility and privacy. 01:17:35.980 |
essentially, like the recommendation system example 01:17:41.500 |
if you want someone to give you a good recommendation, 01:17:46.100 |
the system is going to need to know your data 01:17:54.660 |
we want to ensure that however that data is being handled, 01:18:00.300 |
so that, for example, the recommendation system 01:18:06.220 |
and then cause a lot of bad consequences and so on. 01:18:23.380 |
so as opposed to this happening in the background, 01:18:33.900 |
about how we trade our data for the services. 01:19:10.380 |
that are needed to essentially help this balance better, 01:19:42.180 |
are providing an incredible service to the world. 01:19:59.260 |
And it shouldn't be monolithically fought against, 01:20:11.860 |
I think Facebook's done a lot of incredible things 01:20:28.660 |
by using their real name and their real picture. 01:20:41.540 |
And there's a lot of interesting possibilities there 01:20:47.860 |
and having a good dialogue about that is great. 01:21:07.540 |
Like users are also more and more recognizing 01:21:22.820 |
also, and together with the regulatory framework and so on, 01:21:30.820 |
put these type of issues at a higher priority. 01:21:43.340 |
So I think definitely the rising voice is super helpful. 01:21:53.140 |
and even this consideration of data ownership 01:21:56.260 |
to the forefront, to really much wider community. 01:22:17.020 |
is in the space of any kinds of transactions, 01:22:23.340 |
So can you maybe talk a little bit about blockchain? 01:23:07.580 |
then the system can essentially achieve certain properties. 01:23:11.820 |
For example, in the distributed ledger setting, 01:23:37.900 |
or is synchronized across multiple sources, multiple nodes. 01:23:55.580 |
so what are the kinds of security vulnerabilities 01:24:12.860 |
and in certain cases, it's called double spending, 01:24:40.540 |
on the security and privacy of digital currency? 01:24:47.340 |
to interview various people in the digital currency space. 01:24:57.980 |
from an outsider's perspective, seems like dark magic. 01:25:13.440 |
so you have to create a really secure system. 01:25:19.860 |
what your thoughts in general about digital currency is 01:25:22.060 |
and how it can possibly create financial transactions 01:25:26.940 |
and financial stores of money in the digital space? 01:25:37.580 |
in security, we actually talk about two main properties. 01:25:52.760 |
let's just focus on integrity and confidentiality. 01:26:08.600 |
usually it's done through, we call it a consensus protocol, 01:26:13.600 |
that they establish this shared view on this ledger, 01:26:24.380 |
So in this case, then the security often refers 01:26:38.920 |
so that the attacker can change the lock, for example? 01:26:43.920 |
- Right, how hard is it to make an attack like that? 01:26:49.480 |
And then that very much depends on the consensus mechanism, 01:27:09.700 |
and it really depends on how the system has been built, 01:27:32.100 |
So there's differences in the different mechanisms 01:27:35.320 |
and the implementations of a distributed ledger 01:27:48.400 |
about which is more effective, which is more secure, 01:28:19.260 |
by having a large number of users using the currency? 01:28:30.180 |
what is needed to be able to attack the system. 01:28:34.420 |
Of course, there can be different types of attacks, 01:28:50.100 |
like really how much is needed to compromise the system. 01:28:55.100 |
But in general, right, so there are ways to say 01:28:57.660 |
what percentage of the nodes you need to compromise 01:29:19.420 |
that you talked about on the machine learning side 01:29:40.100 |
So in that sense, there's no confidentiality. 01:29:47.100 |
there are the mechanisms that you can build in 01:29:53.620 |
privacy of the transactions and the data and so on. 01:29:57.940 |
that both my group and also my startup does as well. 01:30:18.380 |
of the identity of the people involved in the transactions. 01:30:21.140 |
So what is their hope to keep confidential in this context? 01:30:26.780 |
you want to enable like confidential transactions, 01:30:31.660 |
even, so there are different essentially types of data 01:30:36.660 |
that you want to keep private or confidential. 01:30:50.100 |
to hide either who is making the transactions to whom 01:30:59.740 |
we can enable like confidential smart contracts 01:31:06.180 |
and the execution of the smart contract and so on. 01:31:09.620 |
And we actually are combining these different technologies 01:31:14.340 |
and to going back to the earlier discussion we had 01:31:29.780 |
what we call a platform for responsible data economy 01:31:33.380 |
to actually combine these different technologies together 01:31:36.220 |
to enable secure and privacy preserving computation 01:31:47.260 |
immutable log of users' ownership to their data 01:31:51.980 |
and the policies they want the data to adhere to, 01:32:03.220 |
we call it distributors secure computing fabric 01:32:06.900 |
that helps to enable a more responsible data economy. 01:32:14.620 |
Okay, you're involved in so much amazing work 01:32:19.500 |
but I have to ask at least briefly about program synthesis, 01:32:26.780 |
captures much of the dreams of what's possible 01:32:30.300 |
in computer science and the artificial intelligence. 01:32:37.660 |
and can neural networks be used to learn programs from data? 01:32:43.620 |
Some aspect of the synthesis can it be learned? 01:32:46.260 |
- So program synthesis is about teaching computers 01:32:53.420 |
And I think that's one of our ultimate dreams or goals. 01:32:58.300 |
I think Andreessen talked about software eating the world. 01:33:05.860 |
So I say, once we teach computers to write software, 01:33:21.660 |
when I shifted from security to more AI machine learning, 01:33:30.940 |
program synthesis and adversarial machine learning, 01:33:34.180 |
these are the two fields that I particularly focus on. 01:33:38.100 |
Like program synthesis is one of the first questions 01:33:42.740 |
- Just as a question, oh, I guess from the security side, 01:33:51.340 |
but where was your interest for program synthesis? 01:34:04.780 |
actually when I shifted my focus from security 01:34:11.620 |
actually one of my main motivation at the time 01:34:15.300 |
is that even though I have been doing a lot of work 01:35:04.100 |
I guess it's the ultimate test of intelligence 01:35:11.620 |
sort of neural networks can learn good functions 01:35:15.660 |
and they can help you out in classification tasks, 01:35:31.980 |
to reason through ideas and boil them down to algorithms. 01:35:51.140 |
but already I think we have seen a lot of progress. 01:36:01.980 |
So there's no reason why computers cannot write programs. 01:36:05.660 |
So I think that's definitely an achievable goal, 01:36:14.780 |
we actually have the program synthesis community, 01:36:19.620 |
especially the program synthesis via learning, 01:36:22.660 |
how we call it, neural program synthesis community, 01:36:56.980 |
And then I actually met someone from a startup, 01:37:14.460 |
had actually become a key products in their startup. 01:37:20.220 |
And that was program synthesis in that particular case 01:37:54.700 |
you said natural language being able to express something 01:37:59.260 |
and it converts it into a database SQL SQL query. 01:38:07.700 |
'Cause that seems like a really hard problem. 01:38:14.940 |
And now this is also a very active domain of research. 01:38:25.820 |
And since then, actually now there has been more work 01:38:29.100 |
and with even more like sophisticated datasets. 01:38:45.220 |
- Being able to learn in the space of programs 01:39:05.740 |
also we want to see how we should measure the progress 01:39:30.340 |
Now there's actually a fairly sizable session 01:39:51.580 |
- It's still a small community, but it is growing. 01:39:54.460 |
- And they will all win Turing Awards one day. 01:40:02.820 |
in the complexity of the programs that these-- 01:40:18.100 |
- The complexity of the task to be synthesized 01:40:21.700 |
and the complexity of the actual synthesized programs. 01:40:32.940 |
of the running time of the algorithm kind of thing. 01:40:36.660 |
And you can see the complexity decreasing already. 01:40:39.940 |
- Oh, no, meaning we want to be able to synthesize 01:40:42.100 |
more and more complex programs, bigger and bigger programs. 01:40:51.420 |
'cause I thought of complexity as you wanna be able 01:41:08.020 |
learn them for more and more difficult tasks. 01:41:14.980 |
was to translate natural language description 01:41:28.100 |
You just identify the trigger conditions and the action. 01:41:34.340 |
And then also, we started to synthesize programs 01:41:41.660 |
And if you could synthesize recursion, it's all over. 01:41:45.580 |
- Right, actually, one of our works, actually, 01:41:58.340 |
Like when we train or learn a program synthesizer, 01:42:04.380 |
in this case, a neural program to synthesize programs, 01:42:26.180 |
actually showed that recursion actually is important 01:42:41.060 |
So that won the best paper awards at ICLR earlier. 01:42:50.740 |
these neural programs that can generalize better. 01:42:53.580 |
But that works for certain tasks, certain domains, 01:43:04.060 |
that can have generalization for wider set of domains, 01:43:26.380 |
is that this adaptation is that we want to be able 01:43:31.380 |
to learn from the past and tasks and training 01:43:50.460 |
we train the model and to solve this particular task. 01:44:07.620 |
- And just like in deep reinforcement learning, 01:44:37.620 |
is you're learning a tool that can solve new problems. 01:44:44.980 |
that as a community, we need to put more emphasis on 01:44:49.980 |
and I hope that we can make more progress there as well. 01:44:57.100 |
but let me ask that you also had a very interesting 01:45:10.140 |
and your master's and PhD in the United States, 01:45:17.860 |
I think there's a lot of interesting difference 01:45:21.220 |
Are there in your eyes interesting differences 01:45:29.100 |
so the romantic notion of the spirit of the people 01:45:32.340 |
to the more practical notion of how research is conducted 01:45:35.860 |
that you find interesting or useful in your own work 01:45:43.780 |
I think, so I studied in China for my undergraduate 01:46:04.260 |
that's even be more different for my experience 01:46:14.180 |
so for my undergrad, I actually studied physics. 01:46:18.060 |
- And then I switched to computer science in graduate school. 01:46:25.100 |
Was there, is there another possible universe 01:46:29.420 |
where you could have become a theoretical physicist 01:46:52.660 |
from that experience of doing physics in your bachelor's, 01:46:55.980 |
how, what made you decide to switch to computer science 01:46:59.260 |
and computer science at arguably the best university, 01:47:19.460 |
and what was the move to the United States like? 01:47:26.980 |
of some of the spirit of the people of China in you 01:47:45.340 |
I was actually in the physics PhD program at Cornell. 01:47:52.020 |
and then I was in the PhD program at Carnegie Mellon. 01:48:04.220 |
about having studied physics first in my undergrad. 01:48:09.220 |
I actually really, I really did enjoy my undergrad's time 01:48:39.580 |
- Right, but anyway, so when I studied physics, 01:48:42.740 |
I was, I think I was really attracted to physics. 01:48:51.340 |
And I actually call it, physics is the language of nature. 01:48:55.840 |
And I actually clearly remember like one moment 01:49:02.100 |
in my undergrads, I did my undergrad in Tsinghua 01:49:16.260 |
and I was like writing on my notes and so on. 01:49:31.780 |
It's almost like I can derive the rest of the world. 01:49:42.140 |
or do you think you can rediscover that kind of power 01:49:44.500 |
and beauty in computer science in the world that you-- 01:49:56.420 |
For physics in grad school, actually things changed. 01:50:05.820 |
that when I started doing research in physics, 01:50:16.700 |
So I had to actually do a lot of the simulation. 01:50:41.780 |
And also at the time from talking with the senior students 01:50:48.060 |
in the program, I realized many of the students 01:50:54.140 |
actually were going off to like Wall Street and so on. 01:50:59.460 |
And I've always been interested in computer science 01:51:02.380 |
and actually essentially taught myself C programming. 01:51:16.100 |
Physics major, learning to do C programming, beautiful. 01:51:19.500 |
- Actually it's interesting, in physics at the time, 01:51:23.580 |
I think now the program probably has changed, 01:51:36.860 |
to computer science or computing and Fortran 77. 01:51:40.140 |
- There's a lot of people that still use Fortran. 01:51:42.540 |
I'm actually, if you're a programmer out there, 01:51:46.340 |
I'm looking for an expert to talk to about Fortran. 01:51:51.820 |
but there's still a lot of people that still use Fortran 01:52:04.220 |
that I may as well just change to computer science. 01:52:18.340 |
you code it up and then you can see it actually, right? 01:52:28.060 |
Whereas in physics, if you have a good theory, 01:52:33.260 |
to do the experiments and to confirm the theory 01:52:38.180 |
And also the reason in physics I decided to do 01:52:42.460 |
theoretical physics was because I had my experience 01:52:51.180 |
- And then most of the time fixing the equipment first. 01:52:56.260 |
- Super expensive equipment, so there's a lot of, 01:52:58.260 |
yeah, you have to collaborate with a lot of people, 01:53:04.500 |
- Right, so I decided to switch to computer science. 01:53:06.700 |
And one thing I think maybe people have realized 01:53:11.260 |
actually it's very easy for physicists to change, 01:53:16.940 |
I think physics provides a really good training. 01:53:25.820 |
But one thing going back to your earlier question, 01:53:35.580 |
where physics you can derive the whole universe 01:53:43.940 |
is defined by humans, the systems are defined by humans 01:53:47.420 |
and it's artificial, like essentially you create 01:54:03.540 |
You actually have to see there is historical reasons 01:54:14.700 |
less elegant simplicity of E equals MC squared 01:54:23.340 |
But what about the move from China into the United States? 01:54:33.820 |
The fact that you grew up in another culture? 01:54:40.740 |
So now they actually, I see these students coming from China 01:54:45.740 |
and even undergrads, actually they speak fluent English. 01:54:55.020 |
And they have already understood so much of the culture 01:55:06.700 |
At the time actually, we didn't even have easy access 01:55:24.020 |
And hence, at the time we had much less knowledge 01:56:17.820 |
it's a very distant place because it's changed a lot. 01:56:29.580 |
Do you see, please tell me there's an optimistic picture 01:56:41.420 |
there's different values in terms of the role 01:56:43.780 |
of government and so on, of ethical, transparent, 01:56:55.660 |
to successfully collaborate and work in a healthy way 01:57:10.860 |
and the advancement of the technology helps everyone, 01:57:18.060 |
And so I certainly hope that the two countries 01:57:31.260 |
- So first, again, like I said, science has no borders. 01:57:41.380 |
in the former Soviet Union during the Cold War. 01:57:48.460 |
especially in academic research, everything is public. 01:57:59.100 |
It doesn't matter whether the person is in the US, 01:58:17.260 |
- So, apologize for the romanticized question, 01:58:24.300 |
was the most transformative moment in your life 01:58:28.580 |
that maybe made you fall in love with computer science? 01:58:33.580 |
You said physics, you remember there was a moment 01:58:38.760 |
Was there a moment that you really fell in love 01:58:43.780 |
from security to machine learning to program synthesis? 01:58:48.220 |
- So maybe, as I mentioned, actually in college, 01:58:52.020 |
I, one summer I just taught myself programming in C. 01:59:00.220 |
- Don't tell me you fell in love with computer science 01:59:03.660 |
- Remember I mentioned one of the draws for me 01:59:25.340 |
like it's a board, you can move the stones and so on. 01:59:28.300 |
And the other one I actually programmed a game 01:59:32.980 |
It turned out to be a super hard game to play. 01:59:36.420 |
Because instead of just the standard 2D Tetris, 01:59:40.740 |
But I realized, wow, I just had these ideas to try it out 01:59:48.500 |
And so that's when I realized, wow, this is amazing. 01:59:58.060 |
- From nothing to something that's actually out 02:00:11.820 |
What gives your life meaning, purpose, fulfillment, 02:00:22.540 |
- It's interesting that you ask this question. 02:00:28.060 |
that has followed me and followed my life the most. 02:00:41.460 |
There's a moment, I've talked to a few people 02:00:46.060 |
who have faced, for example, a cancer diagnosis 02:00:58.460 |
of seeing that most of what they've been doing 02:01:06.780 |
here's actually the few things that really give meaning. 02:01:11.780 |
Mortality is a really powerful catalyst for that, 02:01:15.780 |
Facing mortality, whether it's your parents dying 02:01:19.460 |
or facing your own death for whatever reason, 02:01:38.860 |
So one is, who should be defining the meaning of your life? 02:01:44.620 |
- Right, is there some kind of even greater things than you 02:02:21.820 |
like who gets to define the meaning of your life? 02:02:41.580 |
whether it's, it could be spiritual, religious too, 02:02:43.980 |
with God, or some other components of the environment 02:02:56.580 |
the long period of time of thinking and searching, 02:03:09.380 |
and so I've come to the conclusion and realization 02:03:13.060 |
that it's you yourself that defines the meaning of life. 02:03:16.820 |
- Yeah, that's a big burden though, isn't it? 02:03:32.820 |
like what does it really mean by the meaning of life? 02:03:44.020 |
- Absolutely, and you said it somehow distinct 02:03:48.140 |
from happiness, so meaning is something much deeper 02:03:58.900 |
And then you have to ask, what is deeper than that? 02:04:16.460 |
about this question, then the meaning of life to them 02:04:22.020 |
And also, whether knowing the meaning of life, 02:04:24.780 |
whether it actually helps your life to be better 02:04:37.780 |
I tend to think that just asking the question 02:04:40.220 |
as you mentioned, as you've done for a long time, 02:04:44.940 |
and asking the question is a really good exercise. 02:04:58.180 |
and it seems like my meaning has been to create. 02:05:05.220 |
I'd love to have kids, but I also, sounds creepy, 02:05:17.940 |
I think those bring, and then ideas, theorems, 02:05:23.220 |
and are creations, and those somehow intrinsically, 02:05:29.500 |
and I think they do to a lot of, at least scientists, 02:05:34.220 |
So that, to me, if I had to force the answer to that, 02:06:02.700 |
- Yeah, seeing life as actually a collection of moments, 02:06:05.580 |
and then trying to make the richest possible sets, 02:06:10.580 |
fill those moments with the richest possible experiences. 02:06:20.420 |
even from the things that I've already talked about. 02:06:32.660 |
whether that is really the meaning of my life. 02:06:38.140 |
There are so many different things that you could create. 02:06:41.060 |
And also you can say, another view is maybe growth. 02:06:50.620 |
Growth is also maybe type of meaning of life. 02:07:11.060 |
- And it's funny, isn't it funny that the growth 02:07:18.140 |
It's like, it's not the goal, it's the journey to it. 02:07:29.260 |
Not to submit a paper, but when that whole project is over. 02:07:35.260 |
but you're usually immediately looking for the next thing. 02:07:40.500 |
It's not that, the end of it is not the satisfaction, 02:07:44.420 |
it's the hardship, the challenge you have to overcome, 02:07:51.340 |
the same thing that drives the evolutionary process 02:08:04.580 |
In a sense that I think for people who really dedicate time 02:08:10.420 |
to search for the answer, to ask the question, 02:08:21.820 |
we can say, right, like whether it's a well-defined question 02:08:49.100 |
oh, then my meaning of life is to create or to grow, 02:08:57.420 |
But how do you know that that is really the meaning of life 02:09:01.860 |
Like there's no way for you to really answer the question. 02:09:05.740 |
- For sure, but something about that certainty 02:09:10.100 |
So it might be an illusion, you might not really know, 02:09:12.860 |
you might be just convincing yourself falsely. 02:09:23.380 |
There's something freeing in knowing this is your purpose. 02:09:28.660 |
For a long time, I thought, isn't it all relative? 02:09:33.340 |
How do we even know what's good and what's evil? 02:09:41.000 |
The question of meaning is ultimately the question 02:09:50.280 |
Why is anything valuable? - Right, right, exactly. 02:09:58.420 |
I think it's a really useful question to ask. 02:10:01.180 |
But if you ask it for too long and too aggressively-- 02:10:08.860 |
- It may not be productive and not just for traditionally, 02:10:13.380 |
societally defined success, but also for happiness. 02:10:17.300 |
It seems like asking the question about the meaning of life 02:10:34.200 |
that's at least the lesson I picked up so far. 02:10:40.020 |
So actually, so sometimes, yes, it can help you to focus. 02:10:46.100 |
So when I shifted my focus more from seeing the world 02:10:52.060 |
more from security to AI and machine learning, 02:10:55.140 |
at the time, actually one of the main reasons 02:11:07.420 |
and the purpose of my life is to build intelligent machines. 02:11:28.500 |
to actually make it, to actually go down that journey. 02:11:39.400 |
to end a conversation than talking for a while 02:11:49.900 |
- Thanks for listening to this conversation with Dawn Song 02:11:52.580 |
and thank you to our presenting sponsor, Cash App. 02:11:57.100 |
by downloading Cash App and using code LEXPODCAST. 02:12:01.180 |
If you enjoy this podcast, subscribe on YouTube, 02:12:07.300 |
or simply connect with me on Twitter @LexFriedman. 02:12:10.540 |
And now let me leave you with some words about hacking 02:12:17.900 |
A lot of hacking is playing with other people, 02:12:24.380 |
Thank you for listening and hope to see you next time.