back to indexMichael Kearns: Algorithmic Fairness, Privacy & Ethics | Lex Fridman Podcast #50
Chapters
0:0
47:31 Social Media Platforms
67:6 What Is Differential Privacy
67:24 Anonymization of Data
67:57 Anonymization
71:58 Differential Privacy
75:21 Mechanism of Differential Privacy
88:0 Game Theory
88:24 Algorithmic Game Theory
88:46 Prisoner's Dilemma
97:1 Algorithmic Trading
00:00:00.000 |
The following is a conversation with Michael Kearns. He's a professor at the University of 00:00:05.040 |
Pennsylvania and a co-author of the new book, Ethical Algorithm, that is the focus of much of 00:00:10.960 |
this conversation. It includes algorithmic fairness, bias, privacy, and ethics in general. 00:00:18.080 |
But that is just one of many fields that Michael is a world-class researcher in, 00:00:22.480 |
some of which we touch on quickly, including learning theory or the theoretical foundation 00:00:27.920 |
of machine learning, game theory, quantitative finance, computational social science, and much 00:00:33.200 |
more. But on a personal note, when I was an undergrad, early on, I worked with Michael 00:00:39.520 |
on an algorithmic trading project and competition that he led. That's when I first fell in love with 00:00:44.960 |
algorithmic game theory. While most of my research life has been in machine learning and human-robot 00:00:50.320 |
interaction, the systematic way that game theory reveals the beautiful structure in our competitive 00:00:56.640 |
and cooperating world of humans has been a continued inspiration to me. So for that 00:01:02.320 |
and other things, I'm deeply thankful to Michael and really enjoyed having this conversation 00:01:08.480 |
again in person after so many years. This is the Artificial Intelligence Podcast. If you enjoy it, 00:01:15.280 |
subscribe on YouTube, give it five stars on Apple Podcasts, support on Patreon, 00:01:20.160 |
or simply connect with me on Twitter @LexFriedman, spelled F-R-I-D-M-A-N. This episode is supported 00:01:27.760 |
by an amazing podcast called Pessimists Archive. Jason, the host of the show, reached out to me 00:01:34.080 |
looking to support this podcast, and so I listened to it to check it out. And by listened, I mean I 00:01:40.320 |
went through it Netflix binge-style at least five episodes in a row. It's not one of my favorite 00:01:46.480 |
podcasts, and I think it should be one of the top podcasts in the world, frankly. It's a history show 00:01:52.320 |
about why people resist new things. Each episode looks at a moment in history when something new 00:01:57.760 |
was introduced, something that today we think of as commonplace, like recorded music, umbrellas, 00:02:03.360 |
bicycles, cars, chess, coffee, the elevator, and the show explores why it freaked everyone out. 00:02:09.600 |
The latest episode on mirrors and vanity still stays with me as I think about vanity in the 00:02:15.440 |
modern day of the Twitter world. That's the fascinating thing about this show, is that 00:02:21.200 |
stuff that happened long ago, especially in terms of our fear of new things, repeats itself in the 00:02:26.000 |
modern day and so has many lessons for us to think about in terms of human psychology and the role 00:02:31.600 |
of technology in our society. Anyway, you should subscribe and listen to Pessimists Archive. 00:02:37.760 |
I highly recommend it. And now, here's my conversation with Michael Kearns. 00:02:44.880 |
You mentioned reading Fear and Loathing in Las Vegas in high school and having a more or a bit 00:02:51.280 |
more of a literary mind. So, what books, non-technical, non-computer science, would you 00:02:57.840 |
say had the biggest impact on your life, either intellectually or emotionally? 00:03:05.760 |
Yeah, I think, well, my favorite novel is Infinite Jest by David Foster Wallace, which 00:03:12.720 |
actually, coincidentally, much of it takes place in the halls of buildings right around us here at 00:03:17.280 |
MIT. So, that certainly had a big influence on me. And as you noticed, like when I was in high 00:03:23.360 |
school, I actually even started college as an English major. So, I was very influenced by sort 00:03:29.120 |
of that genre of journalism at the time and thought I wanted to be a writer and then realized that 00:03:33.600 |
an English major teaches you to read, but it doesn't teach you how to write. And then I became 00:03:37.440 |
interested in math and computer science instead. 00:03:40.080 |
Well, in your new book, Ethical Algorithm, you kind of sneak up from an algorithmic perspective 00:03:47.280 |
on these deep, profound philosophical questions of fairness, of privacy. In thinking about these 00:03:57.440 |
topics, how often do you return to that literary mind that you had? 00:04:02.800 |
Yeah, I'd like to claim there was a deeper connection, but I think both Aaron and I kind 00:04:09.520 |
of came at these topics first and foremost from a technical angle. I mean, I kind of consider myself 00:04:15.440 |
primarily and originally a machine learning researcher. And I think as we just watched, 00:04:21.280 |
like the rest of the society, the field technically advance, and then quickly on the heels of that, 00:04:26.000 |
kind of the buzzkill of all of the antisocial behavior by algorithms, just kind of realized 00:04:31.440 |
there was an opportunity for us to do something about it from a research perspective. More to 00:04:37.600 |
the point of your question, I mean, I do have an uncle who is literally a moral philosopher. 00:04:43.680 |
And so in the early days of our technical work on fairness topics, I would occasionally run ideas 00:04:49.760 |
behind him. So I mean, I remember an early email I sent to him in which I said like, "Oh, here's a 00:04:54.400 |
specific definition of algorithmic fairness that we think is some sort of variant of Rawlsian 00:05:00.080 |
fairness. What do you think?" And I thought I was asking a yes or no question, and I got back to a 00:05:06.720 |
kind of classical philosopher's response. "Well, it depends. If you look at it this way, then you 00:05:11.200 |
might conclude this." And that's when I realized that there was a real kind of rift between the 00:05:18.960 |
ways philosophers and others had thought about things like fairness from sort of a humanitarian 00:05:24.640 |
perspective and the way that you needed to think about it as a computer scientist if you were going 00:05:29.680 |
to kind of implement actual algorithmic solutions. - But I would say the algorithmic solutions take 00:05:38.880 |
care of some of the low-hanging fruit. Sort of the problem is a lot of algorithms, when they don't 00:05:45.120 |
consider fairness, they are just terribly unfair. And when they don't consider privacy, they're 00:05:51.840 |
terribly, they violate privacy. Sort of the algorithmic approach fixes big problems. But 00:05:59.840 |
there is still, when you start pushing into the gray area, that's when you start getting into 00:06:05.120 |
this philosophy of what it means to be fair, starting from Plato, what is justice kind of 00:06:11.360 |
questions. - Yeah, I think that's right. And I mean, I would even not go as far as you went to 00:06:16.720 |
say that sort of the algorithmic work in these areas is solving like the biggest problems. And 00:06:22.880 |
we discuss in the book the fact that really we are, there's a sense in which we're kind of looking 00:06:28.400 |
where the light is in that, for example, if police are racist in who they decide to stop and frisk, 00:06:36.960 |
and that goes into the data, there's sort of no undoing that downstream by kind of clever 00:06:42.640 |
algorithmic methods. And I think, especially in fairness, I mean, I think less so in privacy, 00:06:49.840 |
where we feel like the community kind of really has settled on the right definition, 00:06:54.240 |
which is differential privacy. If you just look at the algorithmic fairness literature already, 00:06:59.280 |
you can see it's gonna be much more of a mess. And you've got these theorems saying, 00:07:03.440 |
here are three entirely reasonable, desirable notions of fairness. And here's a proof that 00:07:11.760 |
you cannot simultaneously have all three of them. So I think we know that algorithmic fairness 00:07:18.160 |
compared to algorithmic privacy is gonna be kind of a harder problem. And it will have to revisit, 00:07:23.680 |
I think, things that have been thought about by many generations of scholars before us. 00:07:29.040 |
So it's very early days for fairness, I think. - So before we get into the details of differential 00:07:35.200 |
privacy and on the fairness side, let me linger on the philosophy a bit. Do you think most people 00:07:40.800 |
are fundamentally good? Or do most of us have both the capacity for good and evil within us? 00:07:48.400 |
- I mean, I'm an optimist. I tend to think that most people are good and want to do right. And 00:07:55.920 |
that deviations from that are kind of usually due to circumstance, not due to people being bad at 00:08:05.520 |
are people at the heads of governments, people at the heads of companies, people at the heads of 00:08:11.680 |
maybe, so financial power markets. Do you think the distribution there is also most people are 00:08:19.760 |
good and have good intent? - Yeah, I do. I mean, my statement wasn't 00:08:24.080 |
qualified to people not in positions of power. I mean, I think what happens in a lot of the cliche 00:08:31.600 |
about absolute power corrupts absolutely. I mean, I think even short of that, having spent a lot of 00:08:38.800 |
time on Wall Street and also in arenas very, very different from Wall Street, like academia, 00:08:45.120 |
one of the things I think I've benefited from by moving between two very different worlds is you 00:08:52.960 |
become aware that these worlds kind of develop their own social norms and they develop their 00:08:59.280 |
own rationales for behavior, for instance, that might look unusual to outsiders. But when you're 00:09:05.920 |
in that world, it doesn't feel unusual at all. And I think this is true of a lot of professional 00:09:12.640 |
cultures, for instance. And so then your maybe slippery slope is too strong of a word, but you're 00:09:20.240 |
in some world where you're mainly around other people with the same kind of viewpoints and 00:09:24.880 |
training and worldview as you. And I think that's more of a source of abuses of power 00:09:33.280 |
than sort of there being good people and evil people and that somehow the evil people are the 00:09:40.960 |
ones that somehow rise to power. - That's really interesting. So it's 00:09:44.640 |
within the social norms constructed by that particular group of people, you're all trying 00:09:51.280 |
to do good, but because it's a group, you might drift into something that for the broader population 00:09:57.600 |
it does not align with the values of society. That's the worry. 00:10:01.600 |
- Yeah, I mean, or not that you drift, but even the things that don't make sense to the outside 00:10:08.160 |
world don't seem unusual to you. So it's not sort of like a good or a bad thing, but, you know, 00:10:14.000 |
like, so for instance, you know, in the world of finance, right, there's a lot of complicated types 00:10:20.160 |
of activity that if you are not immersed in that world, you cannot see why the purpose of that, 00:10:25.760 |
you know, that activity exists at all. It just seems like, you know, completely useless and 00:10:30.800 |
people just like, you know, pushing money around. And when you're in that world, right, and you 00:10:36.080 |
learn more, your view does become more nuanced, right? You realize, okay, there is actually a 00:10:41.200 |
function to this activity. And in some cases you would conclude that actually if magically we could 00:10:47.680 |
eradicate this activity tomorrow, it would come back because it actually is like serving some 00:10:53.280 |
useful purpose. It's just a useful purpose that's very difficult for outsiders to see. And so I 00:10:59.520 |
think, you know, lots of professional work environments or cultures, as I might put it, 00:11:05.280 |
kind of have these social norms that, you know, don't make sense to the outside world. Academia 00:11:10.640 |
is the same, right? I mean, lots of people look at academia and say, you know, what the hell are 00:11:14.800 |
all of you people doing? Why are you paid so much in some cases at taxpayer expenses to do, you know, 00:11:21.840 |
to publish papers that nobody reads? You know, but when you're in that world, you come to see 00:11:26.640 |
the value for it. And, but even though you might not be able to explain it to, you know, the person 00:11:31.360 |
in the street. Right. And in the case of the financial sector, tools like credit might not 00:11:38.160 |
make sense to people. Like it's a good example of something that does seem to pop up and be useful, 00:11:43.120 |
or just the power of markets and just in general capitalism. Yeah. And finance, I think the primary 00:11:48.880 |
example I would give is leverage, right? So being allowed to borrow, to sort of use 10 times as much 00:11:56.640 |
money as you've actually borrowed, right? So that's an example of something that before I had 00:12:00.640 |
any experience in financial markets, I might've looked at and said, well, what is the purpose of 00:12:05.040 |
that? That just seems very dangerous. And it is dangerous and it has proven dangerous. But, you 00:12:10.960 |
know, if the fact of the matter is that, you know, sort of on some particular timescale, you are 00:12:17.280 |
holding positions that are, you know, very unlikely to, you know, lose, you know, their, 00:12:23.920 |
you know, like your value at risk or variances like one or 5%, then it kind of makes sense that 00:12:30.400 |
you would be allowed to use a little bit more than you have, because you have, you know, some 00:12:35.360 |
confidence that you're not going to lose it all in a single day. Now, of course, when that happens, 00:12:41.520 |
we've seen what happens, you know, not too long ago. But, you know, but the idea that it serves 00:12:48.800 |
no useful economic purpose under any circumstances is definitely not true. 00:12:54.720 |
We'll return to the other side of the coast, Silicon Valley, and the problems there as we 00:13:00.400 |
talk about privacy, as we talk about fairness. At the high level, and I'll ask some sort of basic 00:13:08.400 |
questions with the hope to get at the fundamental nature of reality. But from a very high level, 00:13:15.120 |
what is an ethical algorithm? So I can say that an algorithm has a running time of using big O 00:13:22.400 |
notation and log N. I can say that a machine learning algorithm classified cat versus dog 00:13:29.360 |
with 97% accuracy. Do you think there will one day be a way to measure sort of in the same 00:13:37.760 |
compelling way as the big O notation of this algorithm is 97% ethical? 00:13:44.000 |
First of all, let me riff for a second on your specific N log N example. So because early in 00:13:50.480 |
the book, when we're just kind of trying to describe algorithms, period, we say like, okay, 00:13:55.360 |
what's an example of an algorithm or an algorithmic problem? First of all, it's sorting, 00:14:00.640 |
right? You have a bunch of index cards with numbers on them and you want to sort them. 00:14:04.320 |
And we describe an algorithm that sweeps all the way through, finds the smallest number, 00:14:09.520 |
puts it at the front, then sweeps through again, finds the second smallest number. 00:14:13.280 |
So we make the point that this is an algorithm, and it's also a bad algorithm in the sense that 00:14:19.280 |
it's quadratic rather than N log N, which we know is kind of optimal for sorting. And we make the 00:14:26.000 |
point that sort of like, so even within the confines of a very precisely specified problem, 00:14:32.400 |
there might be many, many different algorithms for the same problem with different properties. 00:14:38.880 |
Like some might be faster in terms of running time, some might use less memory, some might have 00:14:45.120 |
better distributed implementations. And so the point is, is that already we're used to, 00:14:50.800 |
you know, in computer science, thinking about trade-offs between different types of quantities 00:14:57.040 |
and resources, and there being, you know, better and worse algorithms. And our book is about that 00:15:06.000 |
part of algorithmic ethics that we know how to kind of put on that same kind of quantitative 00:15:13.920 |
footing right now. So, you know, just to say something that our book is not about, our book 00:15:19.440 |
is not about kind of broad, fuzzy notions of fairness. It's about very specific notions of 00:15:26.080 |
fairness. There's more than one of them. There are tensions between them, right? But if you pick one 00:15:33.520 |
of them, you can do something akin to saying that this algorithm is 97% ethical. You can say, 00:15:41.280 |
for instance, the, you know, for this lending model, the false rejection rate on black people 00:15:48.240 |
and white people is within 3%, right? So we might call that a 97% ethical algorithm, and a 100% 00:15:56.800 |
ethical algorithm would mean that that difference is 0%. 00:16:00.160 |
>> In that case, fairness is specified when two groups, however they're defined, are given to you. 00:16:08.320 |
>> So the, and then you can sort of mathematically start describing the algorithm. But 00:16:13.040 |
nevertheless, the part where the two groups are given to you, unlike running time, 00:16:23.040 |
you know, we don't in computer science talk about how fast an algorithm feels like when it runs. 00:16:30.560 |
>> We measure it, and ethical starts getting into feelings. So for example, an algorithm runs, 00:16:37.440 |
you know, if it runs in the background, it doesn't disturb the performance of my system, 00:16:41.920 |
it'll feel nice, I'll be okay with it. But if it overloads the system, it'll feel unpleasant. 00:16:46.800 |
So in that same way, ethics, there's a feeling of how socially acceptable it is, how does it 00:16:52.400 |
represent the moral standards of our society today? So in that sense, and sorry to linger on 00:16:59.120 |
that first high-level philosophical question, is do you have a sense we'll be able to measure how 00:17:06.640 |
>> First of all, I didn't, certainly didn't mean to give the impression that you can kind of measure, 00:17:11.840 |
you know, memory speed trade-offs, you know, and that there's a complete, you know, 00:17:17.200 |
mapping from that on to kind of fairness, for instance, or ethics and accuracy, for example. 00:17:23.760 |
In the type of fairness definitions that are largely the objects of study today and starting 00:17:30.480 |
to be deployed, you as the user of the definitions, you need to make some hard decisions before you 00:17:36.800 |
even get to the point of designing fair algorithms. One of them, for instance, is deciding 00:17:43.840 |
who it is that you're worried about protecting, who you're worried about being harmed by, 00:17:50.000 |
for instance, some notion of discrimination or unfairness. And then you need to also decide 00:17:55.360 |
what constitutes harm. So for instance, in a lending application, maybe you decide that, 00:18:01.360 |
you know, falsely rejecting a creditworthy individual, you know, sort of a false negative, 00:18:07.600 |
is the real harm, and that false positives, i.e. people that are not creditworthy or are not going 00:18:13.840 |
to repay your loan, that get a loan, you might think of them as lucky. And so that's not a harm, 00:18:19.600 |
although it's not clear that if you don't have the means to repay a loan, that being given a loan 00:18:25.600 |
is not also a harm. So, you know, the literature is sort of so far quite limited in that you sort 00:18:34.080 |
of need to say, who do you want to protect and what would constitute harm to that group? 00:18:38.000 |
And when you ask questions like, will algorithms feel ethical, one way in which they won't under 00:18:44.960 |
the definitions that I'm describing is if, you know, if you are an individual who is falsely 00:18:50.800 |
denied a loan, incorrectly denied a loan, all of these definitions basically say like, well, 00:18:56.720 |
you know, your compensation is the knowledge that we are also falsely denying loans to other people, 00:19:03.520 |
you know, in other groups at the same rate that we're doing it to you. And, you know, 00:19:08.560 |
and so there is actually this interesting, even technical tension in the field right now between 00:19:14.400 |
these sort of group notions of fairness and notions of fairness that might actually feel 00:19:20.240 |
like real fairness to individuals, right? They might really feel like their particular interests 00:19:25.920 |
are being protected or thought about by the algorithm rather than just, you know, the groups 00:19:32.080 |
that they happen to be members of. - Is there parallels to the big O notation of worst case 00:19:38.000 |
analysis? So is it important to, looking at the worst violation of fairness for an individual, 00:19:46.400 |
is it important to minimize that one individual? So like worst case analysis, is that something 00:19:51.760 |
you think about or? - I mean, I think we're not even at the point where we can sensibly 00:19:56.480 |
think about that. So first of all, you know, we're talking here both about fairness applied at the 00:20:03.120 |
group level, which is a relatively weak thing, but it's better than nothing. And also the more 00:20:10.240 |
ambitious thing of trying to give some individual promises. But even that doesn't incorporate, 00:20:17.120 |
I think, something that you're hinting at here is what I might have called subjective fairness, 00:20:20.880 |
right? So a lot of the definitions, I mean, all of the definitions in the algorithmic fairness 00:20:25.920 |
literature are what I would kind of call received wisdom definitions. It's sort of, you know, 00:20:30.400 |
somebody like me sits around and thinks like, okay, you know, I think here's a technical 00:20:34.880 |
definition of fairness that I think people should want, or that they should, you know, 00:20:39.440 |
think of as some notion of fairness, maybe not the only one, maybe not the best one, 00:20:43.280 |
maybe not the last one. But we really actually don't know from a subjective standpoint, 00:20:51.680 |
like what people really think is fair. There's, you know, we just started doing a little bit of 00:20:57.600 |
work in our group at actually doing kind of human subject experiments in which we, you know, 00:21:05.040 |
ask people about, you know, we ask them questions about fairness, we survey them, 00:21:11.760 |
we, you know, we show them pairs of individuals in, let's say, a criminal recidivism prediction 00:21:17.440 |
setting, and we ask them, do you think these two individuals should be treated the same as a matter 00:21:23.600 |
of fairness? And to my knowledge, there's not a large literature in which ordinary people are 00:21:31.040 |
asked about, you know, they have sort of notions of their subjective fairness elicited from them. 00:21:37.280 |
It's mainly, you know, kind of scholars who think about fairness, you know, kind of making up their 00:21:43.840 |
own definitions. And I think this needs to change actually for many social norms, not just for 00:21:49.760 |
fairness, right? So there's a lot of, you know, discussion these days in the AI community about 00:21:55.360 |
interpretable AI or understandable AI. And as far as I can tell, everybody agrees that 00:22:02.080 |
deep learning, or at least the outputs of deep learning, are not very understandable. And people 00:22:09.760 |
might agree that sparse linear models with integer coefficients are more understandable. 00:22:15.520 |
But nobody's really asked people, you know, there's very little literature on, you know, sort 00:22:19.760 |
of showing people models and asking them, do they understand what the model is doing? And I think 00:22:25.760 |
that in all these topics, as these fields mature, we need to start doing more behavioral work. 00:22:33.520 |
Yeah, which is, so one of my deep passions is psychology. And I always thought computer 00:22:39.840 |
scientists will be the best future psychologists, in a sense that data is, especially in this modern 00:22:49.280 |
world, the data is a really powerful way to understand and study human behavior. And you've 00:22:53.760 |
explored that with your game theory side of work as well. 00:22:56.800 |
Yeah, I'd like to think that what you say is true about computer scientists and psychology, 00:23:02.240 |
from my own limited wandering into human subject experiments, we have a great deal to learn. 00:23:09.360 |
Not just computer science, but AI and machine learning more specifically, I kind of think of as 00:23:13.600 |
imperialist research communities in that, you know, kind of like physicists in an earlier 00:23:19.280 |
generation, computer scientists kind of don't think of any scientific topic as off limits to 00:23:25.280 |
them, they will like freely wander into areas that others have been thinking about for decades or 00:23:31.040 |
longer. And, you know, we usually tend to embarrass ourselves in those efforts for some amount of 00:23:38.800 |
time. Like, you know, I think reinforcement learning is a good example, right? So a lot of 00:23:43.920 |
the early work in reinforcement learning, I have complete sympathy for the control theorists that 00:23:50.240 |
looked at this and said, like, okay, you are reinventing stuff that we've known since like 00:23:55.280 |
the 40s, right? But, you know, in my view, eventually, this sort of, you know, computer 00:24:01.280 |
scientists have made significant contributions to that field, even though we kind of embarrassed 00:24:07.280 |
ourselves for the first decade. So I think if computer scientists are going to start engaging 00:24:11.440 |
in kind of psychology, human subjects, type of research, we should expect to be embarrassing 00:24:18.000 |
ourselves for a good 10 years or so, and then hope that it turns out as well as, you know, 00:24:24.160 |
some other areas that we've waded into. >> So you've kind of mentioned this, 00:24:28.160 |
just to linger on the idea of an ethical algorithm, of idea of groups, sort of group thinking and 00:24:34.080 |
individual thinking. And we're struggling that, one of the amazing things about algorithms and 00:24:38.560 |
your book and just this field of study is it gets us to ask, like, forcing machines, 00:24:44.640 |
converting these ideas into algorithms is forcing us to ask questions of ourselves as a human 00:24:50.560 |
civilization. So there's a lot of people now in public discourse doing sort of group thinking, 00:24:57.440 |
thinking like there's particular sets of groups that we don't want to discriminate against and 00:25:02.960 |
so on. And then there's individuals, sort of in the individual life stories, the struggles they 00:25:10.320 |
went through and so on. Now, like in philosophy, it's easier to do group thinking because you don't, 00:25:16.080 |
you know, it's very hard to think about individuals, there's so much variability. 00:25:21.440 |
But with data, you can start to actually say, you know, what group thinking is too crude? You're 00:25:28.320 |
actually doing more discrimination by thinking in terms of groups and individuals. Can you linger on 00:25:33.520 |
that kind of idea of group versus individual and ethics? And is it good to continue thinking in 00:25:40.960 |
terms of groups in algorithms? - So let me start by answering a very good high level question with a 00:25:48.800 |
slightly narrow technical response, which is these group definitions of fairness, like here's a few 00:25:55.440 |
groups, like different racial groups, maybe gender groups, maybe age, what have you. And let's make 00:26:02.160 |
sure that, you know, for none of these groups, do we, you know, have a false negative rate, 00:26:08.000 |
which is much higher than any other one of these groups, okay? So these are kind of classic group 00:26:12.800 |
aggregate notions of fairness. And, you know, but at the end of the day, an individual you can think 00:26:17.920 |
of as a combination of all of their attributes, right? They're a member of a racial group, 00:26:22.080 |
they have a gender, they have an age, you know, and many other, you know, demographic properties 00:26:29.280 |
that are not biological, but that, you know, are still, you know, very strong determinants of 00:26:35.200 |
outcome and personality and the like. So one, I think, useful spectrum is to sort of think about 00:26:43.120 |
that array between the group and the specific individual, and to realize that in some ways, 00:26:49.600 |
asking for fairness at the individual level is to sort of ask for group fairness simultaneously 00:26:56.400 |
for all possible combinations of groups. So in particular, you know, 00:27:01.440 |
if I build a predictive model that meets some definition of fairness by race, by gender, 00:27:10.000 |
by age, by what have you, marginally, to get slightly technical, sort of independently, 00:27:16.800 |
I shouldn't expect that model to not discriminate against disabled Hispanic women over age 55, 00:27:24.560 |
making less than $50,000 a year annually, even though I might have protected each one of those 00:27:29.520 |
attributes marginally. >> So the optimization, actually, that's a fascinating way to put it. 00:27:35.520 |
So you're just optimizing, so one way to achieve the optimizing fairness for individuals is just 00:27:42.320 |
to add more and more definitions of groups that each individual belongs to. 00:27:46.080 |
>> So, you know, at the end of the day, we could think of all of ourselves as groups of size one, 00:27:50.800 |
because eventually there's some attribute that separates you from me and everybody, 00:27:54.640 |
from everybody else in the world, okay? And so it is possible to put, you know, 00:28:00.960 |
these incredibly coarse ways of thinking about fairness and these very, very individualistic, 00:28:06.080 |
specific ways on a common scale. And, you know, one of the things we've worked on from a research 00:28:12.560 |
perspective is, you know, so we sort of know how to, you know, in relative terms, we know how to 00:28:18.320 |
provide fairness guarantees at the coarsest end of the scale. We don't know how to provide kind of 00:28:24.640 |
sensible, tractable, realistic fairness guarantees at the individual level, but maybe we could start 00:28:30.720 |
creeping towards that by dealing with more, you know, refined subgroups. I mean, we gave a name 00:28:36.640 |
to this phenomenon where, you know, you protect, you enforce some definition of fairness for a 00:28:43.520 |
bunch of marginal attributes or features, but then you find yourself discriminating against 00:28:48.880 |
a combination of them. We call that fairness gerrymandering, because like political gerrymandering, 00:28:54.800 |
you know, you're giving some guarantee at the aggregate level, but that when you kind of look 00:29:00.320 |
in a more granular way at what's going on, you realize that you're achieving that aggregate 00:29:04.880 |
guarantee by sort of favoring some groups and discriminating against other ones. And so there 00:29:11.200 |
are, you know, it's early days, but there are algorithmic approaches that let you start creep, 00:29:16.960 |
creeping towards that, you know, individual end of the spectrum. 00:29:22.160 |
Does there need to be human input in the form of weighing the value of the importance of each 00:29:31.120 |
kind of group? So for example, is it like, so gender, say, crudely speaking, male and female, 00:29:42.080 |
and then different races, are we as humans supposed to put value on saying gender is 0.6 00:29:50.880 |
and race is 0.4 in terms of in the big optimization of achieving fairness? Is that kind of what 00:30:00.720 |
I mean, you know, I mean, of course, you know, I don't need to tell you that, of course, 00:30:04.400 |
technically one could incorporate such weights if you wanted to into a definition of fairness. 00:30:10.400 |
You know, fairness is an interesting topic in that having worked in, in the book being about 00:30:18.720 |
both fairness, privacy, and many other social norms, fairness, of course, is a much, much more 00:30:25.200 |
loaded topic. So privacy, I mean, people want privacy, people don't like violations of privacy, 00:30:31.600 |
violations of privacy cause damage, angst, and bad publicity for the companies that are victims 00:30:39.280 |
of them. But sort of everybody agrees more data privacy would be better than less data privacy. 00:30:46.480 |
And you don't have these, somehow the discussions of fairness don't become politicized 00:30:53.280 |
along other dimensions like race and about gender and, you know, whether we, and, you know, 00:31:00.880 |
you quickly find yourselves kind of revisiting topics that have been kind of unresolved forever, 00:31:10.480 |
like affirmative action, right? Sort of, you know, like, why are you protecting, 00:31:14.720 |
some people will say, why are you protecting this particular racial group? 00:31:19.120 |
And, and others will say, well, we need to do that as a matter of, of retribution. Other people 00:31:26.480 |
will say it's a matter of economic opportunity. And I don't know which of, you know, whether any 00:31:33.600 |
of these are the right answers, but you sort of fairness is sort of special in that as soon as 00:31:37.760 |
you start talking about it, you inevitably have to participate in debates about fair to whom, 00:31:44.880 |
at what expense to who else. I mean, even in criminal justice, right? You know, where people 00:31:52.240 |
talk about fairness in criminal sentencing, or, you know, predicting failures to appear or making 00:32:01.840 |
parole decisions or the like, they will, you know, they'll point out that, well, these definitions 00:32:08.160 |
of fairness are all about fairness for the criminals. And what about fairness for the 00:32:14.640 |
victims, right? So when I basically say something like, well, the false incarceration rate for black 00:32:22.480 |
people and white people needs to be roughly the same, you know, there's no mention of potential 00:32:28.240 |
victims of criminals in such a fairness definition. And that's the realm of public discord. I should 00:32:35.280 |
actually recommend, I just listened to people listening, Intelligence Squares debates, US 00:32:42.400 |
edition just had a debate. They have this structure where you have old Oxford style or whatever 00:32:49.040 |
they're called, debates, and it was two versus two, and they talked about affirmative action. 00:32:53.520 |
And it was incredibly interesting that it's still, there's really good points on every side of this 00:33:01.520 |
issue, which is fascinating to listen to. Yeah, yeah, I agree. And so it's interesting to be 00:33:07.040 |
a researcher trying to do, for the most part, technical algorithmic work. But Aaron and I both 00:33:15.200 |
quickly learned you cannot do that and then go out and talk about it and expect people to take 00:33:19.520 |
it seriously if you're unwilling to engage in these broader debates that are entirely extra 00:33:26.320 |
algorithmic, right? They're not about, you know, algorithms and making algorithms better. They're 00:33:31.120 |
sort of, you know, as you said, sort of like, what should society be protecting in the first place? 00:33:35.840 |
When you discuss the fairness, an algorithm that achieves fairness, whether in the constraints 00:33:42.240 |
and the objective function, there's an immediate kind of analysis you can perform, which is saying, 00:33:49.120 |
if you care about fairness in gender, this is the amount that you have to pay for in terms of the 00:33:57.280 |
performance of the system. Is there a role for statements like that in a table in a paper, 00:34:03.440 |
or do you want to really not touch that? No, we want to touch that and we do touch it. So, 00:34:09.840 |
I mean, just again to make sure I'm not promising your viewers more than we know how to provide. 00:34:17.520 |
But if you pick a definition of fairness, like I'm worried about gender discrimination, 00:34:21.680 |
and you pick a notion of harm, like false rejection for a loan, for example, and you give me a model, 00:34:27.920 |
I can definitely, first of all, go audit that model. It's easy for me to go, you know, from 00:34:32.960 |
data to kind of say like, okay, your false rejection rate on women is this much higher 00:34:39.200 |
than it is on men, okay? But, you know, once you also put the fairness into your objective function, 00:34:46.160 |
I mean, I think the table that you're talking about is, you know, what we would call the Pareto 00:34:50.320 |
curve, right? You can literally trace out, and we give examples of such plots on real data sets in 00:34:57.680 |
the book, you have two axes. On the x-axis is your error, on the y-axis is unfairness by whatever, 00:35:05.680 |
you know, if it's like the disparity between false rejection rates between two groups. 00:35:10.080 |
And, you know, your algorithm now has a knob that basically says, how strongly do I want to enforce 00:35:17.760 |
fairness? And the less unfair, you know, if the two axes are error and unfairness, we'd like to be 00:35:24.800 |
at zero, zero. We'd like zero error and zero unfairness simultaneously. Anybody who works 00:35:31.520 |
in machine learning knows that you're generally not going to get to zero error period without any 00:35:37.200 |
fairness constraint whatsoever, so that's not going to happen. But in general, you know, you'll 00:35:41.760 |
get this, you'll get some kind of convex curve that specifies the numerical tradeoff you face, 00:35:49.760 |
you know, if I want to go from 17% error down to 16% error, what will be the increase in unfairness 00:35:58.560 |
that I experience as a result of that? And so this curve kind of specifies the, you know, kind of 00:36:07.120 |
undominated models. Models that are off that curve are, you know, can be strictly improved 00:36:13.120 |
in one or both dimensions. You can, you know, either make the error better or the unfairness 00:36:16.880 |
better or both. And I think our view is that not only are these objects, these Pareto curves, 00:36:25.440 |
you know, efficient frontiers as you might call them, not only are they valuable scientific 00:36:33.920 |
objects, I actually think that they in the near term might need to be the interface 00:36:39.920 |
between researchers working in the field and stakeholders in given problems. So, you know, 00:36:46.880 |
you could really imagine telling a criminal jurisdiction, look, if you're concerned about 00:36:55.520 |
racial fairness, but you're also concerned about accuracy, you want to, you know, you want to 00:37:01.840 |
release on parole people that are not going to recommit a violent crime and you don't want to 00:37:06.640 |
release the ones who are. So, you know, that's accuracy. But if you also care about those, 00:37:12.240 |
you know, the mistakes you make not being disproportionately on one racial group or 00:37:16.080 |
another, you can show this curve. I'm hoping that in the near future, it'll be possible to 00:37:21.840 |
explain these curves to non-technical people that are the ones that have to make the decision, 00:37:28.560 |
where do we want to be on this curve? Like, what are the relative merits or value of having lower 00:37:35.600 |
error versus lower unfairness? You know, that's not something computer scientists 00:37:40.480 |
should be deciding for society, right? That, you know, the people in the field, so to speak, 00:37:46.720 |
the policy makers, the regulators, that's who should be making these decisions. But I think 00:37:52.560 |
and hope that they can be made to understand that these trade-offs generally exist and that 00:37:58.240 |
you need to pick a point and like, and ignoring the trade-off, you know, you're implicitly picking 00:38:04.160 |
a point anyway, right? You just don't know it and you're not admitting it. 00:38:08.400 |
Just to linger on the point of trade-offs, I think that's a really important thing to sort of 00:38:13.040 |
think about. So you think when we start to optimize for fairness, there's almost always 00:38:21.440 |
in most system going to be trade-offs. Can you, like, what's the trade-off between, 00:38:27.680 |
just to clarify, there've been some sort of technical terms thrown around, but 00:38:32.080 |
sort of a perfectly fair world, why is that, why will somebody be upset about that? 00:38:42.480 |
The specific trade-off I talked about just in order to make things very concrete was between 00:38:48.640 |
numerical error and some numerical measure of unfairness. 00:38:56.160 |
Just like, say, predictive error, like, you know, the probability or frequency with which you 00:39:01.280 |
release somebody on parole who then goes on to recommit a violent crime or keep incarcerated 00:39:08.480 |
somebody who would not have recommitted a violent crime. 00:39:10.960 |
So in the case of awarding somebody parole or giving somebody parole or letting them out 00:39:17.920 |
on parole, you don't want them to recommit a crime. So it's your system failed in prediction 00:39:24.400 |
if they happen to do a crime. Okay, so that's the performance, that's one axis. 00:39:31.600 |
So then the fairness axis might be the difference between racial groups in the kind of false, 00:39:38.240 |
false positive predictions, namely people that I kept incarcerated 00:39:46.720 |
predicting that they would recommit a violent crime when in fact they wouldn't have. 00:39:50.960 |
Right. And the unfairness of that, just to linger it and allow me to 00:39:56.480 |
ineloquently to try to sort of describe why that's unfair, why unfairness is there. 00:40:04.240 |
The unfairness you want to get rid of is that in the judge's mind, the bias of having being 00:40:13.200 |
brought up to society, the slight racial bias, the racism that exists in the society, you want 00:40:18.480 |
to remove that from the system. Another way that's been debated is sort of equality of opportunity 00:40:27.680 |
versus equality of outcome. And there's a weird dance there that's really difficult to get right. 00:40:34.880 |
And we don't, it's what the affirmative action is exploring that space. 00:40:40.480 |
Right. And then we, this also quickly bleeds into questions like, well, 00:40:46.080 |
maybe if one group really does recommit crimes at a higher rate, 00:40:50.640 |
the reason for that is that at some earlier point in the pipeline or earlier in their lives, 00:40:57.120 |
they didn't receive the same resources that the other group did. 00:41:01.120 |
And that, and so, there's always in kind of fairness discussions, the possibility 00:41:07.520 |
that the real injustice came earlier, right? Earlier in this individual's life, earlier in 00:41:12.800 |
this group's history, et cetera, et cetera. And so, a lot of the fairness discussion is almost, 00:41:18.800 |
the goal is for it to be a corrective mechanism to account for the injustice earlier in life. 00:41:25.200 |
By some definitions of fairness or some theories of fairness, yeah. Others would say like, look, 00:41:30.560 |
it's not to correct that injustice, it's just to kind of level the playing field right now and not 00:41:37.040 |
incarcerate, falsely incarcerate more people of one group than another group. But I mean, 00:41:42.160 |
do you think just, it might be helpful just to demystify a little bit about 00:41:45.040 |
the many ways in which bias or unfairness can come into algorithms, especially in the machine 00:41:54.880 |
learning era, right? And I think many of your viewers have probably heard these examples before, 00:41:59.920 |
but let's say I'm building a face recognition system, right? And so, I'm kind of gathering 00:42:05.920 |
lots of images of faces and trying to train the system to recognize new faces of those individuals 00:42:14.160 |
from training on a training set of those faces of individuals. And it shouldn't surprise anybody, 00:42:21.120 |
or certainly not anybody in the field of machine learning, if my training dataset 00:42:27.600 |
was primarily white males, and I'm training the model to maximize the overall accuracy on my 00:42:36.880 |
training dataset, that the model can reduce its error most by getting things right on the white 00:42:45.840 |
males that constitute the majority of the dataset, even if that means that on other groups, they will 00:42:51.360 |
be less accurate, okay? Now, there's a bunch of ways you could think about addressing this. One is 00:42:57.920 |
to deliberately put into the objective of the algorithm not to optimize the error at the expense 00:43:06.480 |
of this discrimination, and then you're kind of back in the land of these kind of two-dimensional 00:43:10.000 |
numerical trade-offs. A valid counter argument is to say like, "Well, no, you don't have to... 00:43:16.560 |
There's no... The notion of the tension between error and accuracy here is a false one." You could 00:43:23.120 |
instead just go out and get much more data on these other groups that are in the minority 00:43:28.400 |
and equalize your dataset, or you could train a separate model on those subgroups and have 00:43:36.320 |
multiple models. The point I think we would... We tried to make in the book is that those things 00:43:42.880 |
have cost too, right? Going out and gathering more data on groups that are relatively rare 00:43:50.400 |
compared to your plurality or majority group, that it may not cost you in the accuracy of the model, 00:43:55.920 |
but it's going to cost the company developing this model more money to develop that, and it 00:44:02.080 |
also costs more money to build separate predictive models and to implement and deploy them. 00:44:07.280 |
So even if you can find a way to avoid the tension between error and accuracy in training a model, 00:44:14.400 |
you might push the cost somewhere else, like money, like development time, research time, 00:44:21.040 |
and the like. There are fundamentally difficult philosophical questions, in fairness. 00:44:26.640 |
And we live in a very divisive political climate, outrage culture. There is alt-right folks on 00:44:36.400 |
4chan, trolls. There is social justice warriors on Twitter. There is very divisive, 00:44:44.240 |
outraged folks on all sides of every kind of system. How do you, how do we as engineers 00:44:53.440 |
build ethical algorithms in such divisive culture? Do you think they could be disjoint? The human 00:44:59.840 |
has to inject your values, and then you can optimize over those values. But in our times, 00:45:05.760 |
when you start actually applying these systems, things get a little bit 00:45:09.680 |
challenging for the public discourse. How do you think we can proceed? 00:45:15.200 |
Yeah, I mean, for the most part, in the book, a point that we try to take some pains to make is 00:45:21.600 |
that we don't view ourselves or people like us as being in the position of deciding for society 00:45:30.400 |
what the right social norms are, what the right definitions of fairness are. Our main point is 00:45:35.600 |
to just show that if society or the relevant stakeholders in a particular domain can come 00:45:42.800 |
to agreement on those sorts of things, there's a way of encoding that into algorithms in many 00:45:48.400 |
cases, not in all cases. One other misconception that hopefully we definitely dispel is sometimes 00:45:55.120 |
people read the title of the book and I think not unnaturally fear that what we're suggesting is 00:46:00.560 |
that the algorithms themselves should decide what those social norms are and develop their own 00:46:05.200 |
notions of fairness and privacy or ethics. And we're definitely not suggesting that. 00:46:09.920 |
The title of the book is Ethical Algorithm, by the way, and I didn't think of that interpretation 00:46:15.120 |
Yeah, yeah. I mean, especially these days where people are concerned about the robots becoming 00:46:20.880 |
our overlords, the idea that the robots would also sort of develop their own social norms is 00:46:26.160 |
just one step away from that. But I do think, obviously, despite disclaimer that people 00:46:33.760 |
like us shouldn't be making those decisions for society, we are kind of living in a world where, 00:46:39.120 |
in many ways, computer scientists have made some decisions that have fundamentally changed 00:46:44.080 |
the nature of our society and democracy and sort of civil discourse and deliberation in ways that 00:46:51.360 |
I think most people generally feel are bad these days, right? So- 00:46:55.520 |
But they had to make, so if we look at people at the heads of companies and so on, 00:47:00.720 |
they had to make those decisions, right? There has to be decisions. So there's two options. 00:47:06.400 |
Either you kind of put your head in the sand and don't think about these things and just let the 00:47:12.480 |
algorithm do what it does, or you make decisions about what you value, you know, of injecting 00:47:19.280 |
Look, I never mean to be an apologist for the tech industry, but I think it's a little bit 00:47:27.600 |
too far to sort of say that explicit decisions were made about these things. So let's, for 00:47:31.520 |
instance, take social media platforms, right? So like many inventions in technology and computer 00:47:37.680 |
science, a lot of these platforms that we now use regularly kind of started as curiosities, 00:47:44.800 |
right? I remember when things like Facebook came out and its predecessors like Friendster, 00:47:49.040 |
which nobody even remembers now. People really wonder, like, why would anybody want to spend 00:47:55.600 |
time doing that? I mean, even the web when it first came out, when it wasn't populated with 00:48:00.560 |
much content and it was largely kind of hobbyists building their own kind of ramshackle websites, 00:48:06.800 |
a lot of people looked at this as like, "Well, what is the purpose of this thing? Why is this 00:48:10.320 |
interesting? Who would want to do this?" And so even things like Facebook and Twitter, yes, 00:48:15.360 |
technical decisions were made by engineers, by scientists, by executives in the design of those 00:48:20.720 |
platforms. But I don't think 10 years ago anyone anticipated that those platforms, for instance, 00:48:31.360 |
might kind of acquire undue influence on political discourse or on the outcomes of elections. 00:48:40.880 |
And I think the scrutiny that these companies are getting now is entirely appropriate, 00:48:47.360 |
but I think it's a little too harsh to kind of look at history and sort of say like, "Oh, 00:48:52.880 |
you should have been able to anticipate that this would happen with your platform." 00:48:55.920 |
And in this sort of gaming chapter of the book, one of the points we're making is that 00:48:59.680 |
these platforms, right, they don't operate in isolation. So unlike the other topics we're 00:49:06.640 |
discussing like fairness and privacy, those are really cases where algorithms can operate 00:49:11.600 |
on your data and make decisions about you and you're not even aware of it, okay? 00:49:16.000 |
Things like Facebook and Twitter, these are systems, right? These are social systems. 00:49:21.440 |
And their evolution, even their technical evolution because machine learning is involved, 00:49:27.200 |
is driven in no small part by the behavior of the users themselves and how the users decide 00:49:32.880 |
to adopt them and how to use them. And so I'm kind of like, "Who really knew that until we saw it 00:49:44.000 |
happen? Who knew that these things might be able to influence the outcome of elections? Who knew 00:49:48.720 |
that they might polarize political discourse because of the ability to decide who you interact 00:49:57.360 |
with on the platform and also with the platform naturally using machine learning to optimize for 00:50:03.440 |
your own interests that they would further isolate us from each other and feed us all basically just 00:50:09.680 |
the stuff that we already agreed with?" And so I think we've come to that outcome, I think, 00:50:15.120 |
largely, but I think it's something that we all learned together, including the companies, 00:50:21.920 |
as these things happen. Now, you asked like, "Well, are there algorithmic remedies to these 00:50:28.800 |
kinds of things?" And again, these are big problems that are not going to be solved with 00:50:33.840 |
somebody going in and changing a few lines of code somewhere in a social media platform. 00:50:39.760 |
But I do think in many ways, there are definitely ways of making things better. I mean, like an 00:50:45.440 |
obvious recommendation that we make at some point in the book is like, "Look, to the extent that we 00:50:51.520 |
think that machine learning applied for personalization purposes in things like news feed 00:50:57.600 |
or other platforms has led to polarization and intolerance of opposing viewpoints," 00:51:07.680 |
as you know, these algorithms have models, and they place people in some kind of metric space, 00:51:13.440 |
and they place content in that space, and they know the extent to which I have an affinity for 00:51:19.840 |
a particular type of content. And by the same token, they also probably have... That same model 00:51:25.280 |
probably gives you a good idea of the stuff I'm likely to violently disagree with or be offended 00:51:31.200 |
by. So in this case, there really is some knob you could tune that says like, "Instead of 00:51:37.680 |
showing people only what they like and what they want, let's show them some stuff that we think 00:51:43.440 |
that they don't like or that's a little bit further away." And you could even imagine users 00:51:48.240 |
being able to control this. Just like everybody gets a slider, and that slider says like, "How 00:51:55.680 |
much stuff do you want to see that's kind of you might disagree with or is at least further from 00:52:02.000 |
your interests?" It's almost like an exploration button. - So just get your intuition. Do you think 00:52:08.960 |
engagement... So like you're staying on the platform, you're staying engaged. 00:52:13.760 |
Do you think fairness, ideas of fairness won't emerge? Like how bad is it to just optimize for 00:52:21.760 |
engagement? Do you think we'll run into big trouble if we're just optimizing for how much 00:52:27.920 |
you love the platform? - Well, I mean, optimizing for engagement kind of got us where we are. 00:52:33.920 |
- So do you, one, have faith that it's possible to do better? And two, if it is, how do we do better? 00:52:42.640 |
- I mean, it's definitely possible to do different, right? And again, it's not as if I think that 00:52:49.920 |
doing something different than optimizing for engagement won't cost these companies in real 00:52:54.800 |
ways, including revenue and profitability, potentially. - In the short term, at least. 00:53:00.160 |
- Yeah, in the short term, right. And again, if I worked at these companies, I'm sure that 00:53:06.560 |
it would have seemed like the most natural thing in the world also to want to optimize 00:53:11.680 |
engagement, right? And that's good for users in some sense. You want them to be vested in the 00:53:16.800 |
platform and enjoying it and finding it useful, interesting, and/or productive. But my point is 00:53:22.480 |
is that the idea that it's sort of out of their hands, as you said, or that there's nothing to do 00:53:29.120 |
about it, never say never, but that strikes me as implausible as a machine learning person, right? 00:53:34.720 |
I mean, these companies are driven by machine learning, and this optimization of engagement 00:53:38.960 |
is essentially driven by machine learning, right? It's driven by not just machine learning, but 00:53:44.720 |
very, very large-scale A/B experimentation where you kind of tweak some element of the user 00:53:51.200 |
interface or tweak some component of an algorithm or tweak some component or feature of your 00:53:58.000 |
click-through prediction model. And my point is is that anytime you know how to optimize for 00:54:04.880 |
something, almost by definition, that solution tells you how not to optimize for it or to do 00:54:10.800 |
something different. - Engagement can be measured. 00:54:16.000 |
So sort of optimizing for sort of minimizing divisiveness or maximizing intellectual growth 00:54:25.440 |
over the lifetime of a human being are very difficult to measure. 00:54:29.120 |
- That's right. So I'm not claiming that doing something different will 00:54:35.280 |
immediately make it apparent that this is a good thing for society. And in particular, 00:54:41.040 |
I mean, I think one way of thinking about where we are on some of these social media platforms 00:54:45.840 |
is it kind of feels a bit like we're in a bad equilibrium, right? That these systems are helping 00:54:52.160 |
us all kind of optimize something myopically and selfishly for ourselves. And of course, 00:54:57.760 |
from an individual standpoint, at any given moment, like why would I want to see things 00:55:02.960 |
in my newsfeed that I found irrelevant, offensive, or the like, okay? But maybe by all of us 00:55:12.320 |
having these platforms myopically optimized in our interests, we have reached a collective outcome as 00:55:19.120 |
a society that we're unhappy with in different ways, let's say with respect to things like 00:55:23.840 |
political discourse and tolerance of opposing viewpoints. - And if Mark Zuckerberg gave you 00:55:32.400 |
a call and said, "I'm thinking of taking a sabbatical, could you run Facebook for me for 00:55:36.560 |
six months?" What would you, how? - I think no thanks would be my first response, but 00:55:41.680 |
there are many aspects of being the head of the entire company that are kind of entirely 00:55:49.440 |
exogenous to many of the things that we're discussing here. And so I don't really think 00:55:54.640 |
I would need to be CEO of Facebook to kind of implement the more limited set of solutions that 00:56:01.440 |
I might imagine. But I think one concrete thing they could do is they could experiment with 00:56:08.160 |
letting people who chose to, to see more stuff in their newsfeed that is not entirely kind of 00:56:15.680 |
chosen to optimize for their particular interests, beliefs, et cetera. - So the kind of thing, 00:56:24.560 |
I could speak to YouTube, but I think Facebook probably does something similar, is they're quite 00:56:31.760 |
effective at automatically finding what sorts of groups you belong to, not based on race or gender 00:56:37.840 |
or so on, but based on the kind of stuff you enjoy watching in the case of YouTube. It's a difficult 00:56:45.920 |
thing for Facebook or YouTube to then say, "Well, you know what? We're gonna show you something from 00:56:51.760 |
a very different cluster, even though we believe algorithmically you're unlikely to enjoy that 00:56:58.080 |
thing." So if that's a weird jump to make, there has to be a human at the very top of that system 00:57:05.440 |
that says, "Well, that will be long-term healthy for you." That's more than an algorithmic decision. 00:57:11.440 |
- Or that same person could say, "That'll be long-term healthy for the platform." 00:57:16.320 |
- For the platform. - Or for the platform's influence on 00:57:19.520 |
society outside of the platform. And it's easy for me to sit here and say these things, 00:57:25.840 |
but conceptually, I do not think that these are totally or they shouldn't be completely alien 00:57:33.600 |
ideas. You could try things like this, and we wouldn't have to invent entirely new science to 00:57:43.040 |
do it, because if we're all already embedded in some metric space and there's a notion of distance 00:57:48.160 |
between you and me and every piece of content, then we know exactly... The same model that 00:57:56.000 |
dictates how to make me really happy also tells how to make me as unhappy as possible as well. 00:58:04.160 |
- Right. The focus in your book and algorithmic fairness research today in general is on machine 00:58:10.560 |
learning, like we said, is data. And just even the entire AI field right now is captivated with 00:58:16.880 |
machine learning, with deep learning. Do you think ideas in symbolic AI or totally other kinds of 00:58:23.440 |
approaches are interesting, useful in the space, have some promising ideas in terms of fairness? 00:58:30.320 |
- I haven't thought about that question specifically in the context of fairness. I 00:58:35.680 |
definitely would agree with that statement in the large, right? I mean, I am one of many machine 00:58:42.640 |
learning researchers who do believe that the great successes that have been shown in machine 00:58:48.880 |
learning recently are great successes, but they're on a pretty narrow set of tasks. I mean, I don't 00:58:54.240 |
think we're kind of notably closer to general artificial intelligence now than we were when I 00:59:02.160 |
started my career. I mean, there's been progress. And I do think that we are kind of as a community 00:59:08.320 |
maybe looking a bit where the light is, but the light is shining pretty bright there right now, 00:59:12.160 |
and we're finding a lot of stuff. So I don't want to like argue with the progress that's been made 00:59:16.240 |
in areas like deep learning, for example. - This touches another sort of related thing 00:59:21.520 |
that you mentioned, and that people might misinterpret from the title of your book, 00:59:26.080 |
ethical algorithm. Is it possible for the algorithm to automate some of those decisions, 00:59:30.720 |
sort of higher level decisions of what kind of... - Like what should be fair. 00:59:36.720 |
- What should be fair. - The more you know about a field, 00:59:39.920 |
the more aware you are of its limitations. And so I'm pretty leery of sort of trying... There's so 00:59:47.600 |
much we already don't know in fairness, even when we're the ones picking the fairness definitions 00:59:54.480 |
and comparing alternatives and thinking about the tensions between different definitions, 00:59:59.520 |
that the idea of kind of letting the algorithm start exploring as well, I definitely think, 01:00:07.200 |
this is a much narrower statement. I definitely think that kind of algorithmic auditing for 01:00:11.120 |
different types of unfairness, right? So like in this gerrymandering example, where I might want 01:00:16.880 |
to prevent not just discrimination against very broad categories, but against combinations of 01:00:22.720 |
broad categories, you quickly get to a point where there's a lot of categories, there's a lot of 01:00:28.000 |
combinations of end features. And you can use algorithmic techniques to sort of try to find 01:00:34.880 |
the subgroups on which you're discriminating the most and try to fix that. That's actually kind of 01:00:39.680 |
the form of one of the algorithms we developed for this fairness gerrymandering problem. 01:00:43.920 |
But I'm, you know, partly because of our technology and our sort of our scientific 01:00:50.080 |
ignorance on these topics right now. And also partly just because these topics are so loaded 01:00:56.400 |
emotionally for people that I just don't see the value. I mean, again, never say never, 01:01:02.160 |
but I just don't think we're at a moment where it's a great time for computer scientists to be 01:01:06.000 |
rolling out the idea like, "Hey, you know, not only have we kind of figured fairness out, but, 01:01:11.360 |
you know, we think the algorithms should start deciding what's fair or giving input on that 01:01:16.400 |
decision." I just don't, it's like the cost benefit analysis to the field of kind of going 01:01:21.840 |
there right now just doesn't seem worth it to me. That said, I should say that I think computer 01:01:26.880 |
scientists should be more philosophically, like should enrich their thinking about these kinds 01:01:31.440 |
of things. I think it's been too often used as an excuse for roboticists working on autonomous 01:01:37.360 |
vehicles, for example, to not think about the human factor or psychology or safety. 01:01:43.200 |
In the same way, like computer science design algorithms, they've been sort of using it as 01:01:46.640 |
an excuse. And I think it's time for basically everybody to become computer scientists. 01:01:51.440 |
I was about to agree with everything you said except that last point. I think that 01:01:55.440 |
the other way of looking at it is that I think computer scientists, you know, and many of us are, 01:02:02.080 |
but we need to wait out into the world more, right? I mean, just the influence that computer 01:02:09.840 |
science and therefore computer scientists have had on society at large just like has exponentially 01:02:17.760 |
magnified in the last 10 or 20 years or so. And, you know, before when we were just tinkering 01:02:24.640 |
around amongst ourselves and it didn't matter that much, there was no need for sort of computer 01:02:29.440 |
scientists to be citizens of the world more broadly. And I think those days need to be over 01:02:35.440 |
very, very fast. And I'm not saying everybody needs to do it, but to me, like the right way 01:02:40.240 |
of doing it is to not to sort of think that everybody else is going to become a computer 01:02:43.360 |
scientist. But, you know, I think, you know, people are becoming more sophisticated about 01:02:48.160 |
computer science, even lay people. You know, I think one of the reasons we decided to write 01:02:53.920 |
this book is we thought 10 years ago, I wouldn't have tried this just because I just didn't think 01:02:59.280 |
that sort of people's awareness of algorithms and machine learning, you know, the general population 01:03:05.840 |
would have been high. And I mean, you would have had to first, you know, write one of the many 01:03:09.840 |
books kind of just explicating that topic to a lay audience first. Now, I think we're at the point 01:03:15.600 |
where like lots of people without any technical training at all know enough about algorithms and 01:03:20.400 |
machine learning that you can start getting to these nuances of things like ethical algorithms. 01:03:25.120 |
I think we agree that there needs to be much more mixing. But I think a lot of the onus of 01:03:31.840 |
that mixing needs to be on the computer science community. Yeah. So just to linger on the 01:03:37.920 |
disagreement, because I do disagree with you on the point that I think if you're a biologist, 01:03:45.200 |
if you're a chemist, if you're an MBA business person, all of those things you can, like, 01:03:53.360 |
if you learned a program, and not only program, if you learn to do machine learning, if you learn to 01:03:58.560 |
do data science, you immediately become much more powerful in the kinds of things you can do. 01:04:04.000 |
And therefore, literature, like library sciences, like, so you were speaking, I think, 01:04:11.120 |
I think it holds true what you're saying for the next few years. But long term, if you're interested 01:04:17.040 |
to me, if you're interested in philosophy, you should learn a program, because then you can 01:04:23.520 |
scrape data, you can study what people are thinking about on Twitter, and then start making 01:04:30.000 |
philosophical conclusions about the meaning of life. Right? I just, I just feel like the access 01:04:36.080 |
to data, the digitization of whatever problem you're trying to solve, it fundamentally changes 01:04:42.800 |
what it means to be a computer scientist. I mean, computer scientists in 20, 30 years will go back 01:04:47.680 |
to being Donald Knuth style theoretical computer science, and everybody would be doing, basically, 01:04:54.640 |
they're exploring the kinds of ideas that you're exploring in your book. It won't be a computer 01:04:58.320 |
science major. Yeah, I mean, I don't think I disagree, but I think that that trend of 01:05:04.240 |
more and more people in more and more disciplines, 01:05:06.880 |
adopting ideas from computer science, learning how to code, I think that that trend seems 01:05:13.600 |
firmly underway. I mean, you know, like, an interesting digressive question along these 01:05:19.040 |
lines is maybe in 50 years, there won't be computer science departments anymore. 01:05:24.000 |
Because the field will just sort of be ambient in all of the different disciplines. And you know, 01:05:30.960 |
people will look back and, you know, having a computer science department will look like having 01:05:35.600 |
an electricity department or something. It's like, you know, everybody uses this, it's just out 01:05:39.920 |
there. I mean, I do think there will always be that kind of Knuth style core to it. But it's not 01:05:45.120 |
an implausible path that we kind of get to the point where the academic discipline of computer 01:05:50.560 |
science becomes somewhat marginalized, because of its very success in kind of infiltrating 01:05:56.160 |
all of science and society and the humanities, etc. What is differential privacy, or more broadly, 01:06:04.080 |
algorithmic privacy? Algorithmic privacy more broadly is just the study or the notion of privacy 01:06:12.880 |
definitions or norms being encoded inside of algorithms. And so, you know, I think we count 01:06:22.160 |
among this body of work, just, you know, the literature and practice of things like data 01:06:29.120 |
anonymization, which we kind of at the beginning of our discussion of privacy, say like, okay, 01:06:35.840 |
this is sort of a notion of algorithmic privacy, it kind of tells you, you know, something to go 01:06:41.040 |
do with data. But, you know, our view is that it's, and I think this is now, you know, quite 01:06:47.760 |
widespread, that it's, you know, despite the fact that those notions of anonymization, kind of 01:06:54.080 |
redacting and coarsening, are the most widely adopted technical solutions for data privacy, 01:07:01.520 |
they are like deeply, fundamentally flawed. And so, you know, to your first question, 01:07:07.040 |
what is differential privacy? Differential privacy seems to be a much, much better notion of privacy 01:07:15.600 |
that kind of avoids a lot of the weaknesses of anonymization notions while still letting us do 01:07:23.200 |
useful stuff with data. What's anonymization of data? So, by anonymization, I'm, you know, 01:07:28.720 |
kind of referring to techniques like I have a database, the rows of that database are, 01:07:35.360 |
let's say, individual people's medical records, okay? And I want to let people use that data, 01:07:43.520 |
maybe I want to let researchers access that data to build predictive models for some disease, 01:07:48.400 |
but I'm worried that that will leak, you know, sensitive information about specific people's 01:07:56.160 |
medical records. So, anonymization broadly refers to the set of techniques where I say, like, okay, 01:08:01.440 |
I'm first going to, like, I'm going to delete the column with people's names. I'm going to not put, 01:08:07.600 |
you know, so that would be like a redaction, right? I'm just redacting that information. 01:08:12.080 |
I am going to take ages, and I'm not going to, like, say your exact age, I'm going to say whether 01:08:17.920 |
you're, you know, zero to 10, 10 to 20, 20 to 30. I might put the first three digits of your zip 01:08:24.640 |
code but not the last two, et cetera, et cetera. And so, the idea is that through some series of 01:08:29.280 |
operations like this on the data, I anonymize it, you know, another term of art that's used is 01:08:35.280 |
removing personally identifiable information. And, you know, this is basically the most common 01:08:41.680 |
way of providing data privacy but that's in a way that still lets people access some variant form 01:08:48.960 |
of the data. >> So, at a slightly broader picture, as you talk about what does anonymization mean 01:08:55.680 |
when you have multiple databases, like with a Netflix prize when you can start combining stuff 01:09:01.280 |
together. >> So, this is exactly the problem with these notions, right, is that notions of 01:09:06.640 |
anonymization, removing personally identifiable information, the kind of fundamental conceptual 01:09:12.400 |
flaw is that, you know, these definitions kind of pretend as if the data set in question is the only 01:09:18.960 |
data set that exists in the world or that ever will exist in the future. And, of course, things 01:09:24.560 |
like the Netflix prize and many, many other examples since the Netflix prize, I think that 01:09:28.560 |
was one of the earliest ones, though, you know, you can reidentify people that were, you know, 01:09:34.640 |
that were anonymized in the data set by taking that anonymized data set and combining it with 01:09:39.760 |
other allegedly anonymized data sets and maybe publicly available information about you. 01:09:44.320 |
>> And for people who don't know, the Netflix prize was being publicly released as data. 01:09:49.360 |
So, the names from those rows were removed but what was released is the preference or the ratings 01:09:56.320 |
of what movies you like and you don't like. And from that combined with other things, 01:10:00.320 |
I think forum posts and so on, you can start to figure out the names. 01:10:03.680 |
>> Yeah, I mean, in that case, it was specifically the internet movie database. 01:10:07.920 |
>> Where lots of Netflix users publicly rate their movie, you know, their movie preferences. 01:10:13.840 |
And so, the anonymized data in Netflix when kind of, you know, it's just this phenomenon, I think, 01:10:20.800 |
we've all come to realize in the last decade or so is that just knowing a few apparently 01:10:28.720 |
irrelevant innocuous things about you can often act as a fingerprint. Like if I know, 01:10:33.600 |
you know, what rating you gave to these 10 movies and the date on which you entered these movies, 01:10:40.240 |
this is almost like a fingerprint for you in the sea of all Netflix users. 01:10:44.800 |
There was just another paper on this in Science or Nature about a month ago that, you know, kind of 01:10:50.320 |
18 attributes. I mean, my favorite example of this was actually a paper from several years ago now 01:10:57.280 |
where it was shown that just from your likes on Facebook, just from the, you know, the things on 01:11:03.840 |
which you clicked on the thumbs up button on the platform, not using any information, demographic 01:11:10.000 |
information, nothing about who your friends are, just knowing the content that you had liked 01:11:15.520 |
was enough to, you know, in the aggregate accurately predict things like sexual orientation, 01:11:22.240 |
drug and alcohol use, whether you were the child of divorced parents. So we live in this era where, 01:11:28.720 |
you know, even the apparently irrelevant data that we offer about ourselves on public platforms and 01:11:34.560 |
forums often unbeknownst to us more or less acts as a signature or, you know, fingerprint. 01:11:41.360 |
And that if you can kind of, you know, do a join between that kind of data and allegedly anonymized 01:11:47.680 |
data, you have real trouble. So is there hope for any kind of privacy in a world where a few 01:11:54.080 |
likes can identify you? So there is differential privacy, right? So what is differential privacy? 01:12:01.200 |
So differential privacy basically is a kind of alternate, much stronger notion of privacy than 01:12:07.120 |
these anonymization ideas. And it, you know, it's a technical definition, but like the spirit of it 01:12:15.760 |
is we compare two alternate worlds, okay? So let's suppose I'm a researcher and I want to do, 01:12:23.200 |
you know, there's a database of medical records and one of them is yours. And I want to use that 01:12:29.920 |
database of medical records to build a predictive model for some disease. So based on people's 01:12:34.800 |
symptoms and test results and the like, I want to, you know, build a model predicting the 01:12:40.800 |
probability that people have disease. So, you know, this is the type of scientific research 01:12:44.720 |
that we would like to be allowed to continue. And in differential privacy, you ask a very 01:12:50.480 |
particular counterfactual question. We basically compare two alternatives. One is when I do this, 01:13:00.320 |
I build this model on the database of medical records, including your medical record. 01:13:06.880 |
And the other one is where I do the same exercise with the same database with just your medical 01:13:14.800 |
record removed. So basically, you know, it's two databases, one with N records in it and one with 01:13:21.680 |
N minus one records in it. The N minus one records are the same and the only one that's missing in 01:13:27.440 |
the second case is your medical record. So differential privacy basically says that 01:13:34.960 |
any harms that might come to you from the analysis in which your data was included 01:13:42.720 |
are essentially nearly identical to the harms that would have come to you if the same analysis had 01:13:50.160 |
done been done without your medical record included. So in other words, this doesn't say 01:13:55.440 |
that bad things cannot happen to you as a result of data analysis. It just says that these bad 01:14:01.280 |
things were going to happen to you already, even if your data wasn't included. And to give a very 01:14:06.560 |
concrete example, right, you know, like we discussed at some length, the study that, you know, 01:14:14.000 |
in the '50s that was done that created the, that established the link between smoking and lung 01:14:18.560 |
cancer. And we make the point that like, well, if your data was used in that analysis and, you know, 01:14:25.200 |
the world kind of knew that you were a smoker because, you know, there was no stigma associated 01:14:29.520 |
with smoking before that, those findings, real harm might've come to you as a result of that 01:14:35.440 |
study that your data was included in. In particular, your insurer now might have a 01:14:40.080 |
higher posterior belief that you might have lung cancer and raise your premium. So you've 01:14:44.960 |
suffered economic damage. But the point is, is that if the same analysis has been done 01:14:51.520 |
without, with all the other N minus one medical records and just yours missing, 01:14:57.040 |
the outcome would have been the same. Your data wasn't idiosyncratically crucial to establishing 01:15:04.080 |
the link between smoking and lung cancer, because the link between smoking and lung cancer 01:15:08.320 |
is like a fact about the world that can be discovered with any sufficiently large 01:15:13.120 |
database of medical records. >> But that's a very low value of harm. Yeah, 01:15:17.440 |
so that's showing that very little harm is done. Great. But how, what is the mechanism of differential 01:15:23.360 |
privacy? So that's the kind of beautiful statement of it. But what's the mechanism by which privacy 01:15:29.440 |
is preserved? >> Yeah. So it's basically by adding noise to computations, right? So the basic idea 01:15:35.600 |
is that every differentially private algorithm, first of all, or every good differentially 01:15:41.360 |
private algorithm, every useful one is a probabilistic algorithm. So it doesn't, 01:15:46.160 |
on a given input, if you gave the algorithm the same input multiple times, it would give 01:15:51.840 |
different outputs each time from some distribution. And the way you achieve differential privacy 01:15:57.600 |
algorithmically is by kind of carefully and tastefully adding noise to a computation in the 01:16:04.160 |
right places. And to give a very concrete example, if I want to compute the average of a set of 01:16:09.920 |
numbers, right, the non-private way of doing that is to take those numbers and average them and 01:16:15.920 |
release a numerically precise value for the average, okay? In differential privacy, you wouldn't do 01:16:23.280 |
that. You would first compute that average to numerical precisions, and then you'd add some 01:16:29.120 |
noise to it, right? You'd add some kind of zero mean, Gaussian or exponential noise to it, so that 01:16:36.640 |
the actual value you output is not the exact mean, but it'll be close to the mean, but it'll be close, 01:16:43.920 |
the noise that you add will sort of prove that nobody can kind of reverse engineer 01:16:49.360 |
any particular value that went into the average. >> So noise is the savior. How many algorithms can 01:16:57.200 |
be aided by adding noise? >> Yeah, so I'm a relatively recent 01:17:03.920 |
member of the differential privacy community. My co-author, Aaron Roth, is, you know, 01:17:08.880 |
really one of the founders of the field and has done a great deal of work, and I've learned 01:17:13.600 |
a tremendous amount working with him on it. >> It's a pretty grown-up field already. 01:17:17.120 |
>> Yeah, but now it's pretty mature. But I must admit, the first time I saw the definition of 01:17:20.560 |
differential privacy, my reaction was like, "Well, that is a clever definition, and it's really 01:17:25.600 |
making very strong promises." And my, you know, I first saw the definition in much earlier days, 01:17:32.560 |
and my first reaction was like, "Well, my worry about this definition would be that it's a great 01:17:37.200 |
definition of privacy, but that it'll be so restrictive that we won't really be able to use 01:17:42.080 |
it." Like, you know, we won't be able to compute many things in a differentially private way. 01:17:46.960 |
So that's one of the great successes of the field, I think, is in showing that the opposite is true, 01:17:52.320 |
and that, you know, most things that we know how to compute, absent any privacy considerations, 01:18:00.720 |
can be computed in a differentially private way. So, for example, pretty much all of statistics 01:18:05.840 |
and machine learning can be done differentially privately. So pick your favorite machine learning 01:18:11.360 |
algorithm, back propagation and neural networks, you know, cart for decision trees, support vector 01:18:17.120 |
machines, boosting, you name it, as well as classic hypothesis testing and the like in statistics. 01:18:23.360 |
None of those algorithms are differentially private in their original form. All of them have 01:18:30.800 |
modifications that add noise to the computation in different places in different ways that achieve 01:18:37.360 |
differential privacy. So this really means that to the extent that, you know, we've become a, 01:18:43.040 |
you know, a scientific community very dependent on the use of machine learning and statistical 01:18:49.760 |
modeling and data analysis, we really do have a path to kind of provide privacy guarantees to 01:18:57.440 |
those methods. And so we can still, you know, enjoy the benefits of kind of the data science era 01:19:05.360 |
while providing, you know, rather robust privacy guarantees to individuals. 01:19:09.680 |
So perhaps a slightly crazy question, but if we take the ideas of differential privacy and 01:19:16.720 |
take it to the nature of truth that's being explored currently, 01:19:20.480 |
so what's your most favorite and least favorite food? 01:19:24.240 |
Hmm, I'm not a real foodie, so I'm a big fan of spaghetti. 01:19:39.200 |
But is one way to protect your preference for spaghetti by having an information campaign, 01:19:46.480 |
bloggers and so on, of bots saying that you like cauliflower. So like this kind of, 01:19:52.720 |
the same kind of noise ideas. I mean, if you think of in our politics today, there's this idea of 01:19:58.720 |
Russia hacking our elections. What's meant there, I believe, is bots spreading different kinds of 01:20:06.160 |
information. Is that a kind of privacy or is that too much of a stretch? 01:20:10.240 |
No, it's not a stretch. I've not seen those ideas, you know, that is not a technique that to my 01:20:18.080 |
knowledge will provide differential privacy. But to give an example, like one very specific example 01:20:24.480 |
about what you're discussing is, there was a very interesting project at NYU, I think led by Helen 01:20:30.800 |
Nissenbaum there, in which they basically built a browser plugin that tried to essentially 01:20:39.360 |
obfuscate your Google searches. So to the extent that you're worried that Google is using your 01:20:44.640 |
searches to build, you know, predictive models about you, to decide what ads to show you, 01:20:50.400 |
which they might very reasonably want to do. But if you object to that, they built this widget you 01:20:55.600 |
could plug in. And basically, whenever you put in a query into Google, it would send that query to 01:21:00.720 |
Google. But in the background, all of the time from your browser, it would just be sending this 01:21:06.240 |
torrent of irrelevant queries to the search engine. So, you know, it's like a weed and chaff 01:21:13.440 |
thing. So, you know, out of every thousand queries, let's say, that Google was receiving from your 01:21:18.960 |
browser, one of them was one that you put in, but the other 999 were not. Okay, so it's the same 01:21:24.560 |
kind of idea, kind of, you know, privacy by obfuscation. So I think that's an interesting 01:21:30.800 |
idea. Doesn't give you differential privacy. It's also, I was actually talking to somebody at one 01:21:37.520 |
of the large tech companies recently about the fact that, you know, just this kind of thing that 01:21:43.280 |
there are some times when the response to my data needs to be very specific to my data, right? Like, 01:21:51.760 |
I type mountain biking into Google, I want results on mountain biking, and I really want Google to 01:21:57.600 |
know that I typed in mountain biking. I don't want noise added to that. And so I think there's 01:22:03.280 |
sort of maybe even interesting technical questions around notions of privacy that are appropriate 01:22:07.680 |
where, you know, it's not that my data is part of some aggregate like medical records and that we're 01:22:13.040 |
trying to discover important correlations and facts about the world at large, but rather, you 01:22:18.800 |
know, there's a service that I really want to, you know, pay attention to my specific data, yet I 01:22:24.240 |
still want some kind of privacy guarantee. And I think these kind of obfuscation ideas are sort of 01:22:28.880 |
one way of getting at that, but maybe there are others as well. So where do you think we'll land 01:22:33.440 |
in this algorithm driven society in terms of privacy? So, sort of, China, like Kai-Fu Lee 01:22:41.120 |
describes, you know, it's collecting a lot of data on its citizens, but in the best form, it's 01:22:48.160 |
actually able to provide a lot of, sort of, protect human rights and provide a lot of amazing services. 01:22:54.480 |
And it's worse forms that can violate those human rights and limit services. So where do you think 01:23:01.920 |
we'll land? So algorithms are powerful when they use data. So as a society, do you think we'll give 01:23:11.040 |
over more data? Is it possible to protect the privacy of that data? 01:23:15.440 |
So I'm optimistic about the possibility of, you know, balancing the desire for individual privacy 01:23:25.120 |
and individual control of privacy with kind of societally and commercially beneficial uses of 01:23:32.880 |
data, not unrelated to differential privacy or suggestions that say like, well, individuals 01:23:38.320 |
should have control of their data. They should be able to limit the uses of that data. They should 01:23:43.760 |
even, you know, there's, you know, fledgling discussions going on in research circles about 01:23:48.960 |
allowing people selective use of their data and being compensated for it. 01:23:53.600 |
And then you get to sort of very interesting economic questions like pricing, right? And one 01:23:59.680 |
interesting idea is that maybe differential privacy would also, you know, be a conceptual 01:24:05.280 |
framework in which you could talk about the relative value of different people's data, 01:24:09.040 |
like, you know, to demystify this a little bit. If I'm trying to build a predictive model for some 01:24:14.480 |
rare disease and I'm going to use machine learning to do it, it's easy to get negative examples 01:24:21.120 |
because the disease is rare, right? But I really want to have lots of people with the disease in my 01:24:27.360 |
data set, okay? And so somehow those people's data with respect to this application is much more 01:24:34.800 |
valuable to me than just like the background population. And so maybe they should be 01:24:39.200 |
compensated more for it. And so, you know, I think these are kind of very, very fledgling 01:24:47.520 |
conceptual questions that maybe we'll have kind of technical thought on them sometime in the coming 01:24:52.560 |
years. But I do think we'll, you know, to kind of get more directly answer your question, I think 01:24:57.600 |
I'm optimistic at this point from what I've seen that we will land at some, you know, better 01:25:03.600 |
compromise than we're at right now, where again, you know, privacy guarantees are few, far between, 01:25:10.400 |
and weak, and users have very, very little control. And I'm optimistic that we'll land in something 01:25:17.440 |
that, you know, provides better privacy overall and more individual control of data and privacy. 01:25:22.560 |
But, you know, I think to get there, it's again, just like fairness, it's not going to be enough 01:25:27.680 |
to propose algorithmic solutions. There's going to have to be a whole kind of regulatory legal 01:25:32.320 |
process that prods companies and other parties to kind of adopt solutions. 01:25:38.160 |
>> And I think you've mentioned the word control a lot. And I think giving people control, 01:25:42.720 |
that's something that people don't quite have in a lot of these algorithms. And that's a really 01:25:48.160 |
interesting idea of giving them control. Some of that is actually literally an interface design 01:25:53.920 |
question, sort of just enabling, because I think it's good for everybody to give users control. 01:26:00.240 |
It's not, it's not a, it's almost not a trade-off, except that you have to hire people that are good 01:26:05.600 |
at interface design. >> Yeah, I mean, the other thing that has to be said, right, is that, you 01:26:10.800 |
know, it's a cliche, but, you know, we, as the users of many systems, platforms, and apps, you 01:26:19.200 |
know, we are the product. We are not the customer. The customer are advertisers, and our data is the 01:26:25.920 |
product, okay? So it's one thing to kind of suggest more individual control of data and privacy and 01:26:32.640 |
uses, but this, you know, if this happens in sufficient degree, it will upend the entire 01:26:40.560 |
economic model that has supported the internet to date. And so some other economic model will have 01:26:46.960 |
to be, you know, will have to replace it. >> So the idea of markets you mentioned, 01:26:52.000 |
by exposing the economic model to the people, they will then become a market, and therefore— 01:26:57.680 |
>> They could be participants in it. >> Participants in it. 01:26:59.760 |
>> And, you know, this isn't, you know, this is not a weird idea, right? Because 01:27:03.120 |
there are markets for data already. It's just that consumers are not participants in that. 01:27:08.400 |
There's like, you know, there's sort of, you know, publishers and content providers on one side that 01:27:13.280 |
have inventory, and then they're advertising on the others, and, you know, Google and Facebook 01:27:18.320 |
are running, you know, their—pretty much their entire revenue stream is by running two-sided 01:27:24.480 |
markets between those parties, right? And so it's not a crazy idea that there would be like a 01:27:29.920 |
three-sided market or that, you know, that on one side of the market or the other, we would have 01:27:35.040 |
proxies representing our interest. It's not, you know, it's not a crazy idea, but it would—it's 01:27:39.920 |
not a crazy technical idea, but it would have pretty extreme economic consequences. 01:27:47.920 |
>> Speaking of markets, a lot of fascinating aspects of this world arise not from individual 01:27:55.120 |
humans, but from the interaction of human beings. You've done a lot of work in game theory. First, 01:28:02.160 |
can you say what is game theory and how does it help us model and study things? 01:28:07.200 |
>> Yeah, game theory, of course, let us give credit where it's due. You know, it 01:28:11.760 |
comes from the economists first and foremost, but as I've mentioned before, like, you know, 01:28:16.720 |
computer scientists never hesitate to wander into other people's turf, and so there is now this 01:28:22.640 |
20-year-old field called algorithmic game theory. But, you know, game theory, 01:28:27.760 |
first and foremost, is a mathematical framework for reasoning about collective outcomes in systems 01:28:36.960 |
of interacting individuals. You know, so you need at least two people to get started in game theory, 01:28:44.400 |
and many people are probably familiar with Prisoner's Dilemma as kind of a classic example 01:28:49.840 |
of game theory and a classic example where everybody looking out for their own individual 01:28:56.000 |
interests leads to a collective outcome that's kind of worse for everybody than what might be 01:29:02.000 |
possible if they cooperated, for example. But cooperation is not an equilibrium in Prisoner's 01:29:08.240 |
Dilemma. And so my work and the field of algorithmic game theory more generally in these 01:29:14.960 |
areas kind of looks at settings in which the number of actors is potentially extraordinarily 01:29:23.520 |
large and their incentives might be quite complicated and kind of hard to model directly, 01:29:30.720 |
but you still want kind of algorithmic ways of kind of predicting what will happen or influencing 01:29:36.080 |
what will happen in the design of platforms. >> So what to you is the most beautiful idea 01:29:43.760 |
that you've encountered in game theory? >> There's a lot of them. I'm a big fan of the 01:29:48.800 |
field. I mean, you know, I mean, technical answers to that, of course, would include 01:29:54.320 |
Nash's work just establishing that, you know, there's a competitive equilibrium under very, 01:30:01.120 |
very general circumstances, which in many ways kind of put the field on a firm conceptual footing, 01:30:08.400 |
because if you don't have equilibrium, it's kind of hard to ever reason about what might happen 01:30:13.280 |
since, you know, there's just no stability. >> So just the idea that stability can emerge 01:30:18.560 |
when there's multiple... >> Or that, I mean, not that it will necessarily emerge, 01:30:22.000 |
just that it's possible, right? I mean, like the existence of equilibrium doesn't mean that 01:30:26.240 |
sort of natural iterative behavior will necessarily lead to it. 01:30:30.320 |
>> In the real world, yes. >> Yeah. Maybe answering slightly 01:30:33.360 |
less personally than you asked the question, I think within the field of algorithmic game theory, 01:30:38.320 |
perhaps the single most important kind of technical contribution that's been made is 01:30:45.120 |
the realization between close connections between machine learning and game theory, 01:30:50.640 |
and in particular between game theory and the branch of machine learning that's known as 01:30:54.640 |
no-regret learning. And this sort of provides a very general framework in which a bunch of 01:31:02.480 |
players interacting in a game or a system, each one kind of doing something that's in their self 01:31:09.120 |
interest will actually kind of reach an equilibrium, and actually reach an equilibrium in a 01:31:14.000 |
pretty, you know, a rather, you know, short amount of steps. 01:31:21.200 |
>> So you kind of mentioned acting greedily can somehow end up pretty good for everybody. 01:31:28.640 |
>> Or pretty bad. >> Or pretty bad. It'll end up stable. 01:31:33.680 |
>> Yeah, right. And, you know, stability or equilibrium by itself 01:31:38.800 |
is not necessarily either a good thing or a bad thing. 01:31:42.640 |
>> So what's the connection between machine learning and the ideas of equilibrium? 01:31:45.600 |
>> Well, I mean, I think we've kind of talked about these ideas already in kind of a non-technical way, 01:31:50.960 |
which is maybe the more interesting way of understanding them first, which is, you know, 01:31:56.160 |
we have many systems, platforms, and apps these days that work really hard to use our data and 01:32:04.880 |
the data of everybody else on the platform to selfishly optimize on behalf of each user, okay? 01:32:12.560 |
So, you know, let me give, I think, the cleanest example, which is just driving apps, 01:32:17.920 |
navigation apps like, you know, Google Maps and Waze, where, you know, miraculously compared to 01:32:24.080 |
when I was growing up at least, you know, the objective would be the same when you wanted to 01:32:28.800 |
drive from point A to point B, spend the least time driving, not necessarily minimize the distance, 01:32:34.320 |
but minimize the time, right? And when I was growing up, like, the only resources you had 01:32:39.200 |
to do that were, like, maps in the car, which literally just told you what roads were available. 01:32:44.640 |
And then you might have, like, half-hourly traffic reports just about the major freeways, 01:32:50.400 |
but not about side roads. So you were pretty much on your own. And now we've got these apps, 01:32:55.760 |
you pull it out and you say, "I want to go from point A to point B." And in response kind of to 01:33:00.080 |
what everybody else is doing, if you like, what all the other players in this game are doing right 01:33:05.280 |
now, here's the, you know, the route that minimizes your driving time. So it is really 01:33:11.520 |
kind of computing a selfish best response for each of us in response to what all of the rest of us 01:33:17.920 |
are doing at any given moment. And so, you know, I think it's quite fair to think of these apps as 01:33:24.080 |
driving or nudging us all towards the competitive or Nash equilibrium of that game. 01:33:32.320 |
Now you might ask, like, "Well, that sounds great. Why is that a bad thing?" Well, you know, 01:33:37.600 |
it's known both in theory and with some limited studies from actual, like, traffic data 01:33:46.240 |
that all of us being in this competitive equilibrium might cause our collective driving 01:33:53.040 |
time to be higher, maybe significantly higher than it would be under other solutions. 01:33:59.440 |
And then you have to talk about what those other solutions might be and what 01:34:02.480 |
the algorithms to implement them are, which we do discuss in the kind of game theory chapter 01:34:07.440 |
of the book. But similarly, you know, on social media platforms or on Amazon, you know, all these 01:34:15.440 |
algorithms that are essentially trying to optimize our behalf, they're driving us in a colloquial 01:34:22.000 |
sense towards some kind of competitive equilibrium. And, you know, one of the most important lessons 01:34:26.880 |
of game theory is that just because we're at equilibrium doesn't mean that there's not a 01:34:30.480 |
solution in which some or maybe even all of us might be better off. And then the connection to 01:34:36.320 |
machine learning, of course, is that in all these platforms I've mentioned, the optimization that 01:34:41.520 |
they're doing on our behalf is driven by machine learning, you know, like predicting where the 01:34:45.680 |
traffic will be, predicting what products I'm going to like, predicting what would make me 01:34:49.520 |
happy in my news feed. Now, in terms of the stability and the promise of that, I have to ask, 01:34:55.360 |
just out of curiosity, how stable are these mechanisms that you, game theory is just, 01:35:00.400 |
the economists came up with, and we all know that economists don't live in the real world, 01:35:05.280 |
just kidding. Sort of what's, do you think when we look at the fact that we haven't blown ourselves 01:35:12.800 |
up from a game theoretic concept of mutually shared destruction, what are the odds that we 01:35:20.560 |
destroy ourselves with nuclear weapons as one example of a stable game theoretic system? 01:35:26.880 |
>>Just to prime your viewers a little bit, I mean, I think you're referring to the fact that 01:35:32.160 |
game theory was taken quite seriously back in the '60s as a tool for reasoning about kind of Soviet 01:35:39.200 |
US nuclear armament, disarmament, detente, things like that. I'll be honest, as huge of a fan as I 01:35:49.120 |
am of game theory and its kind of rich history, it still surprises me that you had people at the 01:35:56.000 |
Rand Corporation back in those days kind of drawing up two by two tables and one, the row 01:36:01.200 |
player is the US and the column player is Russia, and that they were taking seriously, I'm sure if 01:36:08.240 |
I was there, maybe it wouldn't have seemed as naive as it does at the time. >>It seems to have 01:36:13.200 |
worked, which is why it seems naive and silly. >>Well, we're still here. >>We're still here in 01:36:17.280 |
that sense. >>Yeah. Even though I kind of laugh at those efforts, they were more sensible then 01:36:22.400 |
than they would be now, right? Because there were sort of only two nuclear powers at the time, 01:36:26.560 |
and you didn't have to worry about deterring new entrants and who was developing the capacity. 01:36:32.400 |
And so we have many, we have this, it's definitely a game with more players now and more potential 01:36:39.200 |
entrants. I'm not in general somebody who advocates using kind of simple mathematical models when the 01:36:46.320 |
stakes are as high as things like that, and the complexities are very political and social, 01:36:52.320 |
but we are still here. >>So you've worn many hats, one of which, the one that first caused 01:36:59.600 |
me to become a big fan of your work many years ago is algorithmic trading. So I have to just 01:37:06.000 |
ask a question about this because you have so much fascinating work there. In the 21st century, 01:37:11.120 |
what role do you think algorithms have in the space of trading, investment in the financial sector? 01:37:17.920 |
>>Yeah, it's a good question. I mean, in the time I've spent on Wall Street and in finance, 01:37:26.160 |
I've seen a clear progression, and I think it's a progression that kind of models the use of 01:37:31.680 |
algorithms and automation more generally in society, which is the things that kind of get 01:37:39.200 |
taken over by the algos first are sort of the things that computers are obviously better at 01:37:45.520 |
than people, right? So first of all, there needed to be this era of automation, right, where just 01:37:52.560 |
financial exchanges became largely electronic, which then enabled the possibility of trading 01:38:00.160 |
becoming more algorithmic because once the exchanges are electronic, an algorithm can 01:38:05.520 |
submit an order through an API just as well as a human can do at a monitor. 01:38:09.040 |
>>It can do it really quickly. It can read all the data. 01:38:10.960 |
>>Yeah. And so I think the places where algorithmic trading have had the greatest inroads 01:38:18.640 |
and had the first inroads were in kind of execution problems, kind of optimized execution 01:38:24.080 |
problems. So what I mean by that is at a large brokerage firm, for example, one of the lines of 01:38:30.000 |
business might be on behalf of large institutional clients taking what we might consider difficult 01:38:36.800 |
trades. So it's not like a mom and pop investor saying, "I want to buy 100 shares of Microsoft." 01:38:41.760 |
It's a large hedge fund saying, "I want to buy a very, very large stake in Apple, and I want to 01:38:48.720 |
do it over the span of a day." And it's such a large volume that if you're not clever about how 01:38:54.160 |
you break that trade up, not just over time, but over perhaps multiple different electronic 01:38:59.280 |
exchanges that all let you trade Apple on their platform, you'll push prices around in a way that 01:39:06.560 |
hurts your execution. So this is an optimization problem. This is a control problem. And so 01:39:14.480 |
machines are better. We know how to design algorithms that are better at that kind of 01:39:21.200 |
thing than a person is going to be able to do because we can take volumes of historical and 01:39:26.160 |
real-time data to kind of optimize the schedule with which we trade. And similarly, high-frequency 01:39:32.480 |
trading, which is closely related but not the same as optimized execution, where you're just 01:39:38.960 |
trying to spot very, very temporary mispricings between exchanges or within an asset itself, 01:39:47.360 |
or just predict directional movement of a stock because of the kind of very, very low-level, 01:39:53.200 |
granular buying and selling data in the exchange, machines are good at this kind of stuff. 01:39:59.280 |
It's kind of like the mechanics of trading. What about the... Can machines do long-term 01:40:07.600 |
Yeah. So I think we are in an era where clearly there have been some very successful 01:40:12.480 |
quant hedge funds that are in what we would traditionally call still in the stat-arb 01:40:23.520 |
Stat-arb referring to statistical arbitrage. But for the purposes of this conversation, 01:40:28.160 |
what it really means is making directional predictions in asset price movement or returns. 01:40:34.400 |
Your prediction about that directional movement is good for... You have a view that it's valid for 01:40:42.320 |
some period of time between a few seconds and a few days. And that's the amount of time that 01:40:48.480 |
you're going to kind of get into the position, hold it, and then hopefully be right about the 01:40:52.160 |
directional movement and buy low and sell high as the cliche goes. So that is kind of a sweet spot, 01:41:00.800 |
I think, for quant trading and investing right now and has been for some time. 01:41:06.000 |
When you really get to kind of more Warren Buffett-style time scales, like my cartoon 01:41:13.520 |
of Warren Buffett is that Warren Buffett sits and thinks what the long-term value of 01:41:18.720 |
Apple really should be. And he doesn't even look at what Apple is doing today. 01:41:23.360 |
He just decides, "I think that this is what its long-term value is, and it's far from that right 01:41:29.600 |
now. And so I'm going to buy some Apple or short some Apple, and I'm going to sit on that for 10 01:41:36.000 |
or 20 years." Okay. So when you're at that kind of time scale or even more than just a few days, 01:41:43.360 |
you raise all kinds of other sources of risk and information. So now you're talking about 01:41:50.640 |
holding things through recessions and economic cycles. Wars can break out. 01:41:55.840 |
So there you have to understand human nature at a level that— 01:41:58.800 |
Yeah. And you need to just be able to ingest many, many more sources of data that are on 01:42:03.920 |
wildly different time scales, right? So if I'm an HFT, I'm a high-frequency trader, 01:42:11.280 |
I really—my main source of data is just the data from the exchanges themselves about the activity 01:42:17.040 |
in the exchanges, right? And maybe I need to pay—I need to keep an eye on the news, right? Because 01:42:22.560 |
that can cause sudden—the CEO gets caught in a scandal or gets run over by a bus or something 01:42:30.240 |
that can cause very sudden changes. But I don't need to understand economic cycles. I don't need 01:42:36.480 |
to understand recessions. I don't need to worry about the political situation or war breaking out 01:42:41.920 |
in this part of the world because all I need to know is as long as that's not going to happen 01:42:46.320 |
in the next 500 milliseconds, then my model is good. When you get to these longer time scales, 01:42:53.840 |
you really have to worry about that kind of stuff. And people in the machine learning community are 01:42:57.360 |
starting to think about this. We held a—we jointly sponsored a workshop at Penn with the 01:43:05.040 |
Federal Reserve Bank of Philadelphia a little more than a year ago on—I think the title was something 01:43:09.920 |
like Machine Learning for Macroeconomic Prediction, macroeconomic referring specifically to these 01:43:16.960 |
longer time scales. And it was an interesting conference, but it left me with greater confidence 01:43:26.560 |
that we have a long way to go to—and so I think that people that—in the grand scheme of things, 01:43:34.160 |
so somebody asked me like, "Well, whose job on Wall Street is safe from the bots?" I think people 01:43:39.200 |
that are at that longer time scale and have that appetite for all the risks involved in long-term 01:43:44.880 |
investing and that really need kind of not just algorithms that can optimize from data, but they 01:43:50.880 |
need views on stuff. They need views on the political landscape, economic cycles and the like. 01:43:56.960 |
And I think they're pretty safe for a while as far as I can tell. 01:44:02.400 |
So Warren Buffett's job is safe for a little while. 01:44:04.400 |
Yeah, I'm not seeing a robo-Warren Buffett anytime soon. 01:44:08.080 |
Should give him comfort. Last question. If you could go back to—if 01:44:13.920 |
there's a day in your life you could relive because it made you truly happy, 01:44:27.200 |
What day would it be? Can you look back, you remember just being profoundly transformed in 01:44:38.080 |
I'll answer a slightly different question, which is like, what's a day in my life or my career 01:44:48.020 |
I went straight from undergrad to doctoral studies, and that's not at all atypical. 01:44:55.600 |
And I'm also from an academic family. Like my dad was a professor, my uncle on his side 01:45:00.560 |
is a professor, both my grandfathers were professors. 01:45:05.440 |
Yeah, they're kind of all over the map, yeah. And I was a grad student here just up the river 01:45:11.120 |
at Harvard and came to study with Les Valiant, which was a wonderful experience. But I remember 01:45:16.640 |
my first year of graduate school, I was generally pretty unhappy. And I was unhappy because at 01:45:23.360 |
Berkeley as an undergraduate, yeah, I studied a lot of math and computer science, but it 01:45:28.240 |
was a huge school, first of all. And I took a lot of other courses, as we discussed, I 01:45:31.920 |
started as an English major and took history courses and art history classes and had friends 01:45:38.960 |
And Harvard's a much smaller institution than Berkeley, and its computer science department, 01:45:44.800 |
especially at that time, was a much smaller place than it is now. And I suddenly just 01:45:49.760 |
felt very, like I'd gone from this very big world to this highly specialized world. 01:45:55.760 |
And now all of the classes I was taking were computer science classes, and I was only in 01:46:01.280 |
classes with math and computer science people. And so I was, I thought often in that first 01:46:08.480 |
year of grad school about whether I really wanted to stick with it or not. And I thought 01:46:13.600 |
like, "Oh, I could stop with a master's, I could go back to the Bay Area and to California, 01:46:18.960 |
and this was in one of the early periods where there was, you could definitely get a relatively 01:46:24.880 |
good job, paying job at one of the tech companies back, that were the big tech companies back 01:46:31.200 |
And so I distinctly remember like kind of a late spring day when I was kind of sitting 01:46:36.960 |
in Boston Common and kind of really just kind of chewing over what I wanted to do in my 01:46:40.640 |
life. And then I realized like, "Okay," and I think this is where my academic background 01:46:45.120 |
helped me a great deal. I sort of realized, "Yeah, you're not having a great time 01:46:48.880 |
right now, this feels really narrowing, but you know that you're here for research eventually, 01:46:54.320 |
and to do something original, and to try to carve out a career where you kind of choose 01:47:01.920 |
what you want to think about and have a great deal of independence." 01:47:05.280 |
And so at that point, I really didn't have any real research experience yet. I mean, 01:47:10.800 |
it was trying to think about some problems with very little success, but I knew that 01:47:15.600 |
like I hadn't really tried to do the thing that I knew I'd come to do. And so I thought, 01:47:23.120 |
you know, "I'm going to stick through it for the summer," and that was very formative 01:47:30.080 |
because I went from kind of contemplating quitting to, you know, a year later, it being 01:47:37.120 |
very clear to me I was going to finish because I still had a ways to go, but I kind of started 01:47:42.240 |
doing research, it was going well, it was really interesting, and it was sort of a complete 01:47:46.560 |
transformation. You know, and it's just that transition that I think every doctoral student 01:47:52.080 |
makes at some point, which is to sort of go from being like a student of what's been done before 01:47:59.040 |
to doing, you know, your own thing and figure out what makes you interested and what your 01:48:04.080 |
strengths and weaknesses are as a researcher. And once, you know, I kind of made that decision 01:48:09.200 |
on that particular day at that particular moment in Boston Common, you know, I'm glad 01:48:15.040 |
I made that decision. And also just accepting the painful nature of that journey. Yeah, 01:48:19.520 |
yeah, exactly, exactly. And in that moment said, "I'm gonna stick it out." Yeah, 01:48:24.400 |
I'm gonna stick around for a while. Well, Michael, I've looked up to your work for a 01:48:29.200 |
long time, it's really an honor to talk to you. Thank you so much for doing it. It's 01:48:31.120 |
great to get back in touch with you too and see how great you're doing as well. So thanks a lot,