back to index

Michael Kearns: Algorithmic Fairness, Privacy & Ethics | Lex Fridman Podcast #50


Chapters

0:0
47:31 Social Media Platforms
67:6 What Is Differential Privacy
67:24 Anonymization of Data
67:57 Anonymization
71:58 Differential Privacy
75:21 Mechanism of Differential Privacy
88:0 Game Theory
88:24 Algorithmic Game Theory
88:46 Prisoner's Dilemma
97:1 Algorithmic Trading

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Michael Kearns. He's a professor at the University of
00:00:05.040 | Pennsylvania and a co-author of the new book, Ethical Algorithm, that is the focus of much of
00:00:10.960 | this conversation. It includes algorithmic fairness, bias, privacy, and ethics in general.
00:00:18.080 | But that is just one of many fields that Michael is a world-class researcher in,
00:00:22.480 | some of which we touch on quickly, including learning theory or the theoretical foundation
00:00:27.920 | of machine learning, game theory, quantitative finance, computational social science, and much
00:00:33.200 | more. But on a personal note, when I was an undergrad, early on, I worked with Michael
00:00:39.520 | on an algorithmic trading project and competition that he led. That's when I first fell in love with
00:00:44.960 | algorithmic game theory. While most of my research life has been in machine learning and human-robot
00:00:50.320 | interaction, the systematic way that game theory reveals the beautiful structure in our competitive
00:00:56.640 | and cooperating world of humans has been a continued inspiration to me. So for that
00:01:02.320 | and other things, I'm deeply thankful to Michael and really enjoyed having this conversation
00:01:08.480 | again in person after so many years. This is the Artificial Intelligence Podcast. If you enjoy it,
00:01:15.280 | subscribe on YouTube, give it five stars on Apple Podcasts, support on Patreon,
00:01:20.160 | or simply connect with me on Twitter @LexFriedman, spelled F-R-I-D-M-A-N. This episode is supported
00:01:27.760 | by an amazing podcast called Pessimists Archive. Jason, the host of the show, reached out to me
00:01:34.080 | looking to support this podcast, and so I listened to it to check it out. And by listened, I mean I
00:01:40.320 | went through it Netflix binge-style at least five episodes in a row. It's not one of my favorite
00:01:46.480 | podcasts, and I think it should be one of the top podcasts in the world, frankly. It's a history show
00:01:52.320 | about why people resist new things. Each episode looks at a moment in history when something new
00:01:57.760 | was introduced, something that today we think of as commonplace, like recorded music, umbrellas,
00:02:03.360 | bicycles, cars, chess, coffee, the elevator, and the show explores why it freaked everyone out.
00:02:09.600 | The latest episode on mirrors and vanity still stays with me as I think about vanity in the
00:02:15.440 | modern day of the Twitter world. That's the fascinating thing about this show, is that
00:02:21.200 | stuff that happened long ago, especially in terms of our fear of new things, repeats itself in the
00:02:26.000 | modern day and so has many lessons for us to think about in terms of human psychology and the role
00:02:31.600 | of technology in our society. Anyway, you should subscribe and listen to Pessimists Archive.
00:02:37.760 | I highly recommend it. And now, here's my conversation with Michael Kearns.
00:02:44.880 | You mentioned reading Fear and Loathing in Las Vegas in high school and having a more or a bit
00:02:51.280 | more of a literary mind. So, what books, non-technical, non-computer science, would you
00:02:57.840 | say had the biggest impact on your life, either intellectually or emotionally?
00:03:03.040 | You've dug deep into my history, I see.
00:03:04.880 | Went deep.
00:03:05.760 | Yeah, I think, well, my favorite novel is Infinite Jest by David Foster Wallace, which
00:03:12.720 | actually, coincidentally, much of it takes place in the halls of buildings right around us here at
00:03:17.280 | MIT. So, that certainly had a big influence on me. And as you noticed, like when I was in high
00:03:23.360 | school, I actually even started college as an English major. So, I was very influenced by sort
00:03:29.120 | of that genre of journalism at the time and thought I wanted to be a writer and then realized that
00:03:33.600 | an English major teaches you to read, but it doesn't teach you how to write. And then I became
00:03:37.440 | interested in math and computer science instead.
00:03:40.080 | Well, in your new book, Ethical Algorithm, you kind of sneak up from an algorithmic perspective
00:03:47.280 | on these deep, profound philosophical questions of fairness, of privacy. In thinking about these
00:03:57.440 | topics, how often do you return to that literary mind that you had?
00:04:02.800 | Yeah, I'd like to claim there was a deeper connection, but I think both Aaron and I kind
00:04:09.520 | of came at these topics first and foremost from a technical angle. I mean, I kind of consider myself
00:04:15.440 | primarily and originally a machine learning researcher. And I think as we just watched,
00:04:21.280 | like the rest of the society, the field technically advance, and then quickly on the heels of that,
00:04:26.000 | kind of the buzzkill of all of the antisocial behavior by algorithms, just kind of realized
00:04:31.440 | there was an opportunity for us to do something about it from a research perspective. More to
00:04:37.600 | the point of your question, I mean, I do have an uncle who is literally a moral philosopher.
00:04:43.680 | And so in the early days of our technical work on fairness topics, I would occasionally run ideas
00:04:49.760 | behind him. So I mean, I remember an early email I sent to him in which I said like, "Oh, here's a
00:04:54.400 | specific definition of algorithmic fairness that we think is some sort of variant of Rawlsian
00:05:00.080 | fairness. What do you think?" And I thought I was asking a yes or no question, and I got back to a
00:05:06.720 | kind of classical philosopher's response. "Well, it depends. If you look at it this way, then you
00:05:11.200 | might conclude this." And that's when I realized that there was a real kind of rift between the
00:05:18.960 | ways philosophers and others had thought about things like fairness from sort of a humanitarian
00:05:24.640 | perspective and the way that you needed to think about it as a computer scientist if you were going
00:05:29.680 | to kind of implement actual algorithmic solutions. - But I would say the algorithmic solutions take
00:05:38.880 | care of some of the low-hanging fruit. Sort of the problem is a lot of algorithms, when they don't
00:05:45.120 | consider fairness, they are just terribly unfair. And when they don't consider privacy, they're
00:05:51.840 | terribly, they violate privacy. Sort of the algorithmic approach fixes big problems. But
00:05:59.840 | there is still, when you start pushing into the gray area, that's when you start getting into
00:06:05.120 | this philosophy of what it means to be fair, starting from Plato, what is justice kind of
00:06:11.360 | questions. - Yeah, I think that's right. And I mean, I would even not go as far as you went to
00:06:16.720 | say that sort of the algorithmic work in these areas is solving like the biggest problems. And
00:06:22.880 | we discuss in the book the fact that really we are, there's a sense in which we're kind of looking
00:06:28.400 | where the light is in that, for example, if police are racist in who they decide to stop and frisk,
00:06:36.960 | and that goes into the data, there's sort of no undoing that downstream by kind of clever
00:06:42.640 | algorithmic methods. And I think, especially in fairness, I mean, I think less so in privacy,
00:06:49.840 | where we feel like the community kind of really has settled on the right definition,
00:06:54.240 | which is differential privacy. If you just look at the algorithmic fairness literature already,
00:06:59.280 | you can see it's gonna be much more of a mess. And you've got these theorems saying,
00:07:03.440 | here are three entirely reasonable, desirable notions of fairness. And here's a proof that
00:07:11.760 | you cannot simultaneously have all three of them. So I think we know that algorithmic fairness
00:07:18.160 | compared to algorithmic privacy is gonna be kind of a harder problem. And it will have to revisit,
00:07:23.680 | I think, things that have been thought about by many generations of scholars before us.
00:07:29.040 | So it's very early days for fairness, I think. - So before we get into the details of differential
00:07:35.200 | privacy and on the fairness side, let me linger on the philosophy a bit. Do you think most people
00:07:40.800 | are fundamentally good? Or do most of us have both the capacity for good and evil within us?
00:07:48.400 | - I mean, I'm an optimist. I tend to think that most people are good and want to do right. And
00:07:55.920 | that deviations from that are kind of usually due to circumstance, not due to people being bad at
00:08:02.640 | heart. - With people with power,
00:08:05.520 | are people at the heads of governments, people at the heads of companies, people at the heads of
00:08:11.680 | maybe, so financial power markets. Do you think the distribution there is also most people are
00:08:19.760 | good and have good intent? - Yeah, I do. I mean, my statement wasn't
00:08:24.080 | qualified to people not in positions of power. I mean, I think what happens in a lot of the cliche
00:08:31.600 | about absolute power corrupts absolutely. I mean, I think even short of that, having spent a lot of
00:08:38.800 | time on Wall Street and also in arenas very, very different from Wall Street, like academia,
00:08:45.120 | one of the things I think I've benefited from by moving between two very different worlds is you
00:08:52.960 | become aware that these worlds kind of develop their own social norms and they develop their
00:08:59.280 | own rationales for behavior, for instance, that might look unusual to outsiders. But when you're
00:09:05.920 | in that world, it doesn't feel unusual at all. And I think this is true of a lot of professional
00:09:12.640 | cultures, for instance. And so then your maybe slippery slope is too strong of a word, but you're
00:09:20.240 | in some world where you're mainly around other people with the same kind of viewpoints and
00:09:24.880 | training and worldview as you. And I think that's more of a source of abuses of power
00:09:33.280 | than sort of there being good people and evil people and that somehow the evil people are the
00:09:40.960 | ones that somehow rise to power. - That's really interesting. So it's
00:09:44.640 | within the social norms constructed by that particular group of people, you're all trying
00:09:51.280 | to do good, but because it's a group, you might drift into something that for the broader population
00:09:57.600 | it does not align with the values of society. That's the worry.
00:10:01.600 | - Yeah, I mean, or not that you drift, but even the things that don't make sense to the outside
00:10:08.160 | world don't seem unusual to you. So it's not sort of like a good or a bad thing, but, you know,
00:10:14.000 | like, so for instance, you know, in the world of finance, right, there's a lot of complicated types
00:10:20.160 | of activity that if you are not immersed in that world, you cannot see why the purpose of that,
00:10:25.760 | you know, that activity exists at all. It just seems like, you know, completely useless and
00:10:30.800 | people just like, you know, pushing money around. And when you're in that world, right, and you
00:10:36.080 | learn more, your view does become more nuanced, right? You realize, okay, there is actually a
00:10:41.200 | function to this activity. And in some cases you would conclude that actually if magically we could
00:10:47.680 | eradicate this activity tomorrow, it would come back because it actually is like serving some
00:10:53.280 | useful purpose. It's just a useful purpose that's very difficult for outsiders to see. And so I
00:10:59.520 | think, you know, lots of professional work environments or cultures, as I might put it,
00:11:05.280 | kind of have these social norms that, you know, don't make sense to the outside world. Academia
00:11:10.640 | is the same, right? I mean, lots of people look at academia and say, you know, what the hell are
00:11:14.800 | all of you people doing? Why are you paid so much in some cases at taxpayer expenses to do, you know,
00:11:21.840 | to publish papers that nobody reads? You know, but when you're in that world, you come to see
00:11:26.640 | the value for it. And, but even though you might not be able to explain it to, you know, the person
00:11:31.360 | in the street. Right. And in the case of the financial sector, tools like credit might not
00:11:38.160 | make sense to people. Like it's a good example of something that does seem to pop up and be useful,
00:11:43.120 | or just the power of markets and just in general capitalism. Yeah. And finance, I think the primary
00:11:48.880 | example I would give is leverage, right? So being allowed to borrow, to sort of use 10 times as much
00:11:56.640 | money as you've actually borrowed, right? So that's an example of something that before I had
00:12:00.640 | any experience in financial markets, I might've looked at and said, well, what is the purpose of
00:12:05.040 | that? That just seems very dangerous. And it is dangerous and it has proven dangerous. But, you
00:12:10.960 | know, if the fact of the matter is that, you know, sort of on some particular timescale, you are
00:12:17.280 | holding positions that are, you know, very unlikely to, you know, lose, you know, their,
00:12:23.920 | you know, like your value at risk or variances like one or 5%, then it kind of makes sense that
00:12:30.400 | you would be allowed to use a little bit more than you have, because you have, you know, some
00:12:35.360 | confidence that you're not going to lose it all in a single day. Now, of course, when that happens,
00:12:41.520 | we've seen what happens, you know, not too long ago. But, you know, but the idea that it serves
00:12:48.800 | no useful economic purpose under any circumstances is definitely not true.
00:12:54.720 | We'll return to the other side of the coast, Silicon Valley, and the problems there as we
00:13:00.400 | talk about privacy, as we talk about fairness. At the high level, and I'll ask some sort of basic
00:13:08.400 | questions with the hope to get at the fundamental nature of reality. But from a very high level,
00:13:15.120 | what is an ethical algorithm? So I can say that an algorithm has a running time of using big O
00:13:22.400 | notation and log N. I can say that a machine learning algorithm classified cat versus dog
00:13:29.360 | with 97% accuracy. Do you think there will one day be a way to measure sort of in the same
00:13:37.760 | compelling way as the big O notation of this algorithm is 97% ethical?
00:13:44.000 | First of all, let me riff for a second on your specific N log N example. So because early in
00:13:50.480 | the book, when we're just kind of trying to describe algorithms, period, we say like, okay,
00:13:55.360 | what's an example of an algorithm or an algorithmic problem? First of all, it's sorting,
00:14:00.640 | right? You have a bunch of index cards with numbers on them and you want to sort them.
00:14:04.320 | And we describe an algorithm that sweeps all the way through, finds the smallest number,
00:14:09.520 | puts it at the front, then sweeps through again, finds the second smallest number.
00:14:13.280 | So we make the point that this is an algorithm, and it's also a bad algorithm in the sense that
00:14:19.280 | it's quadratic rather than N log N, which we know is kind of optimal for sorting. And we make the
00:14:26.000 | point that sort of like, so even within the confines of a very precisely specified problem,
00:14:32.400 | there might be many, many different algorithms for the same problem with different properties.
00:14:38.880 | Like some might be faster in terms of running time, some might use less memory, some might have
00:14:45.120 | better distributed implementations. And so the point is, is that already we're used to,
00:14:50.800 | you know, in computer science, thinking about trade-offs between different types of quantities
00:14:57.040 | and resources, and there being, you know, better and worse algorithms. And our book is about that
00:15:06.000 | part of algorithmic ethics that we know how to kind of put on that same kind of quantitative
00:15:13.920 | footing right now. So, you know, just to say something that our book is not about, our book
00:15:19.440 | is not about kind of broad, fuzzy notions of fairness. It's about very specific notions of
00:15:26.080 | fairness. There's more than one of them. There are tensions between them, right? But if you pick one
00:15:33.520 | of them, you can do something akin to saying that this algorithm is 97% ethical. You can say,
00:15:41.280 | for instance, the, you know, for this lending model, the false rejection rate on black people
00:15:48.240 | and white people is within 3%, right? So we might call that a 97% ethical algorithm, and a 100%
00:15:56.800 | ethical algorithm would mean that that difference is 0%.
00:16:00.160 | >> In that case, fairness is specified when two groups, however they're defined, are given to you.
00:16:07.120 | >> That's right.
00:16:08.320 | >> So the, and then you can sort of mathematically start describing the algorithm. But
00:16:13.040 | nevertheless, the part where the two groups are given to you, unlike running time,
00:16:23.040 | you know, we don't in computer science talk about how fast an algorithm feels like when it runs.
00:16:29.840 | >> True.
00:16:30.560 | >> We measure it, and ethical starts getting into feelings. So for example, an algorithm runs,
00:16:37.440 | you know, if it runs in the background, it doesn't disturb the performance of my system,
00:16:41.920 | it'll feel nice, I'll be okay with it. But if it overloads the system, it'll feel unpleasant.
00:16:46.800 | So in that same way, ethics, there's a feeling of how socially acceptable it is, how does it
00:16:52.400 | represent the moral standards of our society today? So in that sense, and sorry to linger on
00:16:59.120 | that first high-level philosophical question, is do you have a sense we'll be able to measure how
00:17:04.880 | ethical an algorithm is?
00:17:06.640 | >> First of all, I didn't, certainly didn't mean to give the impression that you can kind of measure,
00:17:11.840 | you know, memory speed trade-offs, you know, and that there's a complete, you know,
00:17:17.200 | mapping from that on to kind of fairness, for instance, or ethics and accuracy, for example.
00:17:23.760 | In the type of fairness definitions that are largely the objects of study today and starting
00:17:30.480 | to be deployed, you as the user of the definitions, you need to make some hard decisions before you
00:17:36.800 | even get to the point of designing fair algorithms. One of them, for instance, is deciding
00:17:43.840 | who it is that you're worried about protecting, who you're worried about being harmed by,
00:17:50.000 | for instance, some notion of discrimination or unfairness. And then you need to also decide
00:17:55.360 | what constitutes harm. So for instance, in a lending application, maybe you decide that,
00:18:01.360 | you know, falsely rejecting a creditworthy individual, you know, sort of a false negative,
00:18:07.600 | is the real harm, and that false positives, i.e. people that are not creditworthy or are not going
00:18:13.840 | to repay your loan, that get a loan, you might think of them as lucky. And so that's not a harm,
00:18:19.600 | although it's not clear that if you don't have the means to repay a loan, that being given a loan
00:18:25.600 | is not also a harm. So, you know, the literature is sort of so far quite limited in that you sort
00:18:34.080 | of need to say, who do you want to protect and what would constitute harm to that group?
00:18:38.000 | And when you ask questions like, will algorithms feel ethical, one way in which they won't under
00:18:44.960 | the definitions that I'm describing is if, you know, if you are an individual who is falsely
00:18:50.800 | denied a loan, incorrectly denied a loan, all of these definitions basically say like, well,
00:18:56.720 | you know, your compensation is the knowledge that we are also falsely denying loans to other people,
00:19:03.520 | you know, in other groups at the same rate that we're doing it to you. And, you know,
00:19:08.560 | and so there is actually this interesting, even technical tension in the field right now between
00:19:14.400 | these sort of group notions of fairness and notions of fairness that might actually feel
00:19:20.240 | like real fairness to individuals, right? They might really feel like their particular interests
00:19:25.920 | are being protected or thought about by the algorithm rather than just, you know, the groups
00:19:32.080 | that they happen to be members of. - Is there parallels to the big O notation of worst case
00:19:38.000 | analysis? So is it important to, looking at the worst violation of fairness for an individual,
00:19:46.400 | is it important to minimize that one individual? So like worst case analysis, is that something
00:19:51.760 | you think about or? - I mean, I think we're not even at the point where we can sensibly
00:19:56.480 | think about that. So first of all, you know, we're talking here both about fairness applied at the
00:20:03.120 | group level, which is a relatively weak thing, but it's better than nothing. And also the more
00:20:10.240 | ambitious thing of trying to give some individual promises. But even that doesn't incorporate,
00:20:17.120 | I think, something that you're hinting at here is what I might have called subjective fairness,
00:20:20.880 | right? So a lot of the definitions, I mean, all of the definitions in the algorithmic fairness
00:20:25.920 | literature are what I would kind of call received wisdom definitions. It's sort of, you know,
00:20:30.400 | somebody like me sits around and thinks like, okay, you know, I think here's a technical
00:20:34.880 | definition of fairness that I think people should want, or that they should, you know,
00:20:39.440 | think of as some notion of fairness, maybe not the only one, maybe not the best one,
00:20:43.280 | maybe not the last one. But we really actually don't know from a subjective standpoint,
00:20:51.680 | like what people really think is fair. There's, you know, we just started doing a little bit of
00:20:57.600 | work in our group at actually doing kind of human subject experiments in which we, you know,
00:21:05.040 | ask people about, you know, we ask them questions about fairness, we survey them,
00:21:11.760 | we, you know, we show them pairs of individuals in, let's say, a criminal recidivism prediction
00:21:17.440 | setting, and we ask them, do you think these two individuals should be treated the same as a matter
00:21:23.600 | of fairness? And to my knowledge, there's not a large literature in which ordinary people are
00:21:31.040 | asked about, you know, they have sort of notions of their subjective fairness elicited from them.
00:21:37.280 | It's mainly, you know, kind of scholars who think about fairness, you know, kind of making up their
00:21:43.840 | own definitions. And I think this needs to change actually for many social norms, not just for
00:21:49.760 | fairness, right? So there's a lot of, you know, discussion these days in the AI community about
00:21:55.360 | interpretable AI or understandable AI. And as far as I can tell, everybody agrees that
00:22:02.080 | deep learning, or at least the outputs of deep learning, are not very understandable. And people
00:22:09.760 | might agree that sparse linear models with integer coefficients are more understandable.
00:22:15.520 | But nobody's really asked people, you know, there's very little literature on, you know, sort
00:22:19.760 | of showing people models and asking them, do they understand what the model is doing? And I think
00:22:25.760 | that in all these topics, as these fields mature, we need to start doing more behavioral work.
00:22:33.520 | Yeah, which is, so one of my deep passions is psychology. And I always thought computer
00:22:39.840 | scientists will be the best future psychologists, in a sense that data is, especially in this modern
00:22:49.280 | world, the data is a really powerful way to understand and study human behavior. And you've
00:22:53.760 | explored that with your game theory side of work as well.
00:22:56.800 | Yeah, I'd like to think that what you say is true about computer scientists and psychology,
00:23:02.240 | from my own limited wandering into human subject experiments, we have a great deal to learn.
00:23:09.360 | Not just computer science, but AI and machine learning more specifically, I kind of think of as
00:23:13.600 | imperialist research communities in that, you know, kind of like physicists in an earlier
00:23:19.280 | generation, computer scientists kind of don't think of any scientific topic as off limits to
00:23:25.280 | them, they will like freely wander into areas that others have been thinking about for decades or
00:23:31.040 | longer. And, you know, we usually tend to embarrass ourselves in those efforts for some amount of
00:23:38.800 | time. Like, you know, I think reinforcement learning is a good example, right? So a lot of
00:23:43.920 | the early work in reinforcement learning, I have complete sympathy for the control theorists that
00:23:50.240 | looked at this and said, like, okay, you are reinventing stuff that we've known since like
00:23:55.280 | the 40s, right? But, you know, in my view, eventually, this sort of, you know, computer
00:24:01.280 | scientists have made significant contributions to that field, even though we kind of embarrassed
00:24:07.280 | ourselves for the first decade. So I think if computer scientists are going to start engaging
00:24:11.440 | in kind of psychology, human subjects, type of research, we should expect to be embarrassing
00:24:18.000 | ourselves for a good 10 years or so, and then hope that it turns out as well as, you know,
00:24:24.160 | some other areas that we've waded into. >> So you've kind of mentioned this,
00:24:28.160 | just to linger on the idea of an ethical algorithm, of idea of groups, sort of group thinking and
00:24:34.080 | individual thinking. And we're struggling that, one of the amazing things about algorithms and
00:24:38.560 | your book and just this field of study is it gets us to ask, like, forcing machines,
00:24:44.640 | converting these ideas into algorithms is forcing us to ask questions of ourselves as a human
00:24:50.560 | civilization. So there's a lot of people now in public discourse doing sort of group thinking,
00:24:57.440 | thinking like there's particular sets of groups that we don't want to discriminate against and
00:25:02.960 | so on. And then there's individuals, sort of in the individual life stories, the struggles they
00:25:10.320 | went through and so on. Now, like in philosophy, it's easier to do group thinking because you don't,
00:25:16.080 | you know, it's very hard to think about individuals, there's so much variability.
00:25:21.440 | But with data, you can start to actually say, you know, what group thinking is too crude? You're
00:25:28.320 | actually doing more discrimination by thinking in terms of groups and individuals. Can you linger on
00:25:33.520 | that kind of idea of group versus individual and ethics? And is it good to continue thinking in
00:25:40.960 | terms of groups in algorithms? - So let me start by answering a very good high level question with a
00:25:48.800 | slightly narrow technical response, which is these group definitions of fairness, like here's a few
00:25:55.440 | groups, like different racial groups, maybe gender groups, maybe age, what have you. And let's make
00:26:02.160 | sure that, you know, for none of these groups, do we, you know, have a false negative rate,
00:26:08.000 | which is much higher than any other one of these groups, okay? So these are kind of classic group
00:26:12.800 | aggregate notions of fairness. And, you know, but at the end of the day, an individual you can think
00:26:17.920 | of as a combination of all of their attributes, right? They're a member of a racial group,
00:26:22.080 | they have a gender, they have an age, you know, and many other, you know, demographic properties
00:26:29.280 | that are not biological, but that, you know, are still, you know, very strong determinants of
00:26:35.200 | outcome and personality and the like. So one, I think, useful spectrum is to sort of think about
00:26:43.120 | that array between the group and the specific individual, and to realize that in some ways,
00:26:49.600 | asking for fairness at the individual level is to sort of ask for group fairness simultaneously
00:26:56.400 | for all possible combinations of groups. So in particular, you know,
00:27:01.440 | if I build a predictive model that meets some definition of fairness by race, by gender,
00:27:10.000 | by age, by what have you, marginally, to get slightly technical, sort of independently,
00:27:16.800 | I shouldn't expect that model to not discriminate against disabled Hispanic women over age 55,
00:27:24.560 | making less than $50,000 a year annually, even though I might have protected each one of those
00:27:29.520 | attributes marginally. >> So the optimization, actually, that's a fascinating way to put it.
00:27:35.520 | So you're just optimizing, so one way to achieve the optimizing fairness for individuals is just
00:27:42.320 | to add more and more definitions of groups that each individual belongs to.
00:27:46.080 | >> So, you know, at the end of the day, we could think of all of ourselves as groups of size one,
00:27:50.800 | because eventually there's some attribute that separates you from me and everybody,
00:27:54.640 | from everybody else in the world, okay? And so it is possible to put, you know,
00:28:00.960 | these incredibly coarse ways of thinking about fairness and these very, very individualistic,
00:28:06.080 | specific ways on a common scale. And, you know, one of the things we've worked on from a research
00:28:12.560 | perspective is, you know, so we sort of know how to, you know, in relative terms, we know how to
00:28:18.320 | provide fairness guarantees at the coarsest end of the scale. We don't know how to provide kind of
00:28:24.640 | sensible, tractable, realistic fairness guarantees at the individual level, but maybe we could start
00:28:30.720 | creeping towards that by dealing with more, you know, refined subgroups. I mean, we gave a name
00:28:36.640 | to this phenomenon where, you know, you protect, you enforce some definition of fairness for a
00:28:43.520 | bunch of marginal attributes or features, but then you find yourself discriminating against
00:28:48.880 | a combination of them. We call that fairness gerrymandering, because like political gerrymandering,
00:28:54.800 | you know, you're giving some guarantee at the aggregate level, but that when you kind of look
00:29:00.320 | in a more granular way at what's going on, you realize that you're achieving that aggregate
00:29:04.880 | guarantee by sort of favoring some groups and discriminating against other ones. And so there
00:29:11.200 | are, you know, it's early days, but there are algorithmic approaches that let you start creep,
00:29:16.960 | creeping towards that, you know, individual end of the spectrum.
00:29:22.160 | Does there need to be human input in the form of weighing the value of the importance of each
00:29:31.120 | kind of group? So for example, is it like, so gender, say, crudely speaking, male and female,
00:29:42.080 | and then different races, are we as humans supposed to put value on saying gender is 0.6
00:29:50.880 | and race is 0.4 in terms of in the big optimization of achieving fairness? Is that kind of what
00:29:59.920 | humans-
00:30:00.720 | I mean, you know, I mean, of course, you know, I don't need to tell you that, of course,
00:30:04.400 | technically one could incorporate such weights if you wanted to into a definition of fairness.
00:30:10.400 | You know, fairness is an interesting topic in that having worked in, in the book being about
00:30:18.720 | both fairness, privacy, and many other social norms, fairness, of course, is a much, much more
00:30:25.200 | loaded topic. So privacy, I mean, people want privacy, people don't like violations of privacy,
00:30:31.600 | violations of privacy cause damage, angst, and bad publicity for the companies that are victims
00:30:39.280 | of them. But sort of everybody agrees more data privacy would be better than less data privacy.
00:30:46.480 | And you don't have these, somehow the discussions of fairness don't become politicized
00:30:53.280 | along other dimensions like race and about gender and, you know, whether we, and, you know,
00:31:00.880 | you quickly find yourselves kind of revisiting topics that have been kind of unresolved forever,
00:31:10.480 | like affirmative action, right? Sort of, you know, like, why are you protecting,
00:31:14.720 | some people will say, why are you protecting this particular racial group?
00:31:19.120 | And, and others will say, well, we need to do that as a matter of, of retribution. Other people
00:31:26.480 | will say it's a matter of economic opportunity. And I don't know which of, you know, whether any
00:31:33.600 | of these are the right answers, but you sort of fairness is sort of special in that as soon as
00:31:37.760 | you start talking about it, you inevitably have to participate in debates about fair to whom,
00:31:44.880 | at what expense to who else. I mean, even in criminal justice, right? You know, where people
00:31:52.240 | talk about fairness in criminal sentencing, or, you know, predicting failures to appear or making
00:32:01.840 | parole decisions or the like, they will, you know, they'll point out that, well, these definitions
00:32:08.160 | of fairness are all about fairness for the criminals. And what about fairness for the
00:32:14.640 | victims, right? So when I basically say something like, well, the false incarceration rate for black
00:32:22.480 | people and white people needs to be roughly the same, you know, there's no mention of potential
00:32:28.240 | victims of criminals in such a fairness definition. And that's the realm of public discord. I should
00:32:35.280 | actually recommend, I just listened to people listening, Intelligence Squares debates, US
00:32:42.400 | edition just had a debate. They have this structure where you have old Oxford style or whatever
00:32:49.040 | they're called, debates, and it was two versus two, and they talked about affirmative action.
00:32:53.520 | And it was incredibly interesting that it's still, there's really good points on every side of this
00:33:01.520 | issue, which is fascinating to listen to. Yeah, yeah, I agree. And so it's interesting to be
00:33:07.040 | a researcher trying to do, for the most part, technical algorithmic work. But Aaron and I both
00:33:15.200 | quickly learned you cannot do that and then go out and talk about it and expect people to take
00:33:19.520 | it seriously if you're unwilling to engage in these broader debates that are entirely extra
00:33:26.320 | algorithmic, right? They're not about, you know, algorithms and making algorithms better. They're
00:33:31.120 | sort of, you know, as you said, sort of like, what should society be protecting in the first place?
00:33:35.840 | When you discuss the fairness, an algorithm that achieves fairness, whether in the constraints
00:33:42.240 | and the objective function, there's an immediate kind of analysis you can perform, which is saying,
00:33:49.120 | if you care about fairness in gender, this is the amount that you have to pay for in terms of the
00:33:57.280 | performance of the system. Is there a role for statements like that in a table in a paper,
00:34:03.440 | or do you want to really not touch that? No, we want to touch that and we do touch it. So,
00:34:09.840 | I mean, just again to make sure I'm not promising your viewers more than we know how to provide.
00:34:17.520 | But if you pick a definition of fairness, like I'm worried about gender discrimination,
00:34:21.680 | and you pick a notion of harm, like false rejection for a loan, for example, and you give me a model,
00:34:27.920 | I can definitely, first of all, go audit that model. It's easy for me to go, you know, from
00:34:32.960 | data to kind of say like, okay, your false rejection rate on women is this much higher
00:34:39.200 | than it is on men, okay? But, you know, once you also put the fairness into your objective function,
00:34:46.160 | I mean, I think the table that you're talking about is, you know, what we would call the Pareto
00:34:50.320 | curve, right? You can literally trace out, and we give examples of such plots on real data sets in
00:34:57.680 | the book, you have two axes. On the x-axis is your error, on the y-axis is unfairness by whatever,
00:35:05.680 | you know, if it's like the disparity between false rejection rates between two groups.
00:35:10.080 | And, you know, your algorithm now has a knob that basically says, how strongly do I want to enforce
00:35:17.760 | fairness? And the less unfair, you know, if the two axes are error and unfairness, we'd like to be
00:35:24.800 | at zero, zero. We'd like zero error and zero unfairness simultaneously. Anybody who works
00:35:31.520 | in machine learning knows that you're generally not going to get to zero error period without any
00:35:37.200 | fairness constraint whatsoever, so that's not going to happen. But in general, you know, you'll
00:35:41.760 | get this, you'll get some kind of convex curve that specifies the numerical tradeoff you face,
00:35:49.760 | you know, if I want to go from 17% error down to 16% error, what will be the increase in unfairness
00:35:58.560 | that I experience as a result of that? And so this curve kind of specifies the, you know, kind of
00:36:07.120 | undominated models. Models that are off that curve are, you know, can be strictly improved
00:36:13.120 | in one or both dimensions. You can, you know, either make the error better or the unfairness
00:36:16.880 | better or both. And I think our view is that not only are these objects, these Pareto curves,
00:36:25.440 | you know, efficient frontiers as you might call them, not only are they valuable scientific
00:36:33.920 | objects, I actually think that they in the near term might need to be the interface
00:36:39.920 | between researchers working in the field and stakeholders in given problems. So, you know,
00:36:46.880 | you could really imagine telling a criminal jurisdiction, look, if you're concerned about
00:36:55.520 | racial fairness, but you're also concerned about accuracy, you want to, you know, you want to
00:37:01.840 | release on parole people that are not going to recommit a violent crime and you don't want to
00:37:06.640 | release the ones who are. So, you know, that's accuracy. But if you also care about those,
00:37:12.240 | you know, the mistakes you make not being disproportionately on one racial group or
00:37:16.080 | another, you can show this curve. I'm hoping that in the near future, it'll be possible to
00:37:21.840 | explain these curves to non-technical people that are the ones that have to make the decision,
00:37:28.560 | where do we want to be on this curve? Like, what are the relative merits or value of having lower
00:37:35.600 | error versus lower unfairness? You know, that's not something computer scientists
00:37:40.480 | should be deciding for society, right? That, you know, the people in the field, so to speak,
00:37:46.720 | the policy makers, the regulators, that's who should be making these decisions. But I think
00:37:52.560 | and hope that they can be made to understand that these trade-offs generally exist and that
00:37:58.240 | you need to pick a point and like, and ignoring the trade-off, you know, you're implicitly picking
00:38:04.160 | a point anyway, right? You just don't know it and you're not admitting it.
00:38:08.400 | Just to linger on the point of trade-offs, I think that's a really important thing to sort of
00:38:13.040 | think about. So you think when we start to optimize for fairness, there's almost always
00:38:21.440 | in most system going to be trade-offs. Can you, like, what's the trade-off between,
00:38:27.680 | just to clarify, there've been some sort of technical terms thrown around, but
00:38:32.080 | sort of a perfectly fair world, why is that, why will somebody be upset about that?
00:38:42.480 | The specific trade-off I talked about just in order to make things very concrete was between
00:38:48.640 | numerical error and some numerical measure of unfairness.
00:38:52.240 | What is numerical error in the case of...
00:38:56.160 | Just like, say, predictive error, like, you know, the probability or frequency with which you
00:39:01.280 | release somebody on parole who then goes on to recommit a violent crime or keep incarcerated
00:39:08.480 | somebody who would not have recommitted a violent crime.
00:39:10.960 | So in the case of awarding somebody parole or giving somebody parole or letting them out
00:39:17.920 | on parole, you don't want them to recommit a crime. So it's your system failed in prediction
00:39:24.400 | if they happen to do a crime. Okay, so that's the performance, that's one axis.
00:39:29.840 | Right.
00:39:30.080 | And what's the fairness axis?
00:39:31.600 | So then the fairness axis might be the difference between racial groups in the kind of false,
00:39:38.240 | false positive predictions, namely people that I kept incarcerated
00:39:46.720 | predicting that they would recommit a violent crime when in fact they wouldn't have.
00:39:50.960 | Right. And the unfairness of that, just to linger it and allow me to
00:39:56.480 | ineloquently to try to sort of describe why that's unfair, why unfairness is there.
00:40:04.240 | The unfairness you want to get rid of is that in the judge's mind, the bias of having being
00:40:13.200 | brought up to society, the slight racial bias, the racism that exists in the society, you want
00:40:18.480 | to remove that from the system. Another way that's been debated is sort of equality of opportunity
00:40:27.680 | versus equality of outcome. And there's a weird dance there that's really difficult to get right.
00:40:34.880 | And we don't, it's what the affirmative action is exploring that space.
00:40:40.480 | Right. And then we, this also quickly bleeds into questions like, well,
00:40:46.080 | maybe if one group really does recommit crimes at a higher rate,
00:40:50.640 | the reason for that is that at some earlier point in the pipeline or earlier in their lives,
00:40:57.120 | they didn't receive the same resources that the other group did.
00:41:00.960 | Right.
00:41:01.120 | And that, and so, there's always in kind of fairness discussions, the possibility
00:41:07.520 | that the real injustice came earlier, right? Earlier in this individual's life, earlier in
00:41:12.800 | this group's history, et cetera, et cetera. And so, a lot of the fairness discussion is almost,
00:41:18.800 | the goal is for it to be a corrective mechanism to account for the injustice earlier in life.
00:41:25.200 | By some definitions of fairness or some theories of fairness, yeah. Others would say like, look,
00:41:30.560 | it's not to correct that injustice, it's just to kind of level the playing field right now and not
00:41:37.040 | incarcerate, falsely incarcerate more people of one group than another group. But I mean,
00:41:42.160 | do you think just, it might be helpful just to demystify a little bit about
00:41:45.040 | the many ways in which bias or unfairness can come into algorithms, especially in the machine
00:41:54.880 | learning era, right? And I think many of your viewers have probably heard these examples before,
00:41:59.920 | but let's say I'm building a face recognition system, right? And so, I'm kind of gathering
00:42:05.920 | lots of images of faces and trying to train the system to recognize new faces of those individuals
00:42:14.160 | from training on a training set of those faces of individuals. And it shouldn't surprise anybody,
00:42:21.120 | or certainly not anybody in the field of machine learning, if my training dataset
00:42:27.600 | was primarily white males, and I'm training the model to maximize the overall accuracy on my
00:42:36.880 | training dataset, that the model can reduce its error most by getting things right on the white
00:42:45.840 | males that constitute the majority of the dataset, even if that means that on other groups, they will
00:42:51.360 | be less accurate, okay? Now, there's a bunch of ways you could think about addressing this. One is
00:42:57.920 | to deliberately put into the objective of the algorithm not to optimize the error at the expense
00:43:06.480 | of this discrimination, and then you're kind of back in the land of these kind of two-dimensional
00:43:10.000 | numerical trade-offs. A valid counter argument is to say like, "Well, no, you don't have to...
00:43:16.560 | There's no... The notion of the tension between error and accuracy here is a false one." You could
00:43:23.120 | instead just go out and get much more data on these other groups that are in the minority
00:43:28.400 | and equalize your dataset, or you could train a separate model on those subgroups and have
00:43:36.320 | multiple models. The point I think we would... We tried to make in the book is that those things
00:43:42.880 | have cost too, right? Going out and gathering more data on groups that are relatively rare
00:43:50.400 | compared to your plurality or majority group, that it may not cost you in the accuracy of the model,
00:43:55.920 | but it's going to cost the company developing this model more money to develop that, and it
00:44:02.080 | also costs more money to build separate predictive models and to implement and deploy them.
00:44:07.280 | So even if you can find a way to avoid the tension between error and accuracy in training a model,
00:44:14.400 | you might push the cost somewhere else, like money, like development time, research time,
00:44:21.040 | and the like. There are fundamentally difficult philosophical questions, in fairness.
00:44:26.640 | And we live in a very divisive political climate, outrage culture. There is alt-right folks on
00:44:36.400 | 4chan, trolls. There is social justice warriors on Twitter. There is very divisive,
00:44:44.240 | outraged folks on all sides of every kind of system. How do you, how do we as engineers
00:44:53.440 | build ethical algorithms in such divisive culture? Do you think they could be disjoint? The human
00:44:59.840 | has to inject your values, and then you can optimize over those values. But in our times,
00:45:05.760 | when you start actually applying these systems, things get a little bit
00:45:09.680 | challenging for the public discourse. How do you think we can proceed?
00:45:15.200 | Yeah, I mean, for the most part, in the book, a point that we try to take some pains to make is
00:45:21.600 | that we don't view ourselves or people like us as being in the position of deciding for society
00:45:30.400 | what the right social norms are, what the right definitions of fairness are. Our main point is
00:45:35.600 | to just show that if society or the relevant stakeholders in a particular domain can come
00:45:42.800 | to agreement on those sorts of things, there's a way of encoding that into algorithms in many
00:45:48.400 | cases, not in all cases. One other misconception that hopefully we definitely dispel is sometimes
00:45:55.120 | people read the title of the book and I think not unnaturally fear that what we're suggesting is
00:46:00.560 | that the algorithms themselves should decide what those social norms are and develop their own
00:46:05.200 | notions of fairness and privacy or ethics. And we're definitely not suggesting that.
00:46:09.920 | The title of the book is Ethical Algorithm, by the way, and I didn't think of that interpretation
00:46:13.760 | of the title. That's interesting.
00:46:15.120 | Yeah, yeah. I mean, especially these days where people are concerned about the robots becoming
00:46:20.880 | our overlords, the idea that the robots would also sort of develop their own social norms is
00:46:26.160 | just one step away from that. But I do think, obviously, despite disclaimer that people
00:46:33.760 | like us shouldn't be making those decisions for society, we are kind of living in a world where,
00:46:39.120 | in many ways, computer scientists have made some decisions that have fundamentally changed
00:46:44.080 | the nature of our society and democracy and sort of civil discourse and deliberation in ways that
00:46:51.360 | I think most people generally feel are bad these days, right? So-
00:46:55.520 | But they had to make, so if we look at people at the heads of companies and so on,
00:47:00.720 | they had to make those decisions, right? There has to be decisions. So there's two options.
00:47:06.400 | Either you kind of put your head in the sand and don't think about these things and just let the
00:47:12.480 | algorithm do what it does, or you make decisions about what you value, you know, of injecting
00:47:17.760 | moral values into the algorithm.
00:47:19.280 | Look, I never mean to be an apologist for the tech industry, but I think it's a little bit
00:47:27.600 | too far to sort of say that explicit decisions were made about these things. So let's, for
00:47:31.520 | instance, take social media platforms, right? So like many inventions in technology and computer
00:47:37.680 | science, a lot of these platforms that we now use regularly kind of started as curiosities,
00:47:44.800 | right? I remember when things like Facebook came out and its predecessors like Friendster,
00:47:49.040 | which nobody even remembers now. People really wonder, like, why would anybody want to spend
00:47:55.600 | time doing that? I mean, even the web when it first came out, when it wasn't populated with
00:48:00.560 | much content and it was largely kind of hobbyists building their own kind of ramshackle websites,
00:48:06.800 | a lot of people looked at this as like, "Well, what is the purpose of this thing? Why is this
00:48:10.320 | interesting? Who would want to do this?" And so even things like Facebook and Twitter, yes,
00:48:15.360 | technical decisions were made by engineers, by scientists, by executives in the design of those
00:48:20.720 | platforms. But I don't think 10 years ago anyone anticipated that those platforms, for instance,
00:48:31.360 | might kind of acquire undue influence on political discourse or on the outcomes of elections.
00:48:40.880 | And I think the scrutiny that these companies are getting now is entirely appropriate,
00:48:47.360 | but I think it's a little too harsh to kind of look at history and sort of say like, "Oh,
00:48:52.880 | you should have been able to anticipate that this would happen with your platform."
00:48:55.920 | And in this sort of gaming chapter of the book, one of the points we're making is that
00:48:59.680 | these platforms, right, they don't operate in isolation. So unlike the other topics we're
00:49:06.640 | discussing like fairness and privacy, those are really cases where algorithms can operate
00:49:11.600 | on your data and make decisions about you and you're not even aware of it, okay?
00:49:16.000 | Things like Facebook and Twitter, these are systems, right? These are social systems.
00:49:21.440 | And their evolution, even their technical evolution because machine learning is involved,
00:49:27.200 | is driven in no small part by the behavior of the users themselves and how the users decide
00:49:32.880 | to adopt them and how to use them. And so I'm kind of like, "Who really knew that until we saw it
00:49:44.000 | happen? Who knew that these things might be able to influence the outcome of elections? Who knew
00:49:48.720 | that they might polarize political discourse because of the ability to decide who you interact
00:49:57.360 | with on the platform and also with the platform naturally using machine learning to optimize for
00:50:03.440 | your own interests that they would further isolate us from each other and feed us all basically just
00:50:09.680 | the stuff that we already agreed with?" And so I think we've come to that outcome, I think,
00:50:15.120 | largely, but I think it's something that we all learned together, including the companies,
00:50:21.920 | as these things happen. Now, you asked like, "Well, are there algorithmic remedies to these
00:50:28.800 | kinds of things?" And again, these are big problems that are not going to be solved with
00:50:33.840 | somebody going in and changing a few lines of code somewhere in a social media platform.
00:50:39.760 | But I do think in many ways, there are definitely ways of making things better. I mean, like an
00:50:45.440 | obvious recommendation that we make at some point in the book is like, "Look, to the extent that we
00:50:51.520 | think that machine learning applied for personalization purposes in things like news feed
00:50:57.600 | or other platforms has led to polarization and intolerance of opposing viewpoints,"
00:51:07.680 | as you know, these algorithms have models, and they place people in some kind of metric space,
00:51:13.440 | and they place content in that space, and they know the extent to which I have an affinity for
00:51:19.840 | a particular type of content. And by the same token, they also probably have... That same model
00:51:25.280 | probably gives you a good idea of the stuff I'm likely to violently disagree with or be offended
00:51:31.200 | by. So in this case, there really is some knob you could tune that says like, "Instead of
00:51:37.680 | showing people only what they like and what they want, let's show them some stuff that we think
00:51:43.440 | that they don't like or that's a little bit further away." And you could even imagine users
00:51:48.240 | being able to control this. Just like everybody gets a slider, and that slider says like, "How
00:51:55.680 | much stuff do you want to see that's kind of you might disagree with or is at least further from
00:52:02.000 | your interests?" It's almost like an exploration button. - So just get your intuition. Do you think
00:52:08.960 | engagement... So like you're staying on the platform, you're staying engaged.
00:52:13.760 | Do you think fairness, ideas of fairness won't emerge? Like how bad is it to just optimize for
00:52:21.760 | engagement? Do you think we'll run into big trouble if we're just optimizing for how much
00:52:27.920 | you love the platform? - Well, I mean, optimizing for engagement kind of got us where we are.
00:52:33.920 | - So do you, one, have faith that it's possible to do better? And two, if it is, how do we do better?
00:52:42.640 | - I mean, it's definitely possible to do different, right? And again, it's not as if I think that
00:52:49.920 | doing something different than optimizing for engagement won't cost these companies in real
00:52:54.800 | ways, including revenue and profitability, potentially. - In the short term, at least.
00:53:00.160 | - Yeah, in the short term, right. And again, if I worked at these companies, I'm sure that
00:53:06.560 | it would have seemed like the most natural thing in the world also to want to optimize
00:53:11.680 | engagement, right? And that's good for users in some sense. You want them to be vested in the
00:53:16.800 | platform and enjoying it and finding it useful, interesting, and/or productive. But my point is
00:53:22.480 | is that the idea that it's sort of out of their hands, as you said, or that there's nothing to do
00:53:29.120 | about it, never say never, but that strikes me as implausible as a machine learning person, right?
00:53:34.720 | I mean, these companies are driven by machine learning, and this optimization of engagement
00:53:38.960 | is essentially driven by machine learning, right? It's driven by not just machine learning, but
00:53:44.720 | very, very large-scale A/B experimentation where you kind of tweak some element of the user
00:53:51.200 | interface or tweak some component of an algorithm or tweak some component or feature of your
00:53:58.000 | click-through prediction model. And my point is is that anytime you know how to optimize for
00:54:04.880 | something, almost by definition, that solution tells you how not to optimize for it or to do
00:54:10.800 | something different. - Engagement can be measured.
00:54:16.000 | So sort of optimizing for sort of minimizing divisiveness or maximizing intellectual growth
00:54:25.440 | over the lifetime of a human being are very difficult to measure.
00:54:29.120 | - That's right. So I'm not claiming that doing something different will
00:54:35.280 | immediately make it apparent that this is a good thing for society. And in particular,
00:54:41.040 | I mean, I think one way of thinking about where we are on some of these social media platforms
00:54:45.840 | is it kind of feels a bit like we're in a bad equilibrium, right? That these systems are helping
00:54:52.160 | us all kind of optimize something myopically and selfishly for ourselves. And of course,
00:54:57.760 | from an individual standpoint, at any given moment, like why would I want to see things
00:55:02.960 | in my newsfeed that I found irrelevant, offensive, or the like, okay? But maybe by all of us
00:55:12.320 | having these platforms myopically optimized in our interests, we have reached a collective outcome as
00:55:19.120 | a society that we're unhappy with in different ways, let's say with respect to things like
00:55:23.840 | political discourse and tolerance of opposing viewpoints. - And if Mark Zuckerberg gave you
00:55:32.400 | a call and said, "I'm thinking of taking a sabbatical, could you run Facebook for me for
00:55:36.560 | six months?" What would you, how? - I think no thanks would be my first response, but
00:55:41.680 | there are many aspects of being the head of the entire company that are kind of entirely
00:55:49.440 | exogenous to many of the things that we're discussing here. And so I don't really think
00:55:54.640 | I would need to be CEO of Facebook to kind of implement the more limited set of solutions that
00:56:01.440 | I might imagine. But I think one concrete thing they could do is they could experiment with
00:56:08.160 | letting people who chose to, to see more stuff in their newsfeed that is not entirely kind of
00:56:15.680 | chosen to optimize for their particular interests, beliefs, et cetera. - So the kind of thing,
00:56:24.560 | I could speak to YouTube, but I think Facebook probably does something similar, is they're quite
00:56:31.760 | effective at automatically finding what sorts of groups you belong to, not based on race or gender
00:56:37.840 | or so on, but based on the kind of stuff you enjoy watching in the case of YouTube. It's a difficult
00:56:45.920 | thing for Facebook or YouTube to then say, "Well, you know what? We're gonna show you something from
00:56:51.760 | a very different cluster, even though we believe algorithmically you're unlikely to enjoy that
00:56:58.080 | thing." So if that's a weird jump to make, there has to be a human at the very top of that system
00:57:05.440 | that says, "Well, that will be long-term healthy for you." That's more than an algorithmic decision.
00:57:11.440 | - Or that same person could say, "That'll be long-term healthy for the platform."
00:57:16.320 | - For the platform. - Or for the platform's influence on
00:57:19.520 | society outside of the platform. And it's easy for me to sit here and say these things,
00:57:25.840 | but conceptually, I do not think that these are totally or they shouldn't be completely alien
00:57:33.600 | ideas. You could try things like this, and we wouldn't have to invent entirely new science to
00:57:43.040 | do it, because if we're all already embedded in some metric space and there's a notion of distance
00:57:48.160 | between you and me and every piece of content, then we know exactly... The same model that
00:57:56.000 | dictates how to make me really happy also tells how to make me as unhappy as possible as well.
00:58:04.160 | - Right. The focus in your book and algorithmic fairness research today in general is on machine
00:58:10.560 | learning, like we said, is data. And just even the entire AI field right now is captivated with
00:58:16.880 | machine learning, with deep learning. Do you think ideas in symbolic AI or totally other kinds of
00:58:23.440 | approaches are interesting, useful in the space, have some promising ideas in terms of fairness?
00:58:30.320 | - I haven't thought about that question specifically in the context of fairness. I
00:58:35.680 | definitely would agree with that statement in the large, right? I mean, I am one of many machine
00:58:42.640 | learning researchers who do believe that the great successes that have been shown in machine
00:58:48.880 | learning recently are great successes, but they're on a pretty narrow set of tasks. I mean, I don't
00:58:54.240 | think we're kind of notably closer to general artificial intelligence now than we were when I
00:59:02.160 | started my career. I mean, there's been progress. And I do think that we are kind of as a community
00:59:08.320 | maybe looking a bit where the light is, but the light is shining pretty bright there right now,
00:59:12.160 | and we're finding a lot of stuff. So I don't want to like argue with the progress that's been made
00:59:16.240 | in areas like deep learning, for example. - This touches another sort of related thing
00:59:21.520 | that you mentioned, and that people might misinterpret from the title of your book,
00:59:26.080 | ethical algorithm. Is it possible for the algorithm to automate some of those decisions,
00:59:30.720 | sort of higher level decisions of what kind of... - Like what should be fair.
00:59:36.720 | - What should be fair. - The more you know about a field,
00:59:39.920 | the more aware you are of its limitations. And so I'm pretty leery of sort of trying... There's so
00:59:47.600 | much we already don't know in fairness, even when we're the ones picking the fairness definitions
00:59:54.480 | and comparing alternatives and thinking about the tensions between different definitions,
00:59:59.520 | that the idea of kind of letting the algorithm start exploring as well, I definitely think,
01:00:07.200 | this is a much narrower statement. I definitely think that kind of algorithmic auditing for
01:00:11.120 | different types of unfairness, right? So like in this gerrymandering example, where I might want
01:00:16.880 | to prevent not just discrimination against very broad categories, but against combinations of
01:00:22.720 | broad categories, you quickly get to a point where there's a lot of categories, there's a lot of
01:00:28.000 | combinations of end features. And you can use algorithmic techniques to sort of try to find
01:00:34.880 | the subgroups on which you're discriminating the most and try to fix that. That's actually kind of
01:00:39.680 | the form of one of the algorithms we developed for this fairness gerrymandering problem.
01:00:43.920 | But I'm, you know, partly because of our technology and our sort of our scientific
01:00:50.080 | ignorance on these topics right now. And also partly just because these topics are so loaded
01:00:56.400 | emotionally for people that I just don't see the value. I mean, again, never say never,
01:01:02.160 | but I just don't think we're at a moment where it's a great time for computer scientists to be
01:01:06.000 | rolling out the idea like, "Hey, you know, not only have we kind of figured fairness out, but,
01:01:11.360 | you know, we think the algorithms should start deciding what's fair or giving input on that
01:01:16.400 | decision." I just don't, it's like the cost benefit analysis to the field of kind of going
01:01:21.840 | there right now just doesn't seem worth it to me. That said, I should say that I think computer
01:01:26.880 | scientists should be more philosophically, like should enrich their thinking about these kinds
01:01:31.440 | of things. I think it's been too often used as an excuse for roboticists working on autonomous
01:01:37.360 | vehicles, for example, to not think about the human factor or psychology or safety.
01:01:43.200 | In the same way, like computer science design algorithms, they've been sort of using it as
01:01:46.640 | an excuse. And I think it's time for basically everybody to become computer scientists.
01:01:51.440 | I was about to agree with everything you said except that last point. I think that
01:01:55.440 | the other way of looking at it is that I think computer scientists, you know, and many of us are,
01:02:02.080 | but we need to wait out into the world more, right? I mean, just the influence that computer
01:02:09.840 | science and therefore computer scientists have had on society at large just like has exponentially
01:02:17.760 | magnified in the last 10 or 20 years or so. And, you know, before when we were just tinkering
01:02:24.640 | around amongst ourselves and it didn't matter that much, there was no need for sort of computer
01:02:29.440 | scientists to be citizens of the world more broadly. And I think those days need to be over
01:02:35.440 | very, very fast. And I'm not saying everybody needs to do it, but to me, like the right way
01:02:40.240 | of doing it is to not to sort of think that everybody else is going to become a computer
01:02:43.360 | scientist. But, you know, I think, you know, people are becoming more sophisticated about
01:02:48.160 | computer science, even lay people. You know, I think one of the reasons we decided to write
01:02:53.920 | this book is we thought 10 years ago, I wouldn't have tried this just because I just didn't think
01:02:59.280 | that sort of people's awareness of algorithms and machine learning, you know, the general population
01:03:05.840 | would have been high. And I mean, you would have had to first, you know, write one of the many
01:03:09.840 | books kind of just explicating that topic to a lay audience first. Now, I think we're at the point
01:03:15.600 | where like lots of people without any technical training at all know enough about algorithms and
01:03:20.400 | machine learning that you can start getting to these nuances of things like ethical algorithms.
01:03:25.120 | I think we agree that there needs to be much more mixing. But I think a lot of the onus of
01:03:31.840 | that mixing needs to be on the computer science community. Yeah. So just to linger on the
01:03:37.920 | disagreement, because I do disagree with you on the point that I think if you're a biologist,
01:03:45.200 | if you're a chemist, if you're an MBA business person, all of those things you can, like,
01:03:53.360 | if you learned a program, and not only program, if you learn to do machine learning, if you learn to
01:03:58.560 | do data science, you immediately become much more powerful in the kinds of things you can do.
01:04:04.000 | And therefore, literature, like library sciences, like, so you were speaking, I think,
01:04:11.120 | I think it holds true what you're saying for the next few years. But long term, if you're interested
01:04:17.040 | to me, if you're interested in philosophy, you should learn a program, because then you can
01:04:23.520 | scrape data, you can study what people are thinking about on Twitter, and then start making
01:04:30.000 | philosophical conclusions about the meaning of life. Right? I just, I just feel like the access
01:04:36.080 | to data, the digitization of whatever problem you're trying to solve, it fundamentally changes
01:04:42.800 | what it means to be a computer scientist. I mean, computer scientists in 20, 30 years will go back
01:04:47.680 | to being Donald Knuth style theoretical computer science, and everybody would be doing, basically,
01:04:54.640 | they're exploring the kinds of ideas that you're exploring in your book. It won't be a computer
01:04:58.320 | science major. Yeah, I mean, I don't think I disagree, but I think that that trend of
01:05:04.240 | more and more people in more and more disciplines,
01:05:06.880 | adopting ideas from computer science, learning how to code, I think that that trend seems
01:05:13.600 | firmly underway. I mean, you know, like, an interesting digressive question along these
01:05:19.040 | lines is maybe in 50 years, there won't be computer science departments anymore.
01:05:24.000 | Because the field will just sort of be ambient in all of the different disciplines. And you know,
01:05:30.960 | people will look back and, you know, having a computer science department will look like having
01:05:35.600 | an electricity department or something. It's like, you know, everybody uses this, it's just out
01:05:39.920 | there. I mean, I do think there will always be that kind of Knuth style core to it. But it's not
01:05:45.120 | an implausible path that we kind of get to the point where the academic discipline of computer
01:05:50.560 | science becomes somewhat marginalized, because of its very success in kind of infiltrating
01:05:56.160 | all of science and society and the humanities, etc. What is differential privacy, or more broadly,
01:06:04.080 | algorithmic privacy? Algorithmic privacy more broadly is just the study or the notion of privacy
01:06:12.880 | definitions or norms being encoded inside of algorithms. And so, you know, I think we count
01:06:22.160 | among this body of work, just, you know, the literature and practice of things like data
01:06:29.120 | anonymization, which we kind of at the beginning of our discussion of privacy, say like, okay,
01:06:35.840 | this is sort of a notion of algorithmic privacy, it kind of tells you, you know, something to go
01:06:41.040 | do with data. But, you know, our view is that it's, and I think this is now, you know, quite
01:06:47.760 | widespread, that it's, you know, despite the fact that those notions of anonymization, kind of
01:06:54.080 | redacting and coarsening, are the most widely adopted technical solutions for data privacy,
01:07:01.520 | they are like deeply, fundamentally flawed. And so, you know, to your first question,
01:07:07.040 | what is differential privacy? Differential privacy seems to be a much, much better notion of privacy
01:07:15.600 | that kind of avoids a lot of the weaknesses of anonymization notions while still letting us do
01:07:23.200 | useful stuff with data. What's anonymization of data? So, by anonymization, I'm, you know,
01:07:28.720 | kind of referring to techniques like I have a database, the rows of that database are,
01:07:35.360 | let's say, individual people's medical records, okay? And I want to let people use that data,
01:07:43.520 | maybe I want to let researchers access that data to build predictive models for some disease,
01:07:48.400 | but I'm worried that that will leak, you know, sensitive information about specific people's
01:07:56.160 | medical records. So, anonymization broadly refers to the set of techniques where I say, like, okay,
01:08:01.440 | I'm first going to, like, I'm going to delete the column with people's names. I'm going to not put,
01:08:07.600 | you know, so that would be like a redaction, right? I'm just redacting that information.
01:08:12.080 | I am going to take ages, and I'm not going to, like, say your exact age, I'm going to say whether
01:08:17.920 | you're, you know, zero to 10, 10 to 20, 20 to 30. I might put the first three digits of your zip
01:08:24.640 | code but not the last two, et cetera, et cetera. And so, the idea is that through some series of
01:08:29.280 | operations like this on the data, I anonymize it, you know, another term of art that's used is
01:08:35.280 | removing personally identifiable information. And, you know, this is basically the most common
01:08:41.680 | way of providing data privacy but that's in a way that still lets people access some variant form
01:08:48.960 | of the data. >> So, at a slightly broader picture, as you talk about what does anonymization mean
01:08:55.680 | when you have multiple databases, like with a Netflix prize when you can start combining stuff
01:09:01.280 | together. >> So, this is exactly the problem with these notions, right, is that notions of
01:09:06.640 | anonymization, removing personally identifiable information, the kind of fundamental conceptual
01:09:12.400 | flaw is that, you know, these definitions kind of pretend as if the data set in question is the only
01:09:18.960 | data set that exists in the world or that ever will exist in the future. And, of course, things
01:09:24.560 | like the Netflix prize and many, many other examples since the Netflix prize, I think that
01:09:28.560 | was one of the earliest ones, though, you know, you can reidentify people that were, you know,
01:09:34.640 | that were anonymized in the data set by taking that anonymized data set and combining it with
01:09:39.760 | other allegedly anonymized data sets and maybe publicly available information about you.
01:09:44.320 | >> And for people who don't know, the Netflix prize was being publicly released as data.
01:09:49.360 | So, the names from those rows were removed but what was released is the preference or the ratings
01:09:56.320 | of what movies you like and you don't like. And from that combined with other things,
01:10:00.320 | I think forum posts and so on, you can start to figure out the names.
01:10:03.680 | >> Yeah, I mean, in that case, it was specifically the internet movie database.
01:10:07.040 | >> The ad data.
01:10:07.920 | >> Where lots of Netflix users publicly rate their movie, you know, their movie preferences.
01:10:13.840 | And so, the anonymized data in Netflix when kind of, you know, it's just this phenomenon, I think,
01:10:20.800 | we've all come to realize in the last decade or so is that just knowing a few apparently
01:10:28.720 | irrelevant innocuous things about you can often act as a fingerprint. Like if I know,
01:10:33.600 | you know, what rating you gave to these 10 movies and the date on which you entered these movies,
01:10:40.240 | this is almost like a fingerprint for you in the sea of all Netflix users.
01:10:44.800 | There was just another paper on this in Science or Nature about a month ago that, you know, kind of
01:10:50.320 | 18 attributes. I mean, my favorite example of this was actually a paper from several years ago now
01:10:57.280 | where it was shown that just from your likes on Facebook, just from the, you know, the things on
01:11:03.840 | which you clicked on the thumbs up button on the platform, not using any information, demographic
01:11:10.000 | information, nothing about who your friends are, just knowing the content that you had liked
01:11:15.520 | was enough to, you know, in the aggregate accurately predict things like sexual orientation,
01:11:22.240 | drug and alcohol use, whether you were the child of divorced parents. So we live in this era where,
01:11:28.720 | you know, even the apparently irrelevant data that we offer about ourselves on public platforms and
01:11:34.560 | forums often unbeknownst to us more or less acts as a signature or, you know, fingerprint.
01:11:41.360 | And that if you can kind of, you know, do a join between that kind of data and allegedly anonymized
01:11:47.680 | data, you have real trouble. So is there hope for any kind of privacy in a world where a few
01:11:54.080 | likes can identify you? So there is differential privacy, right? So what is differential privacy?
01:12:01.200 | So differential privacy basically is a kind of alternate, much stronger notion of privacy than
01:12:07.120 | these anonymization ideas. And it, you know, it's a technical definition, but like the spirit of it
01:12:15.760 | is we compare two alternate worlds, okay? So let's suppose I'm a researcher and I want to do,
01:12:23.200 | you know, there's a database of medical records and one of them is yours. And I want to use that
01:12:29.920 | database of medical records to build a predictive model for some disease. So based on people's
01:12:34.800 | symptoms and test results and the like, I want to, you know, build a model predicting the
01:12:40.800 | probability that people have disease. So, you know, this is the type of scientific research
01:12:44.720 | that we would like to be allowed to continue. And in differential privacy, you ask a very
01:12:50.480 | particular counterfactual question. We basically compare two alternatives. One is when I do this,
01:13:00.320 | I build this model on the database of medical records, including your medical record.
01:13:06.880 | And the other one is where I do the same exercise with the same database with just your medical
01:13:14.800 | record removed. So basically, you know, it's two databases, one with N records in it and one with
01:13:21.680 | N minus one records in it. The N minus one records are the same and the only one that's missing in
01:13:27.440 | the second case is your medical record. So differential privacy basically says that
01:13:34.960 | any harms that might come to you from the analysis in which your data was included
01:13:42.720 | are essentially nearly identical to the harms that would have come to you if the same analysis had
01:13:50.160 | done been done without your medical record included. So in other words, this doesn't say
01:13:55.440 | that bad things cannot happen to you as a result of data analysis. It just says that these bad
01:14:01.280 | things were going to happen to you already, even if your data wasn't included. And to give a very
01:14:06.560 | concrete example, right, you know, like we discussed at some length, the study that, you know,
01:14:14.000 | in the '50s that was done that created the, that established the link between smoking and lung
01:14:18.560 | cancer. And we make the point that like, well, if your data was used in that analysis and, you know,
01:14:25.200 | the world kind of knew that you were a smoker because, you know, there was no stigma associated
01:14:29.520 | with smoking before that, those findings, real harm might've come to you as a result of that
01:14:35.440 | study that your data was included in. In particular, your insurer now might have a
01:14:40.080 | higher posterior belief that you might have lung cancer and raise your premium. So you've
01:14:44.960 | suffered economic damage. But the point is, is that if the same analysis has been done
01:14:51.520 | without, with all the other N minus one medical records and just yours missing,
01:14:57.040 | the outcome would have been the same. Your data wasn't idiosyncratically crucial to establishing
01:15:04.080 | the link between smoking and lung cancer, because the link between smoking and lung cancer
01:15:08.320 | is like a fact about the world that can be discovered with any sufficiently large
01:15:13.120 | database of medical records. >> But that's a very low value of harm. Yeah,
01:15:17.440 | so that's showing that very little harm is done. Great. But how, what is the mechanism of differential
01:15:23.360 | privacy? So that's the kind of beautiful statement of it. But what's the mechanism by which privacy
01:15:29.440 | is preserved? >> Yeah. So it's basically by adding noise to computations, right? So the basic idea
01:15:35.600 | is that every differentially private algorithm, first of all, or every good differentially
01:15:41.360 | private algorithm, every useful one is a probabilistic algorithm. So it doesn't,
01:15:46.160 | on a given input, if you gave the algorithm the same input multiple times, it would give
01:15:51.840 | different outputs each time from some distribution. And the way you achieve differential privacy
01:15:57.600 | algorithmically is by kind of carefully and tastefully adding noise to a computation in the
01:16:04.160 | right places. And to give a very concrete example, if I want to compute the average of a set of
01:16:09.920 | numbers, right, the non-private way of doing that is to take those numbers and average them and
01:16:15.920 | release a numerically precise value for the average, okay? In differential privacy, you wouldn't do
01:16:23.280 | that. You would first compute that average to numerical precisions, and then you'd add some
01:16:29.120 | noise to it, right? You'd add some kind of zero mean, Gaussian or exponential noise to it, so that
01:16:36.640 | the actual value you output is not the exact mean, but it'll be close to the mean, but it'll be close,
01:16:43.920 | the noise that you add will sort of prove that nobody can kind of reverse engineer
01:16:49.360 | any particular value that went into the average. >> So noise is the savior. How many algorithms can
01:16:57.200 | be aided by adding noise? >> Yeah, so I'm a relatively recent
01:17:03.920 | member of the differential privacy community. My co-author, Aaron Roth, is, you know,
01:17:08.880 | really one of the founders of the field and has done a great deal of work, and I've learned
01:17:13.600 | a tremendous amount working with him on it. >> It's a pretty grown-up field already.
01:17:17.120 | >> Yeah, but now it's pretty mature. But I must admit, the first time I saw the definition of
01:17:20.560 | differential privacy, my reaction was like, "Well, that is a clever definition, and it's really
01:17:25.600 | making very strong promises." And my, you know, I first saw the definition in much earlier days,
01:17:32.560 | and my first reaction was like, "Well, my worry about this definition would be that it's a great
01:17:37.200 | definition of privacy, but that it'll be so restrictive that we won't really be able to use
01:17:42.080 | it." Like, you know, we won't be able to compute many things in a differentially private way.
01:17:46.960 | So that's one of the great successes of the field, I think, is in showing that the opposite is true,
01:17:52.320 | and that, you know, most things that we know how to compute, absent any privacy considerations,
01:18:00.720 | can be computed in a differentially private way. So, for example, pretty much all of statistics
01:18:05.840 | and machine learning can be done differentially privately. So pick your favorite machine learning
01:18:11.360 | algorithm, back propagation and neural networks, you know, cart for decision trees, support vector
01:18:17.120 | machines, boosting, you name it, as well as classic hypothesis testing and the like in statistics.
01:18:23.360 | None of those algorithms are differentially private in their original form. All of them have
01:18:30.800 | modifications that add noise to the computation in different places in different ways that achieve
01:18:37.360 | differential privacy. So this really means that to the extent that, you know, we've become a,
01:18:43.040 | you know, a scientific community very dependent on the use of machine learning and statistical
01:18:49.760 | modeling and data analysis, we really do have a path to kind of provide privacy guarantees to
01:18:57.440 | those methods. And so we can still, you know, enjoy the benefits of kind of the data science era
01:19:05.360 | while providing, you know, rather robust privacy guarantees to individuals.
01:19:09.680 | So perhaps a slightly crazy question, but if we take the ideas of differential privacy and
01:19:16.720 | take it to the nature of truth that's being explored currently,
01:19:20.480 | so what's your most favorite and least favorite food?
01:19:24.240 | Hmm, I'm not a real foodie, so I'm a big fan of spaghetti.
01:19:29.840 | Spaghetti?
01:19:30.320 | Yeah.
01:19:30.480 | And what do you really don't like?
01:19:32.800 | Um, I really don't like cauliflower.
01:19:37.280 | Wow, I love cauliflower.
01:19:38.960 | Okay.
01:19:39.200 | But is one way to protect your preference for spaghetti by having an information campaign,
01:19:46.480 | bloggers and so on, of bots saying that you like cauliflower. So like this kind of,
01:19:52.720 | the same kind of noise ideas. I mean, if you think of in our politics today, there's this idea of
01:19:58.720 | Russia hacking our elections. What's meant there, I believe, is bots spreading different kinds of
01:20:06.160 | information. Is that a kind of privacy or is that too much of a stretch?
01:20:10.240 | No, it's not a stretch. I've not seen those ideas, you know, that is not a technique that to my
01:20:18.080 | knowledge will provide differential privacy. But to give an example, like one very specific example
01:20:24.480 | about what you're discussing is, there was a very interesting project at NYU, I think led by Helen
01:20:30.800 | Nissenbaum there, in which they basically built a browser plugin that tried to essentially
01:20:39.360 | obfuscate your Google searches. So to the extent that you're worried that Google is using your
01:20:44.640 | searches to build, you know, predictive models about you, to decide what ads to show you,
01:20:50.400 | which they might very reasonably want to do. But if you object to that, they built this widget you
01:20:55.600 | could plug in. And basically, whenever you put in a query into Google, it would send that query to
01:21:00.720 | Google. But in the background, all of the time from your browser, it would just be sending this
01:21:06.240 | torrent of irrelevant queries to the search engine. So, you know, it's like a weed and chaff
01:21:13.440 | thing. So, you know, out of every thousand queries, let's say, that Google was receiving from your
01:21:18.960 | browser, one of them was one that you put in, but the other 999 were not. Okay, so it's the same
01:21:24.560 | kind of idea, kind of, you know, privacy by obfuscation. So I think that's an interesting
01:21:30.800 | idea. Doesn't give you differential privacy. It's also, I was actually talking to somebody at one
01:21:37.520 | of the large tech companies recently about the fact that, you know, just this kind of thing that
01:21:43.280 | there are some times when the response to my data needs to be very specific to my data, right? Like,
01:21:51.760 | I type mountain biking into Google, I want results on mountain biking, and I really want Google to
01:21:57.600 | know that I typed in mountain biking. I don't want noise added to that. And so I think there's
01:22:03.280 | sort of maybe even interesting technical questions around notions of privacy that are appropriate
01:22:07.680 | where, you know, it's not that my data is part of some aggregate like medical records and that we're
01:22:13.040 | trying to discover important correlations and facts about the world at large, but rather, you
01:22:18.800 | know, there's a service that I really want to, you know, pay attention to my specific data, yet I
01:22:24.240 | still want some kind of privacy guarantee. And I think these kind of obfuscation ideas are sort of
01:22:28.880 | one way of getting at that, but maybe there are others as well. So where do you think we'll land
01:22:33.440 | in this algorithm driven society in terms of privacy? So, sort of, China, like Kai-Fu Lee
01:22:41.120 | describes, you know, it's collecting a lot of data on its citizens, but in the best form, it's
01:22:48.160 | actually able to provide a lot of, sort of, protect human rights and provide a lot of amazing services.
01:22:54.480 | And it's worse forms that can violate those human rights and limit services. So where do you think
01:23:01.920 | we'll land? So algorithms are powerful when they use data. So as a society, do you think we'll give
01:23:11.040 | over more data? Is it possible to protect the privacy of that data?
01:23:15.440 | So I'm optimistic about the possibility of, you know, balancing the desire for individual privacy
01:23:25.120 | and individual control of privacy with kind of societally and commercially beneficial uses of
01:23:32.880 | data, not unrelated to differential privacy or suggestions that say like, well, individuals
01:23:38.320 | should have control of their data. They should be able to limit the uses of that data. They should
01:23:43.760 | even, you know, there's, you know, fledgling discussions going on in research circles about
01:23:48.960 | allowing people selective use of their data and being compensated for it.
01:23:53.600 | And then you get to sort of very interesting economic questions like pricing, right? And one
01:23:59.680 | interesting idea is that maybe differential privacy would also, you know, be a conceptual
01:24:05.280 | framework in which you could talk about the relative value of different people's data,
01:24:09.040 | like, you know, to demystify this a little bit. If I'm trying to build a predictive model for some
01:24:14.480 | rare disease and I'm going to use machine learning to do it, it's easy to get negative examples
01:24:21.120 | because the disease is rare, right? But I really want to have lots of people with the disease in my
01:24:27.360 | data set, okay? And so somehow those people's data with respect to this application is much more
01:24:34.800 | valuable to me than just like the background population. And so maybe they should be
01:24:39.200 | compensated more for it. And so, you know, I think these are kind of very, very fledgling
01:24:47.520 | conceptual questions that maybe we'll have kind of technical thought on them sometime in the coming
01:24:52.560 | years. But I do think we'll, you know, to kind of get more directly answer your question, I think
01:24:57.600 | I'm optimistic at this point from what I've seen that we will land at some, you know, better
01:25:03.600 | compromise than we're at right now, where again, you know, privacy guarantees are few, far between,
01:25:10.400 | and weak, and users have very, very little control. And I'm optimistic that we'll land in something
01:25:17.440 | that, you know, provides better privacy overall and more individual control of data and privacy.
01:25:22.560 | But, you know, I think to get there, it's again, just like fairness, it's not going to be enough
01:25:27.680 | to propose algorithmic solutions. There's going to have to be a whole kind of regulatory legal
01:25:32.320 | process that prods companies and other parties to kind of adopt solutions.
01:25:38.160 | >> And I think you've mentioned the word control a lot. And I think giving people control,
01:25:42.720 | that's something that people don't quite have in a lot of these algorithms. And that's a really
01:25:48.160 | interesting idea of giving them control. Some of that is actually literally an interface design
01:25:53.920 | question, sort of just enabling, because I think it's good for everybody to give users control.
01:26:00.240 | It's not, it's not a, it's almost not a trade-off, except that you have to hire people that are good
01:26:05.600 | at interface design. >> Yeah, I mean, the other thing that has to be said, right, is that, you
01:26:10.800 | know, it's a cliche, but, you know, we, as the users of many systems, platforms, and apps, you
01:26:19.200 | know, we are the product. We are not the customer. The customer are advertisers, and our data is the
01:26:25.920 | product, okay? So it's one thing to kind of suggest more individual control of data and privacy and
01:26:32.640 | uses, but this, you know, if this happens in sufficient degree, it will upend the entire
01:26:40.560 | economic model that has supported the internet to date. And so some other economic model will have
01:26:46.960 | to be, you know, will have to replace it. >> So the idea of markets you mentioned,
01:26:52.000 | by exposing the economic model to the people, they will then become a market, and therefore—
01:26:57.680 | >> They could be participants in it. >> Participants in it.
01:26:59.760 | >> And, you know, this isn't, you know, this is not a weird idea, right? Because
01:27:03.120 | there are markets for data already. It's just that consumers are not participants in that.
01:27:08.400 | There's like, you know, there's sort of, you know, publishers and content providers on one side that
01:27:13.280 | have inventory, and then they're advertising on the others, and, you know, Google and Facebook
01:27:18.320 | are running, you know, their—pretty much their entire revenue stream is by running two-sided
01:27:24.480 | markets between those parties, right? And so it's not a crazy idea that there would be like a
01:27:29.920 | three-sided market or that, you know, that on one side of the market or the other, we would have
01:27:35.040 | proxies representing our interest. It's not, you know, it's not a crazy idea, but it would—it's
01:27:39.920 | not a crazy technical idea, but it would have pretty extreme economic consequences.
01:27:47.920 | >> Speaking of markets, a lot of fascinating aspects of this world arise not from individual
01:27:55.120 | humans, but from the interaction of human beings. You've done a lot of work in game theory. First,
01:28:02.160 | can you say what is game theory and how does it help us model and study things?
01:28:07.200 | >> Yeah, game theory, of course, let us give credit where it's due. You know, it
01:28:11.760 | comes from the economists first and foremost, but as I've mentioned before, like, you know,
01:28:16.720 | computer scientists never hesitate to wander into other people's turf, and so there is now this
01:28:22.640 | 20-year-old field called algorithmic game theory. But, you know, game theory,
01:28:27.760 | first and foremost, is a mathematical framework for reasoning about collective outcomes in systems
01:28:36.960 | of interacting individuals. You know, so you need at least two people to get started in game theory,
01:28:44.400 | and many people are probably familiar with Prisoner's Dilemma as kind of a classic example
01:28:49.840 | of game theory and a classic example where everybody looking out for their own individual
01:28:56.000 | interests leads to a collective outcome that's kind of worse for everybody than what might be
01:29:02.000 | possible if they cooperated, for example. But cooperation is not an equilibrium in Prisoner's
01:29:08.240 | Dilemma. And so my work and the field of algorithmic game theory more generally in these
01:29:14.960 | areas kind of looks at settings in which the number of actors is potentially extraordinarily
01:29:23.520 | large and their incentives might be quite complicated and kind of hard to model directly,
01:29:30.720 | but you still want kind of algorithmic ways of kind of predicting what will happen or influencing
01:29:36.080 | what will happen in the design of platforms. >> So what to you is the most beautiful idea
01:29:43.760 | that you've encountered in game theory? >> There's a lot of them. I'm a big fan of the
01:29:48.800 | field. I mean, you know, I mean, technical answers to that, of course, would include
01:29:54.320 | Nash's work just establishing that, you know, there's a competitive equilibrium under very,
01:30:01.120 | very general circumstances, which in many ways kind of put the field on a firm conceptual footing,
01:30:08.400 | because if you don't have equilibrium, it's kind of hard to ever reason about what might happen
01:30:13.280 | since, you know, there's just no stability. >> So just the idea that stability can emerge
01:30:18.560 | when there's multiple... >> Or that, I mean, not that it will necessarily emerge,
01:30:22.000 | just that it's possible, right? I mean, like the existence of equilibrium doesn't mean that
01:30:26.240 | sort of natural iterative behavior will necessarily lead to it.
01:30:30.320 | >> In the real world, yes. >> Yeah. Maybe answering slightly
01:30:33.360 | less personally than you asked the question, I think within the field of algorithmic game theory,
01:30:38.320 | perhaps the single most important kind of technical contribution that's been made is
01:30:45.120 | the realization between close connections between machine learning and game theory,
01:30:50.640 | and in particular between game theory and the branch of machine learning that's known as
01:30:54.640 | no-regret learning. And this sort of provides a very general framework in which a bunch of
01:31:02.480 | players interacting in a game or a system, each one kind of doing something that's in their self
01:31:09.120 | interest will actually kind of reach an equilibrium, and actually reach an equilibrium in a
01:31:14.000 | pretty, you know, a rather, you know, short amount of steps.
01:31:21.200 | >> So you kind of mentioned acting greedily can somehow end up pretty good for everybody.
01:31:28.640 | >> Or pretty bad. >> Or pretty bad. It'll end up stable.
01:31:33.680 | >> Yeah, right. And, you know, stability or equilibrium by itself
01:31:38.800 | is not necessarily either a good thing or a bad thing.
01:31:42.640 | >> So what's the connection between machine learning and the ideas of equilibrium?
01:31:45.600 | >> Well, I mean, I think we've kind of talked about these ideas already in kind of a non-technical way,
01:31:50.960 | which is maybe the more interesting way of understanding them first, which is, you know,
01:31:56.160 | we have many systems, platforms, and apps these days that work really hard to use our data and
01:32:04.880 | the data of everybody else on the platform to selfishly optimize on behalf of each user, okay?
01:32:12.560 | So, you know, let me give, I think, the cleanest example, which is just driving apps,
01:32:17.920 | navigation apps like, you know, Google Maps and Waze, where, you know, miraculously compared to
01:32:24.080 | when I was growing up at least, you know, the objective would be the same when you wanted to
01:32:28.800 | drive from point A to point B, spend the least time driving, not necessarily minimize the distance,
01:32:34.320 | but minimize the time, right? And when I was growing up, like, the only resources you had
01:32:39.200 | to do that were, like, maps in the car, which literally just told you what roads were available.
01:32:44.640 | And then you might have, like, half-hourly traffic reports just about the major freeways,
01:32:50.400 | but not about side roads. So you were pretty much on your own. And now we've got these apps,
01:32:55.760 | you pull it out and you say, "I want to go from point A to point B." And in response kind of to
01:33:00.080 | what everybody else is doing, if you like, what all the other players in this game are doing right
01:33:05.280 | now, here's the, you know, the route that minimizes your driving time. So it is really
01:33:11.520 | kind of computing a selfish best response for each of us in response to what all of the rest of us
01:33:17.920 | are doing at any given moment. And so, you know, I think it's quite fair to think of these apps as
01:33:24.080 | driving or nudging us all towards the competitive or Nash equilibrium of that game.
01:33:32.320 | Now you might ask, like, "Well, that sounds great. Why is that a bad thing?" Well, you know,
01:33:37.600 | it's known both in theory and with some limited studies from actual, like, traffic data
01:33:46.240 | that all of us being in this competitive equilibrium might cause our collective driving
01:33:53.040 | time to be higher, maybe significantly higher than it would be under other solutions.
01:33:59.440 | And then you have to talk about what those other solutions might be and what
01:34:02.480 | the algorithms to implement them are, which we do discuss in the kind of game theory chapter
01:34:07.440 | of the book. But similarly, you know, on social media platforms or on Amazon, you know, all these
01:34:15.440 | algorithms that are essentially trying to optimize our behalf, they're driving us in a colloquial
01:34:22.000 | sense towards some kind of competitive equilibrium. And, you know, one of the most important lessons
01:34:26.880 | of game theory is that just because we're at equilibrium doesn't mean that there's not a
01:34:30.480 | solution in which some or maybe even all of us might be better off. And then the connection to
01:34:36.320 | machine learning, of course, is that in all these platforms I've mentioned, the optimization that
01:34:41.520 | they're doing on our behalf is driven by machine learning, you know, like predicting where the
01:34:45.680 | traffic will be, predicting what products I'm going to like, predicting what would make me
01:34:49.520 | happy in my news feed. Now, in terms of the stability and the promise of that, I have to ask,
01:34:55.360 | just out of curiosity, how stable are these mechanisms that you, game theory is just,
01:35:00.400 | the economists came up with, and we all know that economists don't live in the real world,
01:35:05.280 | just kidding. Sort of what's, do you think when we look at the fact that we haven't blown ourselves
01:35:12.800 | up from a game theoretic concept of mutually shared destruction, what are the odds that we
01:35:20.560 | destroy ourselves with nuclear weapons as one example of a stable game theoretic system?
01:35:26.880 | >>Just to prime your viewers a little bit, I mean, I think you're referring to the fact that
01:35:32.160 | game theory was taken quite seriously back in the '60s as a tool for reasoning about kind of Soviet
01:35:39.200 | US nuclear armament, disarmament, detente, things like that. I'll be honest, as huge of a fan as I
01:35:49.120 | am of game theory and its kind of rich history, it still surprises me that you had people at the
01:35:56.000 | Rand Corporation back in those days kind of drawing up two by two tables and one, the row
01:36:01.200 | player is the US and the column player is Russia, and that they were taking seriously, I'm sure if
01:36:08.240 | I was there, maybe it wouldn't have seemed as naive as it does at the time. >>It seems to have
01:36:13.200 | worked, which is why it seems naive and silly. >>Well, we're still here. >>We're still here in
01:36:17.280 | that sense. >>Yeah. Even though I kind of laugh at those efforts, they were more sensible then
01:36:22.400 | than they would be now, right? Because there were sort of only two nuclear powers at the time,
01:36:26.560 | and you didn't have to worry about deterring new entrants and who was developing the capacity.
01:36:32.400 | And so we have many, we have this, it's definitely a game with more players now and more potential
01:36:39.200 | entrants. I'm not in general somebody who advocates using kind of simple mathematical models when the
01:36:46.320 | stakes are as high as things like that, and the complexities are very political and social,
01:36:52.320 | but we are still here. >>So you've worn many hats, one of which, the one that first caused
01:36:59.600 | me to become a big fan of your work many years ago is algorithmic trading. So I have to just
01:37:06.000 | ask a question about this because you have so much fascinating work there. In the 21st century,
01:37:11.120 | what role do you think algorithms have in the space of trading, investment in the financial sector?
01:37:17.920 | >>Yeah, it's a good question. I mean, in the time I've spent on Wall Street and in finance,
01:37:26.160 | I've seen a clear progression, and I think it's a progression that kind of models the use of
01:37:31.680 | algorithms and automation more generally in society, which is the things that kind of get
01:37:39.200 | taken over by the algos first are sort of the things that computers are obviously better at
01:37:45.520 | than people, right? So first of all, there needed to be this era of automation, right, where just
01:37:52.560 | financial exchanges became largely electronic, which then enabled the possibility of trading
01:38:00.160 | becoming more algorithmic because once the exchanges are electronic, an algorithm can
01:38:05.520 | submit an order through an API just as well as a human can do at a monitor.
01:38:09.040 | >>It can do it really quickly. It can read all the data.
01:38:10.960 | >>Yeah. And so I think the places where algorithmic trading have had the greatest inroads
01:38:18.640 | and had the first inroads were in kind of execution problems, kind of optimized execution
01:38:24.080 | problems. So what I mean by that is at a large brokerage firm, for example, one of the lines of
01:38:30.000 | business might be on behalf of large institutional clients taking what we might consider difficult
01:38:36.800 | trades. So it's not like a mom and pop investor saying, "I want to buy 100 shares of Microsoft."
01:38:41.760 | It's a large hedge fund saying, "I want to buy a very, very large stake in Apple, and I want to
01:38:48.720 | do it over the span of a day." And it's such a large volume that if you're not clever about how
01:38:54.160 | you break that trade up, not just over time, but over perhaps multiple different electronic
01:38:59.280 | exchanges that all let you trade Apple on their platform, you'll push prices around in a way that
01:39:06.560 | hurts your execution. So this is an optimization problem. This is a control problem. And so
01:39:14.480 | machines are better. We know how to design algorithms that are better at that kind of
01:39:21.200 | thing than a person is going to be able to do because we can take volumes of historical and
01:39:26.160 | real-time data to kind of optimize the schedule with which we trade. And similarly, high-frequency
01:39:32.480 | trading, which is closely related but not the same as optimized execution, where you're just
01:39:38.960 | trying to spot very, very temporary mispricings between exchanges or within an asset itself,
01:39:47.360 | or just predict directional movement of a stock because of the kind of very, very low-level,
01:39:53.200 | granular buying and selling data in the exchange, machines are good at this kind of stuff.
01:39:59.280 | It's kind of like the mechanics of trading. What about the... Can machines do long-term
01:40:06.480 | sort of prediction?
01:40:07.600 | Yeah. So I think we are in an era where clearly there have been some very successful
01:40:12.480 | quant hedge funds that are in what we would traditionally call still in the stat-arb
01:40:21.280 | regime. So...
01:40:22.240 | What's that?
01:40:23.520 | Stat-arb referring to statistical arbitrage. But for the purposes of this conversation,
01:40:28.160 | what it really means is making directional predictions in asset price movement or returns.
01:40:34.400 | Your prediction about that directional movement is good for... You have a view that it's valid for
01:40:42.320 | some period of time between a few seconds and a few days. And that's the amount of time that
01:40:48.480 | you're going to kind of get into the position, hold it, and then hopefully be right about the
01:40:52.160 | directional movement and buy low and sell high as the cliche goes. So that is kind of a sweet spot,
01:41:00.800 | I think, for quant trading and investing right now and has been for some time.
01:41:06.000 | When you really get to kind of more Warren Buffett-style time scales, like my cartoon
01:41:13.520 | of Warren Buffett is that Warren Buffett sits and thinks what the long-term value of
01:41:18.720 | Apple really should be. And he doesn't even look at what Apple is doing today.
01:41:23.360 | He just decides, "I think that this is what its long-term value is, and it's far from that right
01:41:29.600 | now. And so I'm going to buy some Apple or short some Apple, and I'm going to sit on that for 10
01:41:36.000 | or 20 years." Okay. So when you're at that kind of time scale or even more than just a few days,
01:41:43.360 | you raise all kinds of other sources of risk and information. So now you're talking about
01:41:50.640 | holding things through recessions and economic cycles. Wars can break out.
01:41:55.840 | So there you have to understand human nature at a level that—
01:41:58.800 | Yeah. And you need to just be able to ingest many, many more sources of data that are on
01:42:03.920 | wildly different time scales, right? So if I'm an HFT, I'm a high-frequency trader,
01:42:11.280 | I really—my main source of data is just the data from the exchanges themselves about the activity
01:42:17.040 | in the exchanges, right? And maybe I need to pay—I need to keep an eye on the news, right? Because
01:42:22.560 | that can cause sudden—the CEO gets caught in a scandal or gets run over by a bus or something
01:42:30.240 | that can cause very sudden changes. But I don't need to understand economic cycles. I don't need
01:42:36.480 | to understand recessions. I don't need to worry about the political situation or war breaking out
01:42:41.920 | in this part of the world because all I need to know is as long as that's not going to happen
01:42:46.320 | in the next 500 milliseconds, then my model is good. When you get to these longer time scales,
01:42:53.840 | you really have to worry about that kind of stuff. And people in the machine learning community are
01:42:57.360 | starting to think about this. We held a—we jointly sponsored a workshop at Penn with the
01:43:05.040 | Federal Reserve Bank of Philadelphia a little more than a year ago on—I think the title was something
01:43:09.920 | like Machine Learning for Macroeconomic Prediction, macroeconomic referring specifically to these
01:43:16.960 | longer time scales. And it was an interesting conference, but it left me with greater confidence
01:43:26.560 | that we have a long way to go to—and so I think that people that—in the grand scheme of things,
01:43:34.160 | so somebody asked me like, "Well, whose job on Wall Street is safe from the bots?" I think people
01:43:39.200 | that are at that longer time scale and have that appetite for all the risks involved in long-term
01:43:44.880 | investing and that really need kind of not just algorithms that can optimize from data, but they
01:43:50.880 | need views on stuff. They need views on the political landscape, economic cycles and the like.
01:43:56.960 | And I think they're pretty safe for a while as far as I can tell.
01:44:02.400 | So Warren Buffett's job is safe for a little while.
01:44:04.400 | Yeah, I'm not seeing a robo-Warren Buffett anytime soon.
01:44:08.080 | Should give him comfort. Last question. If you could go back to—if
01:44:13.920 | there's a day in your life you could relive because it made you truly happy,
01:44:19.200 | maybe in your outside family.
01:44:22.800 | Yeah, otherwise we'd be out here.
01:44:27.200 | What day would it be? Can you look back, you remember just being profoundly transformed in
01:44:34.720 | some way or blissful?
01:44:38.080 | I'll answer a slightly different question, which is like, what's a day in my life or my career
01:44:45.840 | that was kind of a watershed moment?
01:44:48.020 | I went straight from undergrad to doctoral studies, and that's not at all atypical.
01:44:55.600 | And I'm also from an academic family. Like my dad was a professor, my uncle on his side
01:45:00.560 | is a professor, both my grandfathers were professors.
01:45:03.280 | All kinds of majors too, philosophy, I saw.
01:45:05.440 | Yeah, they're kind of all over the map, yeah. And I was a grad student here just up the river
01:45:11.120 | at Harvard and came to study with Les Valiant, which was a wonderful experience. But I remember
01:45:16.640 | my first year of graduate school, I was generally pretty unhappy. And I was unhappy because at
01:45:23.360 | Berkeley as an undergraduate, yeah, I studied a lot of math and computer science, but it
01:45:28.240 | was a huge school, first of all. And I took a lot of other courses, as we discussed, I
01:45:31.920 | started as an English major and took history courses and art history classes and had friends
01:45:37.440 | that did all kinds of different things.
01:45:38.960 | And Harvard's a much smaller institution than Berkeley, and its computer science department,
01:45:44.800 | especially at that time, was a much smaller place than it is now. And I suddenly just
01:45:49.760 | felt very, like I'd gone from this very big world to this highly specialized world.
01:45:55.760 | And now all of the classes I was taking were computer science classes, and I was only in
01:46:01.280 | classes with math and computer science people. And so I was, I thought often in that first
01:46:08.480 | year of grad school about whether I really wanted to stick with it or not. And I thought
01:46:13.600 | like, "Oh, I could stop with a master's, I could go back to the Bay Area and to California,
01:46:18.960 | and this was in one of the early periods where there was, you could definitely get a relatively
01:46:24.880 | good job, paying job at one of the tech companies back, that were the big tech companies back
01:46:30.640 | then."
01:46:31.200 | And so I distinctly remember like kind of a late spring day when I was kind of sitting
01:46:36.960 | in Boston Common and kind of really just kind of chewing over what I wanted to do in my
01:46:40.640 | life. And then I realized like, "Okay," and I think this is where my academic background
01:46:45.120 | helped me a great deal. I sort of realized, "Yeah, you're not having a great time
01:46:48.880 | right now, this feels really narrowing, but you know that you're here for research eventually,
01:46:54.320 | and to do something original, and to try to carve out a career where you kind of choose
01:47:01.920 | what you want to think about and have a great deal of independence."
01:47:05.280 | And so at that point, I really didn't have any real research experience yet. I mean,
01:47:10.800 | it was trying to think about some problems with very little success, but I knew that
01:47:15.600 | like I hadn't really tried to do the thing that I knew I'd come to do. And so I thought,
01:47:23.120 | you know, "I'm going to stick through it for the summer," and that was very formative
01:47:30.080 | because I went from kind of contemplating quitting to, you know, a year later, it being
01:47:37.120 | very clear to me I was going to finish because I still had a ways to go, but I kind of started
01:47:42.240 | doing research, it was going well, it was really interesting, and it was sort of a complete
01:47:46.560 | transformation. You know, and it's just that transition that I think every doctoral student
01:47:52.080 | makes at some point, which is to sort of go from being like a student of what's been done before
01:47:59.040 | to doing, you know, your own thing and figure out what makes you interested and what your
01:48:04.080 | strengths and weaknesses are as a researcher. And once, you know, I kind of made that decision
01:48:09.200 | on that particular day at that particular moment in Boston Common, you know, I'm glad
01:48:15.040 | I made that decision. And also just accepting the painful nature of that journey. Yeah,
01:48:19.520 | yeah, exactly, exactly. And in that moment said, "I'm gonna stick it out." Yeah,
01:48:24.400 | I'm gonna stick around for a while. Well, Michael, I've looked up to your work for a
01:48:29.200 | long time, it's really an honor to talk to you. Thank you so much for doing it. It's
01:48:31.120 | great to get back in touch with you too and see how great you're doing as well. So thanks a lot,
01:48:34.560 | appreciate it. Thanks, Michael.
01:48:35.360 | Thanks.
01:48:50.480 | [BLANK_AUDIO]