back to index

Richard Karp: Algorithms and Computational Complexity | Lex Fridman Podcast #111


Chapters

0:0 Introduction
3:50 Geometry
9:46 Visualizing an algorithm
13:0 A beautiful algorithm
18:6 Don Knuth and geeks
22:6 Early days of computers
25:53 Turing Test
30:5 Consciousness
33:22 Combinatorial algorithms
37:42 Edmonds-Karp algorithm
40:22 Algorithmic complexity
50:25 P=NP
54:25 NP-Complete problems
70:29 Proving P=NP
72:57 Stable marriage problem
80:32 Randomized algorithms
93:23 Can a hard problem be easy in practice?
103:57 Open problems in theoretical computer science
106:21 A strange idea in complexity theory
110:49 Machine learning
116:26 Bioinformatics
120:37 Memory of Richard's father

Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Richard Karp,
00:00:02.920 | a professor at Berkeley and one of the most important figures
00:00:06.320 | in the history of theoretical computer science.
00:00:09.480 | In 1985, he received the Turing Award
00:00:12.680 | for his research in the theory of algorithms,
00:00:15.140 | including the development of the admirance Karp algorithm
00:00:18.420 | for solving the max flow problem on networks,
00:00:21.800 | Hopcroft-Karp algorithm for finding
00:00:24.380 | maximum cardinality matchings in bipartite graphs,
00:00:27.980 | and his landmark paper in complexity theory
00:00:31.000 | called "Reducibility Among Combinatorial Problems,"
00:00:35.320 | in which he proved 21 problems to be NP-complete.
00:00:39.040 | This paper was probably the most important catalyst
00:00:41.880 | in the explosion of interest in the study of NP-completeness
00:00:45.400 | and the P versus NP problem in general.
00:00:48.800 | Quick summary of the ads.
00:00:50.220 | Two sponsors, 8Sleep Mattress and Cash App.
00:00:53.600 | Please consider supporting this podcast
00:00:55.680 | by going to 8sleep.com/lex
00:00:58.820 | and downloading Cash App and using code LEXPODCAST.
00:01:03.040 | Click the links, buy the stuff.
00:01:04.600 | It really is the best way to support this podcast.
00:01:08.080 | If you enjoy this thing, subscribe on YouTube,
00:01:10.280 | review it with Firestarz and Apple Podcast,
00:01:12.480 | support it on Patreon,
00:01:13.840 | or connect with me on Twitter @lexfriedman.
00:01:16.840 | As usual, I'll do a few minutes of ads now
00:01:18.840 | and never any ads in the middle
00:01:20.120 | that can break the flow of the conversation.
00:01:23.000 | This show is sponsored by 8Sleep and its Pod Pro mattress
00:01:27.380 | that you can check out at 8sleep.com/lex to get $200 off.
00:01:32.380 | It controls temperature with an app.
00:01:35.920 | It can cool down to as low as 55 degrees
00:01:38.360 | on each side of the bed separately.
00:01:41.040 | Research shows that temperature has a big impact
00:01:43.560 | on the quality of our sleep.
00:01:45.020 | Anecdotally, it's been a game changer for me.
00:01:47.800 | I love it.
00:01:48.800 | It's been a couple of weeks now.
00:01:50.180 | I've just been really enjoying it,
00:01:52.520 | both in the fact that I'm getting better sleep
00:01:54.680 | and that it's a smart mattress, essentially.
00:01:58.280 | I kind of imagine this being the early days
00:02:00.080 | of artificial intelligence being a part
00:02:02.840 | of every aspect of our lives.
00:02:04.360 | And certainly infusing AI
00:02:06.440 | in one of the most important aspects of life,
00:02:08.400 | which is sleep, I think has a lot of potential
00:02:11.360 | for being beneficial.
00:02:13.160 | The Pod Pro is packed with sensors
00:02:15.320 | that track heart rate, heart rate variability,
00:02:17.760 | and respiratory rate, showing it all in their app.
00:02:21.800 | The app's health metrics are amazing,
00:02:24.360 | but the cooling alone is honestly worth the money.
00:02:27.160 | I don't always sleep, but when I do,
00:02:29.320 | I choose the 8Sleep Pod Pro mattress.
00:02:32.000 | Check it out at 8sleep.com/flex to get $200 off.
00:02:37.000 | And remember, just visiting the site
00:02:39.080 | and considering the purchase helps convince the folks
00:02:42.040 | at 8Sleep that this silly old podcast
00:02:44.880 | is worth sponsoring in the future.
00:02:46.580 | This show is also presented by the great
00:02:50.580 | and powerful Cash App, the number one finance app
00:02:54.380 | in the App Store.
00:02:55.360 | When you get it, use code LEXPODCAST.
00:02:58.200 | Cash App lets you send money to friends,
00:03:00.160 | buy Bitcoin, and invest in the stock market
00:03:02.560 | with as little as $1.
00:03:04.680 | It's one of the best designed interfaces
00:03:06.760 | of an app that I've ever used.
00:03:08.520 | To me, good design is when everything is easy and natural.
00:03:12.360 | Bad design is when the app gets in the way,
00:03:15.120 | either because it's buggy
00:03:16.600 | or because it tries too hard to be helpful.
00:03:19.300 | I'm looking at you, Clippy, from Microsoft,
00:03:21.580 | even though I love you.
00:03:23.160 | Anyway, there's a big part of my brain and heart
00:03:25.500 | that loves to design things
00:03:27.400 | and also to appreciate great design by others.
00:03:30.000 | So again, if you get Cash App
00:03:31.740 | from the App Store or Google Play
00:03:33.280 | and use the code LEXPODCAST, you get $10.
00:03:36.840 | Cash App will also donate $10 to FIRST,
00:03:39.600 | an organization that is helping to advance robotics
00:03:42.320 | and STEM education for young people around the world.
00:03:45.720 | And now, here's my conversation with Richard Karp.
00:03:49.600 | You wrote that at the age of 13,
00:03:52.820 | you were first exposed to plane geometry
00:03:55.220 | and was wonderstruck by the power
00:03:57.620 | and elegance of formal proofs.
00:04:00.260 | Are there problems, proofs, properties, ideas
00:04:02.640 | in plane geometry that, from that time,
00:04:04.940 | that you remember being mesmerized by
00:04:07.860 | or just enjoying to go through to prove various aspects?
00:04:13.060 | - So Michael Rabin told me this story
00:04:15.200 | about an experience he had when he was a young student
00:04:19.480 | who was tossed out of his classroom for bad behavior
00:04:25.400 | and was wandering through the corridors of his school
00:04:28.240 | and came upon two older students
00:04:31.800 | who were studying the problem
00:04:35.240 | of finding the shortest distance
00:04:37.240 | between two non-overlapping circles.
00:04:41.680 | And Michael thought about it and said,
00:04:45.860 | "You take the straight line between the two centers
00:04:52.520 | "and the segment between the two circles is the shortest
00:04:56.240 | "because a straight line is the shortest distance
00:04:58.760 | "between the two centers.
00:05:00.760 | "And any other line connecting the circles
00:05:03.820 | "would be on a longer line."
00:05:07.880 | And I thought, and he thought,
00:05:09.920 | and I agreed that this was just elegant,
00:05:13.000 | that pure reasoning could come up with such a result.
00:05:17.800 | - Certainly the shortest distance
00:05:21.240 | from the two centers of the circles is a straight line.
00:05:24.820 | Could you once again say what's the next step in that proof?
00:05:29.920 | - Well, any segment joining the two circles,
00:05:36.840 | if you extend it by taking the radius on each side,
00:05:41.640 | you get a path with three edges
00:05:46.640 | which connects the two centers.
00:05:48.560 | And this has to be at least as long as the shortest path,
00:05:52.720 | which is the straight line.
00:05:53.920 | - The straight line, yeah.
00:05:54.960 | Wow, yeah, that's quite simple.
00:05:58.680 | So what is it about that elegance
00:06:00.940 | that you just find compelling?
00:06:04.760 | - Well, just that you could establish a fact
00:06:09.760 | about geometry beyond dispute by pure reasoning.
00:06:15.080 | I also enjoy the challenge
00:06:20.640 | of solving puzzles in plane geometry.
00:06:23.520 | It was much more fun than the earlier mathematics courses
00:06:27.600 | which were mostly about arithmetic operations
00:06:31.160 | and manipulating them.
00:06:33.000 | - Was there something about geometry itself,
00:06:36.000 | the slightly visual component of it,
00:06:38.280 | that you can visualize? - Oh, yes, absolutely.
00:06:40.320 | Although I lacked three-dimensional vision.
00:06:44.280 | I wasn't very good at three-dimensional vision.
00:06:47.440 | - You mean being able to visualize three-dimensional objects?
00:06:49.840 | - Three-dimensional objects or surfaces,
00:06:54.440 | hyperplanes and so on.
00:06:55.980 | So there I didn't have an intuition,
00:07:01.880 | but for example, the fact that the sum of the angles
00:07:06.880 | of a triangle is 180 degrees is proved convincingly,
00:07:12.160 | and it comes as a surprise that that can be done.
00:07:19.720 | - Why is that surprising?
00:07:22.960 | - Well-- - Well, it is a surprising idea,
00:07:29.560 | I suppose.
00:07:30.640 | - Why is that proved difficult?
00:07:32.440 | - It's not.
00:07:33.320 | That's the point.
00:07:34.240 | It's so easy, and yet it's so convincing.
00:07:36.500 | - Do you remember what is the proof
00:07:39.200 | that it adds up to 180?
00:07:41.920 | - You start at a corner and draw a line
00:07:48.360 | parallel to the opposite side,
00:07:56.200 | and that line sort of trisects the angle
00:08:00.720 | between the other two sides,
00:08:04.360 | and you get a half-plane which has to add up
00:08:10.440 | to 180 degrees, and it consists,
00:08:15.760 | and the angles, by the equality of alternate angles,
00:08:20.760 | what's it called?
00:08:24.180 | You get a correspondence between the angles
00:08:27.220 | created along the side of the triangle
00:08:31.940 | and the three angles of the triangle.
00:08:34.700 | - Has geometry had an impact on,
00:08:37.780 | when you look into the future of your work
00:08:39.700 | with combinatorial algorithms,
00:08:41.340 | has it had some kind of impact in terms of,
00:08:45.000 | yeah, being able, the puzzles, the visual aspects
00:08:48.720 | that were first so compelling to you?
00:08:51.420 | - Not Euclidean geometry particularly.
00:08:54.500 | I think I use tools like linear programming
00:08:59.500 | and integer programming a lot,
00:09:01.860 | but those require high-dimensional visualization,
00:09:08.020 | and so I tend to go by the algebraic properties.
00:09:12.180 | - Right, you go by the linear algebra
00:09:16.660 | and not by the visualization.
00:09:19.580 | - Well, the interpretation in terms of, for example,
00:09:23.700 | finding the highest point on a polyhedron,
00:09:26.380 | as in linear programming, is motivating,
00:09:31.140 | but again, I don't have the high-dimensional intuition
00:09:37.580 | that would particularly inform me,
00:09:40.140 | so I sort of lean on the algebra.
00:09:43.860 | - So to linger on that point,
00:09:46.120 | what kind of visualization do you do
00:09:51.000 | when you're trying to think about,
00:09:53.280 | we'll get to combinatorial algorithms,
00:09:55.400 | but just algorithms in general.
00:09:57.280 | - Yeah.
00:09:58.120 | - What's inside your mind when you're thinking
00:10:00.640 | about designing algorithms?
00:10:02.260 | Or even just tackling any mathematical problem?
00:10:07.480 | - Well, I think that usually an algorithm
00:10:13.840 | involves a repetition of some inner loop,
00:10:18.840 | and so I can sort of visualize the distance
00:10:24.240 | from the desired solution as iteratively reducing
00:10:29.680 | until you finally hit the exact solution.
00:10:33.360 | - And try to take steps that get you closer to the--
00:10:35.640 | - Try to take steps that get closer
00:10:38.160 | and having the certainty of converging.
00:10:41.280 | So it's basically the mechanics of the algorithm
00:10:46.280 | is often very simple,
00:10:48.080 | but especially when you're trying something out
00:10:52.480 | on the computer.
00:10:53.320 | So for example, I did some work
00:10:57.080 | on the traveling salesman problem,
00:10:58.960 | and I could see there was a particular function
00:11:03.040 | that had to be minimized,
00:11:04.680 | and it was fascinating to see the successive approaches
00:11:08.600 | to the optimum.
00:11:11.020 | - You mean, so first of all,
00:11:13.020 | traveling salesman problem is where you have to visit
00:11:16.340 | every city without ever, only once.
00:11:21.300 | - Yeah, that's right.
00:11:22.420 | Find the shortest path through a set of cities.
00:11:25.100 | - Yeah, which is sort of a canonical,
00:11:27.900 | a standard, a really nice problem
00:11:29.540 | that's really hard in computer science.
00:11:31.300 | - Exactly, yes.
00:11:32.580 | - So can you say again what was nice
00:11:34.460 | about being able to think about the objective function there
00:11:38.380 | and maximizing it or minimizing it?
00:11:41.540 | - Well, it's just that as the algorithm proceeded,
00:11:45.220 | you were making progress, continual progress,
00:11:49.700 | and eventually getting to the optimum point.
00:11:53.940 | - So there's two parts, maybe.
00:11:57.060 | Maybe you can correct me,
00:11:58.060 | but first is like getting an intuition
00:12:00.300 | about what the solution would look like,
00:12:02.860 | and or even maybe coming up with a solution,
00:12:05.540 | and two is proving that this thing
00:12:07.260 | is actually going to be pretty good.
00:12:09.860 | What part is harder for you?
00:12:13.580 | Where does the magic happen?
00:12:14.940 | Is it in the first sets of intuitions,
00:12:17.740 | or is it in the messy details of actually showing
00:12:22.100 | that it is going to get to the exact solution
00:12:25.420 | and it's gonna run at a certain complexity?
00:12:32.360 | - Well, the magic is just the fact
00:12:34.800 | that the gap from the optimum decreases monotonically,
00:12:39.800 | and you can see it happening,
00:12:44.520 | and various metrics of what's going on
00:12:48.720 | are improving all along until finally you hit the optimum.
00:12:53.680 | Perhaps later we'll talk about the assignment problem,
00:12:56.200 | and I can illustrate.
00:12:58.480 | - Illustrate a little better.
00:12:59.840 | - Yeah.
00:13:00.720 | - Now zooming out again, as you write,
00:13:03.120 | Don Knuth has called attention to a breed of people
00:13:06.880 | who derive great aesthetic pleasure
00:13:10.520 | from contemplating the structure of computational processes.
00:13:13.920 | So Don calls these folks geeks,
00:13:16.600 | and you write that you remember the moment
00:13:18.600 | you realized you were such a person,
00:13:20.680 | you were shown the Hungarian algorithm
00:13:23.120 | to solve the assignment problem.
00:13:25.720 | So perhaps you can explain what the assignment problem is
00:13:29.120 | and what the Hungarian algorithm is.
00:13:32.000 | - So in the assignment problem,
00:13:35.480 | you have n boys and n girls,
00:13:40.320 | and you are given the desirability of,
00:13:45.320 | or the cost of matching the i-th boy
00:13:50.240 | with the j-th girl for all i and j.
00:13:52.680 | You're given a matrix of numbers,
00:13:55.640 | and you want to find the one-to-one matching
00:14:00.640 | of the boys with the girls,
00:14:04.320 | such that the sum of the associated costs will be minimized.
00:14:08.760 | So the best way to match the boys with the girls
00:14:13.280 | or men with jobs or any two sets.
00:14:16.480 | - Now any possible matching is possible?
00:14:20.640 | Or there's constraints?
00:14:21.840 | - All one-to-one correspondences are permissible.
00:14:26.840 | If there is a connection that is not allowed,
00:14:29.480 | then you can think of it as having an infinite cost.
00:14:32.120 | - I see, yeah.
00:14:32.960 | - So what you do is to depend on the observation
00:14:39.360 | that the identity of the optimal assignment,
00:14:50.360 | or as we call it, the optimal permutation,
00:14:53.160 | is not changed if you subtract a constant
00:14:58.080 | from any row or column of the matrix.
00:15:04.920 | You can see that the comparison
00:15:06.680 | between the different assignments is not changed by that.
00:15:09.720 | Because if you decrease a particular row,
00:15:15.560 | all the elements of a row by some constant,
00:15:18.720 | all solutions decrease by the cost of that,
00:15:21.840 | by an amount equal to that constant.
00:15:24.100 | So the idea of the algorithm is to start
00:15:27.520 | with a matrix of non-negative numbers
00:15:30.280 | and keep subtracting from rows or entire columns
00:15:36.280 | in such a way that you subtract the same constant
00:15:44.800 | from all the elements of that row or column,
00:15:48.440 | while maintaining the property
00:15:50.760 | that all the elements are non-negative.
00:15:55.760 | - Simple.
00:15:59.280 | - Yeah, and so what you have to do
00:16:04.280 | is find small moves which will decrease the total cost
00:16:09.680 | while subtracting constants from rows or columns.
00:16:17.560 | And there's a particular way of doing that
00:16:19.480 | by computing the kind of shortest path
00:16:22.200 | through the elements in the matrix.
00:16:24.000 | And you just keep going in this way
00:16:28.520 | until you finally get a full permutation of zeros
00:16:33.240 | while the matrix is non-negative,
00:16:34.920 | and then you know that that has to be the cheapest.
00:16:37.440 | - Is that as simple as it sounds?
00:16:42.560 | So the shortest path through the matrix part?
00:16:45.560 | - Yeah, the simplicity lies in how you find,
00:16:48.880 | I oversimplified slightly,
00:16:52.240 | you will end up subtracting a constant
00:16:56.440 | from some rows or columns
00:16:59.120 | and adding the same constant back to other rows and columns.
00:17:04.120 | So as not to reduce any of the zero elements,
00:17:09.200 | you leave them unchanged.
00:17:13.000 | But each individual step modifies
00:17:18.000 | several rows and columns by the same amount,
00:17:24.080 | but overall decreases the cost.
00:17:26.600 | - So there's something about that elegance
00:17:29.800 | that made you go, "Aha, this is a beautiful,"
00:17:32.240 | like it's amazing that something like this,
00:17:35.680 | something so simple can solve a problem like this.
00:17:38.080 | - Yeah, it's really cool.
00:17:39.760 | If I had mechanical ability,
00:17:42.320 | I would probably like to do woodworking
00:17:44.800 | or other activities where you sort of shape something
00:17:49.080 | into something beautiful and orderly.
00:17:55.480 | And there's something about the orderly,
00:17:58.360 | systematic nature of that iterative algorithm
00:18:03.360 | that is pleasing to me.
00:18:05.560 | - So what do you think about this idea of geeks,
00:18:09.000 | as Don Knuth calls them?
00:18:11.360 | What do you think, is it something specific
00:18:15.560 | to a mindset that allows you to discover the elegance
00:18:19.880 | in computational processes, or is this all of us?
00:18:23.160 | Can all of us discover this beauty?
00:18:25.280 | Were you born this way? (laughs)
00:18:28.480 | - I think so, I always like to play with numbers.
00:18:30.920 | I used to amuse myself
00:18:35.640 | by multiplying four-digit decimal numbers in my head.
00:18:40.280 | And putting myself to sleep by starting with one
00:18:44.400 | and doubling the number as long as I could go.
00:18:48.200 | And testing my memory,
00:18:50.060 | my ability to retain the information.
00:18:52.680 | - And I also read somewhere that you wrote
00:18:55.880 | that you enjoyed showing off to your friends
00:18:59.840 | by, I believe, multiplying four-digit numbers.
00:19:03.320 | - Right.
00:19:05.400 | - A couple of four-digit numbers.
00:19:07.040 | - Yeah, I had a summer job at a beach resort
00:19:10.760 | outside of Boston.
00:19:12.680 | And the other employee,
00:19:16.760 | I was the barker at a skee-ball game.
00:19:20.200 | - Yeah.
00:19:21.040 | - I used to sit at a microphone saying,
00:19:25.680 | "Come one, come all, come in and play skee-ball.
00:19:28.160 | "Five cents to play, a nickel to win," and so on.
00:19:31.080 | - That's what a barker, I wasn't sure if I should know,
00:19:33.860 | but barker, you're the charming, outgoing person
00:19:38.860 | that's getting people to come in.
00:19:41.240 | - Yeah, well, I wasn't particularly charming,
00:19:43.600 | but I could be very repetitious and loud.
00:19:46.180 | And the other employees were sort of juvenile delinquents
00:19:52.140 | who had no academic bent,
00:19:57.360 | but somehow I found that I could impress them
00:20:00.520 | by performing this mental arithmetic.
00:20:05.520 | - Yeah, there's something to that.
00:20:08.340 | Some of the most popular videos on the internet is,
00:20:14.580 | there's a YouTube channel called Numberphile
00:20:18.000 | that shows off different mathematical ideas.
00:20:20.440 | - I see.
00:20:21.280 | - There's still something really profoundly interesting
00:20:23.360 | to people about math, the beauty of it.
00:20:27.160 | Something, even if they don't understand
00:20:30.400 | the basic concept even being discussed,
00:20:33.640 | there's something compelling to it.
00:20:35.040 | What do you think that is?
00:20:36.400 | Any lessons you drew from your early teen years
00:20:40.540 | when you were showing off to your friends with the numbers?
00:20:45.540 | What is it that attracts us
00:20:47.640 | to the beauty of mathematics, do you think?
00:20:50.480 | The general population, not just the computer scientists
00:20:54.560 | and mathematicians.
00:20:56.560 | - I think that you can do amazing things.
00:20:59.600 | You can test whether large numbers are prime.
00:21:03.300 | You can solve little puzzles
00:21:09.440 | about cannibals and missionaries.
00:21:12.340 | (laughing)
00:21:13.180 | - Yeah.
00:21:14.000 | - And there's a kind of achievement, it's puzzle solving.
00:21:19.000 | And at a higher level, the fact that you can do
00:21:22.720 | this reasoning, that you can prove
00:21:24.600 | in an absolutely ironclad way that some of the angles
00:21:29.000 | of a triangle is 180 degrees.
00:21:31.540 | - Yeah, it's a nice escape from the messiness
00:21:35.320 | of the real world where nothing can be proved.
00:21:38.120 | So, and we'll talk about it, but sometimes the ability
00:21:41.360 | to map the real world into such problems
00:21:43.560 | where you can't prove it is a powerful step.
00:21:47.120 | - Yeah.
00:21:47.960 | - It's amazing that we can do it.
00:21:48.780 | - Of course, another attribute of geeks
00:21:50.160 | is they're not necessarily endowed
00:21:53.640 | with emotional intelligence.
00:21:55.480 | And so they can live in a world of abstractions
00:21:59.240 | without having to master the complexities
00:22:03.280 | of dealing with people.
00:22:05.160 | - So just to link on the historical note,
00:22:09.440 | as a PhD student in 1955, you joined
00:22:12.120 | the computational lab at Harvard
00:22:14.280 | where Howard Aiken had built the Mark I
00:22:17.040 | and the Mark IV computers.
00:22:19.760 | Just to take a step back into that history,
00:22:22.000 | what were those computers like?
00:22:23.880 | - The Mark IV filled a large room,
00:22:30.300 | much bigger than this large office
00:22:33.660 | that we're talking in now.
00:22:36.840 | And you could walk around inside it.
00:22:39.080 | There were rows of relays.
00:22:43.300 | You could just walk around the interior
00:22:45.360 | and the machine would sometimes fail
00:22:50.360 | because of bugs, which literally meant
00:22:55.120 | flying creatures landing on the switches.
00:22:57.960 | So I never used that machine for any practical purpose.
00:23:04.840 | The lab eventually acquired
00:23:09.360 | one of the earlier commercial computers.
00:23:14.640 | - This is already in the '60s.
00:23:16.680 | - No, in the mid '50s.
00:23:17.880 | - The mid '50s?
00:23:18.720 | - Or the late '50s.
00:23:19.840 | - There was already commercial computers in the--
00:23:21.800 | - Yeah, we had a UNIVAC with 2,000 words of storage.
00:23:26.240 | And so you had to work hard to allocate
00:23:30.220 | the memory properly to also the excess time
00:23:34.920 | from one word to another depended on the number
00:23:38.520 | of the particular words.
00:23:41.240 | And so there was an art to sort of arranging
00:23:44.880 | the storage allocation to make fetching data rapid.
00:23:49.880 | - Were you attracted to this actual physical world
00:23:55.000 | implementation of mathematics?
00:23:58.180 | So it's a mathematical machine that's actually
00:24:00.480 | doing the math physically for you?
00:24:03.160 | - No, not at all.
00:24:04.840 | I think I was attracted to the underlying algorithms.
00:24:10.920 | - But did you draw any inspiration?
00:24:12.880 | So could you have imagined,
00:24:15.280 | like what did you imagine was the future
00:24:18.080 | of these giant computers?
00:24:20.120 | Could you have imagined that 60 years later
00:24:22.040 | we'd have billions of these computers all over the world?
00:24:25.800 | - I couldn't imagine that,
00:24:27.220 | but there was a sense in the laboratory
00:24:32.840 | that this was the wave of the future.
00:24:36.120 | In fact, my mother influenced me.
00:24:38.480 | She told me that data processing was gonna be really big
00:24:42.280 | and I should get into it.
00:24:43.580 | - She's a smart woman.
00:24:47.240 | - Yeah, she was a smart woman.
00:24:49.040 | And there was just a feeling that this was going
00:24:53.040 | to change the world, but I didn't think of it
00:24:55.640 | in terms of personal computing.
00:24:57.120 | I had no anticipation that we would be walking around
00:25:02.120 | with computers in our pockets or anything like that.
00:25:06.080 | - Did you see computers as tools,
00:25:11.000 | as mathematical mechanisms to analyze
00:25:13.840 | sort of theoretical computer science,
00:25:16.560 | or as the AI folks, which is an entire other community
00:25:21.020 | of dreamers, as something that could one day
00:25:24.480 | have human level intelligence?
00:25:26.840 | - Well, AI wasn't very much on my radar.
00:25:29.560 | I did read Turing's paper about the
00:25:33.960 | - The Turing test, computing and intelligence?
00:25:38.120 | - Yeah, the Turing test.
00:25:39.540 | - What'd you think about that paper?
00:25:41.880 | Was that just like science fiction?
00:25:43.780 | - I thought that it wasn't a very good test
00:25:48.320 | because it was too subjective.
00:25:50.480 | So I didn't feel that the Turing test
00:25:55.320 | was really the right way to calibrate
00:25:58.440 | how intelligent an algorithm could be.
00:26:01.040 | - But to linger on that, do you think it's part,
00:26:03.220 | because you've come up with some incredible tests
00:26:06.440 | later on, tests on algorithms, right?
00:26:09.400 | That are strong, reliable, robust
00:26:13.360 | across a bunch of different classes of algorithms.
00:26:16.520 | But returning to this emotional mess that is intelligence,
00:26:21.240 | do you think it's possible to come up with a test
00:26:24.560 | that's as ironclad as some of the
00:26:28.360 | computational complexity work?
00:26:31.240 | - Well, I think the greater question
00:26:32.800 | is whether it's possible to achieve
00:26:35.160 | human level intelligence.
00:26:38.720 | - Right, so that's, so first of all,
00:26:40.120 | let me, at the philosophical level,
00:26:42.080 | do you think it's possible to create algorithms
00:26:46.320 | that reason and would seem to us
00:26:51.320 | to have the same kind of intelligence as human beings?
00:26:54.080 | - It's an open question.
00:26:58.140 | It seems to me that most of the achievements
00:27:03.140 | have operate within a very limited set of ground rules
00:27:09.060 | and for a very limited, precise task,
00:27:15.360 | which is a quite different situation
00:27:17.560 | from the processes that go on in the minds of humans,
00:27:22.300 | which, where they have to sort of function
00:27:25.020 | in changing environments.
00:27:27.740 | They have emotions, they have physical attributes
00:27:32.220 | for exploring their environment.
00:27:38.980 | They have intuition, they have desires, emotions.
00:27:44.620 | And I don't see anything in the current achievements
00:27:52.100 | of what's called AI that come close to that capability.
00:27:56.940 | I don't think there's any computer program
00:28:01.940 | which surpasses a six-month-old child
00:28:05.740 | in terms of comprehension of the world.
00:28:09.040 | - Do you think this complexity of human intelligence,
00:28:15.580 | all the cognitive abilities we have, all the emotion,
00:28:18.180 | do you think that could be reduced one day
00:28:21.380 | or just fundamentally, can it be reduced
00:28:23.940 | to a set of algorithms or an algorithm?
00:28:27.260 | So can a Turing machine achieve human-level intelligence?
00:28:31.880 | - I am doubtful about that.
00:28:35.940 | I guess the argument in favor of it
00:28:38.620 | is that the human brain seems to achieve
00:28:43.620 | what we call intelligence,
00:28:47.940 | cognitive abilities of different kinds.
00:28:50.760 | And if you buy the premise that the human brain
00:28:54.300 | is just an enormous interconnected set of switches,
00:28:58.620 | so to speak, then in principle,
00:29:02.220 | you should be able to diagnose
00:29:04.460 | what that interconnection structure is like,
00:29:07.420 | characterize the individual switches,
00:29:09.380 | and build a simulation outside.
00:29:12.920 | But while that may be true in principle,
00:29:17.660 | that cannot be the way we're eventually
00:29:19.900 | gonna tackle this problem.
00:29:21.660 | That's, you know, that does not seem
00:29:26.660 | like a feasible way to go about it.
00:29:29.360 | So there is, however, an existence proof that
00:29:32.460 | if you believe that the brain is just a network
00:29:40.380 | of neurons operating by rules,
00:29:44.280 | I guess you could say that that's an existence proof
00:29:46.660 | of the ability to build,
00:29:49.220 | the capabilities of a mechanism.
00:29:51.240 | But it would be almost impossible
00:29:55.100 | to acquire the information
00:29:57.300 | unless we got enough insight
00:30:00.500 | into the operation of the brain.
00:30:02.780 | - But there's so much mystery there.
00:30:04.300 | Do you think, what do you make of consciousness,
00:30:06.700 | for example?
00:30:08.220 | There's something, as an example,
00:30:10.660 | something we completely have no clue about,
00:30:13.180 | the fact that we have this subjective experience.
00:30:15.380 | - Right.
00:30:16.220 | - Is it possible that this network of,
00:30:19.820 | this circuit of switches is able to create
00:30:22.940 | something like consciousness?
00:30:24.980 | - To know its own identity.
00:30:26.860 | - Yeah, to know the algorithm, to know itself.
00:30:30.300 | - To know itself.
00:30:32.140 | I think if you try to define that rigorously,
00:30:35.380 | you'd have a lot of trouble.
00:30:36.660 | - Yeah, that seems to.
00:30:37.780 | - So I know that there are many who
00:30:45.100 | believe that general intelligence can be achieved,
00:30:50.100 | and there are even some who feel certain
00:30:55.780 | that the singularity will come
00:30:59.180 | and we will be surpassed by the machines,
00:31:01.740 | which will then learn more and more about themselves
00:31:04.380 | and reduce humans to an inferior breed.
00:31:08.640 | I am doubtful that this will ever be achieved.
00:31:12.700 | - Just for the fun of it,
00:31:14.000 | could you linger on what's your intuition,
00:31:18.020 | why you're doubtful?
00:31:19.060 | So there are quite a few people
00:31:20.900 | that are extremely worried about this existential threat
00:31:25.220 | of artificial intelligence, of us being left behind
00:31:29.220 | by the super intelligent new species.
00:31:32.340 | What's your intuition why that's not quite likely?
00:31:36.900 | - Just because none of the achievements
00:31:41.420 | in speech or robotics or natural language processing
00:31:46.420 | or creation of flexible computer assistants
00:31:52.660 | or any of that comes anywhere near close
00:31:56.460 | to that level of cognition.
00:32:00.300 | - What do you think about ideas of sort of,
00:32:02.300 | if we look at Moore's law, an exponential improvement
00:32:05.300 | to allow us, that would surprise us?
00:32:09.300 | Sort of our intuition fall apart
00:32:11.180 | with exponential improvement because,
00:32:14.340 | I mean, we're not able to kind of,
00:32:16.060 | we kind of think in linear improvement.
00:32:18.020 | - Yeah.
00:32:18.860 | - We're not able to imagine a world
00:32:20.220 | that goes from the Mark I computer to an iPhone 10.
00:32:25.220 | - Yeah.
00:32:27.540 | - So do you think we could be really surprised
00:32:31.260 | by the exponential growth?
00:32:33.420 | Or on the flip side, is it possible
00:32:37.540 | that also intelligence is actually way, way, way, way harder
00:32:42.220 | even with exponential improvement to be able to crack?
00:32:47.100 | - I don't think any constant factor improvement
00:32:50.260 | could change things.
00:32:53.980 | I mean, given our current comprehension
00:32:57.500 | of what cognition requires,
00:33:04.900 | it seems to me that multiplying the speed of the switches
00:33:08.740 | by a factor of a thousand or a million
00:33:10.940 | will not be useful until we really understand
00:33:16.580 | the organizational principle behind the network of switches.
00:33:21.260 | - Well, let's jump into the network of switches
00:33:25.420 | and talk about combinatorial algorithms if we could.
00:33:28.020 | Let's step back for the very basics.
00:33:31.660 | What are combinatorial algorithms?
00:33:33.660 | And what are some major examples
00:33:35.060 | of problems they aim to solve?
00:33:36.980 | - A combinatorial algorithm is one which deals
00:33:43.140 | with a system of discrete objects
00:33:48.380 | that can occupy various states
00:33:52.340 | or take on various values from a discrete set of values
00:33:59.740 | and need to be arranged or selected in such a way
00:34:04.740 | as to achieve some, to minimize some cost function
00:34:12.260 | or to prove the existence
00:34:16.260 | of some combinatorial configuration.
00:34:19.980 | So an example would be coloring the vertices of a graph.
00:34:24.980 | - What's a graph?
00:34:25.980 | Let's step back.
00:34:28.140 | So what, it's fun to ask one of the greatest
00:34:33.140 | computer scientists of all time
00:34:36.140 | the most basic questions in the beginning of most books.
00:34:39.260 | But for people who might not know,
00:34:41.380 | but in general how you think about it, what is a graph?
00:34:44.460 | - A graph, that's simple.
00:34:47.220 | It's a set of points, certain pairs of which
00:34:51.180 | are joined by lines called edges.
00:34:55.060 | And they sort of represent the,
00:34:57.820 | in different applications represent the interconnections
00:35:02.300 | between discrete objects.
00:35:05.860 | So they could be the interactions,
00:35:07.700 | interconnections between switches in a digital circuit
00:35:12.460 | or interconnections indicating the communication patterns
00:35:16.380 | of a human community.
00:35:17.940 | - And they could be directed or undirected.
00:35:21.420 | And then as you've mentioned before,
00:35:23.180 | you might have costs.
00:35:25.580 | - Right, they can be directed or undirected.
00:35:27.980 | They can be, you can think of them as,
00:35:30.660 | if you think if a graph were representing
00:35:34.300 | a communication network,
00:35:36.660 | then the edge could be undirected,
00:35:38.740 | meaning that information could flow along it
00:35:41.820 | in both directions, or it could be directed
00:35:44.580 | with only one way communication.
00:35:47.020 | A road system is another example of a graph
00:35:49.780 | with weights on the edges.
00:35:52.260 | And then a lot of problems of optimizing
00:35:57.260 | the efficiency of such networks or learning about
00:36:02.780 | the performance of such networks
00:36:05.860 | are the object of combinatorial algorithms.
00:36:11.340 | So it could be scheduling classes at a school
00:36:15.580 | where the vertices, the nodes of the network
00:36:20.780 | are the individual classes and the edges indicate
00:36:25.780 | the constraints which say that certain classes
00:36:30.660 | cannot take place at the same time,
00:36:32.940 | or certain teachers are available only
00:36:35.740 | for certain classes, et cetera.
00:36:38.460 | Or I talked earlier about the assignment problem
00:36:43.140 | of matching the boys with the girls,
00:36:47.940 | where you have a, there a graph with an edge
00:36:52.220 | from each boy to each girl
00:36:54.020 | with a weight indicating the cost.
00:36:58.060 | Or in logical design of computers,
00:37:02.900 | you might want to find a set of so-called gates,
00:37:07.900 | switches that perform logical functions,
00:37:11.940 | which can be interconnected to realize some function.
00:37:15.140 | So you might ask how many gates do you need
00:37:20.140 | in order to, for a circuit to give a yes output
00:37:27.020 | if at least a given number of its inputs are ones
00:37:37.780 | and no if fewer are present.
00:37:42.980 | - My favorite is probably all the work with network flows.
00:37:46.620 | So anytime you have, I don't know why it's so compelling,
00:37:50.780 | but there's something just beautiful about it.
00:37:52.420 | It seems like there's so many applications
00:37:54.340 | and communication networks and traffic flow
00:37:59.340 | that you can map into these.
00:38:02.580 | And then you can think of pipes and water
00:38:04.620 | going through pipes and you can optimize it
00:38:06.580 | in different ways.
00:38:07.420 | There's something always visually
00:38:09.540 | and intellectually compelling to me about it.
00:38:12.340 | And of course you've done work there.
00:38:14.660 | - Yeah, so there the edges represent channels
00:38:20.580 | along which some commodity can flow.
00:38:24.540 | It might be gas, it might be water,
00:38:28.080 | it might be information.
00:38:29.900 | - Maybe supply chain as well, like products being--
00:38:33.860 | - Products flowing from one operation to another.
00:38:36.740 | - Yeah.
00:38:37.700 | - And the edges have a capacity,
00:38:40.280 | which is the rate at which the commodity can flow.
00:38:43.700 | And a central problem is to determine,
00:38:48.700 | given a network of these channels,
00:38:51.660 | in this case, the edges are communication channels.
00:38:54.560 | The challenge is to find the maximum rate
00:39:01.300 | at which the information can flow along these channels
00:39:05.540 | to get from a source to a destination.
00:39:09.180 | And that's a fundamental combinatorial problem
00:39:12.700 | that I've worked on.
00:39:14.240 | Jointly with the scientist, Jack Edmonds,
00:39:19.100 | we, I think we're the first to give a formal proof
00:39:22.460 | that this maximum flow problem through a network
00:39:27.460 | can be solved in polynomial time.
00:39:30.740 | - Which I remember the first time I learned that,
00:39:33.940 | just learning that in maybe even grad school,
00:39:39.780 | I don't think it was even undergrad, no.
00:39:42.080 | Algorithm, yeah.
00:39:43.400 | Do network flows get taught in basic algorithms courses?
00:39:48.400 | - Yes, probably.
00:39:51.140 | - Okay, so yeah, I remember being very surprised
00:39:53.780 | that max flow is a polynomial time algorithm.
00:39:56.580 | - Yeah.
00:39:57.420 | - That there's a nice, fast algorithm that solves max flow.
00:40:00.140 | But, so there is an algorithm named after you in Edmonds.
00:40:05.140 | The Edmond-Karp algorithm for max flow.
00:40:08.380 | So what was it like tackling that problem
00:40:12.580 | and trying to arrive at a polynomial time solution?
00:40:15.700 | And maybe you can describe the algorithm,
00:40:17.180 | maybe you can describe what's the running time complexity
00:40:19.900 | that you showed.
00:40:20.740 | - Yeah, well, first of all,
00:40:23.340 | what is a polynomial time algorithm?
00:40:25.380 | - Yeah.
00:40:26.220 | - Perhaps we could discuss that.
00:40:28.580 | - So yeah, let's actually just even, yeah.
00:40:31.300 | That's what is algorithmic complexity?
00:40:34.180 | What are the major classes of algorithm complexity?
00:40:38.220 | So we, in a problem like the assignment problem
00:40:41.860 | or scheduling schools or any of these applications,
00:40:46.860 | you have a set of input data,
00:40:51.460 | which might, for example, be a set of vertices
00:40:57.380 | connected by edges with,
00:41:01.900 | you're given for each edge, the capacity of the edge.
00:41:07.340 | And you have algorithms,
00:41:10.980 | which are, think of them as computer programs
00:41:14.740 | with operations such as addition, subtraction,
00:41:18.540 | multiplication, division, comparison of numbers and so on.
00:41:22.000 | And you're trying to construct an algorithm
00:41:26.940 | based on those operations,
00:41:32.560 | which will determine in a minimum number
00:41:35.020 | of computational steps, the answer to the problem.
00:41:38.440 | In this case, the computational step
00:41:41.060 | is one of those operations.
00:41:43.340 | And the answer to the problem is, let's say,
00:41:46.140 | the configuration of the network
00:41:50.360 | that carries the maximum amount of flow.
00:41:52.460 | And an algorithm is said to run in polynomial time
00:41:58.300 | if as a function of the size of the input,
00:42:04.980 | the number of vertices, the number of edges and so on.
00:42:07.900 | The number of basic computational steps
00:42:12.020 | grows only as some fixed power of that size.
00:42:15.200 | A linear algorithm would execute a number of steps
00:42:21.900 | linearly proportional to the size.
00:42:24.700 | Quadratic algorithm would be steps proportional
00:42:27.740 | to the square of the size and so on.
00:42:29.680 | And algorithms whose running time is bounded
00:42:34.680 | by some fixed power of the size
00:42:37.260 | are called polynomial algorithms.
00:42:39.900 | - And that's supposed to be
00:42:42.140 | relatively fast class of algorithms.
00:42:44.820 | - That's right.
00:42:45.740 | Theoreticians take that to be the definition
00:42:49.500 | of an algorithm being efficient.
00:42:53.260 | And we're interested in which problems can be solved
00:42:57.900 | by such efficient algorithms.
00:43:02.280 | One can argue whether that's the right definition
00:43:04.920 | of efficient because you could have an algorithm
00:43:08.080 | whose running time is the 10,000th power
00:43:11.380 | of the size of the input,
00:43:12.540 | and that wouldn't be really efficient.
00:43:15.800 | - And in practice, it's oftentimes reducing
00:43:19.160 | from an N squared algorithm to an N log N
00:43:22.560 | or a linear time is practically the jump
00:43:26.900 | that you wanna make to allow a real world system
00:43:30.200 | to solve a problem.
00:43:31.160 | - Yeah, that's also true because especially
00:43:33.240 | as we get very large networks,
00:43:35.840 | the size can be in the millions
00:43:38.060 | and then anything above N log N,
00:43:43.060 | where N is the size,
00:43:45.380 | would be too much for a practical solution.
00:43:50.000 | - Okay, so that's polynomial time algorithms.
00:43:52.560 | What other classes of algorithms are there?
00:43:57.360 | So that usually they designate polynomials
00:44:01.120 | as a letter P.
00:44:02.400 | - Yeah.
00:44:03.240 | - There's also NP, NP complete, NP hard.
00:44:06.200 | - Yeah.
00:44:07.200 | - So can you try to disentangle those
00:44:09.720 | by trying to define them simply?
00:44:14.280 | - Right, so a polynomial time algorithm
00:44:16.960 | is one whose running time is bounded
00:44:20.480 | by a polynomial in the size of the input.
00:44:24.560 | Then the class of such algorithms is called P.
00:44:29.400 | - In the worst case, by the way, we should say, right?
00:44:31.440 | - Yeah.
00:44:32.280 | - So for every case of the problem.
00:44:33.100 | - Yes, that's right, and that's very important
00:44:34.320 | that in this theory,
00:44:36.560 | when we measure the complexity of an algorithm,
00:44:40.880 | we really measure the number of steps,
00:44:45.040 | the growth of the number of steps in the worst case.
00:44:48.960 | So you may have an algorithm
00:44:50.280 | that runs very rapidly in most cases,
00:44:55.040 | but if there's any case
00:44:56.880 | where it gets into a very long computation,
00:45:00.160 | that would increase the computational complexity
00:45:03.480 | by this measure.
00:45:04.500 | And that's a very important issue
00:45:07.440 | because there, as we may discuss later,
00:45:10.600 | there are some very important algorithms
00:45:13.320 | which don't have a good standing
00:45:15.600 | from the point of view of their worst case performance
00:45:18.200 | and yet are very effective.
00:45:19.780 | So theoreticians are interested in P,
00:45:24.360 | the class of problems solvable in polynomial time.
00:45:27.580 | Then there's NP, which is the class of problems
00:45:34.000 | which may be hard to solve,
00:45:38.880 | but where the, where when confronted with a solution,
00:45:43.880 | you can check it in polynomial time.
00:45:46.660 | Let me give you an example there.
00:45:49.200 | So if we look at the assignment problem,
00:45:52.460 | so you have n boys, you have n girls,
00:45:56.000 | the number of numbers that you need to write down
00:45:58.780 | to specify the problem instances, n squared.
00:46:02.920 | And the question is
00:46:06.240 | how many steps are needed to solve it.
00:46:11.560 | And Jack Edmonds and I were the first to show
00:46:15.680 | that it could be done in time n cubed.
00:46:18.460 | Earlier algorithms required n to the fourth.
00:46:23.900 | So as a polynomial function of the size of the input,
00:46:27.320 | this is a fast algorithm.
00:46:29.180 | Now to illustrate the class NP,
00:46:33.280 | the question is how long would it take to verify
00:46:37.280 | that a solution is optimal?
00:46:42.680 | So for example, if the input was a graph,
00:46:47.680 | we might want to find the largest clique in the graph,
00:46:53.160 | or a clique is a set of vertices such that any vertex,
00:46:58.520 | each vertex in the set is adjacent to each of the others.
00:47:03.880 | So the clique is a complete subgraph.
00:47:08.960 | - Yeah, so if it's a Facebook social network,
00:47:11.920 | everybody's friends with everybody else,
00:47:14.120 | close clique of friends.
00:47:14.960 | - No, that would be what's called a complete graph,
00:47:17.280 | it would be.
00:47:18.240 | - No, I mean within that clique.
00:47:20.640 | - Within that clique, yeah.
00:47:22.000 | They're all friends.
00:47:25.640 | - So a complete graph is when--
00:47:27.820 | - Everybody is friendly.
00:47:28.840 | - Is everybody is friends with everybody, yeah.
00:47:31.360 | - So the problem might be to determine
00:47:35.480 | whether in a given graph there exists a clique
00:47:39.600 | of a certain size.
00:47:41.920 | - Well, that turns out to be a very hard problem,
00:47:45.760 | but if somebody hands you a clique
00:47:49.440 | and asks you to check whether it is,
00:47:52.040 | hands you a set of vertices and asks you to check
00:47:56.160 | whether it's a clique, you could do that simply
00:48:00.480 | by exhaustively looking at all of the edges
00:48:02.920 | between the vertices in the clique
00:48:05.200 | and verifying that they're all there.
00:48:08.040 | - And that's a polynomial time algorithm.
00:48:10.480 | - That's a polynomial, so the problem of finding the clique
00:48:15.440 | appears to be extremely hard,
00:48:19.280 | but the problem of verifying a clique
00:48:21.680 | to see if it reaches a target number of vertices
00:48:26.480 | is easy to verify.
00:48:31.640 | So finding the clique is hard, checking it is easy.
00:48:35.720 | Problems of that nature are called
00:48:39.000 | non-deterministic polynomial time algorithms,
00:48:42.440 | and that's the class NP.
00:48:44.280 | - And what about NP complete and NP hard?
00:48:48.320 | - Okay, let's talk about problems
00:48:50.880 | where you're getting a yes or no answer
00:48:53.760 | rather than a numerical value.
00:48:55.400 | So either there is a perfect matching
00:48:58.760 | of the boys with the girls or there isn't.
00:49:04.140 | It's clear that every problem in NP is also in NP.
00:49:09.140 | If you can solve the problem exactly,
00:49:12.540 | then you can certainly verify the solution.
00:49:17.500 | On the other hand, there are problems in the class NP.
00:49:22.500 | This is the class of problems that are easy to check,
00:49:27.100 | although they may be hard to solve.
00:49:29.660 | It's not at all clear that problems in NP lie in P.
00:49:33.740 | So for example, if we're looking at scheduling classes
00:49:37.420 | at a school, the fact that you can verify
00:49:42.420 | when handed a schedule for the school,
00:49:45.780 | whether it meets all the requirements,
00:49:47.980 | that doesn't mean that you can find the schedule rapidly.
00:49:51.380 | So intuitively, NP, non-deterministic polynomial,
00:49:55.300 | checking rather than finding,
00:49:58.860 | is going to be harder than,
00:50:01.600 | is going to include, is easier.
00:50:06.140 | Checking is easier, and therefore the class of problems
00:50:08.820 | that can be checked appears to be much larger
00:50:11.940 | than the class of problems that can be solved.
00:50:14.620 | - And then you keep adding appears to
00:50:17.380 | and sort of these additional words
00:50:21.500 | that designate that we don't know for sure yet.
00:50:24.060 | - We don't know for sure.
00:50:25.180 | So the theoretical question,
00:50:27.020 | which is considered to be the most central problem
00:50:30.420 | in theoretical computer science,
00:50:32.640 | or at least computational complexity theory,
00:50:35.500 | combinatorial algorithm theory,
00:50:38.560 | question is whether P is equal to NP.
00:50:42.720 | If P were equal to NP, it would be amazing.
00:50:46.300 | It would mean that every problem
00:50:54.240 | where a solution can be rapidly checked
00:50:56.920 | can actually be solved in polynomial time.
00:51:00.880 | We don't really believe that's true.
00:51:03.400 | If you're scheduling classes at a school,
00:51:05.780 | we expect that if somebody hands you a satisfying schedule,
00:51:13.040 | you can verify that it works.
00:51:15.640 | That doesn't mean that you should be able
00:51:17.140 | to find such a schedule.
00:51:18.960 | So intuitively, NP encompasses a lot more problems than P.
00:51:24.160 | - So can we take a small tangent
00:51:28.040 | and break apart that intuition?
00:51:30.480 | So do you, first of all, think that the biggest
00:51:34.560 | sort of open problem in computer science,
00:51:36.200 | maybe mathematics, is whether P equals NP?
00:51:40.000 | Do you think P equals NP?
00:51:43.240 | Or do you think P is not equal to NP?
00:51:46.320 | If you had to bet all your money on it.
00:51:48.840 | - I would bet that P is unequal to NP,
00:51:52.040 | simply because there are problems
00:51:54.320 | that have been around for centuries
00:51:55.960 | and have been studied intensively in mathematics,
00:51:59.020 | and even more so in the last 50 years
00:52:02.120 | since the P versus NP was stated.
00:52:05.600 | And no polynomial time algorithms have been found
00:52:10.160 | for these easy to check problems.
00:52:13.720 | So one example is a problem that goes back
00:52:17.640 | to the mathematician Gauss,
00:52:19.880 | who was interested in factoring large numbers.
00:52:24.480 | So we know what a number is prime
00:52:27.960 | if it cannot be written as the product
00:52:31.480 | of two or more numbers unequal to one.
00:52:35.120 | So if we can factor a number like 91, it's seven times 13.
00:52:41.360 | But if I give you 20 digit or 30 digit numbers,
00:52:49.880 | you're probably going to be at a loss
00:52:52.560 | to have any idea whether they can be factored.
00:52:55.460 | So the problem of factoring very large numbers
00:53:00.080 | does not appear to have an efficient solution.
00:53:05.080 | But once you have found the factors,
00:53:09.600 | expressed the number as a product of two smaller numbers,
00:53:16.560 | you can quickly verify that they are factors of the number.
00:53:19.960 | - And your intuition is a lot of people finding,
00:53:22.480 | you know, a lot of brilliant people
00:53:24.680 | have tried to find algorithms.
00:53:25.720 | This one particular problem, there's many others like it
00:53:28.600 | that are really well studied,
00:53:30.560 | and it would be great to find an efficient algorithm for.
00:53:33.920 | - Right, and in fact, we have some results
00:53:38.920 | that I was instrumental in obtaining,
00:53:43.460 | following up on work by the mathematician Stephen Cook,
00:53:46.940 | to show that within the class NP,
00:53:52.920 | of easy to check problems,
00:53:55.820 | there's a huge number that are equivalent in the sense
00:53:59.380 | that either all of them or none of them lie in P.
00:54:03.220 | And this happens only if P is equal to NP.
00:54:06.420 | So if P is unequal to NP, we would also know
00:54:10.900 | that virtually all the standard combinatorial problems,
00:54:15.900 | if P is unequal to NP,
00:54:22.460 | none of them can be solved in polynomial time.
00:54:25.860 | - Can you explain how that's possible
00:54:28.500 | to tie together so many problems in a nice bunch
00:54:32.240 | that if one is proven to be efficient, then all are?
00:54:36.540 | - The first and most important stage of progress
00:54:40.580 | was a result by Stephen Cook,
00:54:44.500 | who showed that a certain problem
00:54:49.660 | called the satisfiability problem of propositional logic
00:54:54.500 | is as hard as any problem in the class P.
00:55:00.140 | So the propositional logic problem is expressed
00:55:05.140 | in terms of expressions involving the logical operations
00:55:11.020 | and or and not operating on variables
00:55:16.020 | that can be either true or false.
00:55:19.200 | So an instance of the problem would be some formula
00:55:24.200 | involving and or and not.
00:55:26.740 | And the question would be whether there is an assignment
00:55:31.140 | of truth values to the variables in the problem
00:55:34.860 | that would make the formula true.
00:55:37.360 | So for example, if I take the formula A or B
00:55:42.360 | and A or not B and not A or B and not A or not B
00:55:47.860 | and take the conjunction of all four
00:55:51.220 | of those so-called expressions,
00:55:54.580 | you can determine that no assignment of truth values
00:55:59.020 | to the variables A and B will allow that conjunction
00:56:03.860 | of what are called clauses to be true.
00:56:08.860 | So that's an example of a formula in propositional logic
00:56:14.160 | involving expressions based on the operations and or and not.
00:56:21.900 | That's an example of a problem which is not satisfiable.
00:56:27.240 | There is no solution that satisfies
00:56:29.240 | all of those constraints.
00:56:31.160 | - And that's like one of the cleanest
00:56:32.880 | and fundamental problems in computer science.
00:56:35.300 | It's like a nice statement of a really hard problem.
00:56:37.680 | - It's a nice statement of a really hard problem.
00:56:39.960 | And what Cook showed is that every problem in NP
00:56:44.960 | can be re-expressed as an instance
00:56:53.560 | of the satisfiability problem.
00:56:56.080 | So to do that, he used the observation
00:57:02.080 | that a very simple abstract machine
00:57:04.160 | called the Turing machine can be used
00:57:08.360 | to describe any algorithm.
00:57:12.400 | An algorithm for any realistic computer
00:57:17.800 | can be translated into an equivalent algorithm
00:57:22.800 | on one of these Turing machines,
00:57:25.840 | which are extremely simple.
00:57:27.960 | - So Turing machine, there's a tape
00:57:29.760 | and you can walk along that tape.
00:57:32.400 | - Yeah, you have data on a tape
00:57:33.360 | and you have basic instructions,
00:57:35.680 | a finite list of instructions,
00:57:37.800 | which say if you're reading a particular symbol on the tape
00:57:42.240 | and you're in a particular state,
00:57:45.480 | then you can move to a different state
00:57:49.800 | and change the state of the number,
00:57:52.200 | the element that you were looking at,
00:57:53.760 | the cell of the tape that you were looking at.
00:57:55.720 | - And that was like a metaphor and a mathematical construct
00:57:58.520 | that Turing put together
00:57:59.840 | to represent all possible computation.
00:58:02.160 | - All possible computation.
00:58:03.560 | Now, one of these so-called Turing machines
00:58:06.200 | is too simple to be useful in practice,
00:58:09.260 | but for theoretical purposes,
00:58:11.320 | we can depend on the fact that an algorithm
00:58:15.820 | for any computer can be translated
00:58:18.800 | into one that would run on a Turing machine.
00:58:21.280 | And then using that fact,
00:58:26.260 | he could sort of describe
00:58:29.220 | any possible non-deterministic polynomial time algorithm,
00:58:35.660 | any algorithm for a problem in NP
00:58:40.020 | could be expressed as a sequence of moves
00:58:44.520 | of the Turing machine described
00:58:47.360 | in terms of reading a symbol on the tape
00:58:51.520 | while you're in a given state
00:58:55.560 | and moving to a new state
00:58:56.840 | and leaving behind a new symbol.
00:58:59.960 | And given that fact that any
00:59:04.800 | non-deterministic polynomial time algorithm
00:59:07.640 | can be described by a list of such instructions,
00:59:12.640 | you could translate the problem
00:59:15.840 | into the language of the satisfiability problem.
00:59:19.080 | - Is that amazing to you, by the way,
00:59:20.360 | if you take yourself back when you were first thinking
00:59:22.300 | about this space of problems?
00:59:23.520 | How amazing is that?
00:59:26.760 | - It's astonishing.
00:59:27.920 | When you look at Cook's proof,
00:59:30.320 | it's not too difficult to sort of figure out
00:59:34.520 | why this is so,
00:59:38.520 | but the implications are staggering.
00:59:40.760 | It tells us that this, of all the problems in NP,
00:59:46.280 | all the problems where solutions are easy to check,
00:59:51.120 | they can all be rewritten in terms of
00:59:54.600 | the satisfiability problem.
00:59:57.760 | - Yeah, it's adding so much more weight
01:00:04.040 | to the P equals NP question,
01:00:06.840 | because all it takes is to show that one--
01:00:09.320 | - That's right.
01:00:10.160 | - One algorithm in this class--
01:00:11.240 | - So the P versus NP can be re-expressed
01:00:13.720 | as simply asking whether the satisfiability problem
01:00:17.560 | of propositional logic is solvable in polynomial time.
01:00:21.460 | But there's more.
01:00:25.120 | I encountered Cook's paper
01:00:30.320 | when he published it in a conference in 1971.
01:00:34.560 | Yeah, so when I saw Cook's paper
01:00:37.720 | and saw this reduction of each of the problems in NP
01:00:44.360 | by a uniform method to the satisfiability problem
01:00:49.120 | of propositional logic,
01:00:50.820 | that meant that the satisfiability problem
01:00:54.560 | was a universal combinatorial problem.
01:00:57.640 | And it occurred to me through experience I had had
01:01:04.080 | in trying to solve other combinatorial problems
01:01:07.720 | that there were many other problems
01:01:10.380 | which seemed to have that universal structure.
01:01:14.620 | And so I began looking for reductions
01:01:19.780 | from the satisfiability to other problems.
01:01:26.060 | One of the other problems would be
01:01:32.460 | the so-called integer programming problem
01:01:35.660 | of determining whether there's a solution
01:01:40.260 | to a set of linear inequalities
01:01:45.100 | involving integer variables.
01:01:47.260 | - Just like linear programming,
01:01:49.540 | but there's a constraint
01:01:50.900 | that the variables must remain integers.
01:01:53.620 | - Integers, in fact, must be either zero or one.
01:01:56.380 | It could only take on those values.
01:01:58.500 | - And that makes the problem much harder.
01:02:00.820 | - Yes, that makes the problem much harder.
01:02:03.540 | And it was not difficult to show
01:02:07.340 | that the satisfiability problem can be restated
01:02:11.660 | as an integer programming problem.
01:02:13.860 | - So can you pause on that?
01:02:15.220 | Was that one of the first problem mappings
01:02:17.220 | that you tried to do?
01:02:19.100 | And how hard is that map?
01:02:20.380 | And you said it wasn't hard to show,
01:02:21.760 | but that's a big leap.
01:02:26.760 | - It is a big leap, yeah.
01:02:29.340 | Well, let me give you another example.
01:02:32.980 | Another problem in NP
01:02:35.180 | is whether a graph contains a clique of a given size.
01:02:39.620 | And now the question is,
01:02:46.700 | can we reduce the propositional logic problem
01:02:51.220 | to the problem of whether there's a clique
01:02:55.500 | of a certain size?
01:02:56.720 | Well, if you look at the propositional logic problem,
01:03:01.300 | it can be expressed as a number of clauses,
01:03:05.460 | each of which is of the form A or B or C,
01:03:10.460 | where A is either one of the variables in the problem
01:03:18.380 | or the negation of one of the variables.
01:03:20.860 | And an instance of the propositional logic problem
01:03:30.220 | can be rewritten using operations of Boolean logic.
01:03:35.220 | Can be rewritten as the conjunction of a set of clauses,
01:03:41.300 | the and of a set of ors,
01:03:43.700 | where each clause is a disjunction,
01:03:48.020 | an or of variables or negated variables.
01:03:51.520 | So the question of,
01:03:58.940 | in the satisfiability problem,
01:04:01.220 | is whether those clauses can be simultaneously satisfied.
01:04:05.980 | Now to satisfy all those clauses,
01:04:09.100 | you have to find one of the terms in each clause,
01:04:13.100 | which is going to be true in your truth assignment,
01:04:18.940 | but you can't make the same variable both true and false.
01:04:24.740 | So if you have the variable A in one clause
01:04:29.580 | and you want to satisfy that clause by making A true,
01:04:34.260 | you can't also make the complement of A true
01:04:38.380 | in some other clause.
01:04:39.700 | - And so the goal is to make every single clause true
01:04:43.060 | if it's possible to satisfy this.
01:04:45.220 | And the way you make it true is at least...
01:04:47.940 | - One term in the clause must be true.
01:04:52.460 | - Got it.
01:04:53.940 | So now we, to convert this problem
01:04:58.420 | to something called the independent set problem,
01:05:01.180 | where you're just sort of asking for a set of vertices
01:05:06.180 | in a graph such that no two of them are adjacent,
01:05:08.860 | sort of the opposite of the clique problem.
01:05:11.060 | So we've seen that we can now express that as...
01:05:19.980 | Finding a set of terms, one in each clause,
01:05:24.980 | without picking both the variable
01:05:36.220 | and the negation of that variable.
01:05:39.020 | Because if the variable is assigned the truth value,
01:05:42.860 | the negated variable has to have the opposite truth value.
01:05:46.300 | - Right.
01:05:47.140 | - And so we can construct a graph
01:05:50.420 | where the vertices are the terms in all of the clauses.
01:05:55.420 | And you have an edge between two terms,
01:06:03.140 | if an edge between two occurrences of terms.
01:06:14.740 | Either if they're both in the same clause,
01:06:17.020 | because you're only picking one element from each clause.
01:06:20.740 | And also an edge between them
01:06:23.380 | if they represent opposite values of the same variable,
01:06:27.300 | because you can't make a variable both true and false.
01:06:30.660 | And so you get a graph
01:06:32.460 | where you have all of these occurrences of variables.
01:06:35.460 | You have edges, which mean that you're not allowed
01:06:38.780 | to choose both of the variables.
01:06:42.380 | Both ends of the edge,
01:06:44.860 | either because they're in the same clause
01:06:46.940 | or they're negations of one another.
01:06:50.580 | - Right, and that's a, first of all,
01:06:53.180 | sort of to zoom out, that's a really powerful idea
01:06:56.300 | that you could take a graph
01:06:59.100 | and connect it to a logic equation somehow.
01:07:03.940 | And do that mapping for all possible formulations
01:07:08.100 | of a particular problem on a graph.
01:07:10.180 | - Yeah.
01:07:11.020 | - I mean, that still is hard for me to believe
01:07:15.460 | that that's possible.
01:07:17.640 | Like, what do you make of that,
01:07:20.820 | that there's such a union of,
01:07:24.860 | there's such a friendship among all these problems across
01:07:28.780 | that somehow are akin to combinatorial algorithms,
01:07:33.780 | that they're all somehow related.
01:07:35.940 | I know it can be proven, but what do you make of it,
01:07:39.980 | that that's true?
01:07:41.740 | - Well, that they just have the same expressive power.
01:07:46.780 | You can take any one of them
01:07:49.580 | and translate it into the terms of the other.
01:07:53.460 | - The fact that they have the same expressive power
01:07:55.580 | also somehow means that they can be translatable.
01:07:59.020 | - Right.
01:08:00.180 | And what I did in the 1971 paper
01:08:03.540 | was to take 21 fundamental problems,
01:08:09.500 | commonly occurring problems of packing,
01:08:12.380 | covering, matching, and so forth,
01:08:14.420 | lying in the class NP,
01:08:19.220 | and show that the satisfiability problem
01:08:21.860 | can be re-expressed as any of those,
01:08:24.300 | that any of those have the same expressive power.
01:08:30.340 | - And that was like throwing down the gauntlet
01:08:31.980 | of saying there's probably many more problems like this.
01:08:35.900 | - Right.
01:08:36.740 | - But that's just saying that,
01:08:37.660 | look, that they're all the same.
01:08:39.660 | - They're all the same, but not exactly.
01:08:43.140 | Yeah, they're all the same in terms of
01:08:45.180 | whether they are rich enough to express any of the others.
01:08:50.180 | But that doesn't mean that they have
01:08:55.260 | the same computational complexity.
01:08:57.780 | But what we can say is that either all of these problems
01:09:02.220 | or none of them are solvable in polynomial time.
01:09:05.820 | - Yeah, so where does NP completeness and NP hard
01:09:10.300 | as classes fit?
01:09:11.140 | - Oh, that's just a small technicality.
01:09:14.060 | So when we're talking about decision problems,
01:09:17.300 | that means that the answer is just yes or no.
01:09:20.940 | There is a clique of size 15
01:09:23.300 | or there's not a clique of size 15.
01:09:25.540 | On the other hand, an optimization problem would be asking,
01:09:29.980 | find the largest clique.
01:09:33.180 | The answer would not be yes or no,
01:09:34.980 | it would be 15.
01:09:36.500 | So when you're asking for the,
01:09:41.300 | when you're putting a valuation on the different solutions
01:09:46.660 | and you're asking for the one with the highest valuation,
01:09:49.060 | that's an optimization problem.
01:09:51.380 | And there's a very close affinity
01:09:52.900 | between the two kinds of problems.
01:09:55.420 | But the counterpart of being the hardest decision problem,
01:10:00.420 | the hardest yes/no problem,
01:10:04.220 | the counterpart of that is to minimize
01:10:09.220 | or maximize an objective function.
01:10:13.380 | And so a problem that's hardest in the class
01:10:16.380 | when viewed in terms of optimization,
01:10:19.900 | those are called NP hard rather than NP complete.
01:10:24.420 | - And NP complete is for decision problems.
01:10:26.260 | - And NP complete is for decision problems.
01:10:29.900 | - So if somebody shows that P equals NP,
01:10:34.900 | what do you think that proof will look like
01:10:39.420 | if you were to put on yourself,
01:10:41.500 | if it's possible to show that as a proof
01:10:45.300 | or to demonstrate an algorithm?
01:10:47.260 | - All I can say is that it will involve concepts
01:10:52.220 | that we do not now have and approaches that we don't have.
01:10:56.380 | - Do you think those concepts are out there
01:10:58.580 | in terms of inside complexity theory,
01:11:01.980 | inside of computational analysis of algorithms?
01:11:04.740 | Do you think there's concepts
01:11:05.820 | that are totally outside of the box
01:11:07.900 | that we haven't considered yet?
01:11:09.140 | - I think that if there is a proof that P is equal to NP
01:11:13.180 | or that P is not equal to NP,
01:11:15.040 | it'll depend on concepts that are now outside the box.
01:11:21.320 | - Now if that's shown, either way, P equals NP or P not,
01:11:25.900 | well, actually P equals NP,
01:11:28.240 | what impact, you kind of mentioned a little bit,
01:11:32.260 | but can you linger on it?
01:11:34.140 | What kind of impact would it have
01:11:36.760 | on theoretical computer science
01:11:38.340 | and perhaps software systems in general?
01:11:42.180 | - Well, I think it would have enormous impact
01:11:44.700 | on the world in either way case.
01:11:49.160 | If P is unequal to NP, which is what we expect,
01:11:52.240 | then we know that for the great majority
01:11:56.860 | of the combinatorial problems that come up,
01:11:59.500 | since they're known to be NP complete,
01:12:02.100 | we're not going to be able to solve them
01:12:05.060 | by efficient algorithms.
01:12:07.780 | However, there's a little bit of hope
01:12:11.620 | in that it may be that we can solve most instances.
01:12:16.580 | All we know is that if a problem is not in P,
01:12:19.860 | then it can't be solved efficiently on all instances.
01:12:25.540 | But basically, it will,
01:12:27.500 | if we find that P is unequal to NP,
01:12:32.700 | it will mean that we can't expect always
01:12:35.300 | to get the optimal solutions to these problems.
01:12:38.640 | And we have to depend on heuristics
01:12:41.020 | that perhaps work most of the time
01:12:43.140 | or give us good approximate solutions.
01:12:45.820 | - So we would turn our eye towards the heuristics
01:12:52.220 | with a little bit more acceptance and comfort on our hearts.
01:12:56.420 | - Exactly.
01:12:57.580 | - Okay, so let me ask a romanticized question.
01:13:01.060 | What to you is one of the most
01:13:04.460 | or the most beautiful combinatorial algorithm
01:13:08.180 | in your own life or just in general
01:13:10.180 | in the field that you've ever come across
01:13:12.500 | or have developed yourself?
01:13:14.100 | - I like the stable matching problem
01:13:17.820 | or the stable marriage problem very much.
01:13:22.780 | - What's the stable matching problem?
01:13:25.180 | - Yeah.
01:13:26.460 | Imagine that you want to marry off N boys
01:13:31.460 | with N girls.
01:13:34.140 | And each boy has an ordered list
01:13:39.940 | of his preferences among the girls,
01:13:42.340 | his first choice, his second choice,
01:13:44.780 | through her Nth choice.
01:13:47.400 | And each girl also has an ordering of the boys,
01:13:52.400 | his first choice, second choice, and so on.
01:13:57.580 | And we'll say that a matching,
01:14:03.460 | a one-to-one matching of the boys with the girls is stable
01:14:07.580 | if there are no two couples in the matching
01:14:15.140 | such that the boy in the first couple
01:14:18.660 | prefers the girl in the second couple to her mate
01:14:23.220 | and she prefers the boy to her current mate.
01:14:27.060 | In other words, if there is, the matching is stable
01:14:31.300 | if there is no pair who want to run away with each other,
01:14:35.460 | leaving their partners behind.
01:14:37.340 | - Gotcha. (laughs)
01:14:41.220 | Yeah.
01:14:44.940 | - Actually, this is relevant to matching residents
01:14:49.100 | with hospitals and some other real-life problems,
01:14:52.300 | although not quite in the form that I described.
01:14:55.580 | So it turns out that there is,
01:14:59.780 | that for any set of preferences, a stable matching exists.
01:15:04.780 | And moreover, it can be computed by a simple algorithm
01:15:14.020 | in which each boy starts making proposals to girls.
01:15:19.020 | And if a girl receives a proposal,
01:15:23.940 | she accepts it tentatively, but she can drop it if,
01:15:27.780 | she can end it, she can drop it later
01:15:32.780 | if she gets a better proposal from her point of view.
01:15:35.740 | And the boys start going down their list,
01:15:39.000 | proposing to their first, second, third choices
01:15:41.980 | until stopping when a proposal is accepted.
01:15:46.980 | But the girls, meanwhile, are watching the proposals
01:15:53.380 | that are coming into them,
01:15:55.380 | and the girl will drop her current partner
01:15:58.700 | if she gets a better proposal.
01:16:03.500 | - And the boys never go back through the list?
01:16:06.300 | - They never go back, yeah.
01:16:07.580 | So once they've been denied.
01:16:09.700 | (both laugh)
01:16:11.740 | - They don't try again?
01:16:12.820 | - They don't try again,
01:16:14.640 | because the girls are always improving their status
01:16:19.220 | as they receive better and better proposals.
01:16:22.780 | The boys are going down their list,
01:16:25.100 | starting with their top preferences.
01:16:28.580 | And one can prove that the process will come to an end
01:16:39.540 | where everybody will get matched with somebody,
01:16:43.460 | and you won't have any pair that want to abscond
01:16:48.460 | from each other.
01:16:50.420 | - Do you find the proof or the algorithm itself beautiful,
01:16:54.140 | or is it the fact that with the simplicity
01:16:56.700 | of just the two marching,
01:16:59.540 | I mean, the simplicity of the underlying rule
01:17:01.780 | of the algorithm, is that the beautiful part?
01:17:04.820 | - Both, I would say.
01:17:07.540 | And you also have the observation that you might ask,
01:17:11.740 | who is better off, the boys who are doing the proposing
01:17:14.700 | or the girls who are reacting to proposals?
01:17:17.740 | And it turns out that it's the boys who are doing the best,
01:17:22.740 | and as each boy is doing at least as well
01:17:25.900 | as he could do in any other staple matching.
01:17:30.520 | So there's a sort of lesson for the boys
01:17:33.220 | that you should go out and be proactive
01:17:36.100 | and make those proposals, go for broke.
01:17:39.680 | - I don't know if this is directly mappable
01:17:43.460 | philosophically to our society,
01:17:45.120 | but certainly seems like a compelling notion.
01:17:48.140 | And like you said, there's probably a lot
01:17:51.340 | of actual real-world problems that this could be mapped to.
01:17:54.580 | - Yeah, well, you get complications.
01:17:58.620 | For example, what happens when a husband and wife
01:18:01.500 | want to be assigned to the same hospital?
01:18:03.740 | So you have to take those constraints into account.
01:18:08.740 | And then the problem becomes NP-hard.
01:18:13.100 | - Why is it a problem for the husband and wife
01:18:18.020 | to be assigned to the same hospital?
01:18:19.980 | - No, it's desirable.
01:18:21.660 | - Desirable.
01:18:22.500 | - Or at least go to the same city.
01:18:24.060 | So you can't, if you're--
01:18:26.540 | - I see.
01:18:27.380 | - If you're assigning residents to hospitals.
01:18:29.580 | - And then you have some preferences
01:18:32.100 | for the husband and wife or for the hospitals.
01:18:34.620 | - The residents have their own preferences.
01:18:37.100 | References, residents, both male and female,
01:18:41.660 | have their own preferences.
01:18:43.220 | The hospitals have their preferences.
01:18:47.700 | But if resident A, the boy, is going to Philadelphia,
01:18:52.700 | then you'd like his wife also to be assigned to a hospital.
01:19:01.220 | - To be assigned to a hospital in Philadelphia.
01:19:04.420 | - Which step makes it a NP-hard problem that you mentioned?
01:19:08.020 | - The fact that you have this additional constraint.
01:19:11.140 | That it's not just the preferences of individuals,
01:19:15.020 | but the fact that the two partners to a marriage
01:19:19.820 | have to be assigned to the same place.
01:19:22.860 | - I'm being a little dense.
01:19:29.300 | The perfect matching?
01:19:30.780 | No, not the perfect, stable matching is what you refer to.
01:19:33.860 | That's when two partners are trying to--
01:19:36.060 | - Okay, what's confusing you is that in the first
01:19:39.260 | interpretation of the problem,
01:19:40.740 | I had boys matching with girls.
01:19:42.860 | - Yes.
01:19:44.180 | - In the second interpretation,
01:19:46.500 | you have humans matching with institutions.
01:19:49.620 | - With institutions.
01:19:50.460 | And there's a coupling between within the,
01:19:54.340 | gotcha, within the humans.
01:19:56.780 | - Any added little constraint will make it
01:19:59.180 | an NP-hard problem.
01:20:00.380 | - Well, yeah.
01:20:01.500 | - Okay.
01:20:04.220 | By the way, the algalene you mentioned,
01:20:06.180 | was it one of yours?
01:20:07.780 | - No, no, that was due to Gale and Shapley.
01:20:11.500 | My friend David Gale passed away before he could get part
01:20:16.860 | of the Nobel Prize, but his partner Shapley shared
01:20:21.780 | in the Nobel Prize with somebody else for--
01:20:24.340 | - Economics?
01:20:25.180 | - For economics, for ideas stemming
01:20:29.700 | from the stable matching idea.
01:20:31.540 | - So you've also developed yourself some elegant,
01:20:36.060 | beautiful algorithms.
01:20:37.340 | Again, picking your children, so the Robin Karp algorithm
01:20:42.340 | for string searching, pattern matching, Edmund Karp
01:20:45.060 | algorithm for max flows we mentioned,
01:20:46.860 | Hopcroft Karp algorithm for finding maximum cardinality
01:20:50.260 | matchings in bipartite graphs.
01:20:52.180 | Is there ones that stand out to you,
01:20:55.460 | ones you're most proud of, or just whether it's beauty,
01:21:00.460 | elegance, or just being the right discovery,
01:21:05.740 | development in your life that you're especially proud of?
01:21:08.740 | - I like the Robin Karp algorithm because it illustrates
01:21:13.140 | the power of randomization.
01:21:17.500 | So the problem there is to decide whether a given long string
01:21:22.500 | of symbols from some alphabet contains a given word,
01:21:38.220 | whether a particular word occurs within some very much
01:21:42.940 | longer word, and so the idea of the algorithm
01:21:47.940 | is to associate with the word that we're looking for,
01:21:57.180 | a fingerprint, some number or some combinatorial object
01:22:02.180 | that describes that word, and then to look for an occurrence
01:22:12.260 | of that same fingerprint as you slide along the longer word.
01:22:15.660 | And what we do is we associate with each word a number.
01:22:23.500 | So we, first of all, we think of the letters that occur
01:22:29.660 | in a word as the digits of, let's say, decimal
01:22:33.620 | or whatever base here, whatever number of different symbols
01:22:38.620 | there are in the alphabet.
01:22:42.020 | - That's the base of the numbers, yeah.
01:22:44.340 | - Right, so every word can then be thought of as a number
01:22:48.180 | with the letters being the digits of that number.
01:22:52.380 | And then we pick a random prime number in a certain range
01:22:58.380 | and we take that word viewed as a number
01:23:03.260 | and take the remainder on dividing that number by the prime.
01:23:09.260 | - So coming up with a nice hash function.
01:23:14.260 | - It's a kind of hash function.
01:23:16.620 | - Yeah, it gives you a little shortcut
01:23:20.820 | for that particular word.
01:23:22.420 | - Yeah, so that's the--
01:23:26.380 | - It's very different than other algorithms of its kind
01:23:31.060 | that were trying to do search, string matching.
01:23:35.540 | - Yeah, which usually are combinatorial
01:23:38.060 | and don't involve the idea of taking a random fingerprint.
01:23:42.660 | - Yes.
01:23:43.620 | - And doing the fingerprinting has two advantages.
01:23:48.060 | One is that as we slide along the long word,
01:23:51.580 | digit by digit, we keep a window of a certain size,
01:23:56.580 | the size of the word we're looking for.
01:24:00.660 | And we compute the fingerprint of every stretch
01:24:05.780 | of that length and it turns out that just a couple
01:24:09.500 | of arithmetical operations will take you
01:24:11.900 | from the fingerprint of one part to what you get
01:24:16.380 | when you slide over by one position.
01:24:18.740 | So the computation of all the fingerprints is simple.
01:24:24.540 | And secondly, it's unlikely if the prime is chosen randomly
01:24:32.780 | from a certain range that you will get two of the segments
01:24:37.540 | in question having the same fingerprint.
01:24:40.020 | And so there's a small probability of error
01:24:43.980 | which can be checked after the fact
01:24:46.500 | and also the ease of doing the computation
01:24:48.740 | because you're working with these fingerprints
01:24:51.620 | which are remainder's modulo some big prime.
01:24:54.640 | - So that's the magical thing about randomized algorithms
01:24:58.040 | is that if you add a little bit of randomness,
01:25:02.460 | it somehow allows you to take a pretty naive approach,
01:25:05.420 | a simple looking approach
01:25:07.260 | and allow it to run extremely well.
01:25:10.700 | So can you maybe take a step back and say,
01:25:14.140 | like what is a randomized algorithm,
01:25:16.020 | this category of algorithms?
01:25:18.340 | - Well, it's just the ability to draw a random number
01:25:22.460 | from some range or to associate a random number
01:25:30.480 | with some object or to draw at random from some set.
01:25:35.280 | So another example is very simple
01:25:40.280 | if we're conducting a presidential election
01:25:45.360 | and we would like to pick the winner.
01:25:50.400 | In principle, we could draw a random sample
01:25:57.320 | of all of the voters in the country.
01:26:00.160 | And if it was of substantial size, say a few thousand
01:26:05.160 | then the most popular candidate in that group
01:26:08.920 | would be very likely to be the correct choice
01:26:12.280 | that would come out of counting all the millions of votes.
01:26:15.840 | And of course we can't do this
01:26:17.480 | because first of all,
01:26:18.440 | everybody has to feel that his or her vote counted.
01:26:21.920 | And secondly, we can't really do a purely random sample
01:26:25.280 | from that population.
01:26:28.000 | And I guess thirdly, there could be a tie in which case
01:26:31.080 | we wouldn't have a significant difference
01:26:34.080 | between two candidates.
01:26:36.360 | - But those things aside,
01:26:37.560 | if you didn't have all that messiness of human beings,
01:26:40.480 | you could prove that that kind of random picking would--
01:26:43.320 | - You guess that random picking would solve the problem
01:26:48.000 | with a very low probability of error.
01:26:51.360 | Another example is testing whether a number is prime.
01:26:55.520 | So if I want to test whether 17 is prime,
01:27:00.280 | I could pick any number between one and 17
01:27:06.760 | and raise it to the 16th power modulo 17
01:27:12.320 | and you should get back the original number.
01:27:15.040 | That's a famous formula due to Fermat
01:27:18.800 | about it's called Fermat's little theorem
01:27:21.240 | that if you take any number A in the range
01:27:26.240 | zero through N minus one
01:27:31.000 | and raise it to the N minus one power modulo N,
01:27:37.080 | you'll get back the number A if A is prime.
01:27:43.280 | So if you don't get back the number A,
01:27:45.920 | that's a proof that a number is not prime.
01:27:49.840 | - Wow. (laughs)
01:27:52.360 | - And you can show that suitably define
01:27:57.360 | the probability that you will get a value unequal,
01:28:05.280 | you will get a violation of Fermat's result is very high
01:28:14.320 | and so this gives you a way of rapidly proving
01:28:18.600 | that a number is not prime.
01:28:20.040 | It's a little more complicated than that
01:28:22.840 | because there are certain values of N
01:28:26.320 | where something a little more elaborate has to be done
01:28:28.640 | but that's the basic idea.
01:28:30.220 | Taking an identity that holds for primes
01:28:34.920 | and therefore if it ever fails on any instance
01:28:39.500 | for a non-prime, you know that the number is not prime.
01:28:43.480 | It's a quick choice, a fast choice,
01:28:45.720 | fast proof that a number is not prime.
01:28:48.760 | - Can you maybe elaborate a little bit more
01:28:50.920 | of what's your intuition why randomness works so well
01:28:54.200 | and results in such simple algorithms?
01:28:56.460 | - Well, the example of conducting an election
01:29:00.840 | where you could take in theory, you could take a sample
01:29:04.360 | and depend on the validity of the sample
01:29:07.060 | to really represent the whole,
01:29:09.200 | is it just the basic fact of statistics
01:29:12.040 | which gives a lot of opportunities.
01:29:17.760 | And I actually exploited that sort of random sampling idea
01:29:22.760 | in designing an algorithm for counting the number
01:29:28.120 | of solutions that satisfy a particular formula
01:29:33.560 | and propositional logic.
01:29:36.880 | - A particular, so some version
01:29:41.280 | of the satisfiability problem or?
01:29:44.360 | - A version of the satisfiability problem.
01:29:47.480 | - Is there some interesting insight
01:29:49.400 | that you wanna elaborate on?
01:29:50.480 | Like what some aspect of that algorithm
01:29:53.320 | that might be useful to describe?
01:29:57.480 | - So you have a collection of formulas
01:30:02.480 | and you want to count the number of solutions
01:30:14.400 | that satisfy at least one of the formulas.
01:30:18.920 | And you can count the number of solutions
01:30:23.480 | that satisfy any particular one of the formulas,
01:30:27.320 | but you have to account for the fact
01:30:29.920 | that that solution might be counted many times
01:30:33.700 | if it solves more than one of the formulas.
01:30:40.840 | And so what you do is you sample from the formulas
01:30:45.840 | according to the number of solutions
01:30:49.440 | that satisfy each individual one.
01:30:52.300 | And that way you draw a random solution,
01:30:55.640 | but then you correct by looking at the number of formulas
01:31:00.200 | that satisfy that random solution and don't double count.
01:31:08.840 | So if you, you can think of it this way.
01:31:11.600 | So you have a matrix of zeros and ones,
01:31:15.160 | and you wanna know how many columns
01:31:18.160 | of that matrix contain at least one one.
01:31:20.600 | And you can count in each row how many ones there are.
01:31:26.000 | So what you can do is draw from the rows
01:31:29.440 | according to the number of ones.
01:31:31.680 | If a row has more ones, it gets drawn more frequently.
01:31:35.980 | But then if you draw from that row,
01:31:39.060 | you have to go up the column and looking at
01:31:42.580 | where that same one is repeated in different rows
01:31:45.940 | and only count it as a success or a hit
01:31:51.260 | if it's the earliest row that contains the one.
01:31:54.840 | - Right.
01:31:55.680 | - And that gives you a robust statistical estimate
01:32:00.300 | of the total number of columns
01:32:02.020 | that contain at least one of the ones.
01:32:04.540 | So that is an example of the same principle
01:32:09.020 | that was used in studying random sampling.
01:32:13.340 | Another viewpoint is that if you have a phenomenon
01:32:18.340 | that occurs almost all the time,
01:32:21.400 | then if you sample one of the occasions where it occurs,
01:32:27.580 | you're most likely to, and you're looking for an occurrence,
01:32:32.620 | a random occurrence is likely to work.
01:32:34.820 | So that comes up in solving identities,
01:32:39.420 | solving algebraic identities.
01:32:42.660 | You get two formulas that may look very different.
01:32:46.460 | You wanna know if they're really identical.
01:32:48.960 | What you can do is just pick a random value
01:32:52.860 | and evaluate the formulas at that value
01:32:55.980 | and seeing if they agree.
01:32:58.820 | And you depend on the fact
01:33:01.980 | that if the formulas are distinct,
01:33:04.300 | then they're gonna disagree a lot.
01:33:06.780 | And so therefore a random choice
01:33:08.460 | will exhibit the disagreement.
01:33:10.600 | And if there are many ways for the two to disagree,
01:33:15.200 | and you only need to find one disagreement,
01:33:18.540 | then random choice is likely to yield it.
01:33:22.500 | - And in general,
01:33:23.540 | so we've just talked about randomized algorithms,
01:33:26.020 | but we can look at the probabilistic analysis
01:33:28.620 | of algorithms.
01:33:29.700 | And that gives us an opportunity to step back.
01:33:32.060 | And as we've said,
01:33:34.260 | everything we've been talking about is worst case analysis.
01:33:37.220 | - Right.
01:33:38.060 | - Could you maybe comment on the usefulness
01:33:43.060 | and the power of worst case analysis
01:33:45.420 | versus best case analysis, average case, probabilistic?
01:33:50.420 | How do we think about the future
01:33:52.760 | of theoretical computer science, computer science,
01:33:56.620 | in the kind of analysis we do of algorithms?
01:33:59.100 | Does worst case analysis still have a place,
01:34:01.620 | an important place,
01:34:02.780 | or do we want to try to move forward
01:34:04.600 | towards kind of average case analysis?
01:34:06.700 | - Yeah.
01:34:07.540 | - And what are the challenges there?
01:34:09.340 | - So if worst case analysis shows
01:34:11.580 | that an algorithm is always good, that's fine.
01:34:16.140 | If worst case analysis is used to show
01:34:23.580 | that the problem, that the solution is not always good,
01:34:28.580 | then you have to step back and do something else to ask,
01:34:32.660 | how often will you get a good solution?
01:34:34.960 | - Just to pause on that for a second,
01:34:38.060 | that's so beautifully put,
01:34:40.180 | because I think we tend to judge algorithms.
01:34:43.300 | We throw them in the trash the moment
01:34:46.340 | their worst case is shown to be bad.
01:34:48.260 | - Right.
01:34:49.100 | And that's unfortunate.
01:34:50.700 | I think a good example is going back
01:34:55.700 | to the satisfiability problem.
01:34:58.860 | There are very powerful programs called SAT solvers,
01:35:04.500 | which in practice fairly reliably solve instances
01:35:09.500 | with many millions of variables that arise
01:35:12.480 | in a digital design or in proving programs correct
01:35:16.180 | in other applications.
01:35:20.220 | And so in many application areas,
01:35:24.460 | even though satisfiability,
01:35:25.980 | as we've already discussed, is NP-complete,
01:35:29.220 | the SAT solvers will work so well
01:35:34.820 | that the people in that discipline tend to think
01:35:37.780 | of satisfiability as an easy problem.
01:35:40.040 | So in other words, just for some reason
01:35:45.220 | that we don't entirely understand,
01:35:47.700 | the instances that people formulate
01:35:50.540 | in designing digital circuits or other applications
01:35:54.380 | are such that satisfiability is not hard to check.
01:35:59.380 | And even searching for a satisfying solution
01:36:07.400 | can be done efficiently in practice.
01:36:10.300 | And there are many examples.
01:36:13.280 | For example, we talked about
01:36:15.620 | the traveling salesman problem.
01:36:17.340 | So just to refresh our memories,
01:36:21.420 | the problem is you've got a set of cities,
01:36:23.580 | you have pairwise distances between cities,
01:36:27.040 | and you want to find a tour through all the cities
01:36:31.460 | that minimizes the total cost of all the edges traversed,
01:36:36.460 | all the trips between cities.
01:36:38.980 | The problem is NP-hard,
01:36:42.040 | but people using integer programming codes
01:36:46.160 | together with some other mathematical tricks
01:36:50.480 | can solve geometric instances of the problem
01:36:56.600 | where the cities are, let's say, points in the plane,
01:36:59.900 | and get optimal solutions to problems
01:37:03.320 | with tens of thousands of cities.
01:37:05.300 | Actually, it'll take a few computer months
01:37:08.460 | to solve a problem of that size,
01:37:10.500 | but for problems of size 1,000 or 2,
01:37:13.400 | it'll rapidly get optimal solutions,
01:37:16.520 | provably optimal solutions,
01:37:19.200 | even though, again, we know that it's unlikely
01:37:23.240 | that the traveling salesman problem
01:37:25.960 | can be solved in polynomial time.
01:37:28.640 | - Are there methodologies,
01:37:31.040 | like rigorous systematic methodologies for,
01:37:35.000 | you said in practice.
01:37:38.360 | In practice, this algorithm is pretty good.
01:37:40.060 | Are there systematic ways of saying,
01:37:42.200 | in practice, this one is pretty good?
01:37:43.840 | So in other words, average case analysis,
01:37:46.120 | or you've also mentioned that average case
01:37:49.080 | kind of requires you to understand
01:37:50.920 | what the typical case is, typical instances,
01:37:53.760 | and that might be really difficult.
01:37:55.520 | - That's very difficult.
01:37:56.600 | So after I did my original work
01:37:59.760 | on getting, showing all these problems to be NP-complete,
01:38:06.560 | I looked around for a way to get some,
01:38:09.740 | shed some positive light on combinatorial algorithms,
01:38:13.780 | and what I tried to do was to study
01:38:16.140 | problems, behavior on the average or with high probability,
01:38:23.380 | but I had to make some assumptions
01:38:26.140 | about what's the probability space,
01:38:29.680 | what's the sample space,
01:38:30.820 | what do we mean by typical problems?
01:38:33.820 | That's very hard to say,
01:38:35.260 | so I took the easy way out
01:38:37.680 | and made some very simplistic assumptions.
01:38:40.500 | So I assumed, for example,
01:38:42.000 | that if we were generating a graph
01:38:44.420 | with a certain number of vertices and edges,
01:38:47.440 | then we would generate the graph
01:38:48.920 | by simply choosing one edge at a time at random
01:38:53.840 | until we got the right number of edges.
01:38:56.820 | That's a particular model of random graphs
01:38:59.800 | that has been studied mathematically a lot,
01:39:02.880 | and within that model,
01:39:05.120 | I could prove all kinds of wonderful things,
01:39:07.560 | I and others who also worked on this.
01:39:10.640 | So we could show that we know exactly how many edges
01:39:15.640 | there have to be in order for
01:39:17.760 | there be a so-called Hamiltonian circuit
01:39:24.060 | that's a cycle that visits each vertex exactly once.
01:39:31.560 | We know that if the number of edges
01:39:35.240 | is a little bit more than n log n,
01:39:37.520 | where n is the number of vertices,
01:39:39.160 | then such a cycle is very likely to exist,
01:39:44.000 | and we can give a heuristic
01:39:45.680 | that'll find it with high probability.
01:39:47.880 | And we got the community in which I was working,
01:39:53.880 | got a lot of results along these lines,
01:39:58.540 | but the field tended to be rather lukewarm
01:40:03.540 | about accepting these results as meaningful
01:40:07.340 | because we were making such a simplistic assumption
01:40:09.900 | about the kinds of graphs that we would be dealing with.
01:40:13.980 | So we could show all kinds of wonderful things,
01:40:16.020 | it was a great playground, I enjoyed doing it,
01:40:18.900 | but after a while, I concluded that
01:40:27.220 | it didn't have a lot of bite
01:40:29.060 | in terms of the practical application.
01:40:31.320 | - Okay, so there's too much into the world of toy problems.
01:40:35.300 | - Yeah.
01:40:36.140 | - Okay, but all right, is there a way to find
01:40:41.140 | nice representative real-world impactful instances
01:40:45.100 | of a problem on which demonstrate that an algorithm is good?
01:40:48.840 | So this is kind of like the machine learning world,
01:40:51.380 | that's kind of what they at its best tries to do
01:40:54.220 | is find a dataset from the real world
01:40:57.940 | and show the performance,
01:40:59.380 | all the conferences are all focused
01:41:02.780 | on beating the performance on that real-world dataset.
01:41:07.180 | Is there an equivalent in the complexity analysis?
01:41:10.740 | - Not really.
01:41:12.840 | Don Knuth started to collect examples of graphs
01:41:19.380 | coming from various places,
01:41:21.620 | so he would have a whole zoo of different graphs
01:41:26.140 | that he could choose from
01:41:27.260 | and he could study the performance of algorithms
01:41:30.100 | on different types of graphs.
01:41:31.660 | - But there it's really important and compelling
01:41:37.300 | to be able to define a class of graphs.
01:41:39.980 | The actual act of defining a class of graphs
01:41:44.060 | that you're interested in,
01:41:44.900 | it seems to be a non-trivial step
01:41:47.740 | if we're talking about instances
01:41:49.460 | that we should care about in the real world.
01:41:51.540 | - Yeah, there's nothing available there
01:41:55.860 | that would be analogous to the training set
01:41:58.780 | for supervised learning,
01:42:00.180 | where you sort of assume that the world
01:42:03.820 | has given you a bunch of examples to work with.
01:42:08.820 | We don't really have that for problems,
01:42:14.580 | for combinatorial problems on graphs and networks.
01:42:18.220 | - You know, there's been a huge growth,
01:42:20.980 | a big growth of data sets available.
01:42:23.980 | Do you think some aspect of theoretical computer science,
01:42:28.180 | I might be contradicting my own question while saying it,
01:42:30.620 | but will there be some aspect,
01:42:33.400 | an empirical aspect of theoretical computer science
01:42:36.900 | which will allow the fact that these data sets are huge,
01:42:41.100 | we'll start using them for analysis?
01:42:45.660 | - If you want to say something about a graph algorithm,
01:42:48.980 | you might take a social network like Facebook
01:42:53.860 | and looking at subgraphs of that
01:42:55.540 | and prove something about the Facebook graph
01:42:58.740 | and at the same time be respected
01:43:01.860 | in the theoretical computer science community.
01:43:03.820 | - That hasn't been achieved yet, I'm afraid.
01:43:06.260 | - Is that P equals NP, is that impossible?
01:43:10.280 | Is it impossible to publish a successful paper
01:43:14.580 | in the theoretical computer science community
01:43:17.060 | that shows some performance on a real world data set?
01:43:22.060 | Or is that really just those are two different worlds?
01:43:26.340 | - They haven't really come together.
01:43:27.840 | I would say that there is a field
01:43:31.820 | of experimental algorithmics where people,
01:43:35.300 | sometimes they're given some family of examples.
01:43:39.940 | Sometimes they just generate them at random
01:43:42.620 | and they report on performance,
01:43:44.860 | but there's no convincing evidence
01:43:51.580 | that the sample is representative of anything at all.
01:43:56.700 | - So let me ask in terms of breakthroughs and open problems,
01:44:02.500 | what are the most compelling open problems to you
01:44:05.620 | and what possible breakthroughs do you see in the near term
01:44:08.260 | in terms of theoretical computer science?
01:44:12.900 | - Well, there are all kinds of relationships
01:44:15.460 | among complexity classes that can be studied.
01:44:18.920 | Just to mention one thing,
01:44:21.820 | I wrote a paper with Richard Lipton in 1979
01:44:26.260 | where we asked the following question.
01:44:30.940 | If you take a problem, a combinatorial problem in NP,
01:44:40.620 | let's say, and you choose,
01:44:45.620 | and you pick the size of the problem,
01:44:51.540 | say it's a traveling salesman problem, but of size 52.
01:44:58.060 | And you ask, could you get an efficient,
01:45:03.460 | a small Boolean circuit tailored for that size, 52,
01:45:09.860 | where you could feed the edges of the graph
01:45:12.500 | in as Boolean inputs and get as an output,
01:45:16.660 | the question of whether or not
01:45:17.940 | there's a tour of a certain length.
01:45:20.040 | And that would, in other words,
01:45:23.220 | briefly what you would say in that case
01:45:25.900 | is that the problem has small circuits,
01:45:28.660 | polynomial size circuits.
01:45:30.420 | Now we know that if P is equal to NP,
01:45:35.380 | then in fact, these problems will have small circuits,
01:45:39.700 | but what about the converse?
01:45:41.340 | Could a problem have small circuits,
01:45:43.220 | meaning that an algorithm tailored
01:45:45.940 | to any particular size could work well
01:45:48.540 | and yet not be a polynomial time algorithm?
01:45:52.220 | That is, you couldn't write it
01:45:53.420 | as a single uniform algorithm good for all sizes.
01:45:56.940 | - Just to clarify, small circuits
01:45:59.780 | for a problem of particular size
01:46:02.180 | or even further constraint,
01:46:03.980 | small circuit for a particular--
01:46:06.180 | - No, for all the inputs of that size.
01:46:09.180 | - Of that size.
01:46:10.100 | Is that a trivial problem for a particular instance?
01:46:13.620 | So coming up, an automated way of coming up with a circuit,
01:46:17.900 | I guess that's just--
01:46:18.740 | - That would be hard, yeah.
01:46:21.140 | But there's the existential question.
01:46:25.340 | Everybody talks nowadays about existential questions,
01:46:28.940 | existential challenges.
01:46:33.260 | You could ask the question,
01:46:38.860 | does the Hamiltonian circuit problem
01:46:43.860 | have a small circuit for every size?
01:46:48.900 | For each size, a different small circuit?
01:46:51.780 | In other words, could you tailor solutions
01:46:55.620 | depending on the size and get polynomial size?
01:47:00.620 | - Even if P is not equal to NP.
01:47:02.660 | - Right.
01:47:03.500 | - That would be fascinating if that's true.
01:47:08.660 | - Yeah, what we proved is that if that were possible,
01:47:13.660 | then something strange would happen in complexity theory.
01:47:18.140 | Some high-level class, which I could briefly describe,
01:47:23.980 | something strange would happen.
01:47:28.420 | So I'll take a stab at describing what I mean by that.
01:47:33.100 | - Let's go there.
01:47:33.980 | - So we have to define this hierarchy
01:47:37.740 | in which the first level of the hierarchy is P
01:47:41.500 | and the second level is NP.
01:47:44.140 | And what is NP?
01:47:45.340 | NP involves statements of the form,
01:47:48.300 | there exists a something such that something holds.
01:47:52.380 | So for example,
01:47:56.740 | there exists a coloring such that a graph can be colored
01:48:01.940 | with only that number of colors.
01:48:06.700 | Or there exists a Hamiltonian circuit.
01:48:09.180 | - There's a statement about this graph.
01:48:10.860 | - Yeah.
01:48:11.700 | And NP deals with statements of that kind,
01:48:22.940 | that there exists a solution.
01:48:24.660 | Now you could imagine a more complicated expression,
01:48:32.700 | which says for all X, there exists a Y
01:48:37.700 | such that some proposition holds involving both X and Y.
01:48:43.700 | So that would say, for example, in game theory,
01:48:50.140 | for all strategies for the first player,
01:48:55.020 | there exists a strategy for the second player
01:48:57.820 | such that the first player wins.
01:48:59.660 | That would be at the second level of the hierarchy.
01:49:03.460 | The third level would be, there exists an A
01:49:06.100 | such that for all B, there exists a C that something holds.
01:49:09.340 | And you can imagine going higher and higher in the hierarchy
01:49:12.820 | and you'd expect that the complexity classes
01:49:17.620 | that correspond to those different cases
01:49:22.540 | would get bigger and bigger.
01:49:25.380 | - What do you mean by bigger and bigger?
01:49:28.940 | - Sorry, they'd get harder and harder to solve.
01:49:31.220 | - Harder and harder, right.
01:49:32.060 | - Harder and harder to solve.
01:49:34.180 | And what Lyfton and I showed was that
01:49:38.100 | if NP had small circuits,
01:49:41.700 | then this hierarchy would collapse down to the second level.
01:49:45.500 | In other words, you wouldn't get any more mileage
01:49:48.500 | by complicating your expressions with three quantifiers
01:49:51.660 | or four quantifiers or any number.
01:49:53.620 | - I'm not sure what to make of that exactly.
01:49:57.860 | - Well, I think it would be evidence
01:49:59.420 | that NP doesn't have small circuits
01:50:03.060 | 'cause something so bizarre would happen.
01:50:07.220 | But again, it's only evidence, not proof.
01:50:09.140 | - Well, yeah, that's not even evidence
01:50:12.660 | because you're saying P's not equal to NP
01:50:17.100 | because something bizarre has to happen.
01:50:19.700 | I mean, that's proof by the lack of bizarreness
01:50:25.260 | in our science, but it seems like
01:50:27.980 | just the very notion of P equals NP would be bizarre.
01:50:33.200 | So any way you arrive at, there's no way.
01:50:36.340 | You have to fight the dragon at some point.
01:50:38.580 | - Yeah, okay, well, anyway.
01:50:40.880 | For whatever it's worth, that's what we proved.
01:50:43.420 | - Awesome, so that's a potential space
01:50:47.660 | of open, interesting problems.
01:50:49.500 | Let me ask you about this other world
01:50:54.780 | of machine learning, of deep learning.
01:50:57.380 | What's your thoughts on the history
01:50:59.780 | and the current progress of machine learning field
01:51:02.740 | that's often progressed sort of separately
01:51:05.860 | as a space of ideas and space of people
01:51:08.940 | than the theoretical computer science
01:51:11.000 | or just even computer science world?
01:51:12.740 | - Yeah, it's really very different
01:51:15.780 | from the theoretical computer science world
01:51:17.860 | because the results about it,
01:51:22.380 | algorithmic performance tend to be empirical.
01:51:25.500 | It's more akin to the world of SAT solvers
01:51:28.980 | where we observe that for formulas arising in practice,
01:51:33.980 | the solver does well.
01:51:36.020 | So it's of that type.
01:51:37.780 | It's where we're moving into the empirical
01:51:42.460 | evaluation of algorithms.
01:51:45.420 | Now, it's clear that there've been huge successes
01:51:47.900 | in image processing, robotics,
01:51:52.860 | natural language processing, a little less so,
01:51:55.500 | but across the spectrum of game playing is another one.
01:52:00.500 | There've been great successes.
01:52:02.780 | And one of those effects is that it's not too hard
01:52:09.220 | to become a millionaire if you can get a reputation
01:52:11.860 | in machine learning and there'll be all kinds
01:52:13.700 | of companies that will be willing to offer you the moon
01:52:16.700 | because they think that if they have AI at their disposal,
01:52:21.700 | then they can solve all kinds of problems.
01:52:25.920 | But there are limitations.
01:52:30.640 | One is that the solutions that you get
01:52:38.140 | by from to supervise learning problems
01:52:46.100 | through convolutional neural networks
01:52:50.380 | seem to perform amazingly well,
01:52:56.220 | even for inputs that are outside the training set.
01:52:59.220 | But we don't have any theoretical understanding
01:53:06.340 | of why that's true.
01:53:07.580 | Secondly, the solutions, the networks that you get
01:53:14.620 | are very hard to understand.
01:53:16.620 | And so very little insight comes out.
01:53:18.900 | So yeah, yeah, they may seem to work on your training set
01:53:24.020 | and you may be able to discover whether your photos occur
01:53:29.020 | in a different sample of inputs or not,
01:53:34.020 | but we don't really know what's going on.
01:53:37.560 | We don't know the features that distinguish the photographs
01:53:41.700 | or the objects are not easy to characterize.
01:53:46.700 | - Well, it's interesting 'cause you mentioned
01:53:51.180 | coming up with a small circuit
01:53:53.820 | to solve a particular size problem.
01:53:56.420 | It seems that neural networks are kind of small circuits.
01:53:59.940 | - In a way, yeah.
01:54:01.420 | - But they're not programs.
01:54:02.860 | Sort of like the things you've designed are algorithms,
01:54:05.820 | programs, algorithms.
01:54:08.980 | Neural networks aren't able to develop algorithms
01:54:12.620 | to solve a problem.
01:54:13.820 | It's more of a function.
01:54:15.860 | - They are algorithms, it's just that they're--
01:54:18.460 | - But sort of, yeah, it could be a semantic question,
01:54:25.220 | but there's not a algorithmic style
01:54:30.020 | manipulation of the input.
01:54:32.060 | Perhaps you could argue there is.
01:54:35.300 | - Yeah, well--
01:54:37.120 | - It feels a lot more like a function of the input.
01:54:40.520 | - It's a function, it's a computable function.
01:54:43.220 | Once you have the network, you can simulate it
01:54:48.400 | on a given input and figure out the output.
01:54:51.280 | But if you're trying to recognize images,
01:54:56.280 | then you don't know what features of the image
01:55:00.860 | are really being determined by the input.
01:55:06.580 | Determinant of what the circuit is doing.
01:55:09.420 | The circuit is sort of very intricate,
01:55:13.980 | and it's not clear that the simple characteristics
01:55:18.980 | that you're looking for, the edges of the objects
01:55:23.940 | or whatever they may be, they're not emerging
01:55:27.140 | from the structure of the circuit.
01:55:29.580 | - Well, it's not clear to us humans,
01:55:31.060 | but it's clear to the circuit.
01:55:33.000 | - Yeah, well, right.
01:55:34.860 | I mean, it's not clear to sort of the elephant
01:55:39.860 | how the human brain works, but it's clear to us humans,
01:55:46.860 | we can explain to each other our reasoning,
01:55:49.180 | and that's why the cognitive science
01:55:50.740 | and psychology field exists.
01:55:52.740 | Maybe the whole thing of being explainable to humans
01:55:56.280 | is a little bit overrated.
01:55:57.700 | - Oh, maybe, yeah.
01:55:59.740 | I guess you can say the same thing about our brain,
01:56:02.460 | that when we perform acts of cognition,
01:56:06.140 | we have no idea how we do it, really.
01:56:07.940 | - That's true.
01:56:08.780 | - We do, though.
01:56:09.620 | I mean, we, at least for the visual system,
01:56:13.700 | the auditory system, and so on,
01:56:15.220 | we do get some understanding of the principles
01:56:19.260 | that they operate under, but for many deeper
01:56:23.500 | cognitive tasks, we don't have that.
01:56:25.620 | - That's right.
01:56:26.660 | Let me ask.
01:56:27.580 | - Yeah.
01:56:28.580 | - You've also been doing work on bioinformatics.
01:56:33.020 | Does it amaze you that the fundamental building blocks,
01:56:37.020 | so if we take a step back and look at us humans,
01:56:39.860 | the building blocks used by evolution
01:56:41.820 | to build us intelligent human beings
01:56:44.660 | is all contained there in our DNA?
01:56:47.200 | - It's amazing, and what's really amazing
01:56:51.900 | is that we are beginning to learn how to edit,
01:56:57.420 | how to edit DNA, which is very, very fascinating,
01:57:02.420 | this ability to take a sequence,
01:57:10.580 | find it in the genome, and do something to it.
01:57:18.980 | - I mean, that's really taking our biological system
01:57:21.260 | towards the worlds of algorithms.
01:57:24.500 | - Yeah, but it raises a lot of questions.
01:57:27.140 | You have to distinguish between doing it on an individual
01:57:33.900 | or doing it on somebody's germline,
01:57:35.740 | which means that all of their descendants will be affected.
01:57:38.840 | - So that's like an ethical...
01:57:42.180 | - Yeah, so it raises very severe ethical questions,
01:57:46.060 | and even doing it on individuals.
01:57:53.760 | Um, is, there's a lot of hubris involved
01:57:58.760 | that you can assume that knocking out a particular gene
01:58:04.200 | is going to be beneficial because you don't know
01:58:06.180 | what the side effects are going to be.
01:58:08.120 | So we have this wonderful new world of gene editing,
01:58:20.240 | which is very, very impressive,
01:58:23.240 | and it could be used in agriculture,
01:58:27.320 | it could be used in medicine in various ways,
01:58:31.340 | but very serious ethical problems arise.
01:58:35.560 | - What are, to you, the most interesting places
01:58:39.920 | where algorithms, sort of the ethical side,
01:58:44.600 | is an exceptionally challenging thing
01:58:46.060 | that I think we're going to have to tackle
01:58:48.080 | with all of genetic engineering.
01:58:51.840 | But on the algorithmic side,
01:58:53.760 | there's a lot of benefit that's possible.
01:58:55.560 | So is there areas where you see exciting possibilities
01:59:00.280 | for algorithms to help model,
01:59:02.640 | optimize, study biological systems?
01:59:05.120 | - Yeah, I mean, we can certainly analyze genomic data
01:59:11.720 | to figure out which genes are operative in the cell
01:59:17.480 | and under what conditions
01:59:18.800 | and which proteins affect one another,
01:59:21.300 | which proteins physically interact.
01:59:26.100 | We can sequence proteins and modify them.
01:59:30.660 | - Is there some aspect of that
01:59:33.840 | that's a computer science problem,
01:59:35.920 | or is that still fundamentally a biology problem?
01:59:39.840 | - Well, it's a big data,
01:59:41.600 | it's a statistical big data problem for sure.
01:59:45.700 | So, you know, the biological data sets
01:59:48.480 | are increasing our ability to study our ancestry,
01:59:53.480 | to study the tendencies towards disease,
01:59:59.940 | to personalize treatment according to what's in our genomes
02:00:04.940 | and what tendencies for disease we have,
02:00:09.320 | to be able to predict what troubles might come upon us
02:00:13.920 | in the future and anticipate them,
02:00:16.100 | to understand whether you,
02:00:21.100 | for a woman, whether her proclivity for breast cancer
02:00:29.340 | is so strong enough that she would want to take action
02:00:34.840 | to avoid it.
02:00:36.640 | - You dedicate your 1985 Turing Award lecture
02:00:41.060 | to the memory of your father.
02:00:42.840 | - Mm-hmm.
02:00:43.680 | - What's your fondest memory of your dad?
02:00:47.140 | - Seeing him standing in front of a class
02:00:56.160 | at the blackboard, drawing perfect circles by hand,
02:01:00.840 | and showing his ability to attract the interest
02:01:11.420 | of the motley collection of eighth grade students
02:01:14.760 | that he was teaching.
02:01:15.820 | - When did you get a chance to see him
02:01:21.640 | draw the perfect circles?
02:01:23.080 | - On rare occasions, I would get a chance
02:01:27.480 | to sneak into his classroom and observe it.
02:01:32.480 | And I think he was at his best in the classroom.
02:01:36.320 | I think he really came to life
02:01:40.820 | and had fun not only teaching,
02:01:43.880 | but engaging in chit-chat with the students
02:01:48.880 | and ingratiating himself with the students.
02:01:53.800 | And what I inherited from that
02:01:57.120 | is a great desire to be a teacher.
02:02:01.920 | I retired recently and a lot of my former students came,
02:02:08.360 | students with whom I had done research
02:02:11.240 | or who had read my papers or who had been in my classes.
02:02:14.800 | And when they talked about me,
02:02:19.860 | they talked not about my 1979 paper or my 1992 paper,
02:02:27.560 | but about what came away in my classes.
02:02:33.600 | And not just the details, but just the approach
02:02:36.400 | and the manner of teaching.
02:02:39.400 | And so I sort of take pride in the,
02:02:43.600 | at least in my early years as a faculty member at Berkeley,
02:02:47.680 | I was exemplary in preparing my lectures
02:02:51.760 | and I always came in prepared to the teeth
02:02:56.760 | and able therefore to deviate
02:02:58.960 | according to what happened in the class
02:03:01.240 | and to really provide a model
02:03:06.840 | for the students.
02:03:08.760 | - So is there advice you can give out
02:03:12.120 | for others on how to be a good teacher?
02:03:16.480 | So preparation is one thing you've mentioned,
02:03:19.000 | being exceptionally well prepared,
02:03:20.440 | but are there other things,
02:03:21.520 | pieces of advice that you can impart?
02:03:24.460 | - Well, the top three would be preparation,
02:03:26.600 | preparation, and preparation.
02:03:28.160 | (laughing)
02:03:29.720 | - Why is preparation so important, I guess?
02:03:31.920 | - It's because it gives you the ease
02:03:34.400 | to deal with any situation that comes up in the classroom.
02:03:38.680 | And if you discover that you're not getting through
02:03:43.680 | one way, you can do it another way.
02:03:45.840 | If the students have questions,
02:03:47.280 | you can handle the questions.
02:03:48.960 | - Ultimately, you're also feeling the crowd,
02:03:53.960 | the students of what they're struggling with,
02:03:57.080 | what they're picking up,
02:03:57.940 | just looking at them through the questions,
02:03:59.740 | but even just through their eyes.
02:04:01.440 | - Yeah, that's right.
02:04:02.280 | - So the preparation, you can dance.
02:04:05.440 | - You can dance, you can say it another way,
02:04:09.880 | give it another angle.
02:04:11.680 | - Are there, in particular, ideas and algorithms
02:04:14.840 | of computer science that you find
02:04:17.120 | were big aha moments for students,
02:04:19.960 | where they, for some reason, once they got it,
02:04:22.800 | it clicked for them and they fell in love
02:04:24.760 | with computer science?
02:04:26.680 | Or is it individual, is it different for everybody?
02:04:29.360 | - It's different for everybody.
02:04:30.880 | You have to work differently with students.
02:04:33.080 | Some of them just don't need much influence.
02:04:38.080 | They're just running with what they're doing
02:04:42.400 | and they just need an ear now and then.
02:04:44.880 | Others need a little prodding.
02:04:47.280 | Others need to be persuaded to collaborate
02:04:49.960 | among themselves rather than working alone.
02:04:52.500 | They have their personal ups and downs.
02:04:57.200 | So you have to deal with each student as a human being
02:05:02.200 | and bring out the best.
02:05:06.640 | - Humans are complicated.
02:05:08.160 | - Yeah.
02:05:09.240 | - Perhaps a silly question.
02:05:11.240 | If you could relive a moment in your life outside of family
02:05:15.400 | because it made you truly happy,
02:05:17.440 | or perhaps because it changed the direction of your life
02:05:19.920 | in a profound way, what moment would you pick?
02:05:24.640 | - I was kind of a lazy student as an undergraduate
02:05:28.280 | and even in my first year in graduate school.
02:05:32.500 | And I think it was when I started doing research.
02:05:37.360 | I had a couple of summer jobs where I was able to contribute
02:05:41.560 | and I had an idea.
02:05:45.080 | And then there was one particular course
02:05:47.800 | on mathematical methods and operations research
02:05:50.560 | where I just gobbled up the material
02:05:53.640 | and I scored 20 points higher than anybody else in the class
02:05:57.600 | then came to the attention of the faculty.
02:06:00.720 | And it made me realize that I had some ability
02:06:04.640 | that was going somewhere.
02:06:06.160 | - You realize you're pretty good at this thing.
02:06:11.560 | I don't think there's a better way to end it, Richard.
02:06:14.360 | It was a huge honor.
02:06:15.280 | Thank you for decades of incredible work.
02:06:18.280 | Thank you for talking to me.
02:06:19.120 | - Thank you, it's been a great pleasure.
02:06:21.120 | You're a superb interviewer.
02:06:23.760 | (laughing)
02:06:25.040 | - I'll stop it.
02:06:25.880 | Thanks for listening to this conversation with Richard Karp
02:06:29.640 | and thank you to our sponsors, Eight Sleep and Cash App.
02:06:34.160 | Please consider supporting this podcast
02:06:35.920 | by going to eightsleep.com/lex
02:06:39.160 | to check out their awesome mattress
02:06:41.960 | and downloading Cash App and using code LEXPODCAST.
02:06:46.080 | Click the links, buy the stuff, even just visiting the site
02:06:49.840 | but also considering the purchase helps them know
02:06:52.440 | that this podcast is worth supporting in the future.
02:06:55.880 | It really is the best way to support this journey I'm on.
02:06:59.680 | If you enjoy this thing, subscribe on YouTube,
02:07:02.080 | review it with Five Stars and Apple Podcast,
02:07:04.120 | support it on Patreon, connect with me on Twitter
02:07:07.360 | at Lex Friedman if you can figure out how to spell that.
02:07:11.320 | And now let me leave you with some words from Isaac Asimov.
02:07:16.000 | I do not fear computers.
02:07:18.200 | I fear lack of them.
02:07:19.820 | Thank you for listening and hope to see you next time.
02:07:23.720 | (upbeat music)
02:07:26.300 | (upbeat music)
02:07:28.880 | [BLANK_AUDIO]