back to index

Judea Pearl: Correlation and Causation | AI Podcast Clips


Chapters

0:0 What is correlation
1:10 What is conditional probability
2:30 Causation vs correlation
4:30 Research Question
6:0 Daniels Experiment

Whisper Transcript | Transcript Only Page

00:00:00.000 | - What is correlation?
00:00:02.940 | What is it, so probability of something happening
00:00:06.760 | is something, but then there's a bunch of things happening.
00:00:10.080 | And sometimes they happen together, sometimes not.
00:00:13.140 | They're independent or not.
00:00:14.380 | So how do you think about correlation of things?
00:00:17.260 | - Correlation occurs when two things vary together
00:00:19.900 | over a very long time.
00:00:21.460 | There's one way of measuring it.
00:00:23.340 | Or when you have a bunch of variables
00:00:25.460 | that they all vary cohesively.
00:00:29.500 | Then we call it, we have a correlation here.
00:00:32.220 | And usually when we think about correlation,
00:00:35.380 | we really think causally.
00:00:38.060 | Things cannot be correlated unless there is a reason
00:00:41.580 | for them to vary together.
00:00:43.940 | Why should they vary together?
00:00:45.700 | If they don't see each other,
00:00:47.020 | why should they vary together?
00:00:49.220 | - So underlying it somewhere is causation.
00:00:51.900 | - Yes.
00:00:52.860 | Hidden in our intuition, there is a notion of causation
00:00:56.820 | because we cannot grasp any other logic except causation.
00:01:01.820 | - And how does conditional probability differ
00:01:06.740 | from causation?
00:01:08.140 | So what is conditional probability?
00:01:11.540 | - Conditional probability, how things vary,
00:01:14.180 | when one of them stays the same.
00:01:18.640 | Now staying the same means that I have chosen
00:01:22.900 | to look only at those incidents
00:01:25.300 | where the guy has the same value as the previous one.
00:01:29.740 | It's my choice as an experimenter.
00:01:32.900 | So things that are not correlated before
00:01:35.860 | could become correlated.
00:01:37.860 | Like for instance, if I have two coins
00:01:40.460 | which are uncorrelated, okay,
00:01:42.820 | and I choose only those flippings experiments
00:01:47.360 | in which a bell rings, and the bell rings
00:01:50.300 | when at least one of them is a tail, okay,
00:01:54.380 | then suddenly I see correlation between the two coins
00:01:57.980 | because I only look at the cases where the bell rang.
00:02:02.020 | You see, it's my design, with my ignorance essentially,
00:02:07.260 | with my audacity to ignore certain incidents,
00:02:12.260 | I suddenly create a correlation
00:02:18.340 | where it doesn't exist physically.
00:02:20.360 | - Right, so that's, you just outlined
00:02:23.940 | one of the flaws of observing the world
00:02:26.620 | and trying to infer something from the math about the world
00:02:29.700 | from looking at the correlation.
00:02:31.100 | - I don't look at it as a flaw, the world works like that.
00:02:34.020 | But the flaws comes if we try to impose
00:02:38.580 | causal logic on correlation, it doesn't work too well.
00:02:46.540 | - I mean, but that's exactly what we do.
00:02:49.940 | That's what, that has been the majority of science,
00:02:53.620 | is you-- - The majority of naive science.
00:02:56.300 | Statisticians know it, statisticians know
00:02:59.740 | that if you condition on a third variable,
00:03:03.100 | then you can destroy or create correlations
00:03:07.380 | among two other variables.
00:03:09.260 | They know it, it's in the data.
00:03:10.980 | It's nothing surprising, that's why they all dismiss
00:03:14.740 | the Simpson paradox, ah, we know it.
00:03:17.740 | They don't know anything about it.
00:03:19.440 | - Well, there's disciplines like psychology
00:03:23.300 | where all the variables are hard to account for.
00:03:26.500 | And so, oftentimes, there's a leap
00:03:28.820 | between correlation to causation.
00:03:31.140 | You're imposing-- - What do you mean, a leap?
00:03:33.620 | Who is trying to get causation from correlation?
00:03:38.100 | No one. - Not, you're not proving
00:03:40.940 | causation, but you're sort of discussing it,
00:03:45.300 | implying, sort of hypothesizing with our ability to--
00:03:48.940 | - Which discipline you have in mind?
00:03:50.660 | I'll tell you if they are obsolete,
00:03:54.020 | or if they are outdated, or they're about to get outdated.
00:03:57.820 | - Yes, yes. - Tell me which one
00:03:59.860 | you have in mind. - Oh, psychology.
00:04:01.780 | - Psychology, what, is it SEM, Structural Equations?
00:04:04.380 | - No, no, I was thinking of applied psychology studying,
00:04:07.900 | for example, we work with human behavior
00:04:10.820 | in semi-autonomous vehicles, how people behave.
00:04:13.980 | And you have to conduct these studies
00:04:16.180 | of people driving cars. - Everything starts
00:04:18.380 | with a question, what is the research question?
00:04:21.420 | - What is the research question?
00:04:23.060 | The research question, do people fall asleep
00:04:27.940 | when the car is driving itself?
00:04:31.180 | - Do they fall asleep, or do they tend to fall asleep
00:04:35.860 | more frequently-- - More frequently.
00:04:37.420 | - Than with a car not driving itself?
00:04:39.420 | - Not driving itself. - That's a good question, okay.
00:04:42.340 | - And so, you measure, you put people in the car,
00:04:46.100 | because it's real world, you can't conduct
00:04:48.140 | an experiment where you control everything.
00:04:49.900 | - Why can't you-- - You could.
00:04:52.020 | - Turn the automatic module on and off?
00:04:57.020 | - Because it's on-road public, I mean,
00:05:01.740 | there's aspects to it that's unethical,
00:05:06.300 | because it's testing on public roads.
00:05:08.540 | So you can only use vehicle, they have to,
00:05:11.180 | the people, the drivers themselves have to make
00:05:14.380 | that choice themselves, and so they regulate that.
00:05:18.020 | So you just observe when they drive it autonomously
00:05:22.620 | and when they don't.
00:05:23.940 | - But maybe they turn it off when they're very tired.
00:05:26.740 | - Yeah, that kind of thing, but you don't know those.
00:05:30.260 | - So you have now uncontrolled experiment.
00:05:32.980 | - Uncontrolled experiment.
00:05:34.300 | - We call it observational study,
00:05:36.820 | and we form the correlation, detected,
00:05:40.780 | we have to infer causal relationship,
00:05:43.980 | whether it was the automatic piece
00:05:47.060 | that caused them to fall asleep, or,
00:05:49.580 | so that is an issue that is about 120 years old.
00:05:54.580 | I should only go 100 years old, okay?
00:06:02.240 | Oh, maybe it's not, actually I should say it's 2000 years old
00:06:08.820 | because we have this experiment by Daniel,
00:06:12.100 | about the Babylonian king that wanted the exile,
00:06:17.100 | the people from Israel that were taken in exile
00:06:26.020 | to Babylon to serve the king,
00:06:28.300 | he wanted to serve them king's food,
00:06:30.860 | which was meat, and Daniel, as a good Jew,
00:06:33.980 | couldn't eat non-kosher food,
00:06:36.380 | so he asked them to eat vegetarian food,
00:06:40.260 | but the king overseer said, "I'm sorry,
00:06:42.860 | "but if the king sees that your performance
00:06:46.600 | "falls below that of other kids,
00:06:51.020 | "he's going to kill me."
00:06:52.980 | Daniel said, "Let's make an experiment.
00:06:55.140 | "Let's take four of us from Jerusalem,
00:06:57.900 | "give us vegetarian food,
00:06:59.960 | "let's take the other guys to eat the king's food,
00:07:03.820 | "and in about a week's time, we'll test our performance."
00:07:07.700 | And you know the answer,
00:07:09.060 | of course he did the experiment,
00:07:11.420 | and they were so much better than the others,
00:07:15.740 | and the king nominated them to super position in his case.
00:07:20.740 | So it was the first experiment, yes.
00:07:23.740 | So there was a very simple,
00:07:26.340 | it's also the same research questions.
00:07:29.120 | We want to know if vegetarian food
00:07:31.020 | assists or obstructs your mental ability.
00:07:37.100 | And okay, so the question is very old.
00:07:42.100 | Even Democritus said,
00:07:46.560 | if I could discover one cause of things,
00:07:52.860 | I would rather discover one cause
00:07:55.060 | than be a king of Persia.
00:07:56.480 | The task of discovering causes
00:08:01.980 | was in the mind of ancient people
00:08:04.500 | from many, many years ago.
00:08:07.100 | But the mathematics of doing that
00:08:10.980 | was only developed in the 1920s.
00:08:14.080 | So science has left us orphaned.
00:08:17.320 | Science has not provided us with the mathematics
00:08:21.920 | to capture the idea of X causes Y,
00:08:25.600 | and Y does not cause X.
00:08:27.940 | Because all the questions of physics
00:08:30.140 | are symmetrical, algebraic.
00:08:32.220 | The equality sign goes both ways.
00:08:34.720 | (air whooshing)
00:08:37.380 | (silence)
00:08:39.540 | (silence)
00:08:41.700 | (silence)
00:08:43.860 | (silence)
00:08:46.020 | (silence)
00:08:48.180 | (silence)
00:08:50.340 | [BLANK_AUDIO]