back to indexDavid Ferrucci: IBM Watson, Jeopardy & Deep Conversations with AI | Lex Fridman Podcast #44
Chapters
0:0
0:1 David Ferrucci
1:16 Difference between Biological Systems and Computer Systems
8:6 What Is Intelligence
36:55 Fundamental Pattern Matching
39:2 Time Travel
00:00:00.000 |
The following is a conversation with David Ferrucci. 00:00:14.960 |
not only for abstract understanding of intelligence, 00:00:17.740 |
but for engineering it to solve real world problems 00:00:21.240 |
under real world deadlines and resource constraints. 00:00:26.520 |
is where brilliant, simple ingenuity emerges. 00:01:01.400 |
And now, here's my conversation with David Ferrucci. 00:01:11.320 |
before you went on for the PhD in computer science. 00:01:16.840 |
What is the difference between biological systems 00:01:50.580 |
to understand, to process information the way we do, 00:01:57.960 |
even if that process were the intelligence process itself, 00:02:11.660 |
I mean, you can go in the direction of spirituality, 00:02:22.040 |
from an intellectual and physical perspective, 00:02:27.500 |
Clearly, there are different implementations, 00:02:33.240 |
is a biological information processing system 00:02:38.460 |
than one we might be able to build out of silicon 00:02:46.560 |
- How distant do you think is the biological implementation? 00:02:50.600 |
So fundamentally, they may have the same capabilities, 00:02:58.300 |
where a huge number of breakthroughs are needed 00:03:12.840 |
So your question presupposes that there's this goal 00:03:16.560 |
to recreate what we perceive as biological intelligence. 00:03:20.920 |
I'm not sure that's how I would state the goal. 00:03:38.560 |
for us to be able to diagnose and treat issues 00:03:44.760 |
for us to understand our own strengths and weaknesses, 00:03:48.240 |
both intellectual, psychological, and physical. 00:03:55.000 |
from that perspective, there's a clear goal there. 00:04:06.440 |
Human intelligence certainly has a lot of things we envy. 00:04:12.880 |
So I think we're capable of sort of stepping back 00:04:16.680 |
and saying, what do we want out of an intelligence? 00:04:21.680 |
How do we wanna communicate with that intelligence? 00:04:27.460 |
Now, of course, it's somewhat of an interesting argument 00:04:30.360 |
because I'm sitting here as a human with a biological brain 00:04:33.940 |
and I'm critiquing the strengths and weaknesses 00:04:36.440 |
of human intelligence and saying that we have 00:04:42.560 |
gee, what is intelligence and what do we really want 00:04:45.120 |
out of it, and that in and of itself suggests 00:04:48.100 |
that human intelligence is something quite enviable, 00:05:01.160 |
- Yeah, I think that flaws that human intelligence has 00:05:10.480 |
- Do you think those are, sorry to interrupt, 00:05:12.040 |
do you think those are features or are those bugs? 00:05:14.400 |
Do you think the prejudice, the forgetfulness, the fear, 00:05:33.440 |
- Well, again, if you go back and you define intelligence 00:05:36.200 |
as being able to sort of accurately, precisely, 00:05:43.840 |
and justify those answers in an objective way, 00:05:46.640 |
yeah, then human intelligence has these flaws 00:05:59.740 |
meaning it takes past data, uses that to predict the future, 00:06:06.000 |
but fundamentally biased and prejudicial in other cases, 00:06:10.760 |
by its priors, whether they're right or wrong 00:06:17.420 |
you're gonna favor them because those are the decisions 00:06:20.520 |
or those are the paths that succeeded in the past. 00:06:24.240 |
And I think that mode of intelligence makes a lot of sense 00:06:42.080 |
and make more objective and reasoned decisions. 00:06:48.400 |
They do sort of one more naturally than they do the other, 00:06:58.460 |
to not be eaten by the predators in the world. 00:07:02.760 |
- For example, but then we've learned to reason 00:07:13.020 |
I think that's harder for the individual to do. 00:07:21.000 |
I think we are, the human mind certainly is capable of it, 00:07:25.320 |
And then there are other weaknesses, if you will, 00:07:27.720 |
as you mentioned earlier, just memory capacity 00:07:30.660 |
and how many chains of inference can you actually go through 00:07:47.260 |
but I think you're the perfect person to talk about this 00:07:55.700 |
That's one of the most incredible accomplishments in AI, 00:08:19.820 |
being able to reason your way through solving that problem? 00:08:25.460 |
- So I think of intelligence primarily two ways. 00:08:37.620 |
whether it's to predict the answer of a question 00:08:40.900 |
or to say, look, I'm looking at all the market dynamics 00:08:43.900 |
and I'm gonna tell you what's gonna happen next, 00:08:49.380 |
and you're gonna predict what they're gonna do next 00:08:52.900 |
- In a highly dynamic environment full of uncertainty, 00:09:09.880 |
what's gonna happen next accurately and consistently? 00:09:18.320 |
You need to have an understanding of the way the world works 00:09:22.880 |
in order to be able to unroll it into the future. 00:09:35.140 |
machine learning does, is if you give me enough prior data 00:09:39.000 |
and you tell me what the output variable is that matters, 00:09:41.980 |
I'm gonna sit there and be able to predict it. 00:09:47.300 |
so that I can get it right more often than not, I'm smart. 00:09:50.980 |
If I can do that with less data and less training time, 00:09:58.020 |
If I can figure out what's even worth predicting, 00:10:08.500 |
- Well, that's interesting about picking a goal. 00:10:30.820 |
that you need to run away from the ferocious tiger, 00:10:33.640 |
but we survive in a social context as an example. 00:10:38.640 |
So understanding the subtleties of social dynamics 00:10:42.300 |
becomes something that's important for surviving, 00:10:49.360 |
with complex sets of variables, complex constraints, 00:11:00.700 |
In other words, represent those patterns efficiently 00:11:06.060 |
That doesn't really require anything specific 00:11:18.460 |
But then when we say, well, do we understand each other? 00:11:23.300 |
In other words, would you perceive me as intelligent 00:11:31.040 |
So now I can predict, but I can't really articulate 00:11:41.240 |
And I can't get you to understand what I'm doing 00:11:50.800 |
the right pattern-managing machinery that I did. 00:11:59.120 |
but I'm sort of an alien intelligence relative to you. 00:12:02.660 |
- You're intelligent, but nobody knows about it. 00:12:08.660 |
- So you're saying, let's sort of separate the two things. 00:12:29.640 |
- Well, it's not impressing you that I'm intelligent. 00:12:39.760 |
- When you pass, I say, wow, you're right more times 00:12:43.940 |
than I am, you're doing something interesting. 00:12:49.160 |
But then what happens is, if I say, how are you doing that? 00:13:00.760 |
I may say, well, you're doing something weird, 00:13:12.400 |
Because now this is, you're in this weird place 00:13:15.160 |
where for you to be recognized as intelligent 00:13:33.520 |
my ability to relate to you starts to change. 00:13:36.800 |
So now you're not an alien intelligence anymore, 00:13:43.920 |
And so I think when we look at animals, for example, 00:13:48.120 |
animals can do things we can't quite comprehend, 00:13:54.420 |
They can't put what they're going through in our terms. 00:14:01.520 |
and they're not really worth necessarily what we're worth, 00:14:03.600 |
we don't treat them the same way as a result of that. 00:14:06.360 |
But it's hard, because who knows what's going on. 00:14:17.960 |
explaining the reasoning that went into the prediction 00:14:27.080 |
If we look at humans, look at political debates 00:14:30.240 |
and discourse on Twitter, it's mostly just telling stories. 00:14:35.240 |
So your task is, sorry, your task is not to tell 00:15:04.440 |
- Yeah, there have been several proofs out there 00:15:06.060 |
where mathematicians would study for a long time 00:15:12.200 |
until the community mathematicians decided that it did. 00:15:20.920 |
Is that ultimately, this notion of understanding us, 00:15:24.600 |
understanding something is ultimately a social concept. 00:15:28.200 |
In other words, I have to convince enough people 00:15:33.720 |
I did this in a way that other people can understand 00:15:36.360 |
and replicate and that it makes sense to them. 00:15:39.800 |
So, human intelligence is bound together in that way. 00:15:55.840 |
- Did you think the general question of intelligence 00:16:08.600 |
The answer will ultimately be a socially constructed concept. 00:16:21.360 |
here's this data, I wanna predict this type of thing, 00:16:25.740 |
learn this function, and then if you get it right, 00:16:38.640 |
It could be solving a problem we can't otherwise solve 00:16:56.120 |
Can we relate to the process that you're going through? 00:17:01.160 |
whether you're a machine or another human, frankly, 00:17:08.680 |
how it is that you're arriving at that answer 00:17:13.880 |
or a judge of people to decide whether or not 00:17:17.680 |
And by the way, that happens with humans as well. 00:17:20.240 |
You're sitting down with your staff, for example, 00:17:22.080 |
and you ask for suggestions about what to do next, 00:17:25.560 |
and someone says, oh, I think you should buy, 00:17:33.200 |
or I think you should launch the product today or tomorrow, 00:17:37.140 |
whatever the decision may be, and you ask why, 00:17:39.860 |
and the person says, I just have a good feeling about it. 00:17:54.180 |
Can you explain to me why I should believe this? 00:17:56.820 |
- And that explanation may have nothing to do 00:18:07.880 |
And that's why I'm saying we're bound together. 00:18:12.200 |
Our intelligences are bound together in that sense. 00:18:15.400 |
And if, for example, you're giving me an explanation, 00:18:33.560 |
and being objective and following logical paths 00:18:41.440 |
and sort of computing probabilities across those paths, 00:18:53.200 |
So I think we'll talk quite a bit about the first 00:18:58.000 |
on a specific objective metric benchmark performing well. 00:19:03.000 |
But being able to explain the steps, the reasoning, 00:19:18.180 |
- The thing that's hard for humans, as you know, 00:19:24.460 |
So, sorry, so how hard is that problem for computers? 00:19:38.360 |
and we say we wanna design computers to do that, 00:19:58.420 |
and what judgments we use to learn that well. 00:20:05.460 |
if you look at the entire enterprise of science, 00:20:13.700 |
So we think about, gee, who's the most intelligent person 00:20:20.500 |
Do we think about the savants who can close their eyes 00:20:29.500 |
who kinda work through the details and write the papers 00:20:33.820 |
and come up with the thoughtful, logical proofs 00:20:40.680 |
And my point is that, how do you train someone to do that? 00:20:47.600 |
What's the process of training people to do that well? 00:20:56.040 |
to get other people to understand our thinking 00:21:07.500 |
we can persuade them through emotional means, 00:21:24.200 |
we try to do it as even artists in many forms, 00:21:29.760 |
We go through a fairly significant training process 00:21:37.920 |
But it's hard, and for humans, it takes a lot of work. 00:22:01.040 |
which is being able to explain something through reason. 00:22:05.100 |
But if you look at algorithms that recommend things 00:22:14.600 |
you know, their goal is to convince you to buy things 00:22:27.180 |
is showing you things that you really do need 00:22:32.020 |
But it could also be through emotional manipulation. 00:22:35.700 |
The algorithm that describes why a certain reason, 00:22:43.800 |
how hard is it to do it through emotional manipulation? 00:22:56.920 |
really showing in a clear way why something is good. 00:23:09.920 |
in the reasoning aspect and the emotional manipulation? 00:23:17.320 |
but more objectively, it's essentially saying, 00:23:24.400 |
I mean, it kind of give you more of that stuff. 00:23:28.240 |
- Yeah, I mean, I'm not saying it's right or wrong. 00:23:46.060 |
because the objective is to get you to click on it 00:24:00.380 |
- I guess this seems to be very useful for convincing, 00:24:16.980 |
I think there's a more optimistic view of that, too, 00:24:26.140 |
And these algorithms are saying, look, that's up to you 00:24:33.640 |
You may have an unhealthy addiction to this stuff, 00:24:36.900 |
or you may have a reasoned and thoughtful explanation 00:24:44.500 |
and the algorithms are saying, hey, that's whatever. 00:24:51.900 |
Could be a bad reason, could be a good reason. 00:25:04.920 |
which is saying, you seem to be interested in this. 00:25:09.340 |
And I think we're seeing this not just in buying stuff, 00:25:16.940 |
I'm just saying, I'm gonna show you other stuff 00:25:31.920 |
So one, the bar of performance is extremely high, 00:25:34.940 |
and yet we also ask them to, in the case of social media, 00:25:57.980 |
We're not building, the system's not building a theory 00:26:01.820 |
that is consumable and understandable by other humans 00:26:06.360 |
And so on one hand to say, oh, AI is doing this. 00:26:16.280 |
And it's interesting to think about why it's harder. 00:26:26.260 |
In other words, understandings of what's important 00:26:35.380 |
What's sensible, what's not sensible, what's good, 00:26:37.440 |
what's bad, what's moral, what's valuable, what isn't? 00:26:43.240 |
So when I see you clicking on a bunch of stuff 00:26:46.640 |
and I look at these simple features, the raw features, 00:27:07.620 |
or what the category is, and stuff like that. 00:27:11.560 |
That's very different than kind of getting in there 00:27:16.000 |
The stuff you're reading, like why are you reading it? 00:27:20.380 |
What assumptions are you bringing to the table? 00:27:29.020 |
Does it lead you to thoughtful, good conclusions? 00:27:34.020 |
Again, there's interpretation and judgment involved 00:27:37.460 |
in that process that isn't really happening in the AI today. 00:27:42.460 |
That's harder because you have to start getting 00:27:52.040 |
You have to get at how humans interpret the content 00:28:02.140 |
is not just some kind of deep, timeless, semantic thing 00:28:15.240 |
So it's again, even meaning is a social construct. 00:28:19.120 |
So you have to try to predict how most people 00:28:31.840 |
If I show you a painting, it's a bunch of colors on a canvas, 00:28:37.140 |
And it may mean different things to different people 00:28:46.460 |
As we try to get more rigorous with our communication, 00:28:53.280 |
So we go from abstract art to precise mathematics, 00:28:58.280 |
precise engineering drawings and things like that. 00:29:03.440 |
I wanna narrow that space of possible interpretations 00:29:17.920 |
and I think that's why this becomes really hard, 00:29:35.080 |
lots of different ways at many, many different levels. 00:29:37.840 |
But when I wanna align our understanding of that, 00:29:48.400 |
that's actually not directly in the artifact. 00:29:52.320 |
Now I have to say, well, how are you interpreting 00:29:57.280 |
And what about the colors and what do they mean to you? 00:29:59.400 |
What perspective are you bringing to the table? 00:30:02.600 |
What are your prior experiences with those artifacts? 00:30:05.640 |
What are your fundamental assumptions and values? 00:30:14.520 |
well, if this is the case, then I would conclude this. 00:30:16.560 |
If that's the case, then I would conclude that. 00:30:19.120 |
So your reasoning processes and how they work, 00:30:27.200 |
all those things now come together into the interpretation. 00:30:33.800 |
- And yet humans are able to intuit some of that 00:30:46.600 |
- We have the shared experience and we have similar brains. 00:31:02.080 |
We have similar, what we like to call prior models 00:31:07.400 |
think of it as a wide collection of interrelated variables 00:31:17.560 |
But as humans, we have a lot of shared experience. 00:31:31.400 |
how is biological and computer information systems 00:31:38.000 |
Well, one is humans come with a lot of pre-programmed stuff, 00:31:52.700 |
if we can maybe escape the hardware question, 00:32:01.200 |
the history, the many centuries of wars and so on 00:32:15.840 |
Can you speak to how hard is it to encode that knowledge 00:32:19.360 |
systematically in a way that could be used by a computer? 00:32:22.800 |
- So I think it is possible to learn for a machine, 00:32:26.340 |
to program a machine to acquire that knowledge 00:32:31.440 |
In other words, a similar interpretive foundation 00:33:05.520 |
they have goals, goals are largely built around survival 00:33:20.640 |
because you brought up like historical events, 00:33:23.560 |
they start interpreting situations like that. 00:33:25.480 |
They apply a lot of this fundamental framework 00:33:35.000 |
How much power or influence did they have over the other? 00:33:37.000 |
Like this fundamental substrate, if you will, 00:33:42.760 |
So I think it is possible to imbue a computer 00:33:46.880 |
with that stuff that humans like take for granted 00:33:50.600 |
when they go and sit down and try to interpret things. 00:34:02.800 |
are then able to interpret it with regard to that framework. 00:34:05.680 |
And then given that interpretation, they can do what? 00:34:22.320 |
Now you can find humans that come and interpret events 00:34:26.320 |
because they're like using a different framework. 00:34:32.480 |
where they decided humans were really just batteries, 00:34:36.360 |
and that's how they interpreted the value of humans 00:34:54.120 |
It comes from, again, the fact that we're similar beings 00:35:04.960 |
- So how much knowledge is there, do you think? 00:35:17.600 |
of unique situations and unique configurations 00:35:27.600 |
for you need for interpreting them, I don't think. 00:35:31.480 |
- You think the frameworks are more important 00:35:39.200 |
is they give you now the ability to interpret and reason, 00:35:41.560 |
and to interpret and reason over the specifics 00:36:07.720 |
Just being able to sort of manipulate objects, 00:36:16.120 |
in robotics or AI, it seems to be like an onion. 00:36:35.680 |
Do they have to be learned through experience? 00:36:39.200 |
- So I think when, like if you're talking about 00:36:41.320 |
sort of the physics, the basic physics around us, 00:36:48.120 |
Yeah, I mean, I think there's a combination of things going, 00:36:52.200 |
I think there's a combination of things going on. 00:36:54.600 |
I think there is like fundamental pattern matching, 00:37:03.840 |
and with similar input, I'm gonna predict similar outputs. 00:37:10.120 |
You may learn very quickly that when you let something go, 00:37:30.760 |
- But that seems to be, that's exactly what I mean, 00:37:57.400 |
It seems like you have to have a lot of different knowledge 00:38:01.380 |
to be able to integrate that into the framework, 00:38:10.360 |
and start to reason about sociopolitical discourse. 00:38:18.600 |
and the high level reasoning decision-making. 00:38:22.560 |
I guess my question is how hard is this problem? 00:38:33.000 |
is take on a problem that's much more constrained 00:39:16.440 |
first of all, it's about getting machines to learn. 00:39:21.360 |
And I think we're already in a place that we understand, 00:39:24.400 |
for example, how machines can learn in various ways. 00:39:28.560 |
Right now, our learning stuff is sort of primitive 00:39:46.360 |
all the data in the world with the frameworks 00:39:48.920 |
that are inherent or underlying our understanding. 00:39:56.140 |
So if we wanna be able to reason over the data 00:40:03.720 |
or at least we need to program the computer to acquire, 00:40:07.560 |
to have access to and acquire, learn the frameworks as well 00:40:18.400 |
I think we can start, I think machine learning, 00:40:28.880 |
Will they relate them necessarily to gravity? 00:40:32.200 |
Not unless they can also acquire those theories as well 00:40:40.880 |
and connect it back to the theoretical knowledge. 00:40:43.360 |
I think if we think in terms of these class of architectures 00:40:47.160 |
that are designed to both learn the specifics, 00:40:51.000 |
find the patterns, but also acquire the frameworks 00:40:56.300 |
if we think in terms of robust architectures like this, 00:40:59.660 |
I think there is a path toward getting there. 00:41:03.380 |
- In terms of encoding architectures like that, 00:41:06.180 |
do you think systems that are able to do this 00:41:12.020 |
or representing, if you look back to the '80s and '90s, 00:41:18.680 |
so more like graphs, systems that are based in logic, 00:41:26.500 |
where the challenge was the automated acquisition 00:41:37.260 |
- Yeah, so I mean, I think asking the question 00:41:39.340 |
do they look like neural networks is a bit of a red herring. 00:41:41.260 |
I mean, I think that they will certainly do inductive 00:41:46.740 |
And I've already experimented with architectures 00:41:52.740 |
and neural networks to learn certain classes of knowledge, 00:41:57.340 |
in order for it to make good inductive guesses, 00:42:01.540 |
but then ultimately to try to take those learnings 00:42:05.300 |
and marry them, in other words, connect them to frameworks 00:42:13.660 |
So for example, at Elemental Cognition, we do both. 00:42:16.140 |
We have architectures that do both, but both those things, 00:42:21.700 |
for acquiring the frameworks themselves and saying, 00:42:27.260 |
"I need to interpret it in the form of these frameworks 00:42:30.900 |
So there is a fundamental knowledge representation, 00:42:33.380 |
like what you're saying, like these graphs of logic, 00:42:49.240 |
- Yeah, so it seems like the idea of frameworks 00:42:52.600 |
requires some kind of collaboration with humans. 00:42:56.360 |
- So do you think of that collaboration as direct? 00:43:12.060 |
in the terms of frameworks that help them understand things. 00:43:15.980 |
So to be really clear, you can independently create 00:43:28.500 |
that does a better job than you with some things, 00:43:33.540 |
That doesn't mean it might be better than you at the thing. 00:43:36.760 |
It might be that you cannot comprehend the framework 00:43:45.320 |
- But you're more interested in a case where you can. 00:43:55.920 |
I want machines to be able to ultimately communicate 00:44:01.440 |
I want them to be able to acquire and communicate, 00:44:10.340 |
inductive machine learning techniques are good at, 00:44:16.860 |
whether it be in language or whether it be in images 00:44:19.340 |
or videos or whatever, to acquire these patterns, 00:44:24.340 |
to induce the generalizations from those patterns, 00:44:31.300 |
to connect them to frameworks, interpretations, if you will, 00:44:36.720 |
Of course, the machine is gonna have the strength 00:44:44.340 |
reasoning abilities, the deeper reasoning abilities. 00:44:49.220 |
complementary relationship between the human and the machine. 00:44:53.220 |
- Do you think that ultimately needs explainability 00:45:06.460 |
the human and machine are working together there, 00:45:10.340 |
and the human is responsible for their own life 00:45:27.900 |
That's a moment of interaction where, you know, 00:45:33.900 |
it has a failure, somehow the failure's communicated, 00:45:38.740 |
the human is now filling in the mistake, if you will, 00:45:57.820 |
is where the machine's literally talking to you 00:46:02.700 |
"I know that like the next word might be this or that, 00:46:18.180 |
the next time it's reading to try to understand something. 00:46:24.780 |
I mean, I remember when my daughter was in first grade 00:46:27.500 |
and she had a reading assignment about electricity 00:46:35.540 |
"And electricity is produced by water flowing over turbines," 00:46:45.140 |
"I mean, I could, you know, created and produced 00:46:57.640 |
water flowing over turbines and what electricity even is. 00:47:00.400 |
I mean, I can get the answer right by matching the text, 00:47:04.020 |
but I don't have any framework for understanding 00:47:07.900 |
- And framework really is, I mean, it's a set of, 00:47:14.180 |
that you bring to the table and interpreting stuff 00:47:20.460 |
that there's a shared understanding of what they are. 00:47:27.800 |
Do you have a sense that humans on Earth in general 00:47:32.060 |
share a set of, like how many frameworks are there? 00:47:36.500 |
- I mean, it depends on how you bound them, right? 00:47:44.220 |
I think the way I think about it is kind of in a layer. 00:47:47.620 |
I think of the architectures as being layered in that. 00:47:53.580 |
that allow you the foundation to build frameworks. 00:48:03.060 |
I mean, one of the most compelling ways of thinking 00:48:05.340 |
about this is a reasoning by analogy where I can say, 00:48:07.820 |
oh, wow, I've learned something very similar. 00:48:21.020 |
and I have guards and I have this and I have that, 00:48:23.520 |
like where are the similarities and where are the differences 00:48:45.580 |
- Right, I mean, they're talking about political 00:48:47.100 |
and social ways of interpreting the world around them. 00:48:52.700 |
I think they differ in maybe what some fundamental 00:49:04.120 |
The implications of different fundamental values 00:49:06.520 |
or fundamental assumptions in those frameworks 00:49:18.360 |
I just followed where my assumptions took me. 00:49:21.580 |
- Yeah, the process itself will look similar, 00:49:33.700 |
I mean, having a Democrat and a Republican framework 00:49:44.160 |
will be totally different from an AI perspective 00:49:49.420 |
is to be able to tell you that this perspective, 00:49:55.540 |
another set of assumptions is gonna lead you there. 00:50:05.060 |
You know, I have this fundamental belief about that, 00:50:20.140 |
to understand a statement, depending on the framework. 00:50:23.460 |
and the broader the content, the richer it is. 00:50:28.460 |
And so, you and I can have very different experiences 00:50:37.460 |
And if we're committed to understanding each other, 00:50:41.360 |
we start, and that's the other important point, 00:50:45.300 |
if we're committed to understanding each other, 00:50:47.780 |
we start decomposing and breaking down our interpretation 00:51:12.300 |
as really complementing and helping human intelligence 00:51:16.040 |
to overcome some of its biases and its predisposition 00:51:31.420 |
and someone labeled this as a Democratic point of view, 00:51:35.420 |
And if the machine can help us break that argument down 00:51:47.540 |
- We're gonna have to sit and think about that. 00:52:10.860 |
competing in the game of Jeopardy against humans. 00:52:13.820 |
And you were a lead in that, a critical part of that. 00:53:02.000 |
determine whether or not you know the answer, 00:53:10.760 |
because the questions are not asked directly, right? 00:53:14.440 |
- They're all like, the way the questions are asked 00:53:36.940 |
And that's sort of an interesting realization 00:53:44.740 |
and you're still trying to process the question, 00:53:47.020 |
and the champions have answered and moved on. 00:54:02.240 |
if you look back at the Jeopardy games much earlier. 00:54:16.480 |
and subtle, and nuanced, and humorous, and witty over time, 00:54:24.280 |
in figuring out what the question was even asking. 00:54:26.880 |
So yeah, you have to figure out the questions even asking. 00:54:34.520 |
And because you have to buzz in really quickly, 00:54:41.200 |
Otherwise, you lose the opportunity to buzz in. 00:54:43.440 |
- Even before you really know if you know the answer. 00:54:49.080 |
they'll look at it, process it very superficially. 00:54:53.000 |
In other words, what's the topic, what are some keywords, 00:55:12.440 |
They'll just assume they know all about Jeopardy, 00:55:15.920 |
Watson, interestingly, didn't even come close 00:55:25.960 |
which is like how many of all the Jeopardy questions, 00:55:29.400 |
how many could we even find the right answer for, 00:55:34.440 |
Like, could we come up with, if we had a big body of knowledge 00:55:39.800 |
I mean, from a web scale, it was actually very small. 00:55:44.360 |
I was talking about millions of books, right? 00:56:02.080 |
So, and so it was important to get a very quick sense of, 00:56:07.080 |
do you think you know the right answer to this question? 00:56:16.480 |
and at least spend some time essentially answering it 00:56:30.040 |
And that would depend on what else was going on in the game 00:56:35.120 |
where I have to take a guess, I have very little to lose, 00:56:40.280 |
- So that was accounted for the financial standings 00:56:48.280 |
where you were in the standing and things like that. 00:56:58.440 |
- So, I mean, we targeted answering in under three seconds 00:57:14.520 |
whereas like we would say, let's estimate our confidence, 00:57:17.400 |
which was sort of a shallow answering process. 00:57:23.840 |
and then we may take another second or something 00:57:30.880 |
But by and large, we're saying like, we can't play the game. 00:57:33.920 |
We can't even compete if we can't, on average, 00:57:37.600 |
answer these questions in around three seconds or less. 00:57:41.720 |
so there's these three humans playing a game, 00:57:45.320 |
and you stepped in with the idea that IBM Watson 00:57:52.000 |
Can you tell the story of Watson taking on this game? 00:57:58.720 |
- Yeah, so the story was that it was coming up, 00:58:03.520 |
I think, to the 10-year anniversary of Big Blue. 00:58:08.800 |
IBM wanted to do sort of another kind of really, 00:58:16.400 |
and the kind of the cool stuff that we were doing. 00:58:18.640 |
I had been working in AI at IBM for some time. 00:58:28.640 |
which is, we're not gonna tell you what the questions are, 00:58:31.040 |
we're not even gonna tell you what they're about. 00:58:33.120 |
Can you go off and get accurate answers to these questions? 00:58:36.840 |
And it was an area of AI research that I was involved in. 00:58:41.400 |
And so it was a very specific passion of mine. 00:58:44.280 |
Language understanding had always been a passion of mine. 00:58:54.560 |
Factoid, meaning it essentially had an answer, 00:58:57.840 |
and being able to do that accurately and quickly. 00:59:00.920 |
So that was a research area that my team had already been in. 00:59:06.320 |
several IBM executives were like, what are we gonna do? 00:59:13.920 |
This was like, whatever it was, 2004, I think, 00:59:18.760 |
And someone thought, hey, that would be really cool 00:59:28.040 |
And everyone was telling the research execs, no way. 00:59:35.200 |
And we had some pretty senior people in the field 00:59:38.200 |
And it would come across my desk and I was like, 00:59:40.240 |
but that's kind of what I'm really interested in doing. 00:59:46.800 |
this is nuts, we're not gonna risk IBM's reputation on this, 00:59:50.280 |
And this happened in 2004, it happened in 2005. 00:59:53.200 |
At the end of 2006, it was coming around again. 01:00:01.120 |
I was doing the open domain question answering stuff, 01:00:03.120 |
but I was coming off a couple other projects. 01:00:10.240 |
And I argued it would be crazy not to do this. 01:00:17.640 |
what's the confidence that you had yourself, privately, 01:00:29.000 |
What was your estimation of the problem at that time? 01:00:34.360 |
And a lot of people thought it was impossible. 01:00:39.200 |
was because I did some brief experimentation. 01:00:50.960 |
for a lot of the points that we mentioned earlier. 01:01:00.560 |
None of this stuff had been done well enough before. 01:01:04.720 |
were the kinds of technologies that should work. 01:01:07.520 |
But more to the point, what was driving me was, 01:01:14.920 |
And this is the kind of stuff we were supposed to do. 01:01:17.160 |
In other words, we were basically supposed to-- 01:01:20.560 |
- I mean, we were supposed to take things and say, 01:01:27.560 |
if we have the opportunity, to push it to the limits. 01:01:30.160 |
And if it doesn't work, to understand more deeply 01:01:50.720 |
At the very least, we'd be able to come out and say, 01:01:57.080 |
Here's what we've tried and here's how we failed. 01:01:58.700 |
So I was very driven as a scientist from that perspective. 01:02:16.120 |
and sort of a high-level architectural approach 01:02:42.840 |
it was performing very poorly in the beginning. 01:02:46.340 |
So what were the initial approaches and why did they fail? 01:02:49.780 |
- Well, there were lots of hard aspects to it. 01:02:54.860 |
I mean, one of the reasons why prior approaches 01:03:02.400 |
was because the questions were difficult to interpret. 01:03:12.520 |
like what city, or what, even then it could be tricky, 01:03:28.140 |
in other words, we're gonna ask about these five types. 01:03:37.820 |
The answer will be a person of this type, right? 01:03:44.400 |
there were like tens of thousands of these things, 01:03:52.540 |
And so even if you focused on trying to encode the types 01:03:56.920 |
at the very top, like there's five that were the most, 01:04:01.620 |
you still cover a very small percentage of the data. 01:04:04.200 |
So you couldn't take that approach of saying, 01:04:18.240 |
And so we came up with an approach toward that. 01:04:26.000 |
to handle that problem throughout the project. 01:04:29.560 |
The other issue was that right from the outset, 01:04:34.620 |
I committed to doing this in three to five years. 01:04:47.820 |
I said, we're not going to actually understand language 01:04:57.480 |
and the domain of knowledge that the question refers to 01:05:00.200 |
in reason over that to answer these questions. 01:05:04.200 |
At the same time, simple search wasn't good enough 01:05:08.280 |
to confidently answer with a single correct answer. 01:05:16.160 |
and practical engineering, three, four, eight. 01:05:18.660 |
So you're not trying to solve the general NLU problem. 01:05:21.800 |
You're saying, let's solve this in any way possible. 01:05:57.180 |
And I also, there was another thing I believed. 01:06:00.580 |
I believed that the fundamental NLP technologies 01:06:04.620 |
and machine learning technologies would be adequate. 01:06:08.800 |
And this was an issue of how do we enhance them? 01:06:20.180 |
who had said, we're going to need Maxwell's equations 01:06:25.700 |
And I said, if we need some fundamental formula 01:06:28.740 |
that breaks new ground in how we understand language, 01:06:42.420 |
What I'm counting on is the ability to take everything 01:06:46.620 |
that has done before to figure out an architecture 01:06:50.300 |
on how to integrate it well, and then see where it breaks 01:06:54.340 |
and make the necessary advances we need to make 01:07:03.260 |
I mean, that's the Elon Musk approach with rockets, 01:07:09.700 |
- And I happen to be, in this case, I happen to be right, 01:07:20.380 |
So, if you were to do, what's the brute force solution? 01:07:31.380 |
- Look, web search has come a long way, even since then. 01:07:38.020 |
there are a couple other constraints around the problem, 01:07:50.460 |
The device, if the device is as big as a room, 01:08:01.580 |
So it had to kind of fit in a shoebox, if you will, 01:08:08.060 |
So, but also you couldn't just get out there. 01:08:10.440 |
You couldn't go off network, right, to kind of go. 01:08:19.340 |
The problem was, even when we went and did a web search, 01:08:24.540 |
but somewhere in the order of 65% of the time, 01:08:35.440 |
You know, in other words, even if you could pull the, 01:08:49.120 |
unless you had enough confidence in it, right? 01:08:52.480 |
You'd have to have confidence it was the right answer. 01:09:00.320 |
which doesn't even put you in the winner's circle. 01:09:10.100 |
even if I had somewhere in the top 10 documents, 01:09:12.500 |
how do I figure out where in the top 10 documents 01:09:14.980 |
that answer is, and how do I compute a confidence 01:09:19.740 |
so it's not like I go in knowing the right answer 01:09:27.100 |
How do I, as a machine, go out and figure out 01:09:29.060 |
which one's right, and then how do I score it? 01:09:37.320 |
- First of all, if you pause on that, just think about it. 01:09:53.660 |
- Well, we solved that in some definition of solved, 01:09:59.020 |
So how do you take a body of work on a particular topic 01:10:19.840 |
And we ultimately did find the body of knowledge 01:10:31.000 |
like WordNet and other types of semantic resources, 01:10:36.100 |
In other words, where we went out and took that content 01:10:39.100 |
and then expanded it based on producing statistical, 01:11:08.060 |
where do you, I guess that's probably an encyclopedia, so. 01:11:12.440 |
- So that's an encyclopedia, but then we would take 01:11:15.640 |
that stuff and we would go out and we would expand. 01:11:20.180 |
that wasn't in the core resources and expand it. 01:11:26.200 |
but still, again, from a web scale perspective, 01:11:38.480 |
broke it down into all those individual words 01:11:44.320 |
you know, had computer algorithms that annotated it 01:11:47.000 |
and we indexed that in a very rich and very fast index. 01:11:55.240 |
let's say the equivalent of, for the sake of argument, 01:11:59.000 |
We've now analyzed all that, blowing up its size even more 01:12:15.840 |
I mean, I know 2000, maybe this is 2008, nine, 01:12:27.880 |
Like how hard is the infrastructure component, 01:12:36.080 |
but close to 3000 cores completely connected. 01:12:43.840 |
- And they were sharing memory in some kind of way. 01:13:07.240 |
So if I went and tried to find a piece of content, 01:13:27.840 |
So therein lies, you know, the Watson architecture. 01:13:38.800 |
We'd try to figure out what is it asking about? 01:13:47.240 |
That might be represented as a simple string, 01:14:12.720 |
using open source search engines, we modified them. 01:14:16.160 |
We had a number of different search engines we would use 01:14:24.520 |
ultimately to now take our question analysis, 01:14:28.520 |
produce multiple queries based on different interpretations 01:14:33.320 |
and fire out a whole bunch of searches in parallel. 01:14:44.840 |
And so now let's say you had a thousand passages. 01:14:51.800 |
So you went out and you parallelized the search. 01:15:03.280 |
For each passage now, you'd go and figure out 01:15:25.480 |
coming up better ways to generate search queries 01:15:31.960 |
- And speed, so better is accuracy and speed. 01:15:42.640 |
Like I focus purely on accuracy and inaccuracy, 01:15:54.240 |
and then figuring out how to both parallelize 01:16:07.360 |
For each candidate answer, you're gonna score it. 01:16:10.400 |
So you're gonna use all the data that built up. 01:16:15.880 |
You're gonna use how the query was generated. 01:16:34.600 |
from however many candidate answers you have, 01:16:54.520 |
And I wanna rank them based on the likelihood 01:16:56.360 |
that they're a correct answer to the question. 01:16:58.680 |
So every scorer was its own research project. 01:17:14.080 |
a human would be looking at a possible answer. 01:17:20.920 |
they'd be reading the passage in which that occurred. 01:17:25.400 |
and they'd be making a decision of how likely it is 01:17:28.400 |
that Emily Dickinson, given this evidence in this passage, 01:17:38.840 |
- But scoring implies zero to one kind of continuous-- 01:17:41.320 |
- That's right, you give it a zero to one score. 01:17:46.000 |
Give it a zero, yeah, exactly, a zero to one score. 01:17:50.520 |
so you have to somehow normalize and all that kind of stuff 01:17:59.440 |
- We actually looked at the raw scores as well, 01:18:01.960 |
standardized scores, because humans are not involved in this. 01:18:05.920 |
- Sorry, so I'm misunderstanding the process here. 01:18:13.320 |
- Ground truth is only the answers to the questions. 01:18:19.000 |
So I was always driving end to end performance. 01:18:22.360 |
It was a very interesting engineering approach, 01:18:27.360 |
and ultimately scientific and research approach, 01:18:31.200 |
Now, that's not to say we wouldn't make hypotheses 01:18:42.120 |
was related in some way to end to end performance. 01:18:44.400 |
Of course we would, because people would have to 01:18:52.340 |
you had to show impact on end to end performance, 01:18:56.320 |
- There's many very smart people working on this, 01:18:58.360 |
and they're basically trying to sell their ideas 01:19:01.520 |
as a component that should be part of the system. 01:19:04.560 |
And they would do research on their component, 01:19:09.720 |
I'm gonna improve this as a candidate generator, 01:19:13.120 |
or I'm gonna improve this as a question score, 01:19:15.840 |
or as a passage score, I'm gonna improve this, 01:19:23.920 |
on its component metric, like a better parse, 01:19:26.720 |
or a better candidate, or a better type estimation, 01:19:37.720 |
If you can't estimate that, and can't do experiments 01:19:43.360 |
- That's like the best run AI project I've ever heard. 01:19:51.800 |
like I'm sure there's a lot of day to day breakthroughs, 01:20:04.520 |
but one of the things that I think gave people confidence 01:20:08.960 |
that we can get there was that as we follow this procedure 01:20:13.960 |
of different ideas, build different components, 01:20:19.160 |
plug them into the architecture, run the system, 01:20:24.680 |
start off new research projects to improve things, 01:20:28.120 |
and the very important idea that the individual component 01:20:33.640 |
work did not have to deeply understand everything 01:20:38.640 |
that was going on with every other component. 01:20:42.240 |
And this is where we leverage machine learning 01:20:47.400 |
So while individual components could be statistically driven 01:20:50.360 |
machine learning components, some of them were heuristic, 01:20:52.760 |
some of them were machine learning components, 01:20:54.640 |
the system has a whole combined all the scores 01:21:00.560 |
This was critical because that way you can divide 01:21:04.400 |
So you can say, okay, you work on your candidate generator, 01:21:07.520 |
or you work on this approach to answer scoring, 01:21:22.080 |
now we can train and figure out how do we weigh 01:21:40.680 |
and to let the machine learning do the integration. 01:21:45.160 |
is doing the fusion, and then it's a human orchestrated 01:21:51.980 |
Still impressive that you're able to get it done 01:22:03.400 |
But when you look back at the Jeopardy challenge, 01:22:38.100 |
- That's beautiful because there's so much pressure 01:22:41.600 |
because it is a public event, it is a public show, 01:22:55.360 |
By your, I'm sure, exceptionally high standards, 01:22:59.720 |
is there something you regret you would do differently? 01:23:15.400 |
We went back to the old problems that we used to try 01:23:19.520 |
to solve and we did dramatically better on all of them, 01:23:30.160 |
I worry that the world would not understand it as a success 01:23:43.880 |
And that's a whole nother theme of the journey. 01:23:50.280 |
It was not a success in natural language understanding, 01:23:59.840 |
I understand what you're saying in terms of the science, 01:24:04.120 |
but I would argue that the inspiration of it, right, 01:24:21.140 |
What's the difference between how human being 01:24:28.740 |
- Yeah, so that actually came up very early on 01:24:32.600 |
In fact, I had people who wanted to be on the project 01:24:44.300 |
And they were, you know, from a cognition perspective, 01:24:47.080 |
like human cognition and how that should play. 01:25:01.480 |
- I need to build, in the context of this project, 01:25:03.880 |
in NLU and in building an AI that understands 01:25:07.000 |
how it needs to ultimately communicate with humans, 01:25:16.280 |
In fact, as an AI scientist, I care a lot about that, 01:25:27.500 |
I had to kind of say, like, if I'm gonna get this done, 01:25:30.780 |
I'm gonna chart this path and this path says, 01:25:49.760 |
I'm not gonna get there from here in the time frame. 01:25:54.400 |
- I think that's a great way to lead the team. 01:25:59.240 |
when you look back, analyze what's the difference actually. 01:26:02.560 |
- Right, so I was a little bit surprised actually 01:26:18.900 |
that it might have been closer to the way humans 01:26:21.300 |
answer questions than I might have imagined previously. 01:26:24.740 |
- 'Cause humans are probably in the game of Jeopardy 01:26:29.620 |
probably also cheating their way to winning, right? 01:26:40.900 |
So they are very quickly analyzing the question 01:26:44.900 |
and coming up with some key vectors or cues, if you will. 01:27:03.220 |
would kind of score that in a very shallow way 01:27:08.940 |
And so it's interesting as we reflected on that, 01:27:12.460 |
so we may be doing something that's not too far off 01:27:17.260 |
but we certainly didn't approach it by saying, 01:27:28.780 |
because ultimately we're trying to do something 01:27:31.700 |
that is to make the intelligence of the machine 01:27:35.100 |
and the intelligence of the human very compatible. 01:27:37.780 |
Well, compatible in the sense they can communicate 01:27:44.540 |
So how they think about things and how they build answers, 01:27:49.780 |
becomes a very important question to consider. 01:27:52.140 |
- So what's the difference between this open domain, 01:27:56.920 |
but cold constructed question answering of Jeopardy 01:28:01.920 |
and more something that requires understanding 01:28:07.360 |
for shared communication with humans and machines? 01:28:10.280 |
- Yeah, well, this goes back to the interpretation 01:28:15.640 |
- Jeopardy, the system's not trying to interpret 01:28:18.560 |
the question and it's not interpreting the content 01:28:20.680 |
that's reusing with regard to any particular framework. 01:28:23.880 |
I mean, it is parsing it and parsing the content 01:28:26.880 |
and using grammatical cues and stuff like that. 01:28:29.440 |
So if you think of grammar as a human framework, 01:28:33.400 |
But when you get into the richer semantic frameworks, 01:28:36.880 |
what are people, how do they think, what motivates them? 01:28:43.280 |
what else to happen and where are things in time and space? 01:28:47.440 |
And like when you start thinking about how humans formulate 01:28:51.280 |
and structure the knowledge that they acquire in their head, 01:28:56.400 |
- What do you think are the essential challenges 01:29:01.400 |
of free-flowing communication, free-flowing dialogue 01:29:05.840 |
versus question answering even with a framework 01:29:14.960 |
as fundamentally more difficult than question answering 01:29:23.560 |
- So dialogue is important in a number of different ways. 01:29:27.480 |
So first of all, when I think about the machine that, 01:29:30.520 |
when I think about a machine that understands language 01:29:33.280 |
and ultimately can reason in an objective way 01:29:36.760 |
that can take the information that it perceives 01:29:46.200 |
that system ultimately needs to be able to talk to humans 01:29:50.680 |
or it needs to be able to interact with humans. 01:29:57.560 |
sometimes people talk about dialogue and they think, 01:30:09.880 |
We're not trying to mimic casual conversations. 01:30:23.600 |
So instead of like talking to your friend down the street 01:30:30.480 |
this is more about like you would be communicating 01:30:44.240 |
I'm gonna figure out what your mental model is. 01:30:46.600 |
I'm gonna now relate that to the information I have 01:30:50.080 |
and present it to you in a way that you can understand it 01:30:54.920 |
So it's that type of dialogue that you wanna construct. 01:31:00.400 |
It's more goal-oriented, but it needs to be fluid. 01:31:15.720 |
in other words, the machine has to have a model 01:31:17.580 |
of how humans think through things and discuss them. 01:31:22.580 |
- So basically a productive, rich conversation, 01:31:30.120 |
- I'd like to think it's more similar to this podcast. 01:31:46.600 |
as a community from that still to be able to, 01:32:05.280 |
Which aspects of this whole problem that you specified 01:32:10.040 |
of having a productive conversation is the hardest? 01:32:20.780 |
- So I think to do this, you kind of have to be creative 01:32:25.880 |
If I were to do this as purely a machine learning approach 01:32:32.840 |
"fluent, structured knowledge acquisition conversation," 01:32:37.360 |
I'd go out and say, "Okay, I have to collect a bunch 01:32:40.080 |
"of data of people doing that, people reasoning well 01:32:50.220 |
"as well as produces answers and explanations 01:33:03.120 |
- Okay, okay, this one, there's a humorous commenter 01:33:08.560 |
But also, even if it's out there, say it was out there, 01:33:14.800 |
- Like how do you collect successful examples? 01:33:19.240 |
where you don't have enough data to represent 01:33:23.200 |
the phenomenon you wanna learn, in other words, 01:33:25.960 |
if you have enough data, you could potentially 01:33:34.400 |
What recently came out at IBM was the debater project, 01:33:36.960 |
sort of interesting, right, because now you do have 01:33:39.440 |
these structured dialogues, these debate things, 01:33:42.560 |
where they did use machine learning techniques 01:33:46.980 |
Dialogues are a little bit tougher, in my opinion, 01:33:56.080 |
where you have lots of other structured arguments like this. 01:34:00.800 |
this is a bad response in a particular domain. 01:34:03.240 |
Here, I have to be responsive and I have to be opportunistic 01:34:11.840 |
So I'm goal-oriented in saying I wanna solve the problem, 01:34:16.640 |
but I also have to be opportunistic and responsive 01:34:21.080 |
So I think that it's not clear that we could just train 01:34:24.120 |
on the body of data to do this, but we could bootstrap it. 01:34:31.480 |
What do we think the structure of a good dialogue is 01:34:44.720 |
and I can create a tool that now engages humans effectively, 01:34:48.000 |
I could start both, I could start generating data, 01:34:51.320 |
I could start with the human learning process 01:34:58.600 |
But I have to understand what features to even learn over. 01:35:01.880 |
So I have to bootstrap the process a little bit first. 01:35:13.400 |
- So some creativity and yeah, and bootstrapping. 01:35:21.120 |
So one of the benchmarks for me is humor, right? 01:35:30.340 |
So one of the greatest comedy sketches of all time, right, 01:35:44.120 |
With Sean Connery commentating on Alex Trebek's mother 01:35:49.440 |
And I think all of them are in the negative points wise. 01:35:58.360 |
So what do you think about humor in this whole interaction 01:36:06.520 |
Or even just whatever, what humor represents to me is 01:36:10.200 |
the same idea that you're saying about framework, 01:36:25.120 |
- I think there's a couple of things going on there. 01:36:34.720 |
we did a little bit about with puns in Jeopardy!. 01:36:54.880 |
if you have enough data to represent that phenomenon, 01:37:06.680 |
unless we sit back and think about that more formally. 01:37:10.200 |
I think, again, I think you do a combination of both. 01:37:16.720 |
are always a little bit combination of us reflecting 01:37:19.680 |
and being creative about how things are structured, 01:37:26.400 |
and figuring out how to combine these two approaches. 01:37:29.120 |
I think there's another aspect to humor though, 01:37:31.440 |
which goes to the idea that I feel like I can relate 01:37:44.200 |
do I feel differently when I know it's a robot? 01:37:51.480 |
that the robot is not conscious the way I'm conscious, 01:38:03.040 |
I don't imagine that the person's relating it to it 01:38:17.400 |
whether it's sculpture, it's music or whatever, 01:38:21.320 |
are the people who can evoke a similar emotional response 01:38:26.680 |
who can get you to emote, right, about the way they are. 01:38:31.680 |
In other words, who can basically make the connection 01:38:34.440 |
from the artifact, from the music or the painting 01:38:42.360 |
And then, and that's when it becomes compelling. 01:38:44.700 |
So they're communicating at a whole different level. 01:38:49.340 |
They're communicating their emotional response 01:39:00.640 |
- So the idea that you can connect to that person, 01:39:06.360 |
but we're also able to anthropomorphize objects pretty, 01:39:26.960 |
doesn't require anthropomorphization, but nevertheless-- 01:39:30.480 |
- Well, there was some interest in doing that. 01:39:49.200 |
and you're getting humans to react emotionally. 01:40:13.000 |
if you know that the machine is not conscious, 01:40:17.200 |
not having the same richness of emotional reactions 01:40:20.760 |
and understanding that it doesn't really share 01:40:31.600 |
Interesting, I think you probably would for a while. 01:40:40.080 |
- No, I'm pretty confident that majority of the world, 01:40:56.240 |
- So you, the scientist that made the machine, 01:40:58.560 |
is saying that this is how the algorithm works. 01:41:06.080 |
- So you're deep into the science fiction genre now, 01:41:10.040 |
- I don't think it's, it's actually psychology. 01:41:16.780 |
that we'll have to be exploring in the next few decades. 01:41:20.840 |
- It's a very interesting element of intelligence. 01:41:25.200 |
we've talked about social constructs of intelligence 01:41:33.960 |
What do you think is a good test of intelligence 01:41:36.540 |
So there's the Alan Turing with the Turing test. 01:41:41.320 |
Watson accomplished something very impressive with Jeopardy. 01:41:52.940 |
They would say, this is crossing a kind of threshold 01:42:18.520 |
will be better than us, will become more effective. 01:42:21.680 |
In other words, better predictors about a lot of things 01:42:48.640 |
your emotional response, can even generate language 01:42:51.480 |
that will sound smart, and what someone else might say 01:43:19.440 |
that would ultimately satisfy a critical interrogation 01:43:26.780 |
- I think you just described me in a nutshell. 01:43:42.580 |
And so upon deeper probing and deeper interrogation, 01:43:45.820 |
you may find out that there isn't a shared understanding, 01:43:50.300 |
Like, humans are statistical language model machines, 01:44:10.520 |
where we are in our social and political landscape. 01:44:14.720 |
Can you distinguish someone who can string words together 01:44:19.560 |
and sound like they know what they're talking about 01:44:27.740 |
So it's interesting, because humans are really good 01:44:32.400 |
at, in their own mind, justifying or explaining 01:44:39.880 |
So you could say, you could put together a string of words, 01:44:52.440 |
and they'll interpret it another way that suits their needs. 01:45:00.580 |
as AI gets better and better at better and better mimic, 01:45:09.580 |
Do you really know what you're talking about? 01:45:16.420 |
a powerful framework that you could reason over 01:45:19.420 |
and justify your answers, justify your predictions 01:45:24.420 |
and your beliefs, why you think they make sense? 01:45:27.140 |
Can you convince me what the implications are? 01:45:29.500 |
You know, can you, so can you reason intelligently 01:45:36.160 |
the implications of your prediction and so forth? 01:45:53.800 |
a large group of people with a certain standard 01:46:09.900 |
if that large community of people are not judging it 01:46:16.600 |
of objective logic and reason, you still have a problem. 01:46:32.820 |
- By the way, I have nothing against the one. 01:46:36.060 |
so you're a part of one of the great benchmarks, 01:46:47.220 |
AlphaStar accomplishments on video games recently, 01:46:50.740 |
which are also, I think, at least in the case of Go, 01:47:06.020 |
nobody thought like solving Go was gonna be easy, 01:47:12.700 |
hard for humans to learn, hard for humans to excel at. 01:47:15.540 |
And so it was another measure of intelligence. 01:47:24.980 |
I mean, and I loved how they solved the data problem, 01:47:45.580 |
Can the Go machine help me make me a better Go player? 01:47:54.380 |
if we put in very simple terms, it found the function. 01:48:05.540 |
- So one of the interesting ideas of that system 01:48:12.700 |
So like you're saying, it could have, by itself, 01:48:18.480 |
- Toward a goal, like, imagine you're sentencing, 01:48:36.220 |
So it's an interesting dilemma for the applications of AI. 01:48:42.340 |
Do we hold AI to this accountability that says, 01:48:48.060 |
you know, humans have to be able to take responsibility 01:48:56.380 |
In other words, can you explain why you would do the thing? 01:49:02.040 |
and convince them that this was a smart decision? 01:49:07.180 |
Can you get behind the logic that was made there? 01:49:10.220 |
- Do you think, sorry to linger on this point, 01:49:25.820 |
One is where AI systems do like medical diagnosis 01:49:32.420 |
without ever explaining to you why it fails when it does. 01:49:36.600 |
That's one possible world, and we're okay with it. 01:49:45.380 |
from getting too good before it gets able to explain. 01:49:48.780 |
Which of those worlds are more likely, do you think, 01:49:53.500 |
- I think the reality is it's gonna be a mix. 01:49:57.460 |
I mean, I think there are tasks that I'm perfectly fine with 01:50:00.460 |
machines show a certain level of performance, 01:50:03.980 |
and that level of performance is already better than humans. 01:50:11.300 |
If driverless cars learn how to be more effective drivers 01:50:14.340 |
than humans, but can't explain what they're doing, 01:50:27.580 |
when something bad happens and we wanna decide 01:50:29.740 |
who's liable for that thing, and who made that mistake, 01:50:33.540 |
And I think those edge cases are interesting cases. 01:50:41.060 |
And it says, well, you didn't train it properly. 01:50:43.620 |
You know, you were negligent in the training data 01:51:01.620 |
I think that, I think it's gonna be interesting. 01:51:05.820 |
and social discourse are gonna get deeply intertwined 01:51:13.500 |
I think in other cases, it becomes more obvious 01:51:18.120 |
like, why did you decide to give that person, 01:51:21.180 |
you know, a longer sentence, or deny them parole? 01:51:26.060 |
Again, policy decisions, or why did you pick that treatment? 01:51:30.540 |
Like, that treatment ended up killing that guy. 01:51:32.260 |
Like, why was that a reasonable choice to make? 01:51:35.060 |
So, and people are gonna demand explanations. 01:51:45.940 |
I'm not sure humans are making reasonable choices 01:51:54.740 |
or even systematically using statistical averages 01:52:09.300 |
and it took a long time for the ambulance to get there, 01:52:12.380 |
and he was not resuscitated right away, and so forth. 01:52:14.540 |
And they came, they told me he was brain dead. 01:52:21.060 |
Under these conditions, with these four features, 01:52:29.660 |
go there and tell me his brain's not functioning 01:53:00.020 |
But the bottom, a fascinating story, by the way, 01:53:02.060 |
about how I reasoned, and how the doctors reasoned 01:53:05.980 |
But I don't know, somewhere around 24 hours later 01:53:11.020 |
- I mean, what lessons do you draw from that story, 01:53:26.460 |
So in other words, you're getting shit wrong, sorry. 01:53:45.220 |
and that you should be reasoning it about the specific case 01:54:08.140 |
- Well, so it's hard because it's hard to know that. 01:54:15.740 |
and you'd have to have enough data to essentially say, 01:54:18.260 |
and this goes back to the case of how do we decide 01:54:22.060 |
whether AI is good enough to do a particular task? 01:54:28.700 |
So, and what standards do we hold, right, for that? 01:54:51.540 |
Without it, he probably would have died much sooner. 01:55:04.860 |
but maybe not in that particular case, but overall. 01:55:08.140 |
The medical system overall does more good than bad. 01:55:35.700 |
And have you done enough studies to compare it? 01:55:40.120 |
To say, well, what if we dug in in a more direct, 01:55:45.120 |
let's get the evidence, let's do the deductive thing 01:55:57.600 |
because it depends how fast you have to make decision. 01:56:09.080 |
and this is a lot of the argument that I had with a doctor, 01:56:20.120 |
- I mean, it raises questions for our society 01:56:22.880 |
to struggle with, as is the case with your father, 01:56:44.000 |
that most of the violent crime is committed by males. 01:56:53.880 |
if it's a male, more likely to commit the crime. 01:56:56.160 |
- So this is one of my very positive and optimistic views 01:57:08.000 |
logically and statistically, and how to combine them 01:57:12.200 |
because it's causing a, regardless of what state AI devices 01:57:17.200 |
are or not, it's causing this dialogue to happen. 01:57:24.840 |
that, in my view, the human species can have right now, 01:57:28.200 |
which is how to think well, how to reason well, 01:57:41.000 |
That has got to be one of the most important things 01:57:51.200 |
We've created amazing abilities to amplify noise 01:58:06.320 |
getting hit with enormous amounts of information. 01:58:18.800 |
This is such an important dialogue to be having. 01:58:23.200 |
And we are fundamentally, our thinking can be 01:58:31.440 |
And there are statistics, and we shouldn't blind ourselves, 01:58:37.320 |
but we should understand the nature of statistical inference. 01:58:40.920 |
As a society, we decide to reject statistical inference, 01:58:48.240 |
to favor understanding and deciding on the individual. 01:59:03.240 |
even if the statistics said males are more likely 01:59:27.480 |
We do that out of respect for the individual. 01:59:39.000 |
Because the Jeopardy challenge captivated the world, 01:59:50.280 |
Gary's bitterness aside, captivated the world. 01:59:57.880 |
for next grand challenges for future challenges of that? 02:00:08.480 |
which is can you demonstrate that they understand, 02:00:23.360 |
but it's a little bit more demanding than the Turing test. 02:00:26.560 |
It's not enough to convince me that you might be human 02:00:38.480 |
For example, can you, the standard is higher, 02:00:56.240 |
whether or not two people actually understand each other 02:01:04.400 |
So the challenge becomes something along the lines 02:01:07.440 |
of can you satisfy me that we have a shared purpose 02:01:14.800 |
So if I were to probe and probe and you probe me, 02:01:18.400 |
can machines really act like thought partners 02:01:23.400 |
where they can satisfy me that we have a shared, 02:01:29.400 |
that we can collaborate and produce answers together 02:01:33.320 |
and that they can help me explain and justify those answers. 02:01:38.120 |
So we'll have AI system run for president and convince-- 02:01:46.960 |
- You have to convince the voters that they should vote. 02:01:53.800 |
- Again, that's why I think this is such a challenge 02:01:55.920 |
because we go back to the emotional persuasion. 02:02:00.040 |
We go back to, now we're checking off an aspect 02:02:06.080 |
of human cognition that is in many ways weak or flawed. 02:02:13.960 |
Our minds are drawn for often the wrong reasons. 02:02:18.960 |
Not the reasons that ultimately matter to us, 02:02:24.000 |
I think we can be persuaded to believe one thing or another 02:02:28.440 |
for reasons that ultimately don't serve us well 02:02:33.200 |
And a good benchmark should not play with those elements 02:02:41.640 |
And I think that's where we have to set the higher standard 02:03:07.480 |
can you identify where it's consistent or contradictory 02:03:15.600 |
So I think another way to think about it perhaps 02:03:22.780 |
Can it help you-- - Oh, that's a really nice, 02:03:50.120 |
and again, this borrows from some science fictions, 02:03:52.840 |
but can you go off and learn about this topic 02:03:58.440 |
and then work with me to help me understand it? 02:04:03.640 |
Well, a machine that passes that kind of test, 02:04:06.960 |
do you think it would need to have self-awareness 02:04:26.960 |
- People used to ask me if Watson was conscious, 02:04:28.760 |
and I used to say, conscious of what exactly? 02:04:34.280 |
- It depends what it is that you're conscious of. 02:04:38.520 |
it's certainly easy for it to answer questions about, 02:04:49.040 |
that would imply that it was aware of things. 02:04:54.440 |
I mean, I think that we differ from one another 02:05:02.680 |
There's degrees of consciousness in there, so-- 02:05:11.160 |
- But nevertheless, there's a very subjective element 02:05:39.080 |
- He wasn't, yeah, so there's an element of finiteness 02:05:42.880 |
to our existence that I think, like you mentioned, 02:06:01.640 |
to be fundamentally important for intelligence, 02:06:09.800 |
Again, I think you could have an intelligence capability 02:06:14.520 |
and a capability to learn, a capability to predict, 02:06:18.520 |
but I think without, I mean, again, you get a fear, 02:06:40.880 |
to try to protect its power source and survive. 02:06:42.760 |
I mean, so I don't know that that's philosophically 02:06:46.680 |
It sounds like a fairly easy thing to demonstrate 02:06:50.080 |
Well, it'll come up with that goal by itself, 02:06:52.400 |
and I think you have to program that goal in. 02:07:01.520 |
The fact that a robot will be protecting its power source 02:07:06.280 |
would add depth and grounding to its intelligence 02:07:29.520 |
and I don't think if you knew how trivial that was, 02:07:32.120 |
you would associate that with being intelligence. 02:07:35.400 |
I mean, I literally put in a statement of code 02:07:37.480 |
that says you have the following actions you can take. 02:07:44.000 |
or you have the ability to scream or screech or whatever, 02:07:48.960 |
and you say, if you see your power source threatened, 02:07:53.920 |
and you're gonna take these actions to protect it. 02:08:03.800 |
and you're gonna say, well, that's intelligence, 02:08:06.840 |
Maybe, but that's, again, this human bias that says, 02:08:10.220 |
the thing I, I identify my intelligence and my conscience 02:08:31.080 |
something that would, that you would be comfortable calling 02:08:42.460 |
certainly not the next few months or 20 years away, 02:08:52.100 |
I mean, I would be, you know, I would be guessing, 02:08:57.000 |
including just how much we want to invest in it, 02:09:02.080 |
what kind of investment we're willing to make in it, 02:09:06.140 |
what kind of talent we end up bringing to the table, 02:09:10.140 |
So I think it is possible to do this sort of thing. 02:09:15.140 |
I think it's, I think trying to sort of ignore many 02:09:25.380 |
It's probably closer to a 20-year thing, I guess. 02:09:29.680 |
- No, I don't think it's several hundred years. 02:09:33.620 |
but again, so much depends on how committed we are 02:09:38.820 |
to investing and incentivizing this type of work. 02:09:45.160 |
like I don't think it's obvious how incentivized we are. 02:09:54.380 |
if we see business opportunities to take this technique 02:09:59.120 |
I think that's the main driver for many of these things. 02:10:09.560 |
And like we just struggled ourselves right now 02:10:14.760 |
So it's hard to incentivize when we don't even know 02:10:29.640 |
There's no clear directive to do precisely that thing. 02:10:32.280 |
- So assistance in a larger and larger number of tasks. 02:10:36.480 |
So being able to, a system that's particularly able 02:10:39.600 |
to operate my microwave and making a grilled cheese sandwich, 02:10:45.000 |
And then the same system would be doing the vacuum cleaning. 02:10:56.300 |
- I think that when you get into a general intelligence 02:11:04.280 |
and again, I wanna go back to your body question, 02:11:06.080 |
'cause I think your body question was interesting, 02:11:07.320 |
but you wanna go back to learning the abilities 02:11:19.040 |
whether it's mowing your lawn or driving a car 02:11:22.760 |
I think we will get better and better at that 02:11:35.600 |
The underlying mechanisms for doing that may be the same, 02:11:55.040 |
that the general learning infrastructure in there 02:12:04.720 |
can we effectively communicate and understand 02:12:23.480 |
How do you get the machine in the intellectual game? 02:12:31.960 |
So it's a little bit of a bootstrapping thing. 02:12:33.800 |
Can we get the machine engaged in the intellectual, 02:12:39.160 |
but in the intellectual dialogue with the humans? 02:13:07.300 |
but when I go back to, are we incentivized to do that? 02:13:13.160 |
Are we incentivized to do the latter significantly enough? 02:13:20.860 |
is to try to articulate that better and better 02:13:24.560 |
and through trying to craft these grand challenges 02:13:40.120 |
And to build up that incentive system around that. 02:13:45.080 |
- Yeah, I think if people don't understand yet, 02:13:47.680 |
I think there's a huge business potential here. 02:13:54.960 |
but I'm a huge fan of physical presence of things. 02:14:03.360 |
Do you think having a body adds to the interactive element 02:14:13.540 |
- So I think going back to that shared understanding bit, 02:14:26.360 |
to kind of be a compatible human intelligence 02:14:29.200 |
is that our physical bodies are generating a lot of features 02:14:42.760 |
but they also generate a lot of input for our brains. 02:14:46.400 |
So we generate emotion, we generate all these feelings, 02:14:49.640 |
we generate all these signals that machines don't have. 02:14:52.800 |
So the machines that have this is the input data, 02:14:58.840 |
okay, I've gotten this, I've gotten this emotion, 02:15:01.240 |
or I've gotten this idea, I now wanna process it, 02:15:04.360 |
and then I can, it then affects me as a physical being, 02:15:12.240 |
In other words, I could realize the implications of that, 02:15:14.080 |
'cause the implications, again, on my mind-body complex, 02:15:17.560 |
I then process that, and the implications, again, 02:15:20.000 |
are internal features are generated, I learn from them, 02:15:30.480 |
Well, if we want a human-compatible intelligence, 02:15:36.800 |
- Just to clarify, and both concepts are beautiful, 02:15:39.980 |
is humanoid robots, so robots that look like humans is one, 02:15:44.980 |
or did you mean actually sort of what Elon Musk 02:16:01.880 |
I meant like if you wanna create an intelligence 02:16:10.880 |
a shared understanding of the world around it, 02:16:13.080 |
you have to give it a lot of the same substrate. 02:16:18.240 |
that it generates these kinds of internal features, 02:16:21.160 |
the sort of emotional stuff, it has similar senses, 02:16:34.280 |
I think that's a fascinating scientific goal. 02:16:35.840 |
I think it has all kinds of other implications. 02:16:41.600 |
as I create intellectual thought partners for humans, 02:17:00.760 |
of what we process is that physical experience 02:17:22.000 |
put another way, sort of having a deep connection 02:17:41.920 |
So in other words, if you develop an emotional relationship 02:17:53.640 |
But at the same time, I think the opportunity 02:17:55.680 |
to use machines to provide human companionship 02:17:59.060 |
Intellectual and social companionship is not a crazy idea. 02:18:09.960 |
Elon Musk, Sam Harris, about long-term existential threats 02:18:18.720 |
We talked about bias, we talked about different misuses, 02:18:21.080 |
but do you have concerns about thought partners, 02:18:25.640 |
systems that are able to help us make decisions 02:18:28.560 |
together with humans, somehow having a significant 02:18:35.300 |
I think giving machines too much leverage is a problem. 02:18:41.480 |
And what I mean by leverage is too much control 02:18:45.640 |
over things that can hurt us, whether it's socially, 02:18:48.880 |
psychologically, intellectually, or physically. 02:18:51.640 |
And if you give the machines too much control, 02:18:54.800 |
You forget about the AI, just when you give them 02:18:57.260 |
too much control, human bad actors can hack them 02:19:10.040 |
the driverless car network and creating all kinds of havoc. 02:19:15.040 |
But you could also imagine, given the ease at which 02:19:20.200 |
humans could be persuaded one way or the other, 02:19:22.800 |
and now we have algorithms that can easily take control 02:19:32.000 |
I mean, humans do that to other humans all the time. 02:19:34.140 |
And we have marketing campaigns, we have political campaigns 02:19:37.120 |
that take advantage of our emotions or our fears. 02:19:44.180 |
But with machines, machines are like giant megaphones, right? 02:19:50.680 |
and fine tune its control so we can tailor the message. 02:19:54.840 |
We can now very rapidly and efficiently tailor the message 02:19:58.600 |
to the audience, taking advantage of their biases 02:20:03.600 |
and amplifying them and using them to persuade them 02:20:06.640 |
in one direction or another in ways that are not fair, 02:20:24.400 |
more quickly, and we see that already going on 02:20:43.760 |
we could be having is about the nature of intelligence 02:21:03.160 |
and how do we use them to complement it basically 02:21:06.040 |
so that in the end we have a stronger overall system. 02:21:15.800 |
So like telling your kids or telling your students, 02:21:24.480 |
Here's how easy it is to trick your brain, right? 02:21:29.480 |
you should appreciate the different types of thinking 02:21:36.800 |
and what do you prefer and under what conditions 02:22:00.760 |
beyond any definition of the Turing test or the benchmark, 02:22:19.320 |
if you get to pick one, would you have with that system? 02:22:43.560 |
I think what excites me is the beauty of it is 02:22:46.040 |
if I really have that system, I don't have to pick. 02:22:57.200 |
Go out, read this stuff in the next three milliseconds. 02:23:15.960 |
Here's what I'm thinking is the main implication. 02:23:21.060 |
Can you give me the evidence that supports that? 02:23:23.260 |
Can you give me evidence that supports this other thing? 02:23:30.360 |
- Just to be part of, whether it's a medical diagnosis 02:23:33.300 |
or whether it's the various treatment options 02:23:38.360 |
or whether it's a social problem that people are discussing, 02:23:41.600 |
be part of the dialogue, one that holds itself and us 02:23:48.240 |
accountable to reasons and objective dialogue. 02:23:56.140 |
- So when you create it, please come back on the podcast 02:24:04.780 |
This is a record for the longest conversation ever.