back to indexStanford XCS224U: NLU I Behavioral Evaluation of NLU Models, Part 3: Compositionality I Spring 2023
00:00:10.440 |
Our focus is the principle of compositionality. 00:00:13.040 |
This is a principle that is important to me as 00:00:15.260 |
a linguistic semanticist and it's also arguably 00:00:18.480 |
a prerequisite for understanding the goals of 00:00:30.540 |
It says, "The meaning of a phrase is a function of 00:00:33.880 |
the meanings of its immediate syntactic constituents 00:00:38.800 |
That's the principle. Let's unpack it by way of an example. 00:00:44.800 |
a full sentence, "Every student admired the idea." 00:00:48.040 |
The compositionality principle says that the meaning of 00:00:53.880 |
determined by the meaning of its two constituent parts, 00:00:59.040 |
You can see that this implies a recursive process. 00:01:03.080 |
Well, that is fully determined by the meanings of 00:01:05.780 |
this debt for determiner node and this N for noun node. 00:01:18.480 |
That's where this recursive process grounds out. 00:01:24.340 |
all the lexical items of the languages that you speak. 00:01:28.940 |
figured out how they combine with each other, 00:01:30.920 |
you have a recursive process that allows you to combine 00:01:35.200 |
understand novel combinations of these elements. 00:01:43.980 |
the meaning of the parts and how they are combined. 00:01:46.420 |
We could also think about this in a bottom-up fashion. 00:01:57.600 |
determine the meanings of these parent nodes here, 00:02:03.240 |
the complex nodes above them and so forth until we have 00:02:06.320 |
derived bottom-up a meaning for the entire sentence. 00:02:15.560 |
Well, this can be a little bit hard to reconstruct, 00:02:18.200 |
but I would say that the usual motivations are as follows. 00:02:26.480 |
we would model all the meaningful units of the language, 00:02:29.600 |
and that would imply that we have gone all the way 00:02:40.460 |
In practice, I should point out that that means there's 00:02:43.260 |
a lot of abstraction around linguistic semantics 00:02:51.520 |
isolation from the things that it combines with. 00:03:06.840 |
combined with the meaning of this verb phrase, 00:03:09.000 |
finally gives us a meaning for this S node up here, 00:03:11.720 |
and it's something like universal quantification where, 00:03:16.680 |
then it has the property of admiring the idea. 00:03:19.320 |
That would be the fundamental claim of the sentence, 00:03:21.600 |
and you can see there that that claim was driven by 00:03:44.900 |
I grant that there is some sense in which this is true 00:03:47.520 |
because there seems to be no principle bound on 00:03:58.020 |
I'm sad to report that we are all finite beings, 00:04:01.040 |
and therefore there is only a finite capacity in all of 00:04:29.320 |
and yet nonetheless, we are able to instantly and 00:04:35.880 |
That does imply that there is some capacity in 00:04:48.160 |
compositionality could be seen as an explanation for that. 00:04:56.040 |
which I think is a slightly more general notion than 00:04:58.940 |
compositionality and may be a more correct characterization. 00:05:05.760 |
the heading of compositionality or systematicity. 00:05:24.700 |
the ability to produce or understand certain other ones." 00:05:28.360 |
The idea is that if you understand the sentence, 00:05:45.680 |
effortlessly understand the turtle loves the puppy, 00:05:49.800 |
the turtle loves Sandy, and so forth and so on. 00:05:52.240 |
You get this instant explosion in the number of things that 00:05:58.680 |
your own understanding of language being so systematic. 00:06:11.360 |
But I think systematicity is arguably more general. 00:06:16.440 |
characterization here that might allow for things that are 00:06:24.680 |
Systematicity is a powerful idea for thinking about 00:06:31.420 |
especially the hypothesis-driven challenge tests that we run. 00:06:35.380 |
Because very often when we express concerns about systems, 00:06:45.200 |
This is from a real sentiment classification model 00:06:48.440 |
that I developed that I thought was pretty good, 00:06:50.940 |
and I started posing little challenge problems to it. 00:06:54.160 |
I was initially very encouraged by these examples. 00:07:01.800 |
generally a positive claim about this bakery's pies, 00:07:05.400 |
and it involves this very unusual sense of mean, 00:07:23.160 |
and I started to think that my model truly understand 00:07:26.180 |
this very specialized sense of the adjective mean. 00:07:30.140 |
But that fell apart with the next two examples. 00:07:38.340 |
whereas the gold label is of course still positive. 00:07:48.000 |
I have no expectation that changing the subject from 00:07:54.280 |
a pronoun as opposed to a full noun phrase like 00:07:58.800 |
the interpretation of the adjective mean in these cases. 00:08:04.320 |
changed and that manifests for me as a lack of systematicity. 00:08:14.360 |
They have a hypothesis grounded in the systematicity of 00:08:17.400 |
language and they observe departures from that in 00:08:20.160 |
their models and they begin to worry about those models. 00:08:38.140 |
we got compositionality by design because those were 00:08:44.640 |
themselves adhere to the compositionality principle. 00:09:03.440 |
like this one depicted from Percy Leung's work, 00:09:08.620 |
underlyingly there was a compositional grammar 00:09:19.500 |
associated with them being probabilistic models. 00:09:26.180 |
we again saw systems that were arguably compositional. 00:09:32.640 |
It's a recursive tree-structured neural network. 00:09:36.940 |
It abides by the compositionality principle in 00:09:44.180 |
There was a complicated deep learning function that 00:09:58.400 |
the way of much work in linguistic semantics, 00:10:15.100 |
these huge typically transformer-based models 00:10:17.800 |
where everything is connected to everything else. 00:10:32.840 |
they were learning non-systematic solutions and that 00:10:35.800 |
motivated a lot of challenge testing for them. 00:11:01.360 |
often an amazing discovery about the power of 00:11:06.320 |
deliver systematic solutions to language problems. 00:11:18.480 |
the hard behavioral tasks that we pose for them.