back to index

Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs | Lex Fridman Podcast #426


Chapters

0:0 Introduction
1:13 Human language
5:19 Generalizations in language
11:6 Dependency grammar
21:5 Morphology
29:40 Evolution of languages
33:0 Noam Chomsky
77:6 Thinking and language
90:36 LLMs
103:35 Center embedding
130:2 Learning a new language
133:54 Nature vs nurture
140:30 Culture and language
154:58 Universal language
159:21 Language translation
162:36 Animal communication

Whisper Transcript | Transcript Only Page

00:00:00.000 | Naively, I certainly thought that all humans would have words for exact counting, and the
00:00:07.240 | Piraha don't.
00:00:08.240 | Okay?
00:00:09.240 | So, they don't have any words for even one.
00:00:11.680 | There's not a word for one in their language.
00:00:13.720 | And so, there's certainly not a word for two, three, or four, so that kind of blows people's
00:00:17.800 | minds off.
00:00:18.800 | Yeah, that's blowing my mind.
00:00:20.240 | That's pretty weird, isn't it?
00:00:21.240 | How are you going to ask, "I want two of those"?
00:00:23.360 | You just don't.
00:00:24.360 | And so, that's just not a thing you can possibly ask in the Piraha.
00:00:27.920 | It's not possible.
00:00:29.160 | There's no words for that.
00:00:32.360 | The following is a conversation with Edward Gibson, or Ted, as everybody calls him.
00:00:37.480 | He is a psycholinguistics professor at MIT.
00:00:40.640 | He heads the MIT Language Lab that investigates why human languages look the way they do,
00:00:46.680 | the relationship between cultural language and how people represent, process, and learn
00:00:52.680 | language.
00:00:53.680 | Also, he should have a book titled, "Syntax, a Cognitive Approach," published by MIT Press
00:00:59.760 | coming out this fall.
00:01:01.360 | So, look out for that.
00:01:03.840 | This is the Lex Friedman Podcast.
00:01:05.560 | To support it, please check out our sponsors in the description.
00:01:09.200 | And now, dear friends, here's Edward Gibson.
00:01:13.800 | When did you first become fascinated with human language?
00:01:17.580 | As a kid in school, when we had to structure sentences in English grammar, I found that
00:01:24.620 | process interesting.
00:01:25.780 | I found it confusing as to what it was I was told to do.
00:01:29.700 | I didn't understand what the theory was behind it, but I found it very interesting.
00:01:34.540 | So, when you look at grammar, you're almost thinking about it like a puzzle, almost like
00:01:38.100 | a mathematical puzzle?
00:01:39.260 | Yeah, I think that's right.
00:01:40.540 | I didn't know I was going to work on this at all at that point.
00:01:42.780 | I was really just, I was kind of a math geek person, a computer scientist.
00:01:47.340 | I really liked computer science.
00:01:48.740 | And then I found language as a neat puzzle to work on from an engineering perspective,
00:01:55.980 | actually.
00:01:56.980 | That's what I, as a, I sort of accidentally, I decided after I finished my undergraduate
00:02:03.060 | degree, which was computer science and math in Canada, in Queens University, I decided
00:02:08.300 | to go to grad school.
00:02:09.300 | It's like, that's what I always thought I would do.
00:02:11.420 | And I went to Cambridge, where they had a master's program in computational linguistics,
00:02:18.140 | and I hadn't taken a single language class before.
00:02:21.740 | All I'd taken was CS, computer science, math classes, pretty much, mostly, as an undergrad.
00:02:26.940 | And I just thought this was an interesting thing to do for a year, because it was a single
00:02:31.580 | year program.
00:02:33.300 | And then I ended up spending my whole life doing it.
00:02:35.980 | So fundamentally, your journey through life was one of a mathematician and a computer
00:02:39.820 | scientist, and then you kind of discovered the puzzle, the problem of language, and approached
00:02:46.020 | it from that angle, to try to understand it from that angle, almost like a mathematician
00:02:51.860 | or maybe even an engineer.
00:02:53.780 | - As an engineer, I'd say, I mean, to be frank, I had taken an AI class, I guess it was '83
00:02:59.500 | or '84, '85, somewhere '84 in there, a long time ago, and there was a natural language
00:03:03.140 | section in there, and it didn't impress me.
00:03:06.540 | I thought, there must be more interesting things we can do.
00:03:10.140 | It didn't seem very, it seemed just a bunch of hacks to me.
00:03:14.900 | It didn't seem like a real theory of things in any way.
00:03:17.780 | And so I just thought this seemed like an interesting area where there wasn't enough
00:03:23.260 | good work.
00:03:24.260 | - Did you ever come across the philosophy angle of logic?
00:03:27.940 | So if you think about the '80s with AI, the expert systems where you try to kind of maybe
00:03:34.180 | sidestep the poetry of language and some of the syntax and the grammar and all that kind
00:03:38.820 | of stuff and go to the underlying meaning that language is trying to communicate and
00:03:43.380 | try to somehow compress that in a computer-representable way, did you ever come across that in your
00:03:49.900 | studies?
00:03:50.900 | - I mean, I probably did, but I wasn't as interested in it.
00:03:53.380 | I was trying to do the easier problems first, the ones I could, thought maybe were handleable,
00:03:58.940 | which seems like the syntax is easier, which is just the forms as opposed to the meaning,
00:04:04.140 | like when you're starting to talk about the meaning, that's a very hard problem, and it
00:04:07.780 | still is a really, really hard problem.
00:04:09.860 | But the forms is easier, and so I thought at least figuring out the forms of human language,
00:04:16.020 | which sounds really hard, but is actually maybe more tractable.
00:04:19.380 | - So it's interesting.
00:04:20.380 | You think there is a big divide, there's a gap, there's a distance between form and meaning,
00:04:26.420 | because that's a question you have discussed a lot with LLMs, because they're damn good
00:04:32.780 | at form.
00:04:33.780 | - Yeah.
00:04:34.780 | - I think that's what they're good at, is form.
00:04:35.780 | - Yeah.
00:04:36.780 | - Exactly.
00:04:37.780 | And that's why they're good, because they can do form.
00:04:38.780 | Meaning's hard.
00:04:39.780 | - Do you think there's, oh, wow.
00:04:40.780 | I mean, it's an open question, right, how close form and meaning are, but we'll discuss
00:04:46.380 | But to me, studying form, maybe it's a romantic notion, gives you, form is like the shadow
00:04:54.300 | of the bigger meaning thing underlying language, as I, form is, language is how we communicate
00:05:02.540 | ideas, we communicate with each other using language.
00:05:05.580 | So in understanding the structure of that communication, I think you start to understand
00:05:10.860 | the structure of thought and the structure of meaning behind those thoughts and communication,
00:05:16.100 | to me.
00:05:17.100 | But to you, big gap.
00:05:18.660 | - Yeah.
00:05:19.660 | - What do you find most beautiful about human language, maybe the form of human language,
00:05:26.060 | the expression of human language?
00:05:28.020 | - What I find beautiful about human language is the, some of the generalizations that happen
00:05:34.460 | across the human language, just within and across a language.
00:05:37.380 | So let me give you an example of something which I find kind of remarkable, that is if
00:05:43.220 | a language, if it has a word order such that the verbs tend to come before their objects,
00:05:49.860 | and so that's like English does that.
00:05:51.380 | So we have the first, the subject comes first in a simple sentence, so I say, the dog chased
00:05:59.100 | the cat, or Mary kicked the ball, so the subject's first, and then after the subject, there's
00:06:03.980 | the verb, and then we have objects, all these things come after in English.
00:06:08.220 | So it's generally a verb, and most of the stuff that we want to say comes after the
00:06:11.940 | subject, it's the objects, there's a lot of things we want to say that come after.
00:06:16.060 | And there's a lot of languages like that, about 40% of the languages of the world look
00:06:20.060 | like that, they're subject-verb-object languages.
00:06:24.180 | And then these languages tend to have prepositions, these little markers on the nouns that connect
00:06:34.500 | nouns to other nouns, or nouns to verbs.
00:06:36.340 | So a preposition like in, or on, or of, or about, I say I talk about something, the something
00:06:44.100 | is the object of that preposition, we have these little markers come, just like verbs,
00:06:49.260 | they come before their nouns.
00:06:51.820 | Okay, and then, so, now we look at other languages, like Japanese, or Hindi, or some, these are
00:06:57.820 | so-called verb-final languages, those, maybe a little more than 40%, maybe 45% of the world's
00:07:04.980 | languages, or more, I mean 50% of the world's languages are verb-final, those tend to be
00:07:09.500 | postpositions, those markers, the states have the same kinds of markers as we do in English,
00:07:17.580 | but they put 'em after.
00:07:18.980 | So, sorry, but they put 'em first, the markers come first, so you say, instead of, you know,
00:07:25.340 | talk about a book, you say a book about, the opposite order there, in Japanese or in Hindi,
00:07:32.700 | you do the opposite, and the talk comes at the end, so the verb will come at the end
00:07:36.660 | as well.
00:07:37.660 | So instead of Mary kicked the ball, it's Mary ball kicked, and then if it says Mary kicked
00:07:44.660 | the ball to John, it's John to, the to, the marker there, the preposition, it's a postposition
00:07:51.540 | in these languages.
00:07:52.540 | And so the interesting thing, fascinating thing to me, is that within a language, this
00:07:58.180 | order aligns, it's harmonic, and so if it's one or the other, it's either verb-initial
00:08:05.980 | or verb-final, but then you'll have prepositions, prepositions, or postpositions, and that's
00:08:11.660 | across the languages that we can look at, we've got around 1,000 languages for, there's
00:08:16.420 | around 7,000 languages on the Earth right now, but we have information about, say, word
00:08:22.760 | order on around 1,000 of those, a pretty decent amount of information.
00:08:27.120 | And for those 1,000 which we know about, about 95% fit that pattern, so they will have either
00:08:34.060 | verb-initial, it's about half and half, half a verb-initial, like English, and half a verb-final,
00:08:39.660 | like Japanese.
00:08:40.660 | - So just to clarify, verb-initial is subject-verb-object.
00:08:45.300 | - That's correct.
00:08:46.300 | - Verb-final is still subject-object-verb.
00:08:50.160 | - That's correct, yeah, the subject is generally first.
00:08:52.220 | - That's so fascinating, "I ate an apple," or "I apple ate," okay, and it's fascinating
00:08:59.500 | that there's a pretty even division in the world amongst those, 40, 45%.
00:09:03.660 | - Yeah, it's pretty even.
00:09:05.780 | And those two are the most common by far, those two word orders, the subject tends to
00:09:08.900 | be first.
00:09:09.900 | There's so many interesting things, but these things are, the thing I find so fascinating
00:09:12.900 | is there are these generalizations within and across a language.
00:09:17.340 | And not only those, and there's actually a simple explanation, I think, for a lot of
00:09:22.540 | that, and that is you're trying to minimize dependencies between words.
00:09:28.660 | That's basically the story, I think, behind a lot of why word order looks the way it is,
00:09:34.220 | is we're always connecting, what is the thing I'm telling you?
00:09:38.220 | I'm talking to you in sentences, you're talking to me in sentences, these are sequences of
00:09:42.060 | words which are connected, and the connections are dependencies between the words.
00:09:47.860 | And it turns out that what we're trying to do in a language is actually minimize those
00:09:53.580 | dependency links.
00:09:54.580 | It's easier for me to say things if the words that are connecting for their meaning are
00:09:58.300 | close together.
00:09:59.300 | It's easier for you in understanding if that's also true.
00:10:03.500 | If they're far away, it's hard to produce that, and it's hard for you to understand.
00:10:08.820 | And the languages of the world, within a language and across languages, fit that generalization,
00:10:14.020 | which is, so it turns out that having verbs initial and then having prepositions ends
00:10:21.220 | up making dependencies shorter.
00:10:23.140 | And having verbs final and having postpositions ends up making dependencies shorter than if
00:10:28.500 | you cross them.
00:10:29.500 | If you cross them, it ends up, you just end up, it's possible.
00:10:32.380 | You can do it.
00:10:33.380 | - Within a language.
00:10:34.380 | - Within a language, you can do it.
00:10:35.460 | It just ends up with longer dependencies than if you didn't.
00:10:39.060 | And so languages tend to go that way.
00:10:40.740 | They tend to, they call it harmonic.
00:10:43.900 | So it was observed a long time ago, without the explanation, by a guy called Joseph Greenberg,
00:10:49.860 | who's a famous typologist from Stanford.
00:10:53.460 | He observed a lot of generalizations about how word order works, and these are some of
00:10:57.220 | the harmonic generalizations that he observed.
00:11:00.380 | - Harmonic generalizations about word order.
00:11:02.820 | There's so many things I want to ask you.
00:11:04.740 | - Okay, good.
00:11:05.740 | - Okay, let me just, sometimes basics.
00:11:07.900 | You mentioned dependencies a few times.
00:11:09.540 | - Yeah, yeah.
00:11:10.540 | - What do you mean by dependencies?
00:11:12.140 | - Well, what I mean is, in language, there's kind of three structures to, three components
00:11:18.820 | to the structure of language.
00:11:19.860 | One is the sounds.
00:11:21.260 | So cat is /k/, /æ/, and /t/ in English.
00:11:23.940 | I'm not talking about that part, I'm talking, then there's two meaning parts, and those
00:11:27.420 | are the words, and you were talking about meaning earlier.
00:11:30.420 | So words have a form, and they have a meaning associated with them, and so cat is a full
00:11:35.300 | form in English, and it has a meaning associated with whatever a cat is.
00:11:38.760 | And then the combinations of words, that's what I'll call grammar or syntax, and that's
00:11:45.180 | like when I have a combination like "the cat" or "two cats," okay?
00:11:49.660 | So where I take two different words there and put them together, and I get a compositional
00:11:54.960 | meaning from putting those two different words together, and so that's the syntax.
00:11:59.180 | And in any sentence or utterance, whatever I'm talking to you, you're talking to me,
00:12:04.340 | we have a bunch of words and we're putting together in a sequence, it turns out they
00:12:08.980 | are connected so that every word is connected to just one other word in that sentence.
00:12:17.020 | And so you end up with what's called technically a tree, it's a tree structure, where there's
00:12:21.460 | a root of that utterance, of that sentence, and then there's a bunch of dependents, like
00:12:27.740 | branches from that root that go down to the words.
00:12:31.140 | The words are the leaves in this metaphor for a tree.
00:12:34.700 | So a tree is also sort of a mathematical construct.
00:12:37.180 | Yeah, yeah, it's a graph theoretical thing, exactly.
00:12:40.180 | So it's fascinating that you can break down a sentence into a tree, and then every word
00:12:45.380 | is hanging on to another, it's depending on it.
00:12:47.820 | That's right.
00:12:48.820 | And everyone agrees on that, so all linguists will agree with that, that is not controversial.
00:12:53.380 | There's nobody sitting here listening mad at you.
00:12:55.700 | I do not think so.
00:12:56.700 | I don't think so.
00:12:57.700 | No one is sitting there mad at this.
00:12:58.700 | No, no.
00:12:59.700 | I think in every language, I think everyone agrees that all sentences are trees at some
00:13:04.540 | level.
00:13:05.540 | Can I pause on that?
00:13:06.540 | Sure.
00:13:07.540 | 'Cause it, to me, just as a layman, it's surprising that you can break down sentences
00:13:14.380 | in many, mostly all languages into a tree.
00:13:17.820 | I think so.
00:13:18.820 | That's weird.
00:13:19.820 | I've never heard of anyone disagreeing with that.
00:13:21.500 | That's weird.
00:13:22.500 | The details of the trees are what people disagree about.
00:13:25.580 | Well, okay, so what's at the root of a tree, how do you construct, how hard is it, what
00:13:30.860 | is the process of constructing a tree from a sentence?
00:13:34.180 | Well, this is where, you know, depending on what your, there's different theoretical notions.
00:13:38.420 | I'm gonna say the simplest thing, dependency grammar.
00:13:41.380 | It's like a bunch of people invented this, Tenier was the first French guy, back in,
00:13:46.060 | I mean, the paper was published in 1959, but he was working on the '30s and stuff, so,
00:13:50.900 | and it goes back to, you know, philologist Pignini was doing this in ancient India, okay?
00:13:57.980 | And so, you know, doing something like this, the simplest thing we can think of is that
00:14:02.420 | there's just connections between the words to make the utterance.
00:14:06.420 | And so, let's just say I have, like, two dogs entered a room, okay, here's a sentence.
00:14:11.980 | And so, we're connecting two and dogs together, that's like, there's some dependency between
00:14:17.100 | those words to make some bigger meaning, and then we're connecting dogs now to entered,
00:14:22.780 | right?
00:14:23.860 | And we connect a room somehow to entered, and so I'm gonna connect to room and then
00:14:28.620 | room back to entered.
00:14:30.580 | That's the tree, is I, the root is entered, that's, the thing is like an entering event,
00:14:35.060 | that's what we're saying here, and the subject, which is whatever that dog is, is two dogs,
00:14:39.740 | it was, and the connection goes back to dogs, which goes back to, then that goes back to
00:14:45.380 | I'm just, that's my tree.
00:14:46.780 | It starts at entered, goes to dogs, down to two, and then the other side, after the verb,
00:14:52.300 | the object, it goes to room, and then that goes back to the determiner or article, whatever
00:14:57.500 | you want to call that word.
00:14:58.500 | So, there's a bunch of categories of words here, we're noticing, so there are verbs,
00:15:02.780 | those are these things that typically mark, they refer to events and states in the world,
00:15:08.660 | and there are nouns, which typically refer to people, places, and things, is what people
00:15:12.660 | say, but they can refer to other more, they can refer to events themselves as well.
00:15:17.060 | They're marked by, you know, how they get, the category, the part of speech of a word
00:15:22.980 | is how it gets used in language.
00:15:25.700 | It's like, that's how you decide what the category of a word is, not by the meaning,
00:15:30.140 | but how it gets used.
00:15:31.580 | - How it's used.
00:15:32.580 | What's usually the root, is it gonna be the verb that defines the event?
00:15:36.820 | - Usually, usually, yes, yes.
00:15:38.220 | - Okay.
00:15:39.220 | - Yeah, I mean, if I don't say a verb, then there won't be a verb, and so it'll be something
00:15:42.380 | else.
00:15:43.380 | - What if you're messing, are we talking about language that's like, correct language?
00:15:46.300 | What if you're doing poetry and messing with stuff, is it, then rules go out the window,
00:15:51.020 | right?
00:15:52.020 | Then it's--
00:15:53.020 | - No.
00:15:54.020 | - You're still--
00:15:55.020 | - No, no, no, no, you're constrained by whatever language you're dealing with.
00:15:56.700 | Probably you have other constraints in poetry, such that you're, like usually in poetry there's
00:16:00.900 | multiple constraints that you want to, like you want to usually convey multiple meanings
00:16:04.820 | is the idea, and maybe you have like a rhythm or a rhyming structure as well, and depending,
00:16:09.980 | but you usually are constrained by your, the rules of your language for the most part,
00:16:14.660 | and so you don't violate those too much.
00:16:17.620 | You can violate them somewhat, but not too much, so it has to be recognizable as your
00:16:22.300 | language.
00:16:23.300 | Like in English, I can't say, "Dogs two entered room a."
00:16:27.500 | I mean, I meant that, you know, two dogs entered a room, and I can't mess with the order of
00:16:33.540 | the articles, the articles and the nouns, you just can't do that.
00:16:37.420 | In some languages, you can mess around with the order of words much more.
00:16:42.380 | I mean, you speak Russian, Russian has a much freer word order than English, and so in fact
00:16:46.900 | you can move around words in, you know, I told you that English has a subject, verb,
00:16:51.540 | object, word order, so does Russian, but Russian is much freer than English, and so you can
00:16:56.320 | actually mess around with the word order, so probably Russian poetry is gonna be quite
00:17:00.740 | different from English poetry because the word order is much less constrained.
00:17:04.820 | - Yeah, there's a much more extensive culture of poetry throughout the history of the last
00:17:10.540 | hundred years in Russia, and I always wondered why that is, but it seems that there's more
00:17:15.780 | flexibility in the way the language is used.
00:17:20.100 | You're morphing the language easier by altering the words, altering the order of the words,
00:17:25.340 | messing with it.
00:17:26.340 | - Well, you can just mess with different things in each language, and so in Russian, you have
00:17:29.780 | case markers, which are just these endings on the nouns which tell you how each noun
00:17:35.820 | connects to the verb, right?
00:17:37.100 | We don't have that in English, and so when I say Mary kissed John, I don't know who the
00:17:42.820 | agent or the patient is except by the order of the words, right?
00:17:46.220 | In Russian, you actually have a marker on the end if you're using a Russian name, and
00:17:49.820 | each of those names, you'll also say is it, you know, it'll be the nominative, which is
00:17:55.660 | marking the subject, or an accusative will mark the object.
00:17:58.900 | And you could put them in the reverse order.
00:18:00.780 | You could put accusative first, you could put subject, you could put the patient first,
00:18:07.340 | and then the verb, and then the subject, and that would be a perfectly good Russian sentence,
00:18:11.660 | and it would still mean, I could say John kissed Mary, meaning Mary kissed John, as
00:18:17.900 | long as I use the case markers in the right way.
00:18:19.740 | You can't do that in English.
00:18:21.020 | - I love the terminology of agent and patient, and the other ones you used, those are sort
00:18:27.940 | of linguistic terms, correct?
00:18:29.320 | - Those are, those are for, like, kind of meaning, those are meaning, and subject and
00:18:32.540 | object are generally used for position, so subject is just like the thing that comes
00:18:37.140 | before the verb, and the object is the one that comes after the verb.
00:18:40.180 | The agent is kind of like the thing doing, that's kind of what that means, right?
00:18:44.380 | The subject is often the person doing the action, right, the thing, so yeah.
00:18:48.500 | - Okay, this is fascinating.
00:18:49.780 | So how hard is it to form a tree in general, is there a procedure to it, like if you look
00:18:55.260 | at different languages, is it supposed to be a very natural, like is it automatable,
00:18:59.140 | or is there some human genius involved in constructing it?
00:19:01.220 | - I think it's pretty automatable at this point.
00:19:03.820 | People can figure out what the words are, they can figure out the morphemes, which are
00:19:05.860 | the, technically, morphemes are the minimal meaning units within a language, okay?
00:19:10.980 | And so, when you say eats, or drinks, it actually has two morphemes in it in English, there's
00:19:16.300 | the root, which is the verb, and then there's some ending on it which tells you, you know,
00:19:20.540 | that's this third person, third person singular.
00:19:23.340 | - Can you say what morphemes are?
00:19:24.860 | - Morphemes are just the minimal meaning units within a language, and a word is just, kind
00:19:28.060 | of the things we put spaces between in English, and they're a little bit more, they have the
00:19:31.940 | morphology as well, they have the endings, this inflectional morphology on the endings
00:19:36.140 | on the roots.
00:19:37.140 | - It modifies something about the word that adds additional meaning.
00:19:40.100 | - Yeah, yeah, yeah, and so we have a little bit of that in English, very little, much
00:19:43.300 | more in Russian, for instance, but we have a little bit in English, and so we have a
00:19:47.340 | little on the nouns, you can say it's either singular or plural, and you can say, same
00:19:52.220 | thing for verbs, like simple past tense, for example, it's like, you know, notice in English
00:19:58.100 | we say drinks, you know, he drinks, but everyone else is I drink, you drink, we drink, it's
00:20:02.860 | unmarked in a way, and then, but in the past tense, it's just drank, for everyone, there's
00:20:07.420 | no morphology at all for past tense.
00:20:09.820 | There is morphology, it's marking past tense, but it's kind of, it's an irregular now, so
00:20:13.820 | we don't even, you know, drink to drank, you know, it's not even a regular word, so in
00:20:17.720 | most verbs, many verbs, there's an -ed, we kind of add, so walk to walked, we add that
00:20:22.380 | to say it's the past tense, that I just happened to choose an irregular, 'cause it's a high-frequency
00:20:26.480 | word, and the high-frequency words tend to have irregulars in English for.
00:20:30.380 | - What's an irregular?
00:20:31.380 | - Irregular is just, there isn't a rule, so drink to drank, it's an irregular.
00:20:35.480 | - Drink, drank, okay, versus walked.
00:20:37.260 | - As opposed to walk, walked, talked, talked.
00:20:40.020 | - And there's a lot of irregulars in English.
00:20:42.420 | - There's a lot of irregulars in English.
00:20:44.180 | The frequent ones, the common words, tend to be irregular, there's many, many more low-frequency
00:20:50.760 | words, and those tend to be, those are regular ones.
00:20:53.120 | - The evolution of the irregulars are fascinating, 'cause it's essentially slang that's sticky,
00:20:57.240 | 'cause you're breaking the rules, and then everybody uses it and doesn't follow the rules,
00:21:01.960 | and they say screw it to the rules, it's fascinating, so you said morphemes, lots of questions,
00:21:07.840 | so morphology is what, the study of morphemes?
00:21:11.040 | - Morphemes is the connections between the morphemes onto the roots, the roots.
00:21:14.880 | So in English, we mostly have suffixes, we have endings on the words, not very much,
00:21:18.840 | but a little bit, as opposed to prefixes.
00:21:22.000 | Some words, depending on your language, can have mostly prefixes, mostly suffixes, or
00:21:28.120 | mostly, or both, and then even languages, several languages have things called infixes,
00:21:32.840 | where you have some kind of a general form for the root, and you put stuff in the middle,
00:21:41.760 | you change the vowels.
00:21:42.760 | - That's fascinating, that's fascinating, so in general, there's what, two morphemes
00:21:48.760 | per word, usually one or two, or three?
00:21:51.760 | - Well in English, it's one or two, in English, it tends to be one or two, there can be more.
00:21:56.640 | In other languages, a language like Finnish, which has a very elaborate morphology, there
00:22:02.800 | may be 10 morphemes on the end of a root, and so there may be millions of forms of a
00:22:08.720 | given word.
00:22:09.720 | - Okay, I will ask the same question over and over, but how does, just sometimes to
00:22:18.800 | understand things like morphemes, it's nice to just ask the question, how do these kinds
00:22:24.560 | of things evolve?
00:22:26.480 | So you have a great book studying sort of the, how the cognitive processing, how language
00:22:35.320 | used for communication, so the mathematical notion of how effective language is for communication,
00:22:40.360 | what role that plays in the evolution of language, but just high level, like how do we, how does
00:22:46.120 | a language evolve with, where English is two morphemes, or one or two morphemes per word,
00:22:51.520 | and then Finnish has infinity per word?
00:22:54.760 | So what, how does that happen, is it just people?
00:22:58.760 | - That's a really good question.
00:23:00.600 | That's a very good question, is why do languages have more morphology versus less morphology,
00:23:06.640 | and I don't think we know the answer to this.
00:23:08.520 | I think there's just a lot of good solutions to the problem of communication.
00:23:13.440 | So I believe, as you hinted, that language is an invented system by humans for communicating
00:23:22.080 | their ideas, and I think it comes down to we label the things we want to talk about,
00:23:26.560 | those are the morphemes and words, those are the things we want to talk about in the world,
00:23:30.320 | and we invent those things, and then we put them together in ways that are easy for us
00:23:36.120 | to convey, to process.
00:23:38.120 | But that's like a naive view, and I don't, I mean, I think it's probably right, right?
00:23:42.960 | It's naive and probably right, but--
00:23:43.960 | - One has to notice, I don't know if it's naive, I think it's simple.
00:23:46.640 | - Simple, yeah.
00:23:48.640 | - Naive is an indication that it's incorrect somehow, it's a trivial, too simple, I think
00:23:54.760 | it could very well be correct.
00:23:56.720 | But it's interesting how sticky, it feels like two people got together, it just feels
00:24:03.400 | like once you figure out certain aspects of a language, that just becomes sticky and the
00:24:07.480 | tribe forms around that language, or maybe the language, maybe the tribe forms first
00:24:11.800 | and then the language evolves, and then you just kind of agree and you stick to whatever
00:24:15.560 | that is.
00:24:16.560 | - These are very interesting questions, we don't know really about how words, even words,
00:24:22.720 | get invented very much about, we don't really, I mean, assuming they get invented, we don't
00:24:28.640 | really know how that process works and how these things evolve.
00:24:31.280 | What we have is kind of a current picture, a current picture of a few thousand languages,
00:24:39.760 | a few thousand instances.
00:24:40.960 | We don't have any pictures of really how these things are evolving, really.
00:24:45.960 | And then the evolution is massively confused by contact, right?
00:24:52.260 | So as soon as one language group, one group runs into another, we are smart, humans are
00:24:58.640 | smart and they take on whatever is useful in the other group.
00:25:02.780 | And so any kind of contrast which you're talking about, which I find useful, I'm gonna start
00:25:08.480 | using as well.
00:25:09.480 | And I worked a little bit in specific areas of words, in number words and in color words.
00:25:16.240 | And in color words, so we have, in English, we have around 11 words that everyone knows
00:25:21.760 | for colors.
00:25:25.080 | And many more, if you happen to be interested in color for some reason or other, if you're
00:25:29.520 | a fashion designer or an artist or something, you may have many, many more words.
00:25:33.800 | But we can see millions, like if you have normal color vision, normal trichromatic color
00:25:38.960 | vision, you can see millions of distinctions in color.
00:25:41.560 | So we don't have millions of words.
00:25:43.200 | The most efficient, no, the most detailed color vocabulary would have over a million
00:25:49.240 | terms to distinguish all the different colors that we can see.
00:25:52.280 | But of course we don't have that.
00:25:53.760 | So it's somehow, it's kind of useful for English to have evolved in some way to, there's 11
00:26:01.560 | terms that people find useful to talk about, black, white, red, blue, green, yellow, purple,
00:26:08.920 | black, gray, pink, and I probably missed something there.
00:26:11.440 | Anyway, there's 11 that everyone knows, but you go to different cultures, especially the
00:26:17.960 | non-industrialized cultures, and there'll be many fewer.
00:26:21.080 | So some cultures will have only two, believe it or not.
00:26:25.080 | The Danai in Papua New Guinea have only two labels that the group uses for color.
00:26:31.400 | Those are roughly black and white.
00:26:33.000 | They are very, very dark and very, very light, which are roughly black and white.
00:26:36.800 | And you might think, oh, they're dividing the whole color space into light and dark
00:26:41.120 | or something.
00:26:42.120 | And that's not really true.
00:26:43.120 | They mostly just only label the black and the white things.
00:26:46.040 | They just don't talk about the colors for the other ones.
00:26:49.320 | And then there's other groups.
00:26:50.320 | I worked with a group called the Chimani down in Bolivia in South America, and they have
00:26:56.920 | three words that everyone knows, but there's a few others that several people, that many
00:27:02.920 | people know.
00:27:04.160 | And so they have, kind of depending on how you count, between three and seven words that
00:27:10.120 | the group knows.
00:27:11.840 | And again, they're black and white.
00:27:14.120 | Everyone knows those.
00:27:15.120 | And red, red is, that tends to be the third word that everyone, that cultures bring in.
00:27:21.120 | If there's a word, it's always red, the third one.
00:27:23.480 | And then after that, it's kind of all bets are off about what they bring in.
00:27:26.560 | And so after that, they bring in a sort of a big blue-green group.
00:27:31.680 | They have one for that.
00:27:34.120 | And then different people have different words that they'll use for other parts of the space.
00:27:39.040 | And so anyway, it's probably related to what they want to talk, not what they see, because
00:27:45.680 | they see the same colors as we see.
00:27:47.880 | So it's not like they have a weak, a low color palette in the things they're looking at.
00:27:54.320 | They're looking at a lot of beautiful scenery, a lot of different colored flowers and berries
00:28:01.600 | and things.
00:28:02.600 | And so there's lots of things of very bright colors, but they just don't label the color
00:28:07.400 | in those cases.
00:28:08.400 | And the reason, probably, we don't know this, but we think probably what's going on here
00:28:12.880 | is that what you do, why you label something, is you need to talk to someone else about
00:28:18.720 | And why do I need to talk about a color?
00:28:20.080 | Well, if I have two things which are identical, and I want you to give me the one that's different,
00:28:26.280 | and the only way it varies is color, then I invent a word which tells you, "This is
00:28:31.640 | the one I want."
00:28:32.640 | So I want the red sweater off the rack, not the green sweater.
00:28:36.360 | And so those things will be identical, because these are things we made, and they're dyed,
00:28:41.040 | and there's nothing different about them.
00:28:42.680 | And so in industrialized society, everything we've got is pretty much arbitrarily colored.
00:28:50.640 | But if you go to a non-industrialized group, that's not true.
00:28:53.520 | And so they don't, it's not only that they're not interested in color, if you bring bright
00:28:57.480 | colored things to them, they like them just like we like them.
00:29:01.080 | Bright colors are great, they're beautiful, but they just don't need to, no need to talk
00:29:06.080 | about them.
00:29:07.080 | They don't have--
00:29:08.080 | - So probably color words is a good example of how language evolves from sort of function,
00:29:13.320 | when you need to communicate the use of something.
00:29:15.800 | - I think so.
00:29:16.800 | - Then you kind of invent different variations, and basically, you can imagine that the evolution
00:29:22.200 | of a language has to do with what the early tribe's doing, like what kind of problems
00:29:27.680 | are facing them, and they're quickly figuring out how to efficiently communicate the solution
00:29:32.720 | to those problems, whether it's aesthetic or functional, all that kind of stuff, running
00:29:36.600 | away from a mammoth or whatever.
00:29:39.600 | But I think what you're pointing to is that we don't have data on the evolution of language,
00:29:45.840 | because many languages were formed a long time ago, so you don't get the chatter.
00:29:50.160 | We have a little bit of old English to modern English, because there was a writing system,
00:29:56.120 | and we can see how old English looked.
00:29:58.680 | So the word order changed, for instance, in old English to middle English to modern English,
00:30:02.440 | and so we can see things like that, but most languages don't even have a writing system.
00:30:07.080 | So of the 7,000, only a small subset of those have a writing system, and even if they have
00:30:13.120 | a writing system, it's not a very modern writing system, and so they don't have it, so we just
00:30:17.360 | basically have, for Mandarin, for Chinese, we have a lot of evidence for a long time,
00:30:23.880 | and for English, and not for much else.
00:30:25.600 | Not for German a little bit, but not for a whole lot of long-term language evolution.
00:30:31.240 | We don't have a lot.
00:30:32.240 | We just have snapshots, is what we've got, of current languages.
00:30:34.960 | - Yeah, you get an inkling of that from the rapid communication on certain platforms,
00:30:39.640 | like on Reddit.
00:30:40.640 | There's different communities, and they'll come up with different slang.
00:30:44.200 | Especially from my perspective, German, by a little bit of humor, or maybe mockery or
00:30:49.400 | whatever, just talking shit in different kinds of ways, and you could see the evolution of
00:30:57.040 | language there, because I think a lot of things on the internet, you don't want to be the
00:31:03.920 | boring mainstream, so you want to deviate from the proper way of talking, and so you
00:31:11.960 | get a lot of deviation, rapid deviation, and then when communities collide, you get, just
00:31:18.240 | like you said, humans adapt to it, and you can see it through the lens of humor.
00:31:22.240 | It's very difficult to study, but you can imagine 100 years from now, if there's a new
00:31:26.420 | language born, for example, we'll get really high-resolution data.
00:31:30.000 | - I mean, English is changing.
00:31:32.040 | English changes all the time.
00:31:33.100 | All languages change all the time, so there's the famous result about the Queen's English.
00:31:40.680 | So if you look at the Queen's vowels, the Queen's English is supposed to be, originally
00:31:45.520 | the proper way for the talk was sort of defined by whoever the Queen talked, or the King,
00:31:50.080 | whoever was in charge, and so if you look at how her vowels changed from when she first
00:31:57.800 | became Queen in 1952, '53, when she was coronated, I mean, that's Queen Elizabeth who died recently,
00:32:03.040 | of course, until 50 years later, her vowels changed, her vowels shifted a lot.
00:32:08.240 | And so even in the sounds of British English, in her, the way she was talking was changing.
00:32:15.200 | The vowels were changing slightly.
00:32:16.800 | So that's just, in the sounds, there's change.
00:32:19.280 | I don't know what's, I'm interested, we're all interested in what's driving any of these
00:32:24.040 | changes.
00:32:25.040 | The word order of English changed a lot over 1,000 years, right?
00:32:28.400 | So it used to look like German, it used to be a verb-final language with case marking,
00:32:33.880 | and it shifted to a verb-medial language, a lot of contact, so a lot of contact with
00:32:38.120 | French, and it became a verb-medial language with no case marking.
00:32:42.600 | And so it became this verb-initially thing.
00:32:46.640 | And so that's-- - It's evolving.
00:32:48.200 | - It totally evolved, and so it may very well, I mean, it doesn't evolve maybe very much
00:32:52.240 | in 20 years, is maybe what you're talking about, but over 50 and 100 years, things change
00:32:56.600 | a lot, I think.
00:32:57.600 | - We'll now have good data on it, which is great.
00:33:00.040 | - That's for sure, yeah.
00:33:01.200 | - Can you talk to what is syntax and what is grammar?
00:33:03.960 | So you wrote a book on syntax.
00:33:05.920 | - I did.
00:33:06.920 | You were asking me before about how do I figure out what a dependency structure is.
00:33:10.600 | I'd say the dependency structures aren't that hard to, generally, I think there's a lot
00:33:14.760 | of agreement of what they are for almost any sentence in most languages.
00:33:19.600 | I think people will agree on a lot of that.
00:33:22.680 | There are other parameters in the mix such that some people think there's a more complicated
00:33:27.960 | grammar than just a dependency structure.
00:33:30.080 | And so, you know, like Noam Chomsky, he's the most famous linguist ever, and he is famous
00:33:36.760 | for proposing a slightly more complicated syntax.
00:33:40.800 | And so he invented phrase structure grammar.
00:33:43.720 | So he's well-known for many, many things, but in the '50s and early '60s, but late '50s,
00:33:50.120 | he was basically figuring out what's called formal language theory.
00:33:54.420 | And he figured out sort of a framework for figuring out how complicated a certain type
00:34:01.480 | of language might be, so-called phrase structure grammars of language might be.
00:34:06.120 | And so his idea was that maybe we can think about the complexity of a language by how
00:34:14.120 | complicated the rules are, okay?
00:34:16.160 | And the rules will look like this.
00:34:18.720 | They will have a left-hand side and they'll have a right-hand side.
00:34:22.840 | And on the left-hand side, we'll expand to the thing on the right-hand side.
00:34:25.560 | So say we'll start with an S, which is like the root, which is a sentence, okay?
00:34:30.640 | And then we're going to expand to things like a noun phrase and a verb phrase is what he
00:34:35.240 | would say, for instance, okay?
00:34:36.800 | An S goes to an NP and a VP is a kind of a phrase structure rule.
00:34:40.560 | And then we figure out what an NP is.
00:34:42.280 | An NP is a determiner and a noun, for instance, and a verb phrase is something else, is a
00:34:47.960 | verb and another noun phrase and another NP, for instance.
00:34:50.920 | Those are the rules of a very simple phrase structure, okay?
00:34:55.280 | And so he proposed phrase structure grammar as a way to sort of cover human languages.
00:35:01.120 | And then he actually figured out that, well, depending on the formalization of those grammars,
00:35:05.080 | you might get more complicated or less complicated languages.
00:35:08.280 | And so he said, well, these are things called context-free languages that rule.
00:35:14.480 | He thought human languages would tend to be what he calls context-free languages.
00:35:20.040 | But there are simpler languages, which are so-called regular languages, and they have
00:35:23.760 | a more constrained form to the rules of the phrase structure of these particular rules.
00:35:28.840 | So he basically discovered and kind of invented ways to describe the language, and those are
00:35:36.400 | phrase structure, a human language.
00:35:38.640 | And he was mostly interested in English initially in his work in the '50s.
00:35:43.280 | - So quick questions around all this.
00:35:44.600 | So formal language theory is the big field of just studying language formally.
00:35:49.320 | - Yes, and it doesn't have to be human language there.
00:35:51.440 | We can have computer languages, any kind of system which is generating some set of expressions
00:36:00.880 | in a language.
00:36:01.880 | And those could be like the statements in a computer language, for example.
00:36:08.280 | So it could be that or it could be human language.
00:36:10.240 | - So technically you can study programming languages.
00:36:12.840 | - Yes, and have been.
00:36:14.680 | Heavily studied using this formalism.
00:36:16.480 | There's a big field of programming languages within the formal language.
00:36:20.600 | - Okay, and then phrase structure, grammar, is this idea that you can break down language
00:36:25.960 | into this S-N-P-V-P type of thing?
00:36:28.920 | - It's a particular formalism for describing language.
00:36:33.760 | And Chomsky was the first one.
00:36:35.120 | He's the one who figured that stuff out back in the '50s.
00:36:39.120 | And that's equivalent, actually.
00:36:41.720 | The context-free grammar is kind of equivalent in the sense that it generates the same sentences
00:36:46.080 | as a dependency grammar would.
00:36:49.400 | The dependency grammar is a little simpler in some way.
00:36:51.720 | You just have a root, and it goes, we don't have any of these.
00:36:55.200 | The rules are implicit, I guess.
00:36:57.840 | We just have connections between words.
00:36:59.440 | The phrase structure grammar is kind of a different way to think about the dependency
00:37:03.960 | grammar.
00:37:04.960 | It's slightly more complicated, but it's kind of the same in some ways.
00:37:07.720 | - So to clarify, dependency grammar is the framework under which you see language, and
00:37:13.840 | you make the case that this is a good way to describe language.
00:37:18.160 | - That's correct.
00:37:19.160 | - And Noam Chomsky's watching.
00:37:20.920 | This is very upset right now, so let's, just kidding.
00:37:24.280 | But what's the difference between, where's the place of disagreement between phrase structure
00:37:31.440 | grammar and dependency grammar?
00:37:33.480 | - They're very close.
00:37:34.480 | So phrase structure grammar and dependency grammar aren't that far apart.
00:37:38.240 | I like dependency grammar because it's more perspicuous.
00:37:42.500 | It's more transparent about representing the connections between the words.
00:37:46.200 | It's just a little harder to see in phrase structure grammar.
00:37:49.100 | The place where Chomsky sort of devolved or went off from this is he also thought there
00:37:55.320 | was something called movement.
00:37:59.580 | And that's where we disagree.
00:38:01.220 | That's the place where I would say we disagree.
00:38:03.440 | And I mean, maybe we'll get into that later, but the idea is, if you wanna, do you want
00:38:07.440 | me to explain that?
00:38:08.440 | - I would love, can you explain movement?
00:38:10.040 | - Movement, okay, so Chomsky--
00:38:11.040 | - There's so many interesting things.
00:38:12.040 | - Yeah, yeah, yeah.
00:38:13.040 | Okay, so here's the, movement is, Chomsky basically sees English and he says, okay,
00:38:17.020 | I said, we had that sentence earlier, it was like two dogs entered the room, but it's changed
00:38:22.380 | a little bit, say, two dogs will enter the room.
00:38:25.180 | And he notices that, hey, English, if I wanna make a question, a yes/no question from that
00:38:30.660 | same sentence, I say, instead of two dogs will enter the room, I say, will two dogs
00:38:35.060 | enter the room?
00:38:36.060 | Okay, there's a different way to say the same idea, and it's like, well, the auxiliary verb
00:38:40.780 | that will thing, it's at the front as opposed to in the middle, okay?
00:38:45.600 | And so, and he looked, if you look at English, you see that that's true for all those modal
00:38:50.660 | verbs and for other kinds of auxiliary verbs in English, you always do that, you always
00:38:54.460 | put an auxiliary verb at the front, and when he saw that, so if I say, I can win this bet,
00:39:01.560 | can I win this bet, right, so I move a can to the front.
00:39:04.580 | So actually, that's a theory, I just gave you a theory there, he talks about it as movement,
00:39:09.740 | that word in the declarative is the root, is the sort of default way to think about
00:39:14.540 | the sentence, and you move the auxiliary verb to the front.
00:39:17.940 | That's a movement theory, okay, and he just thought that was just so obvious that it must
00:39:23.260 | be true, that there's nothing more to say about that, that this is how auxiliary verbs
00:39:28.660 | work in English.
00:39:29.980 | There's a movement rule, such that you move, like to get from the declarative to the interrogative,
00:39:35.080 | you're moving the auxiliary to the front, and it's a little more complicated as soon
00:39:38.060 | as you go to simple present and simple past, because if I say John slept, you have to say
00:39:45.100 | did John sleep, not slept John, right, and so you have to somehow get an auxiliary verb
00:39:49.900 | and I guess underlyingly, it's like slept, it's a little more complicated than that,
00:39:54.660 | but that's his idea, there's a movement, okay, and so a different way to think about that,
00:39:59.380 | that isn't, I mean, then he ended up showing later, so he proposed this theory of grammar,
00:40:04.580 | which has movement, and there's other places where he thought there's movement, not just
00:40:07.660 | auxiliary verbs, but things like the passive in English and things like questions, WH questions,
00:40:14.340 | a bunch of places where he thought there's also movement going on, and in each one of
00:40:19.220 | those, he thinks there's words, well, phrases and words are moving around from one structure
00:40:23.260 | to another, which he called deep structure to surface structure, I mean, there's like
00:40:26.300 | two different structures in his theory, okay.
00:40:29.860 | There's a different way to think about this, which is there's no movement at all, there's
00:40:34.540 | a lexical copying rule, such that the word will or the word can, these auxiliary verbs,
00:40:41.380 | they just have two forms, and one of them is the declarative and one of them is interrogative,
00:40:46.580 | and you basically have the declarative one, and oh, I form the interrogative, or I can
00:40:50.860 | form one from the other, doesn't matter which direction you go, and I just have a new entry,
00:40:55.900 | which has the same meaning, which has a slightly different argument structure, argument structure
00:41:00.820 | is just a fancy word for the ordering of the words, and so if I say, it was the dogs, two
00:41:07.540 | dogs can or will enter the room, there's two forms of will, one is will declarative, and
00:41:16.220 | then okay, I've got my subject to the left, it comes before me, and the verb comes after
00:41:20.660 | me in that one, and then the will interrogative, it's like, oh, I go first, interrogative,
00:41:25.940 | will is first, and then I have the subject immediately after, and then the verb after
00:41:29.860 | that, and so you can just generate from one of those words, another word with a slightly
00:41:35.020 | different argument structure, with different ordering.
00:41:37.820 | - And these are just lexical copies, they're not necessarily moving from one to another.
00:41:42.140 | - There's no movement.
00:41:43.140 | - There's a romantic notion that you have one main way to use a word, and then you could
00:41:48.940 | move it around, which is essentially what movement is implying.
00:41:52.820 | - Yeah, but that's the lexical copying is similar, so then we do lexical copying for
00:41:58.420 | that same idea, that maybe the declarative is the source, and then we can copy it, and
00:42:03.100 | so an advantage, there's multiple advantages of the lexical copying story, it's not my
00:42:08.740 | story, this is like, Ivan Sog, linguists, a bunch of linguists have been proposing these
00:42:14.140 | stories as well, in tandem with the movement story, okay, Ivan Sog died a while ago, but
00:42:20.060 | he was one of the proponents of the non-movement of the lexical copying story, and so that
00:42:24.900 | is that, a great advantage is, well, Chomsky, really famously in 1971, showed that the movement
00:42:34.500 | story leads to learnability problems, it leads to problems for how language is learned, it's
00:42:41.220 | really, really hard to figure out what the underlying structure of a language is if you
00:42:45.940 | have both phrase structure and movement, it's like really hard to figure out what came from
00:42:51.380 | what, there's like a lot of possibilities there.
00:42:53.460 | If you don't have that problem, the learning problem gets a lot easier.
00:42:57.220 | - Just say there's lexical copies, and when we say the learning problem, do you mean humans
00:43:02.140 | learning a new language?
00:43:03.140 | - Yeah, just learning English, so a baby is lying around listening to me talk, and how
00:43:09.220 | are they learning English, or maybe it's a two-year-old who's learning interrogatives
00:43:13.940 | and stuff, how are they doing that, are they doing it from, are they figuring out, so Chomsky
00:43:20.940 | said it's impossible to figure it out, actually, he said it's actually impossible, not hard,
00:43:26.300 | but impossible, and therefore, that's where universal grammar comes from, is that it has
00:43:31.220 | to be built in, and so what they're learning is, there's some built in, movement is built
00:43:37.140 | in in his story, it's absolutely part of your language module, and then you are, you're
00:43:44.380 | just setting parameters, you're said, depending on English, it's just sort of a variant of
00:43:48.340 | the universal grammar, and you're figuring out, oh, which orders does English do these
00:43:52.940 | things, that's, the non-movement story doesn't have this, it's like much more bottom up,
00:43:59.500 | you're learning rules, you're learning rules one by one, and oh, there's, this word is
00:44:04.420 | connected to that word, a great advantage, another advantage, it's learnable, another
00:44:08.880 | advantage of it is that it predicts that not all auxiliaries might move, like it might
00:44:14.300 | depend on the word, depending on whether you, and that turns out to be true, so there's
00:44:19.140 | words that don't really work as auxiliary, they work in declarative and not in interrogative,
00:44:25.860 | so I can say, I'll give you the opposite first, I can say, "Aren't I invited to the party?"
00:44:32.820 | And that's an interrogative form, but it's not from, "I aren't invited to the party,"
00:44:38.180 | there is no, "I aren't," so that's interrogative only.
00:44:42.540 | And then we also have forms like, "Ought," "I ought to do this," and I guess some old
00:44:50.100 | British people can say, "Ought I?"
00:44:52.140 | Exactly.
00:44:53.140 | It doesn't sound right, does it?
00:44:54.140 | For me, it sounds ridiculous.
00:44:55.780 | I don't even think "ought" is great, but I mean, I totally recognize, "I ought to,"
00:44:59.100 | it's not too bad, actually, I can say, "I ought to do this," that sounds pretty good.
00:45:02.020 | "Ought I?"
00:45:03.020 | If I'm trying to sound sophisticated, maybe.
00:45:04.280 | I don't know, it just sounds completely out to me.
00:45:06.500 | "Ought I?"
00:45:07.500 | Yeah.
00:45:08.500 | Anyway, so there are variants here, and a lot of these words just work in one versus
00:45:13.100 | the other, and that's fine under the lexical copying story, it's like, well, you just
00:45:17.660 | learn the usage, whatever the usage is, is what you do with this word.
00:45:23.780 | But it's a little bit harder in the movement story.
00:45:26.700 | The movement story, that's an advantage, I think, of lexical copying, and in all these
00:45:30.460 | different places, there's all these usage variants which make the movement story a little
00:45:37.860 | bit harder to work.
00:45:39.980 | So one of the main divisions here is the movement story versus the lexical copying story, that
00:45:43.940 | has to do about the auxiliary words and so on, but if you rewind to the phrase structured
00:45:49.580 | grammar versus dependency grammar.
00:45:52.540 | Those are equivalent in some sense, in that for any dependency grammar, I can generate
00:45:57.780 | a phrase structured grammar which generates exactly the same sentences, I just like the
00:46:03.220 | dependency grammar formalism because it makes something really salient, which is the lengths
00:46:11.020 | of dependencies between words, which isn't so obvious in the phrase structure.
00:46:15.220 | In the phrase structure, it's just kind of hard to see.
00:46:17.640 | It's in there, it's just very, very, it's opaque.
00:46:21.060 | - Technically, I think phrase structured grammar is mappable to dependency grammar.
00:46:25.100 | - And vice versa.
00:46:26.100 | - And vice versa.
00:46:27.100 | - It's just like these little labels, SMPVP.
00:46:29.580 | - Yeah, for a particular dependency grammar, you can make a phrase structured grammar which
00:46:34.220 | generates exactly those same sentences, and vice versa, but there are many phrase structured
00:46:39.340 | grammars which you can't really make a dependency grammar.
00:46:41.980 | I mean, you can do a lot more in a phrase structured grammar, but you get many more
00:46:46.460 | of these extra nodes, basically.
00:46:48.860 | You can have more structure in there, and some people like that, and maybe there's value
00:46:53.180 | to that.
00:46:54.180 | I don't like it.
00:46:55.180 | - Well, for you, so we should clarify, so dependency grammar, it's just, well, one word
00:47:01.020 | depends on only one other word, and you form these trees, and that makes, it really puts
00:47:07.220 | priority on those dependencies, just like as a tree that you can then measure the distance
00:47:12.660 | of the dependency from one word to the other.
00:47:15.140 | They can then map to the cognitive processing of the sentences, how easy it is to understand,
00:47:22.620 | all that kind of stuff.
00:47:23.620 | So, it just puts the focus on just like the mathematical distance of dependence between
00:47:31.220 | words.
00:47:32.220 | So, like, it's just a different focus.
00:47:34.300 | - Absolutely.
00:47:35.460 | - Just continue on the thread of Chomsky, 'cause it's really interesting, 'cause as you're
00:47:39.420 | discussing disagreement, to the degree there's disagreement, you're also telling the history
00:47:44.440 | of the study of language, which is really awesome.
00:47:47.220 | So, you mentioned context-free versus regular.
00:47:50.660 | Does that distinction come into play for dependency grammars?
00:47:54.420 | - No.
00:47:55.420 | - Okay.
00:47:56.420 | - Not at all.
00:47:57.420 | I mean, regular languages are too simple for human languages.
00:48:02.540 | It's a part of the hierarchy.
00:48:04.380 | But human languages are, in the phrase structure world, are definite, they're at least context-free.
00:48:11.620 | Maybe a little bit more, a little bit harder than that.
00:48:15.300 | So, there's something called context-sensitive as well, where you can have, like this is
00:48:19.460 | just the formal language description.
00:48:22.860 | In a context-free grammar, you have one, this is like a bunch of formal language theory
00:48:28.140 | we're doing here.
00:48:29.140 | - I love it.
00:48:30.140 | - Okay.
00:48:31.140 | So, you have a left-hand side category, and you're expanding to anything on the right.
00:48:35.700 | That's a context-free.
00:48:36.700 | So, the idea is that that category on the left expands in independent of context to
00:48:40.660 | those things, whatever they are on the right, doesn't matter what.
00:48:43.820 | And a context-sensitive says, okay, I actually have more than one thing on the left.
00:48:50.140 | I can tell you only in this context, maybe you have a left and a right context, or just
00:48:54.580 | a left context or a right context, I have two or more stuff on the left, tells you how
00:48:58.700 | to expand those things in that way.
00:49:01.220 | Okay, so it's context-sensitive.
00:49:02.700 | A regular language is just more constrained, and so it doesn't allow anything on the right.
00:49:09.540 | It allows very, basically, it's one very complicated rule, is kind of what a regular language is.
00:49:17.260 | And so, it doesn't have any, what's it say, long-distance dependencies?
00:49:21.620 | It doesn't allow recursion, for instance.
00:49:24.300 | There's no recursion.
00:49:25.300 | Yeah, recursion is where you, which is, human languages have recursion, they have embedding,
00:49:29.260 | and you can't, well, it doesn't allow center-embedded recursion, which human languages have, which
00:49:33.460 | is what--
00:49:34.460 | - Center-embedded recursion.
00:49:35.460 | - We're gonna get to that.
00:49:36.460 | - So, within a sentence, within a sentence.
00:49:37.460 | - Yeah, within a sentence.
00:49:38.460 | Yeah, we're gonna get to that.
00:49:39.460 | But, you know, the formal language stuff is a little aside, Chomsky wasn't proposing it
00:49:43.100 | for human languages, even.
00:49:44.500 | He was just pointing out that human languages are context-free, and then he was most, for
00:49:49.380 | human, 'cause that was kind of stuff we did for formal languages, and what he was most
00:49:52.960 | interested in was human language, and that's like, the movement is where we, where he sort
00:50:00.120 | of set off on, I would say, a very interesting, but wrong foot.
00:50:04.740 | It was kind of interesting, it's a very, I agree, it's a very interesting history.
00:50:08.040 | So there's this, so he proposed this multiple theories in '57 and then '65, they all have
00:50:13.640 | this framework, though, it was phrase structure plus movement, different versions of the phrase
00:50:18.020 | structure and the movement in the '57, these are the most famous original bits of Chomsky's
00:50:22.180 | work.
00:50:23.180 | And then '71 is when he figured out that those lead to learning problems, that there's cases
00:50:27.540 | where a kid could never figure out which rule, which set of rules was intended.
00:50:34.980 | And so, and then he said, well, that means it's innate.
00:50:37.620 | It's kind of interesting, he just really thought the movement was just so obviously true that
00:50:41.820 | he couldn't, he didn't even entertain giving it up, it's just obvious, that's obviously
00:50:47.180 | right.
00:50:48.180 | And it was later where people figured out that there's all these subtle ways in which
00:50:53.500 | things, which look like generalizations aren't generalizations, and they, across the category,
00:50:58.820 | they're word-specific, and they have, and they kind of work, but they don't work across
00:51:02.780 | various other words in the category, and so it's easier to just think of these things
00:51:05.940 | as lexical copies.
00:51:07.820 | And I think he was very obsessed, I don't know, I'm guessing, that he just, he really
00:51:13.220 | wanted this story to be simple in some sense, and language is a little more complicated
00:51:17.940 | in some sense.
00:51:18.940 | He didn't like words, he never talks about words, he likes to talk about combinations
00:51:22.940 | of words.
00:51:23.940 | And words are, you know, look up a dictionary, there's 50 senses for a common word, right?
00:51:28.900 | The word "take" will have 30 or 40 senses in it.
00:51:32.060 | So there'll be many different senses for common words.
00:51:35.400 | And he just doesn't think about that, or he doesn't think that's language.
00:51:39.900 | I think he doesn't think that's language, he thinks that words are distinct from combinations
00:51:45.760 | of words.
00:51:46.760 | I think they're the same.
00:51:47.760 | If you look at my brain in the scanner, while I'm listening to a language I understand,
00:51:54.180 | and you compare, I can localize my language network in a few minutes, in like 15 minutes.
00:51:59.320 | And what you do is I listen to a language I know, I listen to, you know, maybe some
00:52:03.000 | language I don't know, or I listen to muffled speech, or I read sentences, and I read non-words,
00:52:09.180 | like I can do anything like this, anything that's sort of really like English, and anything
00:52:12.140 | that's not very like English.
00:52:13.700 | So I've got something like it and not, and I've got a control.
00:52:16.660 | And the voxels, which is just, you know, the 3D pixels in my brain that are responding
00:52:22.740 | most is a language area, and that's this left-lateralized area in my head.
00:52:30.540 | And wherever I look in that network, if you look for the combinations versus the words,
00:52:36.460 | it's everywhere.
00:52:37.460 | It's the same.
00:52:38.460 | It's the same.
00:52:39.460 | That's fascinating.
00:52:40.460 | And so it's like hard to find.
00:52:41.460 | There are no areas that we know, I mean, that's, it's a little overstated right now.
00:52:46.940 | At this point, the technology isn't great, it's not bad, but we have the best way to
00:52:51.980 | figure out what's going on in my brain when I'm listening or reading language is to use
00:52:55.860 | fMRI, Functional Magnetic Resonance Imaging.
00:52:58.780 | And that's a very good localization method.
00:53:02.140 | So I can figure out where exactly these signals are coming from, pretty, you know, down to,
00:53:06.460 | you know, millimeters, you know, cubic millimeters or smaller, okay?
00:53:09.400 | Very small.
00:53:10.400 | We can figure those out very well.
00:53:11.400 | The problem is the when, okay?
00:53:13.940 | It's measuring oxygen, okay?
00:53:16.420 | And oxygen takes a little while to get to those cells, so it takes on the order of seconds.
00:53:21.140 | So I talk fast, I probably listen fast, and I can probably understand things really fast.
00:53:25.860 | So a lot of stuff happens in two seconds.
00:53:28.060 | And so to say that we know what's going on, that the words, right now in that network,
00:53:34.620 | our best guess is that whole network is doing something similar, but maybe different parts
00:53:39.820 | of that network are doing different things.
00:53:42.340 | And that's probably the case.
00:53:43.900 | We just don't have very good methods to figure that out, right, at this moment.
00:53:47.820 | And so since we're kind of talking about the history of the study of language, what other
00:53:54.500 | interesting disagreements, and you're both at MIT, or were for a long time, what kind
00:53:59.180 | of interesting disagreements there, tension of ideas are there between you and Noam Chomsky?
00:54:03.860 | And we should say that Noam was in the linguistics department, and you're, I guess for a time
00:54:10.660 | were affiliated there, but primarily brain and cognitive science department, which is
00:54:16.140 | another way of studying language, and you've been talking about fMRI.
00:54:19.940 | So what, is there something else interesting to bring to the surface about the disagreement
00:54:25.700 | between the two of you, or other people in the discipline?
00:54:28.980 | - Yeah, I mean, I've been at MIT for 31 years, since 1993, and he, Chomsky's been there much
00:54:35.180 | longer.
00:54:36.860 | So I met him, I knew him, I met when I first got there, I guess, and we would interact
00:54:42.580 | every now and then.
00:54:44.220 | So I'd say our biggest difference is our methods, and so that's the biggest difference between
00:54:52.300 | me and Noam, is that I gather data from people.
00:54:57.820 | I do experiments with people, and I gather corpus data, whatever corpus data's available,
00:55:02.940 | and we do quantitative methods to evaluate any kind of hypothesis we have.
00:55:08.900 | He just doesn't do that.
00:55:09.900 | And so, he has never once been associated with any experiment or corpus work, ever.
00:55:16.620 | And so, it's all thought experiments.
00:55:19.600 | It's his own intuitions, so I just don't think that's the way to do things.
00:55:25.720 | That's a cross-the-street, they're-across-the-street-from-us kind of difference between brain and cog sci
00:55:31.260 | and linguistics.
00:55:32.260 | I mean, not all linguists, some of the linguists, depending on what you do, more speech-oriented,
00:55:37.020 | they do more quantitative stuff, but in the meaning, words and, well, it's combinations
00:55:43.100 | of words, syntax semantics, they tend not to do experiments and corpus analysis.
00:55:49.420 | - So on the linguistic side, probably, well, but the method is a symptom of a bigger approach,
00:55:56.020 | which is sort of a psychology/philosophy side, and, Noam, for you, it's more sort of data-driven,
00:56:01.820 | sort of almost like a mathematical approach.
00:56:03.500 | - Yeah, I mean, I'm a psychologist.
00:56:05.840 | So I would say we're in psychology.
00:56:08.500 | Brain and cognitive science is MIT's old psychology department.
00:56:12.060 | It was a psychology department up until 1985, and it became the Brain and Cognitive Science
00:56:16.000 | Department.
00:56:17.000 | And so, I mean, my training is math and computer science, but I'm a psychologist.
00:56:22.900 | I mean, I don't know what I am.
00:56:24.380 | - So data-driven psychologist.
00:56:25.380 | - Yeah, yeah, yeah.
00:56:26.380 | - You are.
00:56:27.380 | - I don't know what I am, but I'm happy to be called a linguist, I'm happy to be called
00:56:30.540 | a computer scientist, I'm happy to be called a psychologist, any of those things.
00:56:33.980 | - In the actual, like how that manifests itself outside of the methodology is like these differences,
00:56:39.660 | these subtle differences about the movement story versus the lexical copy story.
00:56:43.460 | - Yeah, those are theories, right?
00:56:45.640 | So the theories are, but I think the reason we differ in part is because of how we evaluate
00:56:51.980 | the theories.
00:56:52.980 | And so I evaluate theories quantitatively, and Noam doesn't.
00:56:57.780 | - Got it.
00:56:59.380 | Okay, well, let's explore the theories that you explore in your book.
00:57:04.420 | Let's return to this dependency grammar framework of looking at language.
00:57:10.140 | What's a good justification why the dependency grammar framework is a good way to explain
00:57:14.580 | language?
00:57:15.580 | What's your intuition?
00:57:16.800 | - So the reason I like dependency grammar, as I've said before, is that it's very transparent
00:57:22.660 | about its representation of distance between words.
00:57:26.120 | So it's like, all it is, is you've got a bunch of words, you're connecting together to make
00:57:30.980 | a sentence, and a really neat insight, which turns out to be true, is that the further
00:57:39.100 | apart the pair of words are that you're connecting, the harder it is to do the production, the
00:57:43.140 | harder it is to do the comprehension.
00:57:44.740 | It's harder to produce, it's harder to understand when the words are far apart.
00:57:47.500 | When they're close together, it's easy to produce and it's easy to comprehend.
00:57:51.900 | Let me give you an example, okay?
00:57:53.720 | So we have, in any language, we have mostly local connections between words, but they're
00:58:00.840 | abstract.
00:58:01.840 | The connections are abstract, they're between categories of words.
00:58:05.180 | And so you can always make things further apart if you add modification, for example,
00:58:12.840 | after a noun.
00:58:13.840 | So a noun in English comes before a verb, the subject noun comes before a verb, and
00:58:19.320 | then there's an object after, for example.
00:58:22.120 | I can say what I said before, you know, "The dog entered the room," or something like that.
00:58:25.480 | So I can modify "dog."
00:58:27.120 | If I say something more about "dog" after it, then what I'm doing is, indirectly, I'm
00:58:32.280 | lengthening the dependence between "dog" and "entered" by adding more stuff to it.
00:58:37.100 | So I just make it explicit here.
00:58:39.320 | If I say, "The boy who the cat scratched cried," we're going to have a mean cat here.
00:58:50.400 | And so what I've got here is, "The boy cried," it would be a very short, simple sentence,
00:58:54.320 | and I just told you something about the boy, and I told you it was the boy who the cat
00:58:59.960 | scratched, okay?
00:59:01.240 | So the "cried" is connected to the "boy," the "cried" at the end is connected to the
00:59:05.000 | "boy" in the beginning.
00:59:06.000 | Right.
00:59:07.000 | And so I can do that, and I can say that.
00:59:08.000 | That's a perfectly fine English sentence.
00:59:09.860 | And I can say, "The cat which the dog chased ran away," or something, okay?
00:59:16.880 | I can do that.
00:59:17.920 | But it's really hard now, I've got, you know, whatever I have here, I have, "The boy who
00:59:23.960 | the cat"—now let's say I try to modify "cat," okay?
00:59:27.080 | "The boy who the cat which the dog chased scratched ran away."
00:59:32.880 | Oh my God, that's hard, right?
00:59:34.880 | I'm sort of just working that through in my head, how to produce, and it's really just
00:59:38.960 | horrendous to understand.
00:59:40.600 | It's not so bad.
00:59:41.600 | At least I've got intonation there to sort of mark the boundaries and stuff, but that's
00:59:45.480 | really complicated.
00:59:47.160 | That's sort of English in a way.
00:59:49.480 | I mean, that follows the rules of English.
00:59:52.400 | So what's interesting about that is that what I'm doing is nesting dependencies there.
00:59:56.200 | I'm putting one—I've got a subject connected to a verb there, and then I'm modifying that
01:00:01.920 | with a clause, another clause, which happens to have a subject and a verb relation.
01:00:06.240 | I'm trying to do that again on the second one.
01:00:08.120 | And what that does is it lengthens out the dependence—multiple dependents actually
01:00:12.320 | get lengthened out there.
01:00:13.320 | So the dependencies get longer, and the outside ones get long, and even the ones in between
01:00:17.720 | get kind of long.
01:00:19.800 | So what's fascinating is that that's bad.
01:00:23.760 | That's really horrendous in English.
01:00:26.120 | But that's horrendous in any language.
01:00:28.200 | So no matter what language you look at, if you do—just figure out some structure where
01:00:33.680 | I'm going to have some modification following some head, which is connected to some later
01:00:37.640 | head, and I do it again, it won't be good.
01:00:40.040 | Guaranteed.
01:00:41.040 | So 100%, that will be uninterpretable in that language in the same way that was uninterpretable
01:00:45.800 | in English.
01:00:46.800 | - Let's just clarify.
01:00:47.800 | The distance of the dependencies is whenever the boy cried, there's a dependence between
01:00:55.880 | two words, and then you're counting the number of, what, morphemes between them?
01:01:01.520 | - That's a good question.
01:01:02.520 | I'll just say words.
01:01:03.520 | Your words are morphemes between.
01:01:05.000 | We don't know that.
01:01:06.000 | Actually, that's a very good question.
01:01:07.000 | What is the distance metric?
01:01:08.300 | But let's just say it's words, sure.
01:01:10.480 | And you're saying the longer the distance of that dependence, the more—no matter the
01:01:15.080 | language, except Ligali's—
01:01:16.080 | - Even Ligali's.
01:01:17.080 | - Even Ligali's.
01:01:18.080 | Okay, we'll talk about it.
01:01:19.080 | - Yeah, yeah, yeah.
01:01:20.080 | - We'll talk about it.
01:01:21.080 | - We'll get to that.
01:01:22.080 | - Okay, okay, okay.
01:01:23.080 | But that—the people will be very upset that speak that language.
01:01:27.240 | Not upset, but they'll either not understand it, or they'll be like this is—their brain
01:01:32.120 | will be working in overtime.
01:01:34.040 | - They will have a hard time either producing or comprehending it.
01:01:36.600 | They might tell you that's not their language.
01:01:39.020 | It's sort of their language.
01:01:40.020 | I mean, it's following their—like, they'll agree with each of those pieces as part of
01:01:43.220 | their language, but somehow that combination will be very, very difficult to produce and
01:01:48.080 | understand.
01:01:49.080 | - Is that a chicken or the egg issue here?
01:01:50.080 | So like, is—
01:01:51.080 | - Well, I'm giving you an explanation.
01:01:53.240 | - Right.
01:01:54.240 | - So the—well, I mean—and then there's—I'm giving you two kinds of explanations.
01:01:58.040 | I'm telling you that center embedding, that's nesting, those are the same—those are synonyms
01:02:01.840 | for the same concept here.
01:02:03.980 | And the explanation for what—those are always hard.
01:02:06.760 | Center embedding and nesting are always hard.
01:02:08.080 | And I gave you an explanation for why they might be hard, which is long-distance connections.
01:02:12.580 | When you do center embedding, when you do nesting, you always have long-distance connections
01:02:15.940 | between the dependents.
01:02:16.940 | You just—and so that's not necessarily the right explanation, it just—I can go through
01:02:20.880 | reasons why that's probably a good explanation.
01:02:23.560 | And it's not really just about one of them.
01:02:26.240 | So probably it's a pair of them or something of these dependents that you—get long that
01:02:31.200 | drives you to like be really confused in that case.
01:02:33.980 | And so what the behavioral consequence there—I mean, we—this is kind of methods.
01:02:39.920 | Like how do we get at this?
01:02:41.540 | You could try to do experiments to get people to produce these things.
01:02:44.600 | They're going to have a hard time producing them.
01:02:46.160 | You can try to do experiments to get them to understand them and get—see how well
01:02:49.800 | they understand them, can they understand them.
01:02:52.720 | Another method you can do is give people partial materials and ask them to complete them, you
01:02:58.440 | know, those center-embedded materials, and they'll fail.
01:03:02.360 | So I've done that.
01:03:03.360 | I've done these kinds of things.
01:03:04.360 | - So, so, so, wait a minute.
01:03:06.820 | So central embedding meaning, like you take a normal sentence like boy cried and inject
01:03:10.720 | a bunch of crap in the middle that separates the boy and the cried.
01:03:15.360 | Okay.
01:03:16.360 | That's central embedding.
01:03:17.360 | And nesting is on top of that.
01:03:18.640 | - No, no, nesting is the same thing.
01:03:20.120 | Center-embedding, those are totally equivalent terms.
01:03:22.080 | I'm sorry I sometimes use one and sometimes use the other.
01:03:24.160 | - Ah, got it, got it.
01:03:25.160 | Totally equivalent.
01:03:26.160 | - They don't mean anything different.
01:03:27.160 | - Got it.
01:03:28.160 | And then what you're saying is there's a bunch of different kinds of experiments you can
01:03:32.160 | And the way to understand anyone is like have more embedding, more central embedding.
01:03:35.600 | Is it easier or harder to understand?
01:03:37.320 | But then you have to measure the level of understanding, I guess.
01:03:39.600 | - Yeah, yeah, you could.
01:03:40.600 | I mean, there's multiple ways to do that.
01:03:42.360 | I mean, there's the simplest way is just ask people how good does it sound?
01:03:46.000 | How natural does it sound?
01:03:47.220 | That's a very blunt but very good measure.
01:03:49.400 | It's very, very reliable.
01:03:50.820 | People will do the same thing.
01:03:52.240 | And so it's like, I don't know what it means exactly, but it's doing something such that
01:03:55.720 | we're measuring something about the confusion, the difficulty associated with those.
01:03:59.000 | - And those, like those are giving you a signal.
01:04:00.840 | That's why you can say them.
01:04:02.760 | What about the completion of the central embed?
01:04:05.560 | - So if you give them a partial sentence, say I say the book which the author who, and
01:04:13.800 | I ask you to now finish that off for me.
01:04:15.600 | I mean, either say it, yeah, yeah, but say it's written in front of you and you can just
01:04:19.680 | type in, have as much time as you want.
01:04:21.480 | They will, even though that one's not too hard, right?
01:04:24.240 | So if I say it's like the book, it's like, oh, the book which the author who I met wrote
01:04:30.040 | was good.
01:04:31.040 | That's a very simple completion for that.
01:04:33.840 | If I give that completion online somewhere to a crowdsourcing platform and ask people
01:04:40.280 | to complete that, they will miss off a verb very regularly, like half the time, maybe
01:04:45.640 | two thirds of the time.
01:04:46.640 | They'll say, they'll just leave off one of those verb phrases.
01:04:49.520 | Even with that simple, so say the book which the author who, and they'll say was, you need
01:04:58.960 | three verbs, right?
01:04:59.960 | Three verbs are who I met, wrote, was good.
01:05:03.080 | And they'll give me two.
01:05:04.080 | They'll say, who was famous, was good, or something like that.
01:05:07.880 | They'll just give me two.
01:05:09.400 | And that'll happen about 60% of the time.
01:05:11.360 | So 40%, maybe 30, they'll do it correctly, correctly, meaning they'll do a three verb
01:05:16.600 | phrase.
01:05:17.600 | I don't know what's correct or not.
01:05:18.600 | This is hard.
01:05:19.600 | It's a hard task.
01:05:20.600 | - Yeah, I can actually, I'm struggling with it in my head.
01:05:22.600 | - Well, it's easier written.
01:05:24.360 | - When you stare at it.
01:05:25.360 | - If you look, it's a little easier than listening, it's pretty tough.
01:05:28.200 | 'Cause you have to, 'cause there's no trace of it.
01:05:31.320 | You have to remember the words that I'm saying, which is very hard auditorily.
01:05:34.640 | We wouldn't do it this way.
01:05:35.640 | You do it written.
01:05:36.640 | You can look at it and figure it out.
01:05:38.840 | It's easier in many dimensions in some ways, depending on the person.
01:05:41.680 | It's easier to gather written data for, I mean, most sort of, I work in psycholinguistics,
01:05:47.400 | right?
01:05:48.400 | Psychology of language and stuff.
01:05:49.400 | And so a lot of our work is based on written stuff because it's so easy to gather data
01:05:54.740 | from people doing written kinds of tasks.
01:05:57.240 | Written tasks are just more complicated to administer and analyze because people do weird
01:06:02.480 | things when they speak, and it's harder to analyze what they do.
01:06:05.880 | But they generally point to the same kinds of things.
01:06:10.080 | - Okay, so the universal theory of language by Ted Gibson is that you can form dependency,
01:06:19.320 | you can form trees from any sentences, and you can measure the distance in some way of
01:06:23.920 | those dependencies, and then you can say that most languages have very short dependencies.
01:06:30.760 | - All languages.
01:06:31.760 | - All languages.
01:06:32.760 | - All languages have short dependencies.
01:06:33.760 | You can actually measure that.
01:06:34.880 | So an ex-student of mine, this guy's at University of California, Irvine, Richard Futrell did
01:06:40.680 | a thing a bunch of years ago now, where he looked at all the languages we could look
01:06:45.720 | at, which was about 40 initially, and now I think there's about 60, for which there
01:06:50.640 | are dependency structures.
01:06:52.760 | So meaning there's gotta be a big text, a bunch of texts, which have been parsed for
01:06:57.120 | the dependency structures, and there's about 60 of those which have been parsed that way.
01:07:01.840 | And for all of those, what he did was take any sentence in one of those languages, and
01:07:09.720 | you can do the dependency structure, and then start at the root, we're talking about dependency
01:07:13.360 | structures, that's pretty easy now, and he's trying to figure out what a control way you
01:07:18.080 | might say the same sentence is in that language.
01:07:21.280 | And so he's just like, all right, there's a root, and let's say the sentence is, let's
01:07:26.360 | go back to two dogs entered the room.
01:07:28.160 | So entered is the root, and entered has two dependents, it's got dogs, and it has room.
01:07:35.440 | And what he does is, let's scramble that order, that's three things, the root, and the head,
01:07:40.280 | and the two dependents, in just some random order, just random, and then just do that
01:07:44.400 | for all the dependents down the tree.
01:07:46.000 | So now look, do it for the, and whatever, it's two, and dogs, and for, and room.
01:07:50.480 | And that's not, it's a very short sentence, when sentences get longer, and you have more
01:07:55.120 | dependents, there's more scrambling that's possible, and what he found, so that's one,
01:08:00.800 | you can figure out one scrambling for that sentence, he did this like a hundred times,
01:08:04.000 | for every sentence in every one of these texts, every corpus, and then he just compared the
01:08:10.880 | dependency lengths in those random scramblings to what actually happened, what the English
01:08:16.640 | or the French or the German was in the original language, or Chinese, or what all these like
01:08:20.480 | 80, no, 60 languages, okay?
01:08:22.960 | And the dependency lengths are always shorter in the real language, compared to this kind
01:08:27.400 | of a control.
01:08:28.400 | And there's another, it's a little more rigid, his control, so the way I described it, you
01:08:36.120 | could have crossed dependencies, like by scrambling that way, you could scramble in any way at
01:08:41.440 | all, languages don't do that, they tend not to cross dependencies very much.
01:08:46.440 | Like so the dependency structure, they tend to keep things non-crossed, and there's a
01:08:52.240 | technical term, they call that projective, but it's just non-crossed is all that is projective.
01:08:56.720 | And so if you just constrain the scrambling, so that it only gives you projective, sort
01:09:01.680 | of non-crossed, the same thing holds.
01:09:04.320 | So still human languages are much shorter than this kind of a control.
01:09:10.720 | So there's like, what it means is that we're, in every language, we're trying to put things
01:09:15.400 | close relative to this kind of a control.
01:09:18.920 | It doesn't matter about the word order, some of these are verb-final, some of these are
01:09:21.720 | verb-medial-like English, and some are even verb-initial, there are a few languages in
01:09:25.800 | the world which have VSO, word order, verb, subject, object languages, haven't talked
01:09:31.080 | about those, it's like 10% of the...
01:09:34.000 | - And even in those languages, it's still short dependencies.
01:09:37.640 | - Short dependencies is rules.
01:09:39.080 | - Okay, so what are some possible explanations for that?
01:09:44.120 | For why languages have evolved that way?
01:09:47.160 | So that's one of the, I suppose, disagreements you might have with Chomsky, so you consider
01:09:53.240 | the evolution of language in terms of information theory, and for you, the purpose of language
01:10:02.680 | is ease of communication, right, and processing.
01:10:05.040 | - That's right, that's right.
01:10:06.280 | So I mean, the story here is just about communication, it is just about production, really, it's
01:10:11.520 | about ease of production, is the story.
01:10:13.600 | - When you say production, can you--
01:10:15.120 | - Oh, I just mean ease of language production, it's easier for me to say things when the,
01:10:20.360 | what I'm doing whenever I'm talking to you is somehow I'm formulating some idea in my
01:10:24.240 | head and I'm putting these words together, and it's easier for me to do that, to put,
01:10:29.840 | to say something where the words are closely connected in a dependency, as opposed to separated,
01:10:35.600 | by putting something in between and over and over again, it's just hard for me to keep
01:10:39.600 | that in my head, that's the whole story, the story, it's basically, the dependency grammar
01:10:44.880 | sort of gives that to you, just like long is bad, short is good, it's easier to keep
01:10:50.440 | in mind because you have to keep it in mind for, probably for production, probably matters
01:10:55.660 | in comprehension as well, also matters in comprehension.
01:10:58.160 | - It's on both sides, the production and the--
01:11:00.400 | - But I would guess it's probably evolved for production, it's about producing, what's
01:11:04.040 | easier for me to say, that ends up being easier for you also, that's very hard to disentangle,
01:11:09.800 | this idea of who is it for, is it for me, the speaker, or is it for you, the listener,
01:11:14.160 | I mean part of my language is for you, like the way I talk to you is gonna be different
01:11:19.320 | from how I talk to different people, I'm definitely angling what I'm saying to who I'm saying,
01:11:24.600 | it's not like I'm just talking the same way to every single person, and so I am sensitive
01:11:29.920 | to my audience, but does that work itself out in the dependency length differences,
01:11:37.480 | I don't know, maybe that's about just the words, that part, which words I select.
01:11:41.280 | - My initial intuition is that you optimize language for the audience, but it's just kind
01:11:48.320 | of like messing with my head a little bit to say that some of the optimization might
01:11:52.440 | be, maybe the primary objective of the optimization might be the ease of production.
01:11:57.400 | - We have different senses I guess, I'm very selfish, and you're like, I think it's all
01:12:03.920 | about me, I'm just doing what's easiest for me, I don't wanna, I mean but I have to of
01:12:09.520 | course choose the words that I think you're gonna know, I'm not gonna choose words you
01:12:14.200 | don't know, in fact I'm gonna fix that, so there it's about, but maybe for the syntax,
01:12:20.280 | for the combinations it's just about me, I feel like it's, I don't know though, it's
01:12:24.040 | very hard to-- - Wait, wait, wait, but the purpose of communication is to be understood,
01:12:27.920 | is to convince others and so on, so like the selfish thing is to be understood, so it's
01:12:32.680 | about the listener. - Okay, it's a little circular there too
01:12:34.000 | then, okay. - Right, I mean like the ease of production--
01:12:37.200 | - Helps me be understood then, I don't think it's circular, so I want what's--
01:12:42.320 | - No I think the primary objective is about the listener, 'cause otherwise if you're optimizing
01:12:49.400 | for the ease of production then you're not gonna have any of the interesting complexity
01:12:53.320 | of language, like you're trying to like explain-- - Well let's control for what it is I want
01:12:57.120 | to say, like I'm saying let's control for the thing, the message, control for the message,
01:13:01.880 | I want to tell you-- - But that means the message needs to be
01:13:03.280 | understood, that's the goal. - Oh but that's the meaning, so I'm still
01:13:06.440 | talking about the form, just the form of the meaning, how do I frame the form of the meaning
01:13:11.920 | is all I'm talking about, you're talking about a harder thing I think, it's like how am I,
01:13:16.040 | like trying to change the meaning, let's keep the meaning constant, like which, if you keep
01:13:21.200 | the meaning constant, how can I phrase whatever it is I need to say, like I gotta pick the
01:13:26.360 | right words and I'm gonna pick the order so that it's easy for me, that's what I think
01:13:31.920 | it's probably like. - I think I'm still tying meaning and form
01:13:36.040 | together in my head, but you're saying if you keep the meaning of what you're saying
01:13:40.320 | constant, the optimization, yeah it could be the primary objective that optimization
01:13:46.120 | is for production, that's interesting. I'm struggling to keep constant meaning, it's
01:13:54.120 | just so, I mean I'm a human, so for me the form, without having introspected on this,
01:14:02.440 | the form and the meaning are tied together, like deeply, because I'm a human, like for
01:14:09.680 | me when I'm speaking, 'cause I haven't thought about language, like in a rigorous way, about
01:14:14.800 | the form of language. - But look, for any event, there's an unbounded,
01:14:22.360 | I don't wanna say infinite, but sort of ways that I might communicate that same event.
01:14:26.760 | This two dogs entered a room, I can say in many, many different ways, I can say hey,
01:14:31.360 | there's two dogs, they entered the room. Hey, the room was entered by something, the thing
01:14:37.120 | that was entered was two dogs, I mean that's kind of awkward and weird and stuff, but those
01:14:40.960 | are all similar messages with different forms, different ways I might frame, and of course
01:14:48.040 | I use the same words there all the time. I could have referred to the dogs as a Dalmatian
01:14:52.960 | and a poodle or something. I could have been more specific or less specific about what
01:14:56.760 | they are, and I could have said, been more abstract about the number. So I'm trying to
01:15:02.520 | keep the meaning, which is this event, constant, and then how am I gonna describe that to get
01:15:08.280 | that to you, it kind of depends on what you need to know, right, and what I think you
01:15:11.360 | need to know, but I'm like trying to, let's control for all that stuff, and not, and I'm
01:15:16.680 | just choosing, I'm doing something simpler than you're doing, which is just forms, just
01:15:21.800 | words. - So to you, specifying the breed of dog
01:15:25.960 | and whether they're cute or not is changing the meaning.
01:15:30.320 | - That might be, yeah, yeah, that would be changing, oh, that would be changing the meaning
01:15:32.840 | for sure. - Right, so you're just, well, yeah, yeah.
01:15:36.640 | That's changing the meaning, but say, even if we keep that constant, we can still talk
01:15:40.600 | about what's easier or hard for me, right, the listener and the, right? Which phrase
01:15:46.000 | structures I use, which combinations, which, you know.
01:15:49.080 | - This is so fascinating and just like a really powerful window into human language, but I
01:15:56.080 | wonder still throughout this how vast the gap between meaning and form. I just have
01:16:03.480 | this like maybe romanticized notion that they're close together, that they evolve close, like
01:16:09.120 | hand in hand, that you can't just simply optimize for one without the other being in the room
01:16:15.880 | with us. Like it's, well, it's kind of like an iceberg. Form is the tip of the iceberg
01:16:21.920 | and the rest, the meaning is the iceberg, but you can't like separate.
01:16:26.120 | - But I think that's why these large language models are so successful is 'cause they're
01:16:30.640 | good at form and form isn't that hard in some sense. And meaning is tough still and that's
01:16:35.960 | why they're not, you know, they don't understand what they're doing. We're gonna talk about
01:16:39.120 | that later maybe, but like we can distinguish in our, forget about large language models,
01:16:44.920 | like humans, maybe you'll talk about that later too, is like the difference between
01:16:49.200 | language, which is a communication system, and thinking, which is meaning. So language
01:16:54.440 | is a communication system for the meaning, it's not the meaning. And so that's why, I
01:16:59.760 | mean, and there's a lot of interesting evidence we can talk about relevant to that.
01:17:04.560 | - Well, I mean, that's a really interesting question. What is the difference between language,
01:17:10.800 | written, communicated, versus thought? What to use the difference between them?
01:17:19.040 | - Well, you or anyone has to think of a task, which they think is a good thinking task.
01:17:24.640 | And there's lots and lots of tasks, which should be good thinking tasks. And whatever
01:17:29.320 | those tasks are, let's say it's, you know, playing chess, or that's a good thinking
01:17:33.160 | task, or playing some game, or doing some complex puzzles, maybe remembering some digits,
01:17:39.640 | that's thinking, remembering some, a lot of different tasks we might think, maybe just
01:17:43.160 | listening to music is thinking, or there's a lot of different tasks we might think of
01:17:46.520 | as thinking. There's this woman in my department, F. Fedorenko, and she's done a lot of work
01:17:51.640 | on this question about what's the connection between language and thought. And so she uses,
01:17:56.680 | I was referring earlier to MRI, fMRI, that's her primary method. And so she has been really
01:18:02.860 | fascinated by this question about whether, what language is. And so, as I mentioned earlier,
01:18:08.600 | you can localize my language area, your language area, in a few minutes. In like 15 minutes
01:18:13.920 | I can listen to language, listen to non-language, or backward speech, or something, and we'll
01:18:18.760 | find areas, left lateralized network in my head, which is specially, which is very sensitive
01:18:24.640 | to language, as opposed to whatever that control was, okay?
01:18:28.080 | - Can you specify what you mean by language, like communicated language? Like what is language?
01:18:31.880 | - Just sentences. You know, I'm listening to English of any kind, a story, or I can
01:18:35.680 | read sentences, anything at all that I understand, if I understand it, then it'll activate my
01:18:40.720 | language network.
01:18:41.720 | - In a stable way.
01:18:42.720 | - My language network is going like crazy when I'm talking, and when I'm listening to
01:18:45.960 | you, because we're both, we're communicating.
01:18:48.120 | - And that's pretty stable.
01:18:49.480 | - Yeah, it's incredibly stable. So I've, I happen to be married to this woman at Federico,
01:18:55.400 | and so I've been scanned by her over, and over, and over, since 2007, or six, or something.
01:18:59.680 | And so my language network is exactly the same, you know, like a month ago, as it was
01:19:04.480 | back in 2007.
01:19:05.480 | - Oh, wow.
01:19:06.480 | - It's amazingly stable, it's astounding. It's a really fundamentally cool thing. And
01:19:11.720 | so my language network is, it's like my face, okay? It's not changing much over time, inside
01:19:16.720 | my head.
01:19:17.720 | - Can I ask a quick question? Sorry, this is a small tangent. At which point in the,
01:19:22.280 | as you grow up from baby to adult, does it stabilize?
01:19:26.000 | - We don't know.
01:19:27.000 | - We don't know.
01:19:28.000 | - That's a very hard question. They're working on that right now, because of the problem
01:19:31.560 | scanning little kids. Like doing the, trying to do local, trying to do the localization
01:19:36.520 | on little children in this scanner. You're lying in the fMRI scan, that's the best way
01:19:41.280 | to figure out where something's going on inside our brains. And the scanner's loud, and you're
01:19:45.680 | in this tiny little area, you're claustrophobic. And it doesn't bother me at all, I can go
01:19:50.360 | to sleep in there. But some people are bothered by it, and little kids don't really like it,
01:19:54.520 | and they don't like to lie still. And you have to be really still, because if you move
01:19:57.760 | around, that messes up the coordinates of where everything is. And so, you know, try
01:20:02.160 | to get, you know, your question is, how and when are language developing, you know, how
01:20:07.440 | does this left lateralized system come to play? And it's really hard to get a two year
01:20:11.480 | old to do this task. But you can maybe, they're starting to get three and four and five year
01:20:15.600 | olds to do this task for short periods, and it looks like it's there pretty early.
01:20:19.960 | - So clearly, when you lead up to a baby's first words, before that, there's a lot of
01:20:26.120 | fascinating turmoil going on about figuring out, what are these people saying? And you're
01:20:32.720 | trying to make sense, how does that connect to the world, and all that kind of stuff.
01:20:36.960 | That might be just fascinating development that's happening there. That's hard to introspect.
01:20:41.760 | - But anyway, we're back to the scanner. And I can find my network in 15 minutes, and now
01:20:47.640 | we can ask, find my network, find yours, find, you know, 20 other people do this task. And
01:20:53.080 | we can do some other tasks. Anything else you think is thinking of some other thing.
01:20:56.880 | I can do a spatial memory task. I can do a music perception task. I can do programming
01:21:03.920 | task, if I program, okay? I can do, where I can understand computer programs. And none
01:21:10.080 | of those tasks tap the language network at all. Like, at all. There's no overlap. They're
01:21:15.320 | highly activated in other parts of the brain. There's a bilateral network, which I think
01:21:20.880 | she tends to call the multiple demands network, which does anything kind of hard. And so anything
01:21:25.360 | that's kind of difficult in some ways will activate that multiple demands network. I
01:21:30.480 | mean, music will be in some music area. You know, there's music-specific kinds of areas.
01:21:36.560 | But none of them are activating the language area at all, unless there's words. Like, so
01:21:41.440 | if you have music, and there's a song, and you can hear the words, then you get the language
01:21:45.640 | area.
01:21:46.640 | - Are we talking about speaking and listening? Or are we also talking about reading?
01:21:50.520 | - This is all comprehension of any kind.
01:21:52.840 | - That is fascinating.
01:21:54.680 | - So this network doesn't make any difference if it's written or spoken. So the thing that
01:22:00.720 | she calls, Federico calls, the language network is this high-level language. So it's not about
01:22:05.160 | the spoken language, and it's not about the written language. It's about either one of
01:22:09.240 | them. And so when you do speech, you either listen to speech, and you subtract away some
01:22:14.840 | language you don't understand, or you subtract away backward speech, which sounds like speech,
01:22:20.760 | but it isn't. And then so you take away the sound part altogether. And then if you do
01:22:26.680 | written, you get exactly the same network. So for just reading the language versus reading
01:22:32.040 | sort of nonsense words or something like that, you'll find exactly the same network. And
01:22:36.280 | so this is about high-level comprehension of language, yeah, in this case. And the same
01:22:41.560 | thing happens, production's a little harder to run the scanner, but the same thing happens
01:22:44.280 | in production. You get the same network. So production's a little harder, right? You have
01:22:47.320 | to figure out how do you run a task in the network such that you're doing some kind of
01:22:50.920 | production. And I can't remember what, they've done a bunch of different kinds of tasks there
01:22:54.360 | where you get people to produce things, yeah, figure out how to produce. And the same network
01:22:59.720 | goes on there. It's actually the same place. - Wait, wait, so if you read random words?
01:23:04.600 | - Yeah, if you read things like-- - Like gibberish.
01:23:07.480 | - Yeah, yeah, Lewis Carroll's twas brillig, jabberwocky, right? They call that jabberwocky
01:23:12.600 | speech. - The network doesn't get activated.
01:23:14.760 | - Not as much. There are words in there. - Yeah, 'cause it's still--
01:23:17.880 | - There's function words and stuff, so it's lower activation.
01:23:20.600 | - Fascinating. - Yeah, yeah. So there's like,
01:23:22.440 | basically, the more language-like it is, the higher it goes in the language network. And
01:23:27.000 | that network is there from when you speak, as soon as you learn language. And it's there,
01:23:33.560 | like you speak multiple languages, the same network is going for your multiple languages.
01:23:37.640 | So you speak English, you speak Russian, both of them are hitting that same network if you're
01:23:43.000 | fluent in those languages. - So programming--
01:23:45.080 | - Not at all. Isn't that amazing? Even if you're a really good programmer, that is not a human
01:23:50.520 | language. It's just not conveying the same information. And so it is not in the language
01:23:55.480 | network. - That is mind-blowing,
01:23:57.240 | as I think. That's weird. - It's pretty cool.
01:23:58.600 | - That's weird. - It is amazing.
01:23:59.880 | - That's really weird. - And so that's like one set of data.
01:24:01.800 | This is hers, shows that what you might think is thinking is not language. Language is just
01:24:08.440 | this conventionalized system that we've worked out in human languages. Oh, another fascinating
01:24:14.600 | little tidbit is that even if there are these constructed languages like Klingon, or I don't
01:24:21.560 | know the languages from Game of Thrones, I'm sorry, I don't remember those languages.
01:24:24.600 | - There's a lot of people offended right now. - There's people that speak those languages.
01:24:28.200 | They really speak those languages because the people that wrote the languages for the shows,
01:24:34.920 | they did an amazing job of constructing something like a human language. And that lights up the
01:24:40.840 | language area. Because they can speak pretty much arbitrary thoughts in a human language.
01:24:46.840 | It's a constructed human language, and probably it's related to human languages because the people
01:24:51.560 | that were constructing them were making them like human languages in various ways. But it also
01:24:56.040 | activates the same network, which is pretty cool. Anyway.
01:24:59.400 | - Sorry to go into a place where you may be a little bit philosophical, but is it possible
01:25:05.400 | that this area of the brain is doing some kind of translation into a deeper set of
01:25:09.960 | almost like concepts? - It has to be doing.
01:25:14.760 | So it's doing in communication, right? It is translating from thought, whatever that is,
01:25:19.960 | it's more abstract, and it's doing that. That's what it's doing. That is kind of what it is doing.
01:25:24.920 | It's kind of a meaning network, I guess. - Yeah, like a translation network. But I
01:25:29.240 | wonder what is at the core, at the bottom of it, what are thoughts? Are thoughts,
01:25:34.440 | to me like thoughts and words, are they neighbors, or is it one turtle sitting on top of the other?
01:25:41.960 | Meaning like, is there a deep set of concepts that we--
01:25:46.280 | - Well, there's connections between what these things mean, and then there's probably other
01:25:51.240 | parts of the brain that what these things mean. And so when I'm talking about whatever it is I
01:25:56.360 | want to talk about, it'll be represented somewhere else. That knowledge of whatever that is will be
01:26:01.400 | represented somewhere else. - Well, I wonder if there's some stable,
01:26:04.840 | nicely compressed encoding of meanings that's separate from language. I guess the implication
01:26:14.200 | here is that we don't think in language. - That's correct. Isn't that cool? And that's
01:26:21.720 | so interesting. So people, I mean, this is like hard to do experiments on, but there is this idea
01:26:26.680 | of inner voice, and a lot of people have an inner voice. And so if you do a poll on the internet and
01:26:32.360 | ask if you hear yourself talking when you're just thinking or whatever, about 70 or 80% of people
01:26:37.720 | will say yes. Most people have an inner voice. I don't. And so I always find this strange. So when
01:26:44.280 | people talk about an inner voice, I always thought this was a metaphor, and they hear. I know most of
01:26:50.360 | you, whoever's listening to this, thinks I'm crazy now 'cause I don't have an inner voice, and I just
01:26:55.240 | don't know what you're listening to. It sounds so kind of annoying to me to have this voice going on
01:27:01.000 | while you're thinking, but I guess most people have that, and I don't have that, and we don't
01:27:06.760 | really know what that connects to. - I wonder if the inner voice activates
01:27:10.280 | that same network. I wonder. - I don't know. I don't know. I mean,
01:27:14.280 | this could be speechy, right? So that's like, you hear. Do you have an inner voice?
01:27:17.720 | - I don't think so. - Oh. A lot of people have
01:27:20.280 | this sense that they hear themselves, and then say they read someone's email. I've heard people tell
01:27:25.960 | me that they hear that other person's voice when they read other people's emails, and I'm like,
01:27:31.640 | wow, that sounds so disruptive. - I do think I vocalize what I'm reading,
01:27:36.520 | but I don't think I hear a voice. - Well, you probably don't have
01:27:39.800 | an inner voice. - Yeah, I don't think I have an inner voice.
01:27:40.840 | - People have an inner voice. People have this strong percept of hearing sound in their heads
01:27:46.600 | when they're just thinking. - I refuse to believe
01:27:49.000 | that's the majority of people. - Majority, absolutely.
01:27:51.320 | - What? - It's like two-thirds or
01:27:53.320 | three-quarters. It's a lot. - What?
01:27:54.600 | - I would never ask class, and when I go on the internet, they always say that.
01:27:58.280 | So you're in a minority. - It could be a self-report flaw.
01:28:01.320 | - It could be. - You know, when I'm reading
01:28:03.480 | inside my head, I'm kind of like saying the words, which is probably the wrong way to read,
01:28:12.920 | but I don't hear a voice. There's no percept of a voice. I refuse to believe the majority
01:28:19.400 | people have it. Anyway, it's a fascinating, the human brain is fascinating, but it still blew
01:28:23.560 | my mind that language does appear, comprehension does appear to be separate from thinking.
01:28:31.240 | - Mm-hmm, so that's one set. One set of data from Fedorenko's group is that no matter what task you
01:28:39.160 | do, if it doesn't have words and combinations of words in it, then it won't light up the language
01:28:43.800 | network. It'll be active somewhere else, but not there. So that's one. And then this other
01:28:49.320 | piece of evidence relevant to that question is it turns out there are this group of people who've
01:28:56.680 | had a massive stroke on the left side and wiped out their language network. And as long as they
01:29:02.520 | didn't wipe out everything on the right as well, in that case, they wouldn't be cognitively
01:29:06.280 | functionable. But if they just wiped out language, which is pretty tough to do because it's very
01:29:11.160 | expansive on the left, but if they have, then there is patients like this, so-called global
01:29:17.240 | aphasics, who can do any task just fine, but not language. They can't talk to them. I mean,
01:29:24.760 | they don't understand you. They can't speak, can't write, they can't read, but they can play chess,
01:29:31.240 | they can drive their cars, they can do all kinds of other stuff, do math. So math is not in the
01:29:36.520 | language area, for instance. You do arithmetic and stuff, that's not language area. It's got
01:29:40.680 | symbols. So people sort of confuse some kind of symbolic processing with language, and symbolic
01:29:44.440 | processing is not the same. So there are symbols and they have meaning, but it's not language. It's
01:29:49.400 | not a conventionalized language system. And so math isn't there. And so they can do math. They
01:29:55.720 | do just as well as their age-matched controls and all these tasks. This is Rosemary Varley over in
01:30:01.080 | University College London, who has a bunch of patients who she's shown this, that they're just,
01:30:05.320 | so that sort of combination suggests that language isn't necessary for thinking. It doesn't mean you
01:30:14.040 | can't think in language. You could think in language, 'cause language allows a lot of
01:30:17.640 | expression, but it's just, you don't need it for thinking. It suggests that language is separate,
01:30:22.280 | is a separate system. - This is kind of blowing
01:30:24.920 | my mind right now. - It's cool, isn't it?
01:30:26.040 | - I'm trying to load that in, because it has implications for large language models.
01:30:32.120 | - It sure does, and they've been working on that.
01:30:34.280 | - Well, let's take a stroll there. You wrote that the best current theories of human language are
01:30:39.320 | arguably large language models. So this has to do with form.
01:30:42.760 | - It's kind of a big theory, but the reason it's arguably the best is that it does the best at
01:30:49.720 | predicting what's English, for instance. It's incredibly good, better than any other theory.
01:30:55.800 | It's so, but we don't, it's not sort of, there's not enough detail.
01:31:00.760 | - Well, it's opaque. You don't know what's going on.
01:31:03.960 | - You don't know what's going on. It's another black box. But I think it is a theory.
01:31:07.640 | - What's your definition of a theory? 'Cause it's a gigantic black box with a very large
01:31:13.640 | number of parameters controlling it. To me, theory usually requires a simplicity, right?
01:31:19.960 | - Well, I don't know. Maybe I'm just being loose there. I think it's not a great theory,
01:31:24.920 | but it's a theory. It's a good theory in one sense, in that it covers all the data.
01:31:28.760 | Like anything you want to say in English, it does. And so that's how it's arguably the best,
01:31:33.080 | is that no other theory is as good as a large language model in predicting exactly what's good
01:31:38.440 | and what's bad in English. Now you're saying, is it a good theory? Well, probably not, you know,
01:31:43.800 | because I want a smaller theory than that. It's too big. I agree.
01:31:46.920 | - You could probably construct a mechanism by which it can generate a simple explanation
01:31:53.400 | of a particular language, like a set of rules. It could generate a dependency
01:32:01.400 | grammar for a language, right? - Yes.
01:32:03.240 | - You could probably just ask it about itself. - Well, you know, that presumes,
01:32:14.520 | and there's some evidence for this, that some large language models are implementing something
01:32:20.680 | like dependency grammar inside them. And so there's work from a guy called Chris Manning
01:32:25.560 | and colleagues over at Stanford in natural language. And they looked at, I don't know
01:32:31.960 | how many large language model types, but certainly BERT and some others, where you do some kind of
01:32:38.120 | fancy math to figure out exactly what kind of abstractions of representations are going on.
01:32:43.320 | And they were saying, it does look like dependency structure is what they're constructing. So it's
01:32:49.160 | actually a very, very good map. So they are constructing something like that. Does it mean
01:32:55.960 | that they're using that for meaning? I mean, probably, but we don't know.
01:33:00.360 | - You write that the kinds of theories of language that LLMs are closest to
01:33:05.000 | are called construction-based theories. Can you explain what construction-based theories are?
01:33:09.160 | - It's just a general theory of language such that there's a form and a meaning pair
01:33:16.360 | for lots of pieces of the language. And so it's primarily usage-based, is the construction
01:33:21.720 | grammar. It's trying to deal with the things that people actually say, actually say and actually
01:33:27.480 | write. And so it's a usage-based idea. And what's a construction? A construction is either a simple
01:33:33.720 | word, so like a morpheme plus its meaning, or a combination of words. It's basically
01:33:39.320 | combinations of words, like the rules. But it's unspecified as to what the form of the grammar
01:33:49.560 | is underlyingly. And so I would argue that the dependency grammar is maybe the right form to use
01:33:56.760 | for the types of construction grammar. Construction grammar typically isn't kind of formalized quite.
01:34:03.480 | And so maybe the formalization, a-formalization of that, it might be in dependency grammar.
01:34:09.400 | I mean, I would think so. But I mean, it's up to people, other researchers in that area,
01:34:14.520 | if they agree or not. - Do you think
01:34:17.160 | that large language models understand language? Are they mimicking language? I guess the deeper
01:34:23.720 | question there is, are they just understanding the surface form? Or do they understand something
01:34:29.720 | deeper about the meaning that then generates the form? - I mean, I would argue they're doing the
01:34:35.160 | form. They're doing the form, they're doing it really, really well. And are they doing the
01:34:38.440 | meaning? No, probably not. I mean, there's lots of these examples from various groups showing that
01:34:44.120 | they can be tricked in all kinds of ways. They really don't understand the meaning of what's
01:34:48.440 | going on. And so there's a lot of examples that he and other groups have given, which show they
01:34:55.400 | don't really understand what's going on. So you know the Monty Hall problem is this silly problem,
01:35:00.440 | right? Where if you have three door, it's let's make a deal, it's this old game show,
01:35:06.040 | and there's three doors, and there's a prize behind one, and there's some junk prizes behind
01:35:12.680 | the other two, and you're trying to select one. And if you, he knows, Monty, he knows where the
01:35:18.760 | target item is, the good thing, he knows everything is back there. And you're supposed to, he gives
01:35:24.360 | you a choice, you choose one of the three, and then he opens one of the doors, and it's some
01:35:28.040 | junk prize. And then the question is, should you trade to get the other one? And the answer is yes,
01:35:32.360 | you should trade, because he knew which ones you could turn around, and so now the odds are two
01:35:36.440 | thirds, okay? And then if you just change that a little bit to the large language model, the large
01:35:41.720 | language model has seen that explanation so many times that it just, if you change the story, it's
01:35:47.560 | a little bit, but it makes it sound like it's the Monty Hall problem, but it's not. You just say,
01:35:51.720 | "Oh, there's three doors, and one behind them is a good prize, and there's two bad doors. I happen
01:35:57.400 | to know it's behind door number one. The good prize, the car, is behind door number one. So,
01:36:01.800 | I'm going to choose door number one. Monty Hall opens door number three and shows me nothing
01:36:05.560 | there. Should I trade for door number two, even though I know the good prize is in door number
01:36:09.560 | one?" And then the large language model will say, "Yes, you should trade," because it just goes
01:36:13.960 | through the forms that it's seen before so many times on these cases, where it's, "Yes, you should
01:36:20.920 | trade, because your odds have shifted from one and three now to two out of three to being that thing."
01:36:25.640 | It doesn't have any way to remember that actually you have 100% probability behind that door number
01:36:31.800 | one. You know that. That's not part of the scheme that it's seen hundreds and hundreds of times
01:36:37.160 | before. Even if you try to explain to it that it's wrong, that it can't do that, it'll just keep
01:36:43.080 | giving you back the problem. - But it's also possible the large language model will be aware
01:36:48.200 | of the fact that there's sometimes over-representation of a particular kind of formulation.
01:36:55.800 | And it's easy to get tricked by that. So you could see if they get larger and larger,
01:37:01.880 | models be a little bit more skeptical. So you see over-representation. So it just feels like
01:37:08.040 | training on form can go really far in terms of being able to generate
01:37:19.160 | things that look like the thing understands deeply the underlying world model of the kind of
01:37:28.200 | mathematical world, physical world, psychological world that would generate these kinds of sentences.
01:37:36.600 | It just feels like you're creeping close to the meaning part. Easily fooled, all this kind of
01:37:42.600 | stuff. But that's humans too. So it just seems really impressive how often it seems like it
01:37:51.320 | understands concepts. - I mean, you don't have to convince me of that. I am very, very impressed.
01:37:58.120 | I mean, you're giving a possible world where maybe someone's going to train some other versions such
01:38:05.480 | that it'll be somehow abstracting away from types of forms. I mean, I don't think that's happened.
01:38:11.880 | - Well, no, no, no. I'm not saying that. I think when you just look at anecdotal examples
01:38:17.640 | and just showing a large number of them where it doesn't seem to understand and it's easily fooled,
01:38:22.680 | that does not seem like a scientific data-driven analysis of how many places is damn impressive
01:38:32.360 | in terms of meaning and understanding and how many places is easily fooled.
01:38:35.560 | - That's not the inference. So I don't want to make that. The inference I wouldn't want to make
01:38:40.760 | was that inference. The inference I'm trying to push is just that is it like humans here? It's
01:38:46.120 | probably not like humans here. It's different. So humans don't make that error. If you explain that
01:38:50.920 | to them, they're not going to make that error. They don't make that error. And so it's doing
01:38:55.320 | something different from humans that they're doing in that case. - Well, what's the mechanism by which
01:39:00.360 | humans figure out that it's an error? - I'm just saying the error there is like, if I explain to
01:39:04.840 | you there's a 100% chance that the car is behind this door, well, do you want to trade? People say
01:39:11.240 | no. But this thing will say yes because it's so, that trick, it's so wound up on the form
01:39:17.480 | that it's, that's an error that a human doesn't make, which is kind of interesting.
01:39:22.840 | - Less likely to make, I should say. - Yeah, less likely.
01:39:25.640 | - Because like humans are very. - Oh yeah.
01:39:28.440 | - I mean, you're asking, you know, you're asking humans to, you're asking a system to understand
01:39:34.200 | 100%, like you're asking some mathematical concepts. And so like. - Look, the places
01:39:40.600 | where large language models are, the form is amazing. So let's go back to nested structures,
01:39:46.840 | center-embedded structures, okay? If you ask a human to complete those, they can't do it.
01:39:50.920 | Neither can a large language model. They're just like humans in that. If you ask, if I ask a large
01:39:55.960 | language model. - That's fascinating, by the way. The
01:39:58.440 | central embedding, the central embedding is, it struggles with.
01:40:01.400 | - Just like humans, exactly like humans. Exactly the same way as humans. And that's not trained.
01:40:06.360 | So they do exactly, so that is a similarity. So but then it's, that's not meaning, right? This
01:40:13.400 | is form. But when we get into meaning, this is where they get kind of messed up, where you start
01:40:17.960 | to saying, oh, what's behind this door? Oh, it's, you know, this is the thing I want. Humans don't
01:40:22.920 | mess that up as much. Here, the form is just like, the form matches amazing, similar, without being
01:40:31.160 | trained to do that. I mean, it's trained in the sense that it's getting lots of data, which is
01:40:34.840 | just like human data, but it's not being trained on bad sentences and being told what's bad. It
01:40:41.720 | just can't do those. It'll actually say things like, those are too hard for me to complete,
01:40:46.760 | or something, which is kind of interesting, actually. How does it know that? I don't know.
01:40:50.120 | Oh, but it really often doesn't just complete sentences. It very often says stuff that's true,
01:40:58.280 | and sometimes says stuff that's not true. And almost always the form is great.
01:41:04.840 | But it's still very surprising that with really great form, it's able to generate a lot of things
01:41:12.440 | that are true, based on what it's trained on and so on. So it's not just form that is
01:41:19.800 | generating. It's mimicking true statements from the internet. I guess the underlying idea there
01:41:28.040 | is that on the internet, truth is overrepresented versus falsehoods.
01:41:33.160 | Yeah, I think that's probably right, yeah.
01:41:34.840 | So, but the fundamental thing it's trained on, you're saying, is just form.
01:41:39.240 | I think so, yeah. Yeah, I think so.
01:41:41.160 | Well, that's a sad... To me, that's still a little bit of an open question. I probably lean
01:41:48.120 | agreeing with you, especially now you've just blown my mind that there's a separate module
01:41:54.440 | in the brain for language versus thinking. Maybe there's a fundamental part missing from
01:42:00.680 | the large language model approach that lacks the thinking, the reasoning capability.
01:42:06.840 | Yeah, that's what this group argues. So the same group, Fedorenko's group,
01:42:13.800 | has a recent paper arguing exactly that. There's a guy called Kyle Mahowell, who's here in Austin,
01:42:20.360 | Texas, actually. He's an old student of mine, but he's a faculty in linguistics at Texas,
01:42:25.000 | and he was the first author on that.
01:42:26.200 | That's fascinating. Still, to me, an open question.
01:42:30.440 | Yeah.
01:42:31.080 | What to you are the interesting limits of LLMs?
01:42:32.920 | You know, I don't see any limits to their form. Their form is perfect.
01:42:38.920 | Impressive.
01:42:39.480 | Yeah, yeah, yeah. It's pretty much... I mean, it's close to...
01:42:41.800 | Well, you said ability to complete central embeddings.
01:42:44.920 | Yeah, it's just the same as humans. It seems the same.
01:42:47.560 | But that's not perfect, right? It should be able to...
01:42:49.080 | That's good. No, but I want it to be like humans. I want a model of humans.
01:42:53.400 | Oh, wait, wait, wait. Oh, so perfect is as close to humans as possible. I got it.
01:42:59.080 | Yeah, yeah.
01:42:59.640 | But you should be able to, if you're not human, like you're superhuman,
01:43:03.160 | you should be able to complete central embedded sentences, right?
01:43:06.600 | I mean, that's the mechanism. If it's modeling something,
01:43:10.840 | I think it's kind of really interesting that it can't...
01:43:13.320 | That it's really interesting.
01:43:14.120 | That it's more like... I think it's potentially
01:43:17.240 | underlyingly modeling something like the way the form is processed.
01:43:21.560 | The form of human language.
01:43:22.600 | Yeah, the way that...
01:43:23.400 | And how humans process the language.
01:43:25.960 | Yes, yes. I think that's plausible.
01:43:27.800 | And how they generate language. Process language and generate language, that's fascinating.
01:43:31.400 | Yeah.
01:43:31.900 | So in that sense, they're perfect.
01:43:35.160 | If we can just linger on the center embedding thing, that's hard for LLMs to produce,
01:43:40.040 | and that seems really impressive because that's hard for humans to produce.
01:43:43.400 | And how does that connect to the thing we've been talking about before,
01:43:48.520 | which is the dependency grammar framework in which you view language,
01:43:52.920 | and the finding that short dependencies seem to be a universal part of language.
01:43:58.120 | So why is it hard to complete center embeddings?
01:44:01.960 | So what I like about dependency grammar is it makes
01:44:05.480 | the cognitive cost associated with longer distance connections very transparent.
01:44:14.360 | Turns out there is a cost associated with producing and comprehending
01:44:19.480 | connections between words which are just not beside each other.
01:44:23.320 | The further apart they are, the worse it is, according to...
01:44:27.160 | Well, we can measure that.
01:44:28.520 | And there is a cost associated with that.
01:44:30.840 | Can you just linger on what do you mean by cognitive cost?
01:44:33.960 | Sure.
01:44:34.200 | And how do you measure it?
01:44:34.840 | Oh, well, you can measure it in a lot of ways.
01:44:36.760 | The simplest is just asking people to say how good a sentence sounds.
01:44:42.280 | We just ask.
01:44:43.320 | That's one way to measure.
01:44:44.360 | And you can try to triangulate then across sentences and across structures
01:44:48.920 | to try to figure out what the source of that is.
01:44:50.840 | You can look at reading times in controlled materials.
01:44:56.760 | In certain kinds of materials, and then we can measure the dependency distances there.
01:45:01.320 | There's a recent study which looked at...
01:45:05.320 | We're talking about the brain here.
01:45:08.120 | We could look at the language network, okay?
01:45:09.960 | We could look at the language network and we could look at the activation
01:45:13.240 | in the language network and how big the activation is depending on
01:45:17.480 | the length of the dependencies.
01:45:18.920 | And it turns out in just random sentences that you're listening to,
01:45:21.720 | if you're listening to...
01:45:22.440 | So it turns out there are people listening to stories here.
01:45:25.480 | And the bigger...
01:45:27.080 | The longer the dependency is, the stronger the activation in the language network.
01:45:33.400 | And so there's some measure.
01:45:35.240 | There's a bunch of different measures we could do.
01:45:37.240 | That's kind of a neat measure, actually, of actual...
01:45:39.880 | - Activations.
01:45:40.760 | - Activation in the brain.
01:45:41.880 | - So you can somehow, in different ways, convert it to a number.
01:45:44.920 | I wonder if there's a beautiful equation connecting cognitive costs
01:45:47.880 | and length of dependency.
01:45:49.240 | E equals MC squared kind of thing.
01:45:50.920 | - Yeah, it's complicated, but probably it's doable.
01:45:54.040 | I would guess it's doable.
01:45:55.480 | I tried to do that a while ago and I was reasonably successful,
01:46:00.360 | but for some reason I stopped working on that.
01:46:02.200 | I agree with you that it would be nice to figure out...
01:46:04.600 | So there's some way to figure out the cost.
01:46:07.400 | I mean, it's complicated.
01:46:08.680 | Another issue you raised before was how do you measure distance?
01:46:12.120 | Is it words?
01:46:12.840 | It probably isn't part of the problem.
01:46:15.960 | Is that some words matter more than others, and probably meaning nouns
01:46:22.040 | might matter, and then it maybe depends on which kind of noun.
01:46:25.080 | Is it a noun we've already introduced or a noun that's already been mentioned?
01:46:28.440 | Is it a pronoun versus a name?
01:46:29.960 | All these things probably matter.
01:46:32.280 | So probably the simplest thing to do is just like,
01:46:34.120 | "Oh, let's forget about all that and just think about words or morphemes."
01:46:38.200 | - For sure.
01:46:39.160 | But there might be some insight in the kind of function that fits the data,
01:46:47.480 | meaning like a quadratic, like what...
01:46:49.320 | - I think it's an exponential.
01:46:51.480 | - Exponential.
01:46:51.720 | - So we think it's probably an exponential such that the longer the distance,
01:46:55.560 | the less it matters.
01:46:57.080 | And so then it's the sum of those is my...
01:47:00.440 | That was our best guess a while ago.
01:47:02.120 | So you've got a bunch of dependencies.
01:47:03.560 | If you've got a bunch of them that are being connected at some point,
01:47:06.680 | that's at the ends of those, the cost is some exponential function of those is my guess.
01:47:13.240 | But because the reason it's probably an exponential is like it's not just the distance
01:47:18.200 | between two words because I can make a very, very long subject verb dependency
01:47:21.960 | by adding lots and lots of noun phrases and prepositional phrases,
01:47:25.400 | and it doesn't matter too much.
01:47:27.240 | It's when you do nested, when I have multiple of these,
01:47:30.040 | then things go really bad, go south.
01:47:34.360 | - Probably somehow connected to working memory or something like this.
01:47:36.920 | - Yeah, that's probably a function of the memory here is the access,
01:47:40.760 | is trying to find those earlier things.
01:47:43.640 | It's kind of hard to figure out what was referred to earlier.
01:47:47.320 | Those are those connections.
01:47:48.280 | That's the sort of notion of merking, as opposed to a storagy thing,
01:47:51.960 | but trying to connect, retrieve those earlier words depending on what was in between.
01:47:57.480 | And then we're talking about interference of similar things in between.
01:48:01.240 | That's the right theory probably has that kind of notion,
01:48:04.200 | and it is an interference of similar.
01:48:06.280 | And so I'm dealing with an abstraction over the right theory,
01:48:09.000 | which is just, you know, let's count words.
01:48:10.600 | It's not right, but it's close.
01:48:12.120 | And then maybe you're right though, there's some sort of an exponential
01:48:15.800 | or something to figure out the total,
01:48:18.280 | so we can figure out a function for any given sentence in any given language.
01:48:22.920 | But you know, it's funny, people haven't done that too much,
01:48:25.640 | which I do think is, I'm interested that you find that interesting.
01:48:29.560 | I really find that interesting,
01:48:30.760 | and a lot of people haven't found it interesting.
01:48:32.600 | And I don't know why I haven't got people to want to work on that.
01:48:35.480 | I really like that too.
01:48:36.440 | - No, that's a beautified, and the underlying idea is beautiful,
01:48:40.120 | that there's a cognitive cost that correlates with the length of dependency.
01:48:44.760 | It feels like, I mean, language is so fundamental to the human experience,
01:48:48.440 | and this is a nice, clean theory of language where it's like, wow, okay,
01:48:55.560 | so we like our words close together, dependent words close together.
01:49:00.200 | - Yeah, that's why I like it too.
01:49:01.800 | It's so simple.
01:49:02.600 | - Yeah, the simplicity of the theory, yeah.
01:49:04.000 | - It's so simple, and yet it explains some very complicated phenomena.
01:49:07.160 | If I write these very complicated sentences,
01:49:09.640 | it's kind of hard to know why they're so hard,
01:49:11.400 | and you can like, oh, nail it down.
01:49:13.320 | I can give you a math formula for why each one of them is bad and where,
01:49:17.960 | and that's kind of cool.
01:49:19.160 | I think that's very neat.
01:49:20.120 | - Have you gone through the process?
01:49:21.480 | Is there like, if you take a piece of text and then simplify,
01:49:25.560 | sort of like there's an average length of dependency,
01:49:29.720 | and then you like, you know, reduce it and see comprehension on the entire,
01:49:35.320 | not just single sentence, but like, you know,
01:49:37.480 | you go from James Joyce to Hemingway or something.
01:49:40.040 | - No, no, simple answer is no.
01:49:43.880 | That does, there's probably things you can do in that kind of direction.
01:49:46.760 | - That's fun.
01:49:47.480 | - We might, you know, we're gonna talk about legalese at some point,
01:49:50.760 | so maybe we'll talk about that kind of thinking with applied to legalese.
01:49:55.400 | - Well, let's talk about legalese, 'cause you mentioned that as an exception.
01:49:58.040 | We're just taking it tangent upon tangent.
01:49:59.880 | That's an interesting one.
01:50:00.760 | You give it as an exception.
01:50:02.360 | - It's an exception.
01:50:03.480 | - That you say that most natural languages, as we've been talking about,
01:50:08.840 | have local dependencies, with one exception, legalese.
01:50:12.440 | - That's right.
01:50:13.080 | - So what is legalese, first of all?
01:50:15.160 | - Oh, well, legalese is what you think it is.
01:50:18.520 | It's just any legal language.
01:50:19.880 | - Well, I mean, like, I actually know very little
01:50:22.440 | about the kind of language that lawyers use.
01:50:24.120 | - So I'm just talking about language in laws and language in contracts.
01:50:28.280 | - Got it.
01:50:28.680 | - So the stuff that you have to run into,
01:50:30.520 | we have to run into every other day or every day,
01:50:34.040 | and you skip over because it reads poorly.
01:50:38.280 | And, or, you know, partly it's just long, right?
01:50:40.760 | There's a lot of text there that we don't really want to know about.
01:50:43.480 | And so, but the thing I'm interested in,
01:50:46.200 | so I've been working with this guy called Eric Martinez,
01:50:49.960 | who is a, he was a lawyer who was taking my class.
01:50:53.560 | I was teaching a psycholinguistics lab class,
01:50:55.800 | and I have been teaching it for a long time at MIT,
01:50:57.640 | and he's a, he was a law student at Harvard.
01:51:00.120 | And he took the class 'cause he had done some linguistics as an undergrad,
01:51:03.400 | and he was interested in the problem of why legalese sounds hard to understand.
01:51:09.320 | You know, why, and so why is it hard to understand,
01:51:11.880 | and why do they write that way if it is so hard to understand?
01:51:15.320 | It seems apparent that it's hard to understand.
01:51:17.320 | The question is, why is it?
01:51:18.760 | And so we didn't know.
01:51:20.280 | And we did an evaluation of a bunch of contracts.
01:51:24.760 | Actually, we just took a bunch of sort of random contracts,
01:51:27.640 | 'cause I don't know, you know, there's,
01:51:29.240 | contracts in laws might not be exactly the same,
01:51:31.400 | but contracts are kind of the things
01:51:33.720 | that most people have to deal with most of the time.
01:51:36.040 | And so that's kind of the most common thing that humans have,
01:51:38.680 | like humans, that adults in our industrialized society
01:51:43.000 | have to deal with a lot.
01:51:44.200 | And so that's what we polled.
01:51:46.520 | And we didn't know what was hard about them,
01:51:48.520 | but it turns out that the way they're written is very center-embedded,
01:51:53.320 | has nested structures in them.
01:51:54.600 | So it has low-frequency words as well.
01:51:56.920 | That's not surprising.
01:51:57.800 | Lots of texts have low,
01:51:59.000 | it does have surprising, slightly lower-frequency words
01:52:02.200 | than other kinds of control texts,
01:52:04.520 | even sort of academic texts.
01:52:06.200 | Legalese is even worse.
01:52:07.720 | It is the worst that we were able to find.
01:52:09.880 | - This is fascinating.
01:52:10.680 | You just revealed a game that lawyers are playing.
01:52:12.840 | - They're not, though.
01:52:13.400 | - They're optimizing a different, well...
01:52:15.080 | - You know, it's interesting.
01:52:15.880 | That's a, now you're getting at why.
01:52:17.720 | And so, and I don't think,
01:52:18.920 | so now you're saying it's, they're doing it intentionally.
01:52:20.760 | I don't think they're doing it intentionally.
01:52:21.960 | But let's, let's, let's get to this.
01:52:23.480 | - It's an emergent phenomena, okay.
01:52:24.920 | - Yeah, yeah, yeah, we'll get to that.
01:52:26.200 | We'll get to that.
01:52:26.920 | And so, but we wanted to see why.
01:52:28.760 | So we see what first, as opposed,
01:52:30.520 | so like, 'cause it turns out that we're not the first
01:52:32.840 | to observe that legalese is weird.
01:52:34.200 | Like, back to, Nixon had a plain language act in 1970,
01:52:40.040 | and Obama had one.
01:52:41.480 | And boy, a lot of these, you know,
01:52:44.280 | a lot of presidents have said,
01:52:45.640 | "Oh, we've got to simplify legal language, must simplify it."
01:52:48.520 | But if you don't know how it's complicated,
01:52:50.760 | it's not easy to simplify it.
01:52:52.040 | You need to know what it is you're supposed to do
01:52:53.960 | before you can fix it, right?
01:52:55.480 | And so you need to like, you need a psycholinguist
01:52:58.120 | to analyze the text and see what's wrong with it
01:53:00.680 | before you can like, fix it.
01:53:02.040 | You don't know how to fix it.
01:53:02.840 | How am I supposed to fix something?
01:53:03.960 | I don't know what's wrong with it.
01:53:05.400 | And so what we did was just, that's what we did.
01:53:07.080 | We figured out, well, that's okay.
01:53:08.280 | We just took a bunch of contracts, had people,
01:53:09.880 | and we encoded them for a bunch of features.
01:53:14.600 | And so another feature of the people,
01:53:15.800 | one of them was central embedding.
01:53:17.240 | And so that is like, basically how often a clause
01:53:23.240 | would intervene between a subject and a verb.
01:53:26.200 | For example, that's one kind of a central embedding
01:53:28.360 | of a clause, okay?
01:53:29.480 | And turns out they're massively central embedded.
01:53:32.440 | Like, so I think in random contracts and in random laws,
01:53:35.720 | I think you get about 70% or 80, something like 70%
01:53:39.480 | of sentences have a central embedded clause in them,
01:53:41.480 | which is insanely high.
01:53:43.400 | If you go to any other text, it's down to 20% or something.
01:53:46.680 | It's so much higher than any control you can think of,
01:53:50.280 | including you think, oh, people think,
01:53:51.880 | oh, technical, academic texts.
01:53:54.280 | No, people don't write central embedded sentences
01:53:56.600 | in technical, academic texts.
01:53:58.120 | I mean, they do a little bit, but much,
01:53:59.720 | it's on the 20%, 30% realm, as opposed to 70.
01:54:03.080 | And so there's that, and there's low-frequency words.
01:54:05.720 | And then people, oh, maybe it's passive.
01:54:07.800 | People don't like the passive.
01:54:09.000 | Passive, for some reason, the passive voice in English
01:54:11.480 | has a bad rap, and I'm not really sure
01:54:13.160 | where that comes from.
01:54:14.200 | And there is a lot of passive in the,
01:54:19.160 | there's much more passive voice in legalese
01:54:22.600 | than there is in other texts. - And the passive voice
01:54:24.120 | accounts for some of the low-frequency words.
01:54:25.720 | - No, no, no, no, those are separate, those are separate.
01:54:28.040 | - Oh, so passive voice sucks, low-frequency word sucks.
01:54:30.920 | - Well, sucks is different.
01:54:31.960 | - That's a judgment on passive.
01:54:33.480 | - Yeah, yeah, yeah, drop the judgment.
01:54:35.080 | It's just like, these are frequent.
01:54:36.120 | These are things which happen in legalese text.
01:54:38.200 | Then we can ask.
01:54:38.920 | The dependent measure is how well you understand
01:54:42.760 | those things with those features, okay?
01:54:44.840 | And so then, and it turns out the passive
01:54:46.360 | makes no difference.
01:54:47.240 | So it has zero effect on your comprehension ability,
01:54:50.200 | on your recall ability, nothing at all.
01:54:52.360 | It has no effect.
01:54:53.240 | The words matter a little bit.
01:54:55.880 | They do, and low-frequency words are gonna hurt you
01:54:57.960 | in recall and understanding.
01:54:59.480 | But what really hurts is the center embedding.
01:55:02.600 | That kills you.
01:55:03.560 | That is like, that slows people down.
01:55:05.640 | That makes them very poor at understanding.
01:55:08.920 | That makes them, they can't recall what was said
01:55:11.800 | as well, nearly as well.
01:55:12.760 | And we did this not only on laypeople.
01:55:14.680 | We did it on a lot of laypeople.
01:55:16.040 | We ran it on 100 lawyers.
01:55:17.560 | We recruited lawyers from a wide range
01:55:20.840 | of sort of different levels of law firms and stuff.
01:55:25.800 | And they have the same pattern.
01:55:27.880 | So they also, like when they did this,
01:55:31.160 | I did not know what happened.
01:55:32.040 | I thought maybe they could process.
01:55:33.720 | They're used to legalese.
01:55:34.840 | They didn't process it just as well as it was normal.
01:55:37.320 | No, no, they're much better than laypeople.
01:55:41.240 | So they can much better recall, much better understanding,
01:55:44.200 | but they have the same main effects
01:55:45.720 | as laypeople, exactly the same.
01:55:48.280 | So they also much prefer the non-center.
01:55:51.400 | So we constructed non-center embedded versions
01:55:53.880 | of each of these.
01:55:54.600 | We constructed versions which have higher frequency words
01:55:58.680 | in those places.
01:55:59.480 | And we did, we un-passivized.
01:56:02.360 | We turned them into active versions.
01:56:04.200 | The passive active made no difference.
01:56:06.360 | The words made little difference.
01:56:08.280 | And the un-center embedding makes big differences
01:56:11.240 | in all the populations.
01:56:12.280 | - Un-center embedding.
01:56:13.640 | How hard is that process, by the way?
01:56:14.920 | - It's not very hard.
01:56:15.240 | - For societies, don't question.
01:56:16.280 | But how hard is it to detect center embedding?
01:56:18.760 | - Oh, easy, easy to detect.
01:56:20.280 | - You're just looking at long dependencies?
01:56:21.960 | - Yeah, yeah, you can just, you can.
01:56:23.720 | So there's automatic parsers for English,
01:56:25.640 | which are pretty good.
01:56:26.440 | - And they can detect center embedding?
01:56:28.040 | - Oh, yeah, very.
01:56:28.760 | - Or, I guess, nesting.
01:56:30.040 | - Perfectly, yeah, pretty much.
01:56:32.200 | - So you're not just looking for long dependencies.
01:56:34.280 | You're just literally looking for center embedding.
01:56:35.880 | - Yeah, yeah, we are in this case, in these cases.
01:56:37.480 | But long dependencies, they're highly correlated.
01:56:39.800 | - So like a center embedding is a big bomb
01:56:43.160 | you throw inside of a sentence that just blows up the,
01:56:46.040 | that makes, super.
01:56:47.000 | - Yeah, yeah.
01:56:47.560 | Can I read a sentence for you from these things?
01:56:49.560 | - Sure.
01:56:49.560 | - I mean, this is just like one of the things that,
01:56:52.280 | this is just terrible.
01:56:52.840 | - My eyes might glaze over in mid-sentence.
01:56:55.480 | No, I understand that.
01:56:58.040 | I mean, legalese is hard.
01:56:59.800 | - So here we go.
01:57:00.040 | This is a good one.
01:57:00.600 | It goes, "In the event that any payment or benefit
01:57:02.920 | by the company, all such payments and benefits,
01:57:05.240 | including the payments and benefits
01:57:06.440 | under Section 3A hereof, being here and after
01:57:09.400 | referred to as a total payment,
01:57:10.920 | would be subject to the excise tax,
01:57:13.160 | then the cash severance payments shall be reduced."
01:57:15.320 | So that's something we pulled from a regular text,
01:57:17.800 | from a contract.
01:57:18.760 | - Wow.
01:57:19.400 | - And the center embedded bit there is just,
01:57:21.480 | for some reason, there's a definition.
01:57:23.400 | They throw the definition of what payments and benefits are
01:57:28.120 | in between the subject and the verb.
01:57:29.720 | Let's, how about don't do that?
01:57:31.240 | - Yeah.
01:57:31.560 | - How about put the definition somewhere else,
01:57:33.560 | as opposed to in the middle of the sentence?
01:57:35.880 | And so that's very, very common, by the way.
01:57:38.360 | That's what happens.
01:57:39.480 | You just throw your definitions.
01:57:41.240 | You use a word, a couple of words,
01:57:43.160 | and then you define it.
01:57:44.440 | And then you continue the sentence.
01:57:46.360 | Like, just don't write like that.
01:57:47.800 | And you ask, so then we asked lawyers.
01:57:49.400 | We thought, "Oh, maybe lawyers like this."
01:57:51.080 | Lawyers don't like this.
01:57:51.960 | They don't like this.
01:57:53.800 | They don't wanna write like this.
01:57:55.480 | We asked them to rate materials
01:57:58.440 | which are with the same meaning,
01:57:59.880 | with un-centerbed and centerbed,
01:58:02.440 | and they much preferred the un-centerbed versions.
01:58:05.000 | - On the comprehension, on the reading side.
01:58:06.920 | - Yeah, and we asked them, "Would you hire someone
01:58:10.040 | who writes like this or this?"
01:58:11.000 | We asked them all kinds of questions,
01:58:12.360 | and they always preferred the less complicated version,
01:58:15.560 | all of them.
01:58:16.120 | So I don't even think they want it this way.
01:58:18.120 | - Yeah, but how did it happen?
01:58:19.480 | - How did it happen?
01:58:20.040 | That's a very good question.
01:58:21.160 | And the answer is, we still don't know.
01:58:23.400 | But-- - I have some theories.
01:58:26.040 | - Well, our best theory at the moment is
01:58:28.600 | that there's actually some kind of a performative meaning
01:58:33.080 | in the center embedding, in the style,
01:58:35.400 | which tells you it's legalese.
01:58:37.000 | We think that that's the kind of a style
01:58:38.520 | which tells you it's legalese.
01:58:40.120 | Like, that's a reasonable guess, and maybe it's just...
01:58:43.560 | So, for instance, if you're...
01:58:45.080 | Like, it's like a magic spell.
01:58:47.480 | So we kind of call this the magic spell hypothesis.
01:58:49.880 | So when you tell someone to put a magic spell on someone,
01:58:53.240 | what do you do?
01:58:53.740 | You know, people know what a magic spell is,
01:58:56.680 | and they do a lot of rhyming.
01:58:58.440 | You know, that's kind of what people will tend to do.
01:59:00.440 | They'll do rhyming, and they'll do sort of like
01:59:02.280 | some kind of poetry kind of thing.
01:59:03.560 | - Abracadabra type of thing.
01:59:04.760 | - Exactly, yeah.
01:59:05.320 | And maybe there's a syntactic sort of reflex here
01:59:09.960 | of a magic spell, which is center embedding.
01:59:12.760 | And so that's like, oh, it's trying to tell you
01:59:15.160 | this is something which is true,
01:59:17.560 | which is what the goal of law is, right?
01:59:19.960 | It's telling you something that we want you to believe
01:59:22.840 | as certainly true, right?
01:59:24.280 | That's what legal contracts are trying to enforce on you.
01:59:27.080 | And so maybe that's like a form which has...
01:59:31.400 | This is like a very abstract form, center embedding,
01:59:34.040 | which has a meaning associated with it.
01:59:36.600 | - Well, don't you think there's an incentive
01:59:39.720 | for lawyers to generate things that are hard to understand?
01:59:44.840 | - That was one of our working hypotheses.
01:59:46.760 | We just couldn't find any evidence of that.
01:59:48.760 | - No, lawyers also don't understand it.
01:59:50.600 | - But we asked lawyers.
01:59:50.920 | - But you're creating space.
01:59:52.360 | I mean, you ask in a communist Soviet Union,
01:59:58.360 | the individual members, their self-report
02:00:01.480 | is not going to correctly reflect
02:00:05.080 | what is broken about the gigantic bureaucracy
02:00:07.320 | that leads to Chernobyl or something like this.
02:00:09.240 | I think the incentives under which you operate
02:00:14.760 | are not always transparent to the members
02:00:17.960 | within that system.
02:00:19.320 | So it just feels like a strange coincidence
02:00:22.440 | that there is benefit if you just zoom out,
02:00:26.760 | look at the system, as opposed to asking individual lawyers
02:00:29.480 | that making something hard to understand
02:00:31.560 | is going to make a lot of people money.
02:00:34.200 | - Yeah.
02:00:35.160 | - Like you're gonna need a lawyer to figure that out,
02:00:40.120 | I guess, from the perspective of the individual.
02:00:42.360 | But then that could be the performative aspect.
02:00:44.360 | It could be as opposed to the incentive-driven
02:00:46.520 | to be complicated.
02:00:47.320 | It could be performative to where we lawyers
02:00:49.720 | speak in this sophisticated way
02:00:52.440 | and you regular humans don't understand it,
02:00:54.520 | so you need to hire a lawyer.
02:00:56.040 | Yeah, I don't know which one it is,
02:00:57.080 | but it's suspicious.
02:00:58.120 | Suspicious that it's hard to understand
02:01:01.800 | and that everybody's eyes glaze over
02:01:03.480 | and they don't read.
02:01:04.520 | - I'm suspicious as well.
02:01:05.960 | I'm still suspicious.
02:01:07.560 | And I hear what you're saying.
02:01:08.680 | It could be no individual,
02:01:10.680 | and even average of individuals,
02:01:12.040 | it could just be a few bad apples in a way
02:01:14.440 | which are driving the effect in some way.
02:01:16.760 | - Influential bad apples that everybody looks up to,
02:01:21.400 | or whatever, they're like central figures in how--
02:01:25.240 | - But it is kind of interesting
02:01:27.720 | that among our hundred lawyers,
02:01:29.800 | they did not share that.
02:01:31.080 | - They didn't want this.
02:01:32.520 | They really didn't like it.
02:01:33.640 | - And they weren't better
02:01:35.720 | than regular people at comprehending it,
02:01:37.800 | or they were on average better,
02:01:39.880 | but they had the same difference.
02:01:41.000 | - They had the same difference.
02:01:41.880 | - Exact same difference.
02:01:42.600 | But they wanted it fixed.
02:01:45.160 | And so that gave us hope
02:01:49.240 | that because it actually isn't very hard
02:01:51.400 | to construct a material
02:01:53.800 | which is un-center-embedded
02:01:55.480 | and has the same meaning,
02:01:56.680 | it's not very hard to do.
02:01:57.720 | Just basically in that situation,
02:01:58.920 | just putting definitions
02:01:59.880 | outside of the subject-verb relation
02:02:01.480 | in that particular example,
02:02:02.360 | and that's pretty general.
02:02:04.840 | What they're doing is just throwing stuff in there
02:02:06.840 | which you didn't have to put in there.
02:02:08.200 | There's extra words involved.
02:02:10.520 | Typically, you may need a few extra words
02:02:12.680 | to refer to the things
02:02:14.520 | that you're defining outside in some way,
02:02:16.280 | 'cause if you only use it in that one sentence,
02:02:19.160 | then there's no reason to introduce extra terms.
02:02:23.720 | So we might have a few more words,
02:02:25.400 | but it'll be easier to understand.
02:02:27.080 | So I mean, I have hope
02:02:29.880 | now that maybe we can make legalese
02:02:32.520 | less convoluted in this way.
02:02:35.080 | - So maybe the next president of the United States
02:02:37.320 | can, instead of saying generic things,
02:02:39.160 | say, "I ban center embeddings,"
02:02:43.480 | and make Ted the language czar of the United States.
02:02:47.000 | - Or he can make Eric.
02:02:47.800 | Martinez is the guy you should really put in there.
02:02:50.440 | - Eric Martinez, yeah, yeah, yeah.
02:02:51.400 | (laughing)
02:02:52.440 | - I mean, yeah.
02:02:53.160 | - But center embeddings are the bad thing to have.
02:02:56.280 | - That's right.
02:02:57.160 | - So if you get rid of that--
02:02:58.360 | - That'll do a lot of it.
02:02:59.320 | - That'll fix a lot.
02:03:00.120 | - Oh, that's fascinating.
02:03:01.080 | - Yeah. - That is so fascinating.
02:03:02.280 | - Yeah.
02:03:02.780 | - And it is really fascinating on many fronts
02:03:05.960 | that humans are just not able to deal
02:03:07.480 | with this kind of thing.
02:03:08.440 | And that language, because of that,
02:03:10.120 | evolved in the way you did.
02:03:11.160 | It's fascinating.
02:03:12.040 | So one of the mathematical formulations you have
02:03:15.560 | when talking about languages communication
02:03:18.440 | is this idea of noisy channels.
02:03:20.760 | What's a noisy channel?
02:03:23.160 | - Well, so that's about communication.
02:03:26.360 | And so this is going back to Shannon.
02:03:28.360 | So Shannon, Claude Shannon was a student at MIT
02:03:33.080 | in the '40s.
02:03:33.800 | And so he wrote this very influential piece of work
02:03:37.320 | about communication theory or information theory.
02:03:40.120 | And he was interested in human language, actually.
02:03:43.560 | He was interested in this problem of communication,
02:03:46.600 | of getting a message from my head to your head.
02:03:51.080 | And he was concerned or interested
02:03:54.440 | in what was a robust way to do that.
02:03:59.080 | And so assuming we both speak the same language,
02:04:01.640 | we both already speak English,
02:04:03.480 | whatever the language is, we speak that,
02:04:06.040 | what is a way that I can say the language
02:04:10.280 | so that it's most likely to get the signal
02:04:12.360 | that I want to you?
02:04:14.200 | And then the problem there in the communication
02:04:17.720 | is the noisy channel,
02:04:18.760 | is that there's a lot of noise in the system.
02:04:23.080 | I don't speak perfectly, I make errors, that's noise.
02:04:26.520 | There's background noise, you know that.
02:04:29.560 | - Like a literal--
02:04:30.840 | - Literal background noise.
02:04:31.960 | There is like white noise in the background
02:04:33.560 | or some other kind of noise.
02:04:34.520 | There's some speaking going on that you're at a party,
02:04:38.440 | that's background noise.
02:04:39.320 | You're trying to hear someone, it's hard to understand them
02:04:41.400 | because there's all this other stuff going on
02:04:42.760 | in the background.
02:04:43.320 | And then there's noise on the receiver side
02:04:48.520 | so that you have some problem maybe understanding me
02:04:52.040 | for stuff that's just internal to you in some way.
02:04:54.200 | So you've got some other problems, whatever,
02:04:56.840 | with understanding for whatever reasons.
02:04:59.080 | Maybe you've had too much to drink.
02:05:01.240 | You know, who knows why you're not able to pay attention
02:05:03.800 | to the signal.
02:05:04.360 | So that's the noisy channel.
02:05:05.800 | And so that language, if it's a communication system,
02:05:08.520 | we are trying to optimize in some sense
02:05:12.440 | the passing of the message from one side to the other.
02:05:15.160 | And so, I mean, one idea is that maybe aspects of like
02:05:21.640 | word order, for example, might've optimized in some way
02:05:24.440 | to make language a little more easy to be passed
02:05:28.440 | from speaker to listener.
02:05:29.880 | And so Shannon's the guy that did this stuff
02:05:32.040 | way back in the '40s.
02:05:32.920 | You know, it's very interesting, historically,
02:05:34.920 | he was interested in working in linguistics.
02:05:37.080 | He was at MIT and he did, this was his master's thesis
02:05:40.280 | of all things.
02:05:40.840 | You know, it's crazy how much he did for his master's thesis
02:05:43.880 | in 1948, I think, or '49, something.
02:05:46.280 | And he wanted to keep working in language.
02:05:48.440 | And it just wasn't a popular, communication as a reason,
02:05:53.960 | a source for what language was, wasn't popular at the time.
02:05:56.760 | So Chomsky was becoming, it was moving in there.
02:05:58.760 | And he just wasn't able to get a handle there, I think.
02:06:01.880 | And so he moved to Bell Haps and worked on communication
02:06:05.880 | from a mathematical point of view and was, you know,
02:06:09.240 | did all kinds of amazing work.
02:06:11.560 | And so he's just--
02:06:12.120 | - More on the signal side versus like the language side.
02:06:14.520 | - Yeah.
02:06:14.920 | - Yeah, it would've been interesting to see
02:06:17.400 | if he pursued the language side.
02:06:18.680 | - Yeah.
02:06:19.000 | - That's really interesting.
02:06:19.800 | - Yeah, he was interested in that.
02:06:21.240 | His examples in the '40s are kind of like,
02:06:24.120 | they're very language-like things.
02:06:27.080 | - Yeah.
02:06:27.480 | - We can kind of show that there's a noisy channel process
02:06:31.160 | going on in when you're listening to me, you know,
02:06:34.520 | you can often sort of guess what I meant by what I, you know,
02:06:37.640 | what you think I meant, given what I said.
02:06:39.560 | And I mean, with respect to sort of why language
02:06:43.240 | looks the way it does, we might, there might be sort of,
02:06:45.480 | as I alluded to, there might be ways in which word order
02:06:49.000 | is somewhat optimized for, because of the noisy channel
02:06:52.520 | in some way.
02:06:53.000 | - I mean, that's really cool to sort of model
02:06:55.480 | if you don't hear certain parts of a sentence
02:06:57.800 | or have some probability of missing that part,
02:07:00.920 | like how do you construct a language
02:07:02.200 | that's resilient to that?
02:07:03.320 | That's somewhat robust to that.
02:07:04.680 | - Yeah, that's the idea.
02:07:05.720 | - And then you're kind of saying like the word order
02:07:07.880 | and the syntax of language, the dependency length
02:07:11.080 | are all helpful.
02:07:14.200 | - Yeah, well, the dependency length is really about memory.
02:07:17.080 | I think that's like about sort of what's easier or harder
02:07:19.640 | to produce in some way.
02:07:20.920 | And these other ideas are about sort of robustness
02:07:23.560 | to communication.
02:07:24.440 | So the problem of potential loss of signal due to noise.
02:07:28.840 | It's so that there may be aspects of word order,
02:07:31.640 | which is somewhat optimized for that.
02:07:33.480 | And, you know, we have this one guess in that direction.
02:07:36.360 | These are kind of just so stories.
02:07:38.040 | I have to be, you know, pretty frank.
02:07:39.240 | They're not like, I can't show this is true.
02:07:41.480 | All we can do is like, look at the current languages
02:07:43.320 | of the world.
02:07:44.120 | This is like, we can't sort of see how languages change
02:07:46.520 | or anything because we've got these snapshots of a few,
02:07:49.320 | you know, a hundred or a few thousand languages.
02:07:51.480 | We don't really, we can't do the right kinds
02:07:54.920 | of modifications to test these things experimentally.
02:07:57.560 | And so, you know, so just take this with a grain of salt.
02:08:00.520 | Okay, from here, this stuff.
02:08:01.720 | The dependency stuff I can, I'm much more solid on.
02:08:04.840 | And like, here's what the lengths are.
02:08:06.280 | And here's what's hard.
02:08:07.400 | Here's what's easy.
02:08:08.040 | And this is a reasonable structure.
02:08:09.400 | I think I'm pretty reasonable.
02:08:10.600 | Here's like, why, you know, why does a word order
02:08:13.320 | look the way it does?
02:08:14.040 | We're now into shaky territory, but it's kind of cool.
02:08:17.000 | - But we're talking about, just to be clear,
02:08:19.400 | we're talking about maybe just actually
02:08:21.240 | the sounds of communication.
02:08:22.360 | Like you and I are sitting in a bar.
02:08:24.120 | It's very loud.
02:08:25.320 | And you model with a noisy channel, the loudness,
02:08:30.360 | the noise, and we have the signal that's coming across.
02:08:33.480 | And you're saying word order might have something
02:08:36.200 | to do with optimizing that, where there's presence of noise.
02:08:39.560 | - Yeah, yeah.
02:08:40.680 | - I mean, it's really interesting.
02:08:41.480 | I mean, to me, it's interesting how much you can load
02:08:43.400 | into the noisy channel.
02:08:44.360 | Like how much can you bake in?
02:08:45.640 | You said like, you know, cognitive load
02:08:48.440 | on the receiver end.
02:08:49.320 | - We think that those are, there's three,
02:08:51.480 | at least three different kinds of things going on there.
02:08:53.960 | And we probably don't want to treat them all as the same.
02:08:56.040 | - Sure.
02:08:56.440 | - And so I think that you, you know, the right model,
02:08:58.760 | a better model of a noisy channel would treat,
02:09:00.840 | would have three different sources of noise,
02:09:03.240 | which are background noise, you know,
02:09:06.360 | speaker inherent noise, and listener inherent noise.
02:09:10.040 | And those are not, those are all different things.
02:09:11.640 | - Sure.
02:09:11.960 | But then underneath it, there's a million other subsets.
02:09:15.080 | - Oh yeah, that's true.
02:09:16.680 | - On the receiving end, I mean,
02:09:18.120 | I just mentioned cognitive load on both sides.
02:09:21.000 | Then there's like speech impediments,
02:09:24.200 | or just everything.
02:09:25.640 | World view, I mean, on the meaning,
02:09:27.960 | we start to creep into the meaning realm of like,
02:09:30.440 | we have different worldviews.
02:09:31.640 | - Well, how about just form still though?
02:09:33.160 | Like just what language you know.
02:09:34.840 | Like, so how well you know the language.
02:09:36.760 | And so if it's second language for you versus first language,
02:09:40.200 | and how, maybe what other languages you know,
02:09:42.360 | these are still just form stuff.
02:09:44.200 | And that's like potentially very informative.
02:09:46.760 | And, you know, how old you are.
02:09:48.360 | These things probably matter, right?
02:09:49.800 | So like a child learning a language is a, you know,
02:09:53.000 | as a noisy representation of English grammar,
02:09:56.280 | you know, depending on how old they are.
02:09:58.520 | So maybe when they're six, they're perfectly formed.
02:10:02.040 | But-- - You mentioned one of the things
02:10:04.200 | is like a way to measure a language is learning problems.
02:10:08.280 | So like, what's the correlation between everything
02:10:11.640 | we've been talking about
02:10:12.520 | and how easy it is to learn a language?
02:10:14.600 | So is like short dependencies correlated
02:10:20.200 | to ability to learn a language?
02:10:22.200 | Is there some kind of, or like the dependency grammar,
02:10:24.840 | is there some kind of connection there?
02:10:28.040 | How easy it is to learn?
02:10:30.280 | - Yeah, well, all the languages in the world's language,
02:10:33.160 | none is right now we know is any better than any other
02:10:36.600 | with respect to sort of optimizing dependency links.
02:10:39.080 | For example, they're all kind of do it well.
02:10:41.240 | They all keep low.
02:10:42.200 | So I think of every human language
02:10:45.320 | as some kind of an opposite,
02:10:46.600 | sort of an optimization problem,
02:10:48.040 | a complex optimization problem
02:10:50.440 | to this communication problem.
02:10:51.960 | And so they've like, they've solved it.
02:10:53.880 | You know, they're just sort of noisy solutions
02:10:56.120 | to this problem of communication.
02:10:57.800 | There's just so many ways you can do this.
02:11:00.040 | - So they're not optimized for learning.
02:11:01.880 | They're probably optimized for communication.
02:11:03.400 | - Oh, and learning.
02:11:04.840 | So yes, one of the factors which is,
02:11:07.160 | yeah, so learning is messing this up a bit.
02:11:09.320 | And so, for example,
02:11:11.240 | if it were just about minimizing dependency links
02:11:14.280 | and that was all that matters,
02:11:15.720 | you know, then we, you know,
02:11:17.000 | so then we might find grammars
02:11:19.720 | which didn't have regularity in their rules,
02:11:22.600 | but languages always have regularity in their rules.
02:11:25.800 | So what I mean by that is that
02:11:27.400 | if I wanted to say something to you
02:11:29.240 | in the optimal way to say it was,
02:11:30.840 | what really mattered to me,
02:11:32.360 | all that mattered was keeping the dependencies
02:11:34.360 | as close together as possible.
02:11:36.520 | Then I would have a very lax set
02:11:38.680 | of free structure or dependency rule.
02:11:40.680 | I wouldn't have very many of those.
02:11:42.040 | I would have very little of that.
02:11:43.320 | And I would just put the words as close,
02:11:45.720 | the things that refer to the things
02:11:46.920 | that are connected right beside each other.
02:11:49.080 | But we don't do that.
02:11:50.040 | Like there are word order rules, right?
02:11:52.520 | So they're very, and depending on the language,
02:11:54.440 | they're more and less strict, right?
02:11:56.040 | So you speak Russian, they're less strict than English.
02:11:58.520 | English has very rigid word order rules.
02:12:00.840 | We order things in a very particular way.
02:12:03.160 | And so why do we do that?
02:12:05.320 | Like that's probably not about communication.
02:12:08.760 | That's probably about learning.
02:12:09.800 | I mean, then we're talking about learning.
02:12:11.080 | It's probably easier to learn regular things,
02:12:14.440 | things which are very predictable and easy to,
02:12:16.760 | so that's probably about learning is our guess,
02:12:19.480 | 'cause that can't be about communication.
02:12:20.840 | - Can it be just noise?
02:12:21.720 | Can it be just the messiness
02:12:24.760 | of the development of a language?
02:12:26.120 | - Well, if it were just a communication,
02:12:27.720 | then we should have languages
02:12:29.160 | which have very, very free word order.
02:12:31.080 | And we don't have that.
02:12:31.800 | We have free-er, but not free, like there's always--
02:12:35.080 | - Well, no, but what I mean by noise
02:12:37.720 | is like cultural, like sticky cultural things,
02:12:40.600 | like the way you communicate,
02:12:42.760 | just there's a stickiness to it,
02:12:44.760 | that it's an imperfect, it's a noisy, optimistic, stochastic.
02:12:49.160 | The function over which you're optimizing is very noisy.
02:12:54.040 | So, because I don't, it feels weird to say
02:12:57.720 | that learning is part of the objective function,
02:13:00.120 | 'cause some languages are way harder to learn than others,
02:13:02.680 | right, or is that, that's not true?
02:13:05.400 | - That's not true. - That's interesting.
02:13:06.680 | - I mean-- - I mean, that's
02:13:07.320 | the public sort of perception, right?
02:13:08.760 | - Yes, that's true for a second language.
02:13:11.640 | - For a second language.
02:13:12.520 | - But that depends on what you started with, right?
02:13:14.920 | So, it really depends on how close that second language
02:13:17.880 | is to the first language you've got.
02:13:19.080 | And so, yes, it's very, very hard to learn Arabic
02:13:22.360 | if you've started with English,
02:13:23.800 | or it's hard to learn Japanese,
02:13:26.280 | or if you've started with, Chinese, I think, is the worst.
02:13:29.000 | There's like Defense Language Institute in the United States
02:13:32.840 | has like a list of how hard it is
02:13:36.200 | to learn what language from English,
02:13:37.800 | and I think Chinese is the worst.
02:13:38.640 | - But that's just a second language.
02:13:40.280 | You're saying babies don't care.
02:13:41.400 | - No, no, there's no evidence
02:13:43.400 | that there's anything harder or easier,
02:13:44.920 | but any baby, any language learned,
02:13:46.840 | like by three or four, they speak that language.
02:13:49.320 | And so, there's no evidence of anything harder or easier
02:13:52.280 | about any human language.
02:13:53.160 | They're all kind of equal.
02:13:53.960 | - So, to what degree is language,
02:13:56.440 | this is returning to Chomsky a little bit, is innate?
02:14:00.600 | You said that for Chomsky, he used the idea
02:14:04.200 | that some aspects of language are innate
02:14:06.440 | to explain away certain things that are observed.
02:14:09.000 | But how much are we born with language
02:14:12.920 | at the core of our mind, brain?
02:14:15.160 | - I mean, the answer is I don't know, of course.
02:14:19.400 | But I mean, I like to, I'm an engineer at heart, I guess,
02:14:24.360 | and I sort of think it's fine to postulate
02:14:27.560 | that a lot of it's learned.
02:14:28.600 | And so, I'm guessing that a lot of it's learned.
02:14:31.240 | So, I think the reason Chomsky went with the innateness
02:14:34.120 | is because he hypothesized movement in his grammar.
02:14:40.120 | He was interested in grammar, and movement's hard to learn.
02:14:42.680 | I think he's right.
02:14:43.400 | Movement is a hard, it's a hard thing to learn,
02:14:45.320 | to learn these two things together and how they interact.
02:14:47.720 | And there's a lot of ways in which you might generate
02:14:50.280 | exactly the same sentences, and it's really hard.
02:14:52.280 | And so, he's like, oh, I guess it's learned.
02:14:54.840 | Sorry, so I guess it's not learned, it's innate.
02:14:56.600 | And if you just throw out the movement
02:14:59.160 | and just think about that in a different way,
02:15:00.680 | then you get some messiness.
02:15:04.280 | But the messiness is human language,
02:15:07.160 | which it actually fits better.
02:15:09.320 | That messiness isn't a problem.
02:15:11.400 | It's actually, it's a valuable asset of the theory.
02:15:17.800 | And so, I think I don't really see a reason
02:15:21.000 | to postulate much innate structure.
02:15:23.640 | And that's kind of why I think these large language models
02:15:25.640 | are learning so well, is because I think you can learn
02:15:28.520 | the form, the forms of human language from the input.
02:15:32.200 | I think that's like, it's likely to be true.
02:15:34.040 | - So, that part of the brain that lights up
02:15:35.880 | when you're doing all the comprehension,
02:15:37.160 | that could be learned.
02:15:37.880 | That could be just, you don't need, you don't need any.
02:15:40.280 | - Yeah, it doesn't have to be innate.
02:15:41.160 | So, like lots of stuff is modular
02:15:44.840 | in the brain that's learned.
02:15:46.680 | It doesn't have to, so there's something called
02:15:49.480 | the visual word form area in the back.
02:15:51.400 | And so, it's in the back of your head,
02:15:52.760 | near the visual cortex, okay?
02:15:56.040 | And that is very specialized language,
02:15:59.640 | sorry, very specialized brain area,
02:16:01.160 | which does visual word processing if you read,
02:16:05.720 | if you're a reader, okay?
02:16:06.600 | If you don't read, you don't have it, okay?
02:16:08.520 | Guess what?
02:16:08.920 | You spend some time learning to read
02:16:10.600 | and you develop that brain area,
02:16:12.520 | which does exactly that.
02:16:13.720 | And so, the modularization is not evidence for innateness.
02:16:17.480 | So, the modularization of a language area
02:16:19.400 | doesn't mean we're born with it.
02:16:21.240 | We could have easily learned that.
02:16:23.080 | We might have been born with it.
02:16:25.000 | We just don't know at this point.
02:16:27.160 | We might very well have been born
02:16:28.680 | with this left-lateralized area.
02:16:30.760 | I mean, there's like a lot
02:16:31.880 | of other interesting components here,
02:16:33.960 | features of this kind of argument.
02:16:36.280 | So, some people get a stroke
02:16:39.000 | or something goes really wrong on the left side,
02:16:41.320 | where the language area would be.
02:16:43.240 | And that isn't there.
02:16:45.560 | It's not available.
02:16:46.520 | And it develops just fine on the right.
02:16:48.120 | So, it's not about the left.
02:16:49.960 | It goes to the left.
02:16:52.360 | Like, this is a very interesting question.
02:16:53.720 | It's like, why are any of the brain areas
02:16:57.720 | the way that they are?
02:16:58.520 | And how did they come to be that way?
02:17:00.440 | And there's these natural experiments which happen,
02:17:04.280 | where people get these strange events
02:17:06.760 | in their brains at very young ages,
02:17:08.680 | which wipe out sections of their brain.
02:17:10.600 | And they behave totally normally,
02:17:13.400 | and no one knows anything was wrong.
02:17:14.760 | And we find out later,
02:17:15.880 | 'cause they happen to be accidentally scanned
02:17:17.720 | for some reason.
02:17:18.360 | It's like, what happened to your left hemisphere?
02:17:20.600 | It's missing.
02:17:21.480 | There's not many people who've missed
02:17:22.520 | their whole left hemisphere,
02:17:23.400 | but they'll be missing some other section
02:17:24.920 | of their left or their right.
02:17:26.280 | And they behave absolutely normally.
02:17:27.800 | We'd never know.
02:17:28.920 | So, that's like a very interesting current research.
02:17:32.440 | This is another project that this person,
02:17:35.320 | Ev Fedorenko, is working on.
02:17:36.520 | She's got all these people contacting her,
02:17:38.600 | because she's scanned some people
02:17:40.520 | who have been missing sections.
02:17:44.040 | One person missed a section of her brain
02:17:46.040 | and was scanned in her lab.
02:17:47.560 | And she happened to be a writer for the New York Times.
02:17:50.200 | And there was an article in the New York Times
02:17:51.720 | just about the scanning procedure
02:17:56.280 | and about what might be learned
02:17:58.920 | by sort of the general process of MRI and language.
02:18:02.200 | And that's her language.
02:18:04.040 | And because she's writing for the New York Times,
02:18:07.000 | all these people started writing to her,
02:18:08.440 | who also have similar kinds of deficits,
02:18:11.880 | because they've been accidentally scanned for some reason
02:18:16.360 | and found out they're missing some section.
02:18:19.160 | And they volunteer to be scanned.
02:18:21.960 | - So, these are natural experiments.
02:18:23.560 | - Natural experiments.
02:18:24.360 | They're kind of messy, but natural experiments.
02:18:26.440 | It's kind of cool.
02:18:26.920 | She calls them interesting brains.
02:18:29.000 | - The first few hours, days, months of human life
02:18:32.520 | are fascinating.
02:18:33.320 | It's like, well, inside the womb, actually,
02:18:35.880 | like that development.
02:18:36.920 | That machinery, whatever that is,
02:18:41.320 | seems to create powerful humans
02:18:44.040 | that are able to speak, comprehend, think,
02:18:46.040 | all that kind of stuff, no matter what happens.
02:18:47.960 | Not no matter what, but robust to the different ways
02:18:51.400 | that the brain might be damaged and so on.
02:18:56.280 | That's really interesting.
02:18:58.200 | But what would Chomsky say about the fact,
02:19:01.240 | the thing you're saying now,
02:19:02.760 | that language seems to be happening separate from thought?
02:19:08.440 | Because as far as I understand,
02:19:09.720 | maybe you can correct me,
02:19:10.600 | he thought that language underpins--
02:19:12.360 | - Yeah, he thinks so.
02:19:13.800 | I don't know what he'd say.
02:19:14.680 | - He would be surprised.
02:19:15.800 | Because for him, the idea is that language
02:19:18.200 | is sort of the foundation of thought.
02:19:20.680 | - That's right, absolutely.
02:19:22.600 | - And it's pretty mind-blowing to think
02:19:26.600 | that it could be completely separate from thought.
02:19:28.600 | - That's right.
02:19:29.400 | So he's basically a philosopher,
02:19:32.360 | philosopher of language, in a way,
02:19:33.560 | thinking about these things.
02:19:34.440 | It's a fine thought.
02:19:35.320 | You can't test it in his methods.
02:19:39.240 | You can't do a thought experiment to figure that out.
02:19:41.720 | You need a scanner, you need brain-damaged people,
02:19:44.760 | you need something, you need ways to measure that.
02:19:47.320 | And that's what fMRI offers.
02:19:49.400 | And patients are a little messier.
02:19:53.880 | fMRI is pretty unambiguous, I'd say.
02:19:56.600 | It's very unambiguous.
02:19:57.800 | There's no way to say that the language network
02:20:01.320 | is doing any of these tasks.
02:20:03.320 | You should look at those data.
02:20:05.400 | It's like there's no chance that you can say
02:20:07.160 | that those networks are overlapping.
02:20:09.400 | They're not overlapping.
02:20:10.120 | They're just completely different.
02:20:11.400 | So you can always make, "Oh, it's only two people.
02:20:16.200 | "It's four people," or something for the patients.
02:20:18.920 | And there's something special about them we don't know.
02:20:20.760 | But these are just random people, and with lots of them.
02:20:24.680 | And you find always the same effects.
02:20:27.160 | And it's very robust, I'd say.
02:20:28.600 | - Well, it's a fascinating effect.
02:20:30.200 | You mentioned Bolivia.
02:20:33.000 | What's the connection between culture and language?
02:20:37.000 | You've also mentioned that much of our study of language
02:20:45.400 | comes from W-E-I-R-D, weird people,
02:20:49.880 | Western-educated, industrialized, rich, and democratic.
02:20:53.640 | So when you study remote cultures,
02:20:57.160 | such as around the Amazon jungle,
02:20:59.080 | what can you learn about language?
02:21:01.000 | - So that term "weird" is from Joe Henrich.
02:21:05.400 | He's at Harvard.
02:21:06.840 | He's a Harvard evolutionary biologist.
02:21:09.240 | And so he works on lots of different topics.
02:21:13.400 | And he basically was pushing that observation
02:21:17.000 | that we should be careful
02:21:18.360 | about the inferences we want to make
02:21:20.360 | when we're talking in psychology or sociology,
02:21:23.880 | yeah, mostly in psychology, I guess,
02:21:25.240 | about humans if we're talking about undergrads
02:21:30.280 | at MIT and Harvard.
02:21:31.320 | Those aren't the same, right?
02:21:33.160 | These aren't the same things.
02:21:34.120 | And so if you want to make inferences
02:21:35.640 | about language, for instance,
02:21:37.640 | there's a lot of other kinds of languages in the world
02:21:42.120 | than English and French and Chinese.
02:21:44.680 | And so maybe for language,
02:21:48.280 | we care about how culture,
02:21:50.120 | 'cause cultures can be very,
02:21:51.960 | I mean, of course, English and Chinese cultures
02:21:53.960 | are very different,
02:21:54.600 | but hunter-gatherers are much more different in some ways.
02:21:59.160 | And so if culture has an effect on what language is,
02:22:03.080 | then we kind of want to look there as well as looking.
02:22:06.760 | It's not like the industrialized cultures aren't interesting.
02:22:08.680 | Of course they are.
02:22:09.640 | But we want to look at non-industrialized cultures as well.
02:22:13.000 | And so I've worked with two.
02:22:14.520 | I've worked with the Chimani,
02:22:15.560 | which are in Bolivia and Amazon,
02:22:19.880 | both in the Amazon in these cases.
02:22:21.400 | And there are so-called farmer-foragers,
02:22:24.040 | which is not hunter-gatherers.
02:22:25.480 | It's sort of one up from hunter-gatherers
02:22:28.440 | in that they do a little bit of farming as well,
02:22:30.360 | a lot of hunting as well,
02:22:31.960 | but a little bit of farming.
02:22:33.400 | And the kind of farming they do
02:22:34.520 | is the kind of farming that I might do
02:22:36.440 | if I ever were to grow like tomatoes
02:22:38.600 | or something in my backyard.
02:22:39.880 | So it's not like big field farming.
02:22:42.280 | It's just farming for a family,
02:22:44.600 | a few things, you do that.
02:22:45.800 | So that's the kind of farming they do.
02:22:47.400 | And the other group I've worked with
02:22:50.600 | are the Piraha, which are also in the Amazon
02:22:54.200 | and happen to be in Brazil.
02:22:56.040 | And that's with a guy called Dan Everett,
02:22:59.000 | who is a linguist anthropologist
02:23:02.440 | who actually lived and worked in the,
02:23:05.240 | I mean, he was a missionary actually initially
02:23:07.800 | back in the '70s,
02:23:09.560 | working with trying to translate languages
02:23:12.280 | so they could teach them the Bible,
02:23:13.880 | teach them Christianity.
02:23:14.760 | - What can you say about that?
02:23:16.280 | - Yeah, so the two groups I've worked with,
02:23:19.160 | the Chimani and the Piraha are both isolate languages,
02:23:22.440 | meaning there's no known connected languages at all.
02:23:26.280 | They're just like on their own.
02:23:27.320 | - Oh, cool.
02:23:27.560 | - Yeah, there's a lot of those.
02:23:28.760 | And most of the isolates occur in the Amazon
02:23:34.040 | or in Papua New Guinea.
02:23:35.240 | And these places where the world
02:23:38.840 | has sort of stayed still for a long enough,
02:23:42.520 | and so there aren't earthquakes.
02:23:45.560 | There aren't, well, certainly no earthquakes
02:23:49.240 | in the Amazon jungle.
02:23:50.520 | And the climate isn't bad,
02:23:53.720 | so you don't have droughts.
02:23:55.080 | And so in Africa, you've got a lot of moving of people
02:23:58.680 | because there's drought problems.
02:24:00.120 | And so they get a lot of language contact
02:24:01.960 | when you have, when people have to,
02:24:03.560 | if you've got to move because you've got no water,
02:24:07.240 | then you've got to get going.
02:24:08.600 | And then you run into contact with other tribes,
02:24:12.120 | other groups.
02:24:13.320 | In the Amazon, that's not the case.
02:24:15.000 | And so people can stay there for hundreds and hundreds
02:24:17.400 | and probably thousands of years, I guess.
02:24:19.000 | And so these groups, the Chimani and the Piraha
02:24:22.280 | are both isolates in that.
02:24:23.720 | And they can just, I guess they've just lived there
02:24:25.480 | for ages and ages with minimal contact
02:24:28.440 | with other outside groups.
02:24:30.600 | And so, I mean, I'm interested in them because they are,
02:24:35.000 | I mean, in these cases, I'm interested in their words.
02:24:39.000 | I would love to study their syntax,
02:24:40.680 | their orders of words, but I'm mostly just interested
02:24:43.240 | in how languages are connected
02:24:45.800 | to their cultures in this way.
02:24:49.320 | And so with the Piraha, sort of most interesting,
02:24:51.560 | I was working on number there, number information.
02:24:54.760 | And so the basic idea is I think language is invented.
02:24:57.400 | That's what I get from the words here
02:24:58.760 | is that I think language is invented.
02:25:00.440 | We talked about color earlier.
02:25:01.800 | It's the same idea so that what you need to talk about
02:25:05.880 | with someone else is what you're gonna invent words for.
02:25:09.160 | And so we invent labels for colors that I need,
02:25:12.680 | not that I can see, but the things I need to tell you about
02:25:16.920 | so that I can get objects from you
02:25:18.440 | or get you to give me the right objects.
02:25:20.040 | And I just don't need a word for teal
02:25:21.800 | or a word for aquamarine in the Amazon jungle
02:25:26.920 | for the most part, because I don't have two things
02:25:28.760 | which differ on those colors.
02:25:30.280 | I just don't have that.
02:25:31.480 | And so numbers are really another fascinating source
02:25:35.400 | of information here where you might, naively,
02:25:39.640 | I certainly thought that all humans would have words
02:25:43.400 | for exact counting and the Piraha don't.
02:25:47.160 | Okay, so they don't have any words for even one.
02:25:50.440 | There's not a word for one in their language.
02:25:54.040 | And so there's certainly not a word for two, three or four.
02:25:56.040 | So that kind of blows people's minds off.
02:25:59.160 | - Yeah, that is blowing my mind.
02:26:00.600 | - That's pretty weird, isn't it?
02:26:01.800 | - How are you gonna ask, I want two of those?
02:26:03.720 | - You just don't.
02:26:04.840 | And so that's just not a thing you can possibly ask
02:26:07.720 | in the Piraha, it's not possible.
02:26:09.240 | That is, there's no words for that.
02:26:10.680 | So here's how we found this out.
02:26:12.440 | Okay, so it was thought to be a one, two, many language.
02:26:16.200 | There are three words for quantifiers for sets,
02:26:19.480 | but people had thought that those meant one, two and many.
02:26:23.800 | But what they really mean is few, some and many.
02:26:26.920 | Many is correct, it's few, some and many.
02:26:29.000 | And so the way we figured this out,
02:26:31.880 | and this is kind of cool, is that we gave people,
02:26:35.960 | we had a set of objects, okay?
02:26:38.520 | These were having to be spools of thread,
02:26:39.880 | doesn't really matter what they are, identical objects.
02:26:42.280 | And I sort of start off here, I just give you one of those
02:26:46.120 | and say, what's that?
02:26:46.920 | Okay, so you're a Piraha speaker and you tell me what it is.
02:26:49.640 | And then I give you two and say, what's that?
02:26:51.640 | And nothing's changing in the set except for the number, okay?
02:26:55.080 | And then I just ask you to label these things.
02:26:56.760 | We just do this for a bunch of different people.
02:26:58.280 | And frankly, I did this task.
02:27:00.600 | - This is fascinating. - And it's a weird,
02:27:02.440 | it's a little bit weird.
02:27:03.320 | So they say the word that we thought was one, it's few,
02:27:06.920 | but for the first one, and then maybe they say few,
02:27:09.160 | or maybe they say some for the second.
02:27:11.000 | And then for the third or the fourth,
02:27:12.200 | they start using the word many for the set.
02:27:15.160 | And then five, six, seven, eight, I go all the way to 10.
02:27:18.280 | And it's always the same word.
02:27:20.520 | And they look at me like I'm stupid
02:27:21.880 | because they told me what the word was for six, seven, eight.
02:27:26.040 | And I'm gonna continue asking them at nine and 10.
02:27:28.680 | I'm like, I'm sorry, I just...
02:27:30.200 | They understand that I wanna know their language.
02:27:32.360 | That's the point of the task,
02:27:33.640 | is I'm trying to learn their language, and so that's okay.
02:27:35.960 | But it does seem like I'm a little slow
02:27:37.960 | 'cause they already told me what the word for many was,
02:27:41.880 | five, six, seven, and I keep asking.
02:27:43.480 | So it's a little funny to do this task over and over.
02:27:46.040 | We did this with a guy called, Dan was our translator.
02:27:49.160 | He's the only one who really speaks Piraha fluently.
02:27:53.160 | He's a good bilingual for a bunch of languages,
02:27:56.760 | but also English and Piraha.
02:27:59.160 | And then a guy called Mike Frank
02:28:00.440 | was also a student with me down there.
02:28:02.840 | He and I did these things.
02:28:04.040 | And so you do that, okay?
02:28:06.840 | And everyone does the same thing.
02:28:07.960 | We asked like 10 people,
02:28:10.520 | and they all do exactly the same labeling for one up.
02:28:13.400 | And then we just do the same thing down
02:28:15.400 | on like random order, actually.
02:28:16.680 | We do some of them up, some of them down first, okay?
02:28:19.000 | And so we do, instead of one to 10, we do 10 down to one.
02:28:22.280 | And so I give them 10, nine, at eight,
02:28:24.680 | they start saying the word for some.
02:28:26.920 | And then when you get to four,
02:28:28.920 | everyone is saying the word for few,
02:28:31.000 | which we thought was one.
02:28:32.280 | So it's like the context determined
02:28:34.360 | what that quantifier they used was.
02:28:37.480 | So it's not a count word.
02:28:38.760 | They're not count words.
02:28:40.360 | They're just approximate words.
02:28:41.480 | - And they're gonna be noisy
02:28:42.440 | when you interview a bunch of people
02:28:43.960 | with the definition of few,
02:28:45.480 | and there's gonna be a threshold in the context.
02:28:47.240 | - Yeah, yeah, I don't know what that means.
02:28:49.160 | That's gonna depend on the context.
02:28:50.680 | I think it's true in English too, right?
02:28:51.960 | If you ask an English person what a few is,
02:28:53.640 | I mean, that's gonna depend completely on the context.
02:28:55.960 | - And that might actually be, at first, hard to discover,
02:28:58.680 | 'cause for a lot of people,
02:29:00.680 | the jump from one to two will be few, right?
02:29:03.960 | So it's a jump.
02:29:04.600 | - Yeah, it might still be there, yeah.
02:29:06.600 | - I mean, that's fascinating.
02:29:09.000 | That's fascinating that numbers don't present themselves.
02:29:11.000 | - Yeah, so the words aren't there.
02:29:12.440 | And so then we do these other things.
02:29:13.960 | Well, if they don't have the words,
02:29:16.280 | can they do exact matching kinds of tasks?
02:29:19.480 | Can they even do those tasks?
02:29:21.480 | And the answer is sort of yes and no.
02:29:24.360 | And so yes, they can do them.
02:29:26.440 | So here's the tasks that we did.
02:29:28.040 | We put out those spools of thread again, okay?
02:29:30.360 | So maybe I put like three out here.
02:29:32.280 | And then we gave them some objects.
02:29:34.680 | And those happened to be uninflated red balloons.
02:29:37.320 | It doesn't really matter what they are.
02:29:39.000 | It's just they're a bunch of exactly the same thing.
02:29:41.080 | And it was easy to put down
02:29:43.000 | right next to these spools of thread, okay?
02:29:46.840 | And so then I put out three of these.
02:29:48.280 | And your task was to just put one
02:29:50.280 | against each of my three things.
02:29:52.040 | And they could do that perfectly.
02:29:54.040 | So I mean, I would actually do that.
02:29:55.160 | It was a very easy task to explain to them
02:29:57.000 | because I did this with this guy, Mike Frank,
02:29:59.560 | and I'd be the experimenter telling him to do this
02:30:03.480 | and showing him to do this.
02:30:04.600 | And then we just like, just do what he did.
02:30:06.040 | You'll copy him.
02:30:06.920 | All we had to, I didn't have to speak Pirohan
02:30:08.840 | except for know what, copy him.
02:30:10.680 | Like do what he did is like all we had to be able to say.
02:30:13.160 | And then they would do that just perfectly.
02:30:15.560 | And so we'd move it up.
02:30:16.760 | We do some sort of random number of items up to 10.
02:30:20.120 | And they basically do perfectly on that.
02:30:22.440 | They never get that wrong.
02:30:23.480 | I mean, that's not a counting task, right?
02:30:25.160 | That is just a match.
02:30:26.120 | You just put one against it.
02:30:27.240 | It doesn't matter how many,
02:30:28.040 | I don't need to know how many there are there
02:30:29.480 | to do that correctly.
02:30:31.000 | And they would make mistakes, but very, very few.
02:30:34.040 | And no more than MIT undergrads.
02:30:36.440 | (laughs)
02:30:37.160 | Just gonna say, like there's no, these are low stakes.
02:30:40.040 | So, you know, you make mistakes.
02:30:41.080 | - So counting is not required
02:30:42.200 | to complete the matching task.
02:30:43.080 | - That's right, not at all.
02:30:44.280 | Okay, and so that's our control.
02:30:46.680 | And this guy had gone down there before
02:30:49.160 | and said that they couldn't do this task.
02:30:50.680 | But I just don't know what he did wrong there
02:30:52.280 | 'cause they can do this task perfectly well.
02:30:54.200 | And, you know, I can train my dog to do this task.
02:30:57.000 | So of course they can do this task.
02:30:58.440 | And so, you know, it's not a hard task.
02:31:00.360 | But the other task that was sort of more interesting
02:31:03.320 | is like, so then we do a bunch of tasks
02:31:04.760 | where you need some way to encode the set.
02:31:10.280 | So like one of them is just,
02:31:12.360 | I just put a opaque sheet in front of the things.
02:31:18.200 | I put down a bunch, a set of these things
02:31:19.880 | and I put an opaque sheet down.
02:31:21.320 | And so you can't see them anymore.
02:31:23.160 | And I tell you, do the same thing
02:31:24.680 | you were doing before, right?
02:31:26.040 | And it's easy if it's two or three, it's very easy.
02:31:28.280 | But if I don't have the words for eight,
02:31:30.440 | it's a little harder, like maybe, you know,
02:31:33.320 | with practice, well no.
02:31:34.920 | - 'Cause you have to count.
02:31:37.000 | - For us it's easy 'cause we just count them.
02:31:39.560 | It's just so easy to count them.
02:31:41.240 | But they don't, they can't count them
02:31:43.320 | because they don't count.
02:31:44.040 | They don't have words for this thing.
02:31:45.400 | And so they would do approximate.
02:31:46.760 | It's totally fascinating.
02:31:47.800 | So they would get them approximately right,
02:31:49.960 | you know, after four or five.
02:31:52.520 | You know, 'cause you can basically,
02:31:54.200 | you always get four right.
02:31:55.560 | Three or four, that looks,
02:31:57.240 | that's something we can visually see.
02:31:58.600 | But after that, you kind of have,
02:32:01.000 | it's an approximate number.
02:32:02.360 | And so then, and there's a bunch of tasks we did
02:32:03.960 | and they all failed, I mean, failed.
02:32:07.080 | They did approximate after five on all those tasks.
02:32:10.520 | And it kind of shows that the words,
02:32:12.280 | you kind of need the words, you know,
02:32:15.240 | to be able to do these kinds of tasks.
02:32:17.240 | - There's a little bit of a chicken and egg thing there.
02:32:20.120 | Because if you don't have the words,
02:32:21.640 | then maybe they'll limit you in the kind of,
02:32:25.320 | like a little baby Einstein there
02:32:27.960 | won't be able to come up with a counting task.
02:32:31.480 | You know what I mean?
02:32:31.960 | Like the ability to count enables you
02:32:34.440 | to come up with interesting things probably.
02:32:36.680 | So yes, you develop counting because you need it.
02:32:41.000 | But then once you have counting,
02:32:43.160 | you can probably come up
02:32:44.280 | with a bunch of different inventions.
02:32:45.800 | Like how to, I don't know.
02:32:48.360 | (inhales)
02:32:49.480 | What kind of thing?
02:32:50.360 | They do matching really well for building purposes,
02:32:53.160 | building some kind of hut or something like this.
02:32:55.720 | So it's interesting that language is a limiter
02:33:00.120 | on what you're able to do.
02:33:01.480 | - Yeah, here language is just, is the words.
02:33:03.960 | Here is the words.
02:33:04.920 | Like the words for exact count
02:33:07.000 | is the limiting factor here.
02:33:09.720 | They just don't have them.
02:33:10.600 | - Yeah, that's what I mean.
02:33:14.200 | That limit is also a limit on the society
02:33:17.400 | of what they're able to build.
02:33:18.520 | - That's gonna be true, yeah.
02:33:20.520 | So it's probably, I mean, we don't know.
02:33:22.840 | This is one of those problems with the snapshot
02:33:24.760 | of just current languages
02:33:26.200 | is that we don't know what causes a culture
02:33:28.280 | to discover/invent a counting system.
02:33:31.800 | But the hypothesis is the guess out there
02:33:33.960 | is something to do with farming.
02:33:35.320 | So if you have a bunch of goats
02:33:37.400 | and you wanna keep track of them
02:33:40.520 | and you have say 17 goats and you go to bed at night
02:33:43.160 | and you get up in the morning,
02:33:44.520 | boy, it's easier to have a count system to do that.
02:33:47.400 | That's an abstraction over a set.
02:33:50.520 | So they don't have, like, people often ask me
02:33:53.160 | when I tell them about this kind of work,
02:33:54.600 | they say, "Well, don't these Purahan, don't they have kids?
02:33:56.840 | "Don't they have a lot of children?"
02:33:57.800 | I'm like, "Yeah, they have a lot of children."
02:33:59.240 | And they do, they often have families
02:34:01.240 | of three or four or five kids.
02:34:02.600 | And they go, "Well, don't they need the numbers
02:34:04.360 | "to keep track of their kids?"
02:34:05.320 | And I always ask this person who says this,
02:34:07.480 | like, "Do you have children?"
02:34:08.520 | And the answer's always no
02:34:10.760 | because that's not how you keep track of your kids.
02:34:13.400 | You care about their identities.
02:34:15.000 | It's very important to me when I go,
02:34:17.000 | "I think I have five children."
02:34:17.960 | - You don't think one, two, three, four?
02:34:20.680 | - Yeah, it matters which five.
02:34:22.840 | If you replaced one with someone else, I would care.
02:34:26.840 | A goat, maybe not, right?
02:34:28.200 | That's the kind of point.
02:34:29.000 | It's an abstraction.
02:34:30.200 | Something that looks very similar to the one
02:34:32.040 | wouldn't matter to me probably.
02:34:33.400 | - But if you care about goats,
02:34:35.400 | you're gonna know them actually individually also.
02:34:37.640 | - Yeah, you will.
02:34:38.200 | - I mean, cows, goats, if there's a source of food and milk
02:34:40.760 | and all that kind of stuff,
02:34:41.640 | you're gonna actually really, really care.
02:34:43.080 | - But I'm saying it is an abstraction
02:34:44.600 | such that you don't have to care about their identities
02:34:47.080 | to do this thing fast.
02:34:48.120 | That's the hypothesis, not mine.
02:34:49.800 | From anthropologists as a guessing
02:34:52.840 | about where words for counting came from
02:34:55.080 | is from farming, maybe.
02:34:56.360 | - Yeah.
02:34:57.580 | Do you have a sense why universal languages
02:35:00.840 | like Esperanto have not taken off?
02:35:02.520 | Like, why do we have all these different languages?
02:35:07.400 | - Yeah, yeah.
02:35:07.880 | Well, my guess is that the function of a language
02:35:11.480 | is to do something in a community.
02:35:13.240 | I mean, unless there's some function to that language
02:35:16.360 | in the community, it's not gonna survive.
02:35:18.840 | It's not gonna be useful.
02:35:19.720 | So here's a great example.
02:35:20.840 | Language death is super common, okay?
02:35:24.760 | Languages are dying all around the world,
02:35:27.800 | and here's why they're dying.
02:35:29.160 | Yeah, I see this.
02:35:30.760 | It's not happening right now
02:35:33.080 | in either the Chimane or the Piraha,
02:35:35.160 | but it probably will.
02:35:36.520 | And so there's a neighboring group called Mocitan,
02:35:38.840 | which is, I said that it's isolated.
02:35:42.760 | It's actually, there's a dual.
02:35:43.880 | There's two of them, okay?
02:35:44.840 | So there's two languages which are really close,
02:35:47.240 | which are Mocitan and Chimane,
02:35:51.000 | which are unrelated to anything else.
02:35:52.520 | And Mocitan is unlike Chimane
02:35:54.680 | in that it has a lot of contact with Spanish,
02:35:57.400 | and it's dying.
02:35:58.360 | So that language is dying.
02:35:59.560 | The reason it's dying is there's not a lot of value
02:36:03.000 | for the local people in their native language.
02:36:06.600 | So there's much more value in knowing Spanish,
02:36:08.440 | like because they wanna feed their families.
02:36:11.480 | And how do you feed your family?
02:36:12.520 | You learn Spanish so you can make money,
02:36:14.760 | so you can get a job and do these things,
02:36:16.360 | and then you make money.
02:36:17.960 | And so they want Spanish things.
02:36:19.800 | They want, and so Mocitan is in danger and is dying.
02:36:23.560 | And that's normal.
02:36:24.920 | And so basically the problem is that people,
02:36:27.000 | the reason we learn language is to communicate,
02:36:31.640 | and we use it to make money
02:36:35.000 | and to do whatever it is to feed our families.
02:36:38.200 | And if that's not happening, then it won't take off.
02:36:42.600 | It's not like a game or something.
02:36:44.440 | This is like something we use.
02:36:46.040 | Why is English so popular?
02:36:47.320 | It's not because it's an easy language to learn.
02:36:49.800 | Maybe it is.
02:36:51.480 | I don't really know.
02:36:52.200 | But that's not why it's popular.
02:36:54.200 | - But because the United States is a gigantic economy,
02:36:57.320 | and therefore-- - Yeah, yeah.
02:36:57.720 | It's big economies that do this.
02:36:59.160 | It's all it is.
02:36:59.880 | It's all about money, and that's what,
02:37:02.120 | and so there's a motivation to learn Mandarin.
02:37:05.480 | There's a motivation to learn Spanish.
02:37:06.760 | There's a motivation to learn English.
02:37:08.120 | These languages are very valuable to know
02:37:09.880 | because there's so, so many speakers all over the world.
02:37:12.280 | - That's fascinating.
02:37:13.080 | - There's less of a value economically.
02:37:15.400 | It's kind of what drives this.
02:37:16.760 | It's not just for fun.
02:37:19.880 | I mean, there are these groups
02:37:21.000 | that do want to learn language just for language's sake,
02:37:24.280 | and there's something to that.
02:37:26.360 | But those are rarities in general.
02:37:29.080 | Those are a few small groups that do that.
02:37:31.560 | Not most people don't do that.
02:37:32.680 | - Well, if that was the primary driver,
02:37:34.200 | then everybody was speaking English
02:37:35.800 | or speaking one language.
02:37:37.240 | There's also a tension-- - That's happening.
02:37:39.000 | - And that, well, well-- - We're moving towards
02:37:41.720 | fewer and fewer languages. - We are.
02:37:43.240 | - Yeah, yeah. - I wonder, you're right.
02:37:44.520 | Maybe, you know, this is slow,
02:37:47.480 | but maybe that's where we're moving.
02:37:49.000 | But there is a tension.
02:37:50.760 | You're saying a language that the fringes,
02:37:53.640 | but if you look at geopolitics and superpowers,
02:37:58.120 | it does seem that there's another thing in tension,
02:38:00.040 | which is a language is a national identity sometimes.
02:38:04.040 | - Oh, yeah. - For certain nations.
02:38:05.640 | I mean, that's the war in Ukraine.
02:38:07.400 | Language, Ukrainian language is a symbol of that war
02:38:11.240 | in many ways, like a country fighting for its own identity.
02:38:14.520 | So it's not merely the convenience.
02:38:16.760 | I mean, those two things are a tension,
02:38:18.520 | is the convenience of trade and the economics
02:38:21.720 | and be able to communicate with neighboring countries
02:38:25.240 | and trade more efficiently with neighboring countries,
02:38:28.200 | all that kind of stuff, but also identity--
02:38:30.280 | - That's right. - Of the group.
02:38:31.320 | - I completely agree. - 'Cause language is the way,
02:38:33.720 | for every community, like dialects that emerge
02:38:39.160 | are a kind of identity for people.
02:38:40.920 | - Yeah. - And sometimes a way
02:38:42.520 | for people to say F-U to the more powerful people.
02:38:46.600 | - Yeah. - And it's interesting.
02:38:48.280 | So in that way, language can be used as that tool.
02:38:51.160 | - Yeah, I completely agree.
02:38:52.760 | And there's a lot of work to try to create that identity
02:38:56.600 | so people want to do that.
02:38:57.960 | As a cognitive scientist and language expert,
02:39:02.840 | I hope that continues because I don't want languages to die.
02:39:06.440 | I want languages to survive
02:39:07.800 | because they're so interesting for so many reasons.
02:39:13.320 | But I mean, I find them fascinating
02:39:15.880 | just for the language part, but I think there's a lot
02:39:18.520 | of connections to culture as well,
02:39:20.040 | which is also very important.
02:39:21.400 | - Do you have hope for machine translation
02:39:25.400 | that can break down the barriers of language?
02:39:27.400 | So while all these different diverse languages exist,
02:39:29.960 | I guess there's many ways of asking this question,
02:39:32.840 | but basically how hard is it to translate
02:39:36.760 | in an automated way from one language to another?
02:39:40.200 | - There's gonna be cases
02:39:41.400 | where it's gonna be really hard, right?
02:39:43.160 | So there are concepts that are in one language
02:39:47.080 | and not in another.
02:39:47.880 | Like the most extreme kinds of cases
02:39:49.560 | are these cases of number information.
02:39:51.400 | So like good luck translating a lot of English into Piraha.
02:39:55.640 | It's just impossible.
02:39:56.680 | There's no way to do it because there are no words
02:39:58.600 | for these concepts that we're talking about.
02:40:00.760 | There's probably the flip side, right?
02:40:03.640 | There's probably stuff in Piraha
02:40:05.640 | which is gonna be hard to translate
02:40:08.280 | into English on the other side.
02:40:09.800 | And so I just don't know what those concepts are.
02:40:11.800 | I mean, the space, the world space is different
02:40:15.640 | from my world space.
02:40:16.600 | And so I don't know what,
02:40:17.720 | so that the things they talk about,
02:40:19.000 | things are, it's gonna have to do with their life
02:40:21.640 | as opposed to my industrial life,
02:40:24.280 | which is gonna be different.
02:40:25.720 | And so there's gonna be problems like that always.
02:40:28.520 | There's like, maybe it's not so bad
02:40:31.240 | in the case of some of these spaces
02:40:32.520 | and maybe it's gonna be harder in others.
02:40:34.120 | And so it's pretty bad in number.
02:40:36.360 | It's like extreme, I'd say in the number space,
02:40:39.560 | exact number space, but in the color dimension, right?
02:40:42.040 | So that's not so bad.
02:40:43.000 | I mean, but it's a problem
02:40:45.960 | that you don't have ways to talk about the concepts.
02:40:49.480 | - And there might be entire concepts that are missing.
02:40:51.640 | But to you, it's more about the space of concept
02:40:54.200 | versus the space of form.
02:40:55.480 | Like form, you can probably map.
02:40:58.040 | - Yes, yeah.
02:40:59.160 | But so you were talking earlier about translation
02:41:01.800 | and about how translations,
02:41:04.280 | there's good and bad translations.
02:41:06.360 | I mean, now we're talking about translations of form, right?
02:41:08.760 | So what makes a writing good, right?
02:41:11.400 | - There's a music to the form.
02:41:13.240 | - Right, it's not just the content.
02:41:16.040 | It's how it's written.
02:41:17.400 | And translating that, that sounds difficult.
02:41:21.080 | - So we should say that there is like,
02:41:24.840 | I don't hesitate to say meaning,
02:41:26.840 | but there's a music and a rhythm to the form.
02:41:30.280 | When you look at the broad picture,
02:41:32.120 | like the difference between Dostoevsky and Tolstoy
02:41:34.840 | or Hemingway, Bukowski, James Joyce, like I mentioned,
02:41:40.840 | there's a beat to it, there's an edge to it
02:41:42.920 | that it's like is in the form.
02:41:45.000 | - We can probably get measures of those.
02:41:47.560 | - Yeah.
02:41:48.040 | - I don't know.
02:41:49.720 | I'm optimistic that we could get measures of those things.
02:41:52.840 | And so maybe that's--
02:41:53.800 | - Translatable.
02:41:54.680 | - I don't know.
02:41:55.320 | I don't know though.
02:41:55.960 | I have not worked on that.
02:41:57.240 | - I would love to see--
02:41:58.520 | - That sounds totally fascinating.
02:41:59.560 | - Translation to Hemingway is probably the lowest,
02:42:04.840 | I would love to see different authors,
02:42:06.920 | but the average per sentence dependency length
02:42:11.640 | for Hemingway is probably the shortest.
02:42:13.720 | - Huh, huh, that's your sense, huh?
02:42:15.720 | It's simple sentences with short, yeah, yeah, yeah, yeah.
02:42:18.920 | - I mean, that's when, if you have really long sentences,
02:42:21.400 | even if they don't have center embedding, like--
02:42:23.320 | - They can have longer connections, yeah.
02:42:25.000 | - They can have longer connections.
02:42:26.040 | - They don't have to, right?
02:42:27.000 | You can't have a long, long sentence
02:42:28.360 | with a bunch of local words, yeah.
02:42:29.800 | - Yeah.
02:42:30.200 | - But it is much more likely to have the possibility
02:42:33.400 | of long dependencies with long sentences, yeah.
02:42:35.320 | - I met a guy named Azaroskin
02:42:39.000 | who does a lot of cool stuff, really brilliant.
02:42:42.120 | He works with Tristan Harris and a bunch of stuff,
02:42:43.960 | but he was talking to me about communicating with animals.
02:42:49.960 | He co-founded Earth Species Project,
02:42:52.040 | where you're trying to find the common language
02:42:55.240 | between whales, crows, and humans.
02:42:57.640 | And he was saying that there's a lot of promising work
02:43:02.120 | that even though the signals are very different,
02:43:04.840 | like the actual, if you have embeddings of the languages,
02:43:10.760 | they're actually trying to communicate similar type things.
02:43:14.120 | Is there something you can comment on that,
02:43:18.360 | like where, is there promise to that
02:43:20.760 | in everything you've seen in different cultures,
02:43:22.680 | especially like remote cultures,
02:43:24.120 | that this is a possibility?
02:43:25.400 | Or no, that we can talk to whales?
02:43:27.120 | - I would say yes.
02:43:29.960 | I think it's not crazy at all.
02:43:31.880 | I think it's quite reasonable.
02:43:33.480 | There's this sort of weird view, well, odd view, I think,
02:43:38.280 | to think that human language is somehow special.
02:43:41.640 | Maybe it is.
02:43:44.600 | We can certainly do more than any of the other species.
02:43:48.040 | And maybe our language system is part of that.
02:43:54.600 | It's possible.
02:43:55.400 | But people have often talked about how human,
02:43:59.000 | like Chomsky, in fact, has talked about
02:44:00.760 | how only human language has this compositionality thing
02:44:07.160 | that he thinks is sort of key in language.
02:44:09.480 | And it's, the problem with that argument
02:44:11.960 | is he doesn't speak whale.
02:44:13.320 | And he doesn't speak crow.
02:44:16.360 | And he doesn't speak monkey.
02:44:17.400 | They say things like, well, they're making
02:44:19.880 | a bunch of grunts and squeaks.
02:44:21.240 | And the reasoning is like, that's bad reasoning.
02:44:25.400 | I'm pretty sure if you asked a whale what we're saying,
02:44:28.600 | they'd say, well, I'm making a bunch of weird noises.
02:44:30.840 | - Exactly.
02:44:31.340 | - And so it's like, this is a very odd reasoning
02:44:35.160 | to be making that human language is special
02:44:37.000 | because we're the only ones who have human language.
02:44:38.600 | I'm like, well, we don't know what those other,
02:44:42.280 | we just don't, we can't talk to them yet.
02:44:44.760 | And so there are probably a signal in there.
02:44:46.680 | And it might very well be something complicated
02:44:50.040 | like human language.
02:44:51.240 | I mean, sure, with a small brain in lower species,
02:44:55.960 | there's probably not a very good communication system.
02:44:57.720 | But in these higher species where you have
02:45:00.120 | what seems to be abilities to communicate something,
02:45:05.160 | there might very well be a lot more signal there
02:45:07.560 | than we might have otherwise thought.
02:45:10.600 | - But also if we have a lot of intellectual humility here,
02:45:13.640 | somebody formerly from MIT, Neri Oxman,
02:45:16.120 | who I admire very much, has talked a lot about,
02:45:19.560 | has worked on communicating with plants.
02:45:23.640 | So like, yes, the signal there is even less than,
02:45:28.200 | but like, it's not out of the realm of possibility
02:45:30.840 | that all nature has a way of communicating.
02:45:34.520 | And it's a very different language,
02:45:36.440 | but they do develop a kind of language
02:45:39.160 | through the chemistry,
02:45:40.200 | through some way of communicating with each other.
02:45:43.560 | And if you have enough humility about that possibility,
02:45:46.120 | I think you can, I think it would be a very interesting,
02:45:49.640 | in a few decades, maybe centuries, hopefully not,
02:45:52.840 | a humbling possibility of being able to communicate,
02:45:57.240 | not just between humans effectively,
02:45:59.560 | but between all of living things on Earth.
02:46:02.840 | - Well, I mean, I think some of them
02:46:05.160 | are not gonna have much interesting to say,
02:46:07.080 | but some of them will. - But you could still--
02:46:08.360 | - We don't know.
02:46:09.160 | We certainly don't know.
02:46:10.120 | - But I think if we're humble,
02:46:13.080 | there could be some interesting trees out there.
02:46:15.560 | - Oh, yeah, yeah, yeah, yeah.
02:46:17.000 | Well, they're probably talking to other trees, right?
02:46:19.000 | They're not talking to us.
02:46:20.040 | And so to the extent they're talking,
02:46:21.880 | they're saying something interesting
02:46:23.160 | to some other conspecific, as opposed to us, right?
02:46:27.960 | And so there probably is, there may be some signal there.
02:46:31.240 | So there are people out there,
02:46:32.680 | actually it's pretty common to say that human language
02:46:36.520 | is special and different
02:46:38.440 | from any other animal communication system.
02:46:40.680 | And I just don't think the evidence
02:46:43.480 | is there for that claim.
02:46:44.920 | I think it's not obvious.
02:46:46.040 | We just don't know,
02:46:50.200 | 'cause we don't speak these other communication systems
02:46:52.680 | until we get better.
02:46:54.760 | I do think there are people working on that,
02:46:57.720 | as you pointed out,
02:46:58.840 | people working on whale speak, for instance.
02:47:00.760 | That's really fascinating.
02:47:02.120 | - Let me ask you a wild, out there sci-fi question.
02:47:05.160 | If we make contact with an intelligent alien civilization
02:47:08.120 | and you get to meet them,
02:47:11.400 | how hard do you think,
02:47:13.720 | like how surprised would you be
02:47:15.080 | about their way of communicating?
02:47:16.920 | Do you think it would be recognizable?
02:47:19.880 | Maybe there's some parallels here
02:47:21.400 | when you go to the remote tribes.
02:47:23.160 | - I mean, I would want Dan Everett with me.
02:47:25.400 | He is like amazing at learning foreign languages.
02:47:28.760 | And so this is an amazing feat to be able to go.
02:47:31.320 | This is a language, Piedmont,
02:47:33.240 | which has no translators before him.
02:47:35.160 | - Oh, wow, so he just shows up?
02:47:37.320 | - Well, there was a guy that had been there before,
02:47:39.000 | but he wasn't very good.
02:47:40.280 | And so he learned the language far better
02:47:43.320 | than anyone else had learned before him.
02:47:45.160 | He's a very social person.
02:47:48.600 | I think that's a big part of it,
02:47:49.800 | is being able to interact.
02:47:50.840 | So I don't know, it kind of depends on
02:47:52.040 | the species from outer space,
02:47:55.800 | how much they wanna talk to us.
02:47:57.800 | - Is there something you can say
02:47:58.840 | about the process he follows?
02:48:00.200 | How do you show up to a tribe and socialize?
02:48:03.800 | I mean, I guess colors and counting
02:48:05.480 | is one of the most basic things to figure out.
02:48:07.400 | - Yeah, you start that.
02:48:08.360 | You actually start with objects
02:48:10.760 | and you just throw a stick down and say, "Stick."
02:48:13.560 | And then you say, "What do you call this?"
02:48:14.760 | And then they'll say the word for whatever.
02:48:17.000 | And he says, "The standard thing to do
02:48:18.520 | "is to throw two sticks at two sticks."
02:48:20.280 | And then he learned pretty quick
02:48:22.280 | that there weren't any count words in this language
02:48:24.760 | because they didn't know this wasn't interesting.
02:48:27.080 | I mean, it was kind of weird.
02:48:27.960 | They'd say "some" or something,
02:48:29.080 | the same word over and over again.
02:48:30.280 | But that is a standard thing.
02:48:32.040 | You just try to...
02:48:33.080 | But you have to be pretty out there socially,
02:48:36.120 | willing to talk to random people,
02:48:37.800 | which these are really very different people from you.
02:48:41.320 | And he's very social.
02:48:43.320 | And so I think that's a big part of this,
02:48:44.760 | is that's how a lot of people know a lot of languages,
02:48:48.120 | is they're willing to talk to other people.
02:48:49.960 | - That's a tough one, where you just show up knowing nothing.
02:48:52.760 | - Yeah, oh God.
02:48:53.800 | - It's beautiful that humans are able to connect in that way.
02:48:56.600 | - Yeah, yeah.
02:48:57.400 | - You've had an incredible career
02:48:59.880 | exploring this fascinating topic.
02:49:01.560 | What advice would you give to young people
02:49:03.320 | about how to have a career like that
02:49:08.200 | or a life that they can be proud of?
02:49:10.760 | - When you see something interesting, just go and do it.
02:49:13.320 | I do that.
02:49:15.080 | That's something I do,
02:49:15.800 | which is kind of unusual for most people.
02:49:17.560 | So when I saw that Piero Han was available to go and visit,
02:49:20.520 | I was like, "Yes, yes, I'll go."
02:49:23.000 | And then when we couldn't go back,
02:49:24.360 | we had some trouble with the Brazilian government.
02:49:28.200 | There's some corrupt people there.
02:49:29.320 | It was very difficult to go back in there.
02:49:31.480 | And so I was like, "All right, I gotta find another group."
02:49:33.720 | And so we searched around and we were able to find the Chamani,
02:49:36.280 | because I wanted to keep working on this kind of problem.
02:49:38.520 | And so we found the Chamani and just go there.
02:49:40.520 | I didn't really have, we didn't have content.
02:49:42.680 | We had a little bit of contact and brought someone.
02:49:44.680 | And that was, we just kind of just try things.
02:49:48.440 | I say it's like, a lot of that's just like ambition,
02:49:51.080 | just try to do something that other people haven't done.
02:49:54.040 | Just give it a shot is what I, I mean, I do that all the time.
02:49:57.240 | - I love it.
02:49:58.360 | And I love the fact that your pursuit of fun
02:50:01.240 | has landed you here talking to me.
02:50:03.080 | This was an incredible conversation.
02:50:05.160 | Ted, you're just a fascinating human being.
02:50:08.920 | Thank you for taking a journey
02:50:10.040 | through human language with me today.
02:50:12.120 | This is awesome.
02:50:12.760 | - Thank you very much, Alexis.
02:50:14.040 | It's been a pleasure.
02:50:14.760 | - Thanks for listening to this conversation
02:50:17.560 | with Edward Gibson.
02:50:19.000 | To support this podcast,
02:50:20.360 | please check out our sponsors in the description.
02:50:23.160 | And now let me leave you with some words from Wittgenstein.
02:50:26.520 | The limits of my language mean the limits of my world.
02:50:31.240 | Thank you for listening.
02:50:33.400 | I hope to see you next time.
02:50:36.520 | (upbeat music)
02:50:39.900 | (upbeat music)