back to index

Yann LeCun: Sophia and Does AI Need a Body? | AI Podcast Clips


Whisper Transcript | Transcript Only Page

00:00:00.000 | (gentle music)
00:00:02.580 | - You've criticized the art project
00:00:13.000 | that is Sophia the Robot.
00:00:14.740 | And what that project essentially does
00:00:17.600 | is uses our natural inclination to anthropomorphize
00:00:21.800 | things that look like human and give them more.
00:00:24.840 | Do you think that could be used by AI systems
00:00:27.800 | like in the movie "Her"?
00:00:30.080 | - So do you think that body is needed
00:00:33.440 | to create a feeling of intelligence?
00:00:37.240 | - Well, if Sophia was just an art piece,
00:00:39.360 | I would have no problem with it,
00:00:40.400 | but it's presented as something else.
00:00:43.120 | - Let me add on that comment real quick.
00:00:45.360 | If creators of Sophia could change something
00:00:48.600 | about their marketing or behavior in general,
00:00:50.800 | what would it be?
00:00:51.640 | What's--
00:00:52.880 | - Oh, just about everything.
00:00:54.120 | (laughing)
00:00:55.720 | - I mean, don't you think,
00:00:58.720 | here's a tough question.
00:01:00.160 | Let me, so I agree with you.
00:01:01.760 | So Sophia is not,
00:01:04.120 | the general public feels that Sophia can do way more
00:01:08.300 | than she actually can.
00:01:09.360 | - That's right.
00:01:10.240 | - And the people who created Sophia
00:01:12.800 | are not honestly publicly communicating
00:01:17.800 | trying to teach the public.
00:01:19.520 | - Right.
00:01:20.360 | - But here's a tough question.
00:01:23.320 | Don't you think the same thing
00:01:28.120 | is scientists in industry and research
00:01:32.160 | are taking advantage of the same misunderstanding
00:01:34.720 | in the public when they create AI companies
00:01:37.400 | or publish stuff?
00:01:39.980 | - Some companies, yes.
00:01:41.200 | I mean, there is no sense of,
00:01:43.200 | there's no desire to delude.
00:01:44.960 | There's no desire to kind of over-claim
00:01:47.880 | what something is done.
00:01:48.720 | Right, you publish a paper on AI
00:01:49.880 | that has this result on ImageNet,
00:01:52.280 | it's pretty clear.
00:01:53.120 | I mean, it's not even interesting anymore.
00:01:55.040 | But I don't think there is that,
00:01:58.020 | I mean, the reviewers are generally not very forgiving
00:02:02.960 | of unsupported claims of this type.
00:02:07.240 | And, but there are certainly quite a few startups
00:02:09.760 | that have had a huge amount of hype around this
00:02:12.760 | that I find extremely damaging
00:02:15.560 | and I've been calling it out when I've seen it.
00:02:18.120 | So yeah, but to go back to your original question,
00:02:20.320 | like the necessity of embodiment.
00:02:23.120 | I think, I don't think embodiment is necessary.
00:02:25.680 | I think grounding is necessary.
00:02:27.200 | So I don't think we're gonna get machines
00:02:29.000 | that really understand language
00:02:30.560 | without some level of grounding in the real world.
00:02:32.480 | And it's not clear to me that language
00:02:34.440 | is a high enough bandwidth medium
00:02:36.200 | to communicate how the real world works.
00:02:38.300 | I think for this-
00:02:40.400 | - Can you talk to ground, what grounding means?
00:02:42.400 | - So grounding means that,
00:02:44.120 | so there is this classic problem of common sense reasoning,
00:02:47.800 | you know, the Winograd schema, right?
00:02:51.080 | And so I tell you the trophy doesn't fit in the suitcase
00:02:55.040 | because it's too big,
00:02:56.440 | or the trophy doesn't fit in the suitcase
00:02:57.840 | because it's too small.
00:02:59.240 | And the it in the first case refers to the trophy
00:03:01.880 | in the second case to the suitcase.
00:03:03.680 | And the reason you can figure this out
00:03:05.240 | is because you know where the trophy and the suitcase are,
00:03:07.040 | you know, one is supposed to fit in the other one
00:03:08.720 | and you know the notion of size
00:03:10.640 | and the big object doesn't fit in a small object
00:03:13.080 | unless it's a TARDIS, you know, things like that, right?
00:03:15.360 | So you have this knowledge of how the world works,
00:03:18.720 | of geometry and things like that.
00:03:20.720 | I don't believe you can learn everything about the world
00:03:24.720 | by just being told in language how the world works.
00:03:28.080 | I think you need some low-level perception of the world,
00:03:31.760 | you know, be it visual touch, you know, whatever,
00:03:33.780 | but some higher bandwidth perception of the world.
00:03:36.640 | - So by reading all the world's text,
00:03:38.880 | you still may not have enough information.
00:03:41.200 | - That's right.
00:03:42.600 | There's a lot of things that just will never appear in text
00:03:45.320 | and that you can't really infer.
00:03:47.040 | So I think common sense will emerge from, you know,
00:03:51.800 | certainly a lot of language interaction,
00:03:53.480 | but also with watching videos
00:03:55.680 | or perhaps even interacting in virtual environments
00:03:58.960 | and possibly, you know, robot interacting in the real world.
00:04:01.820 | But I don't actually believe necessarily
00:04:03.680 | that this last one is absolutely necessary,
00:04:06.040 | but I think there's a need for some grounding.
00:04:08.400 | - But the final product
00:04:11.960 | doesn't necessarily need to be embodied, you're saying.
00:04:15.240 | - It just needs to have an awareness, a grounding.
00:04:17.720 | - Right, but it needs to know how the world works
00:04:20.120 | to have, you know, to not be frustrating to talk to.
00:04:24.480 | - And you talked about emotions being important.
00:04:29.560 | That's a whole nother topic.
00:04:31.840 | - Well, so, you know, I talked about this,
00:04:34.400 | the basal ganglia as the thing that calculates
00:04:39.400 | your level of miscontentment,
00:04:43.000 | and then there is this other module
00:04:44.720 | that sort of tries to do a prediction
00:04:46.720 | of whether you're going to be content or not.
00:04:48.600 | That's the source of some emotion.
00:04:50.320 | So fear, for example, is an anticipation
00:04:53.140 | of bad things that can happen to you, right?
00:04:56.480 | You have this inkling that there is some chance
00:04:59.320 | that something really bad is going to happen to you,
00:05:00.960 | and that creates fear.
00:05:02.360 | When you know for sure that something bad
00:05:03.760 | is going to happen to you, you kind of give up, right?
00:05:05.960 | It's not there anymore.
00:05:07.560 | It's uncertainty that creates fear.
00:05:09.480 | So the punchline is,
00:05:11.240 | we're not going to have autonomous intelligence
00:05:12.560 | without emotions.
00:05:13.420 | - Whatever the heck emotions are.
00:05:18.920 | So you mentioned very practical things of fear,
00:05:21.120 | but there's a lot of other mess around it.
00:05:23.480 | - But there are kind of the results of drives.
00:05:26.400 | - Yeah, there's deeper biological stuff going on,
00:05:29.360 | and I've talked to a few folks on this.
00:05:31.440 | There's fascinating stuff that ultimately connects
00:05:34.280 | to our brain.
00:05:36.620 | (silence)
00:05:38.780 | (silence)
00:05:40.940 | (silence)
00:05:43.100 | (silence)
00:05:45.260 | (silence)
00:05:47.420 | (silence)
00:05:49.580 | (silence)
00:05:51.740 | [BLANK_AUDIO]