Yann LeCun: Sophia and Does AI Need a Body?

00:00:00.000 | (gentle music)

00:00:02.580 | - You've criticized the art project

00:00:13.000 | that is Sophia the Robot.

00:00:14.740 | And what that project essentially does

00:00:17.600 | is uses our natural inclination to anthropomorphize

00:00:21.800 | things that look like human and give them more.

00:00:24.840 | Do you think that could be used by AI systems

00:00:27.800 | like in the movie "Her"?

00:00:30.080 | - So do you think that body is needed

00:00:33.440 | to create a feeling of intelligence?

00:00:37.240 | - Well, if Sophia was just an art piece,

00:00:39.360 | I would have no problem with it,

00:00:40.400 | but it's presented as something else.

00:00:43.120 | - Let me add on that comment real quick.

00:00:45.360 | If creators of Sophia could change something

00:00:48.600 | about their marketing or behavior in general,

00:00:50.800 | what would it be?

00:00:51.640 | What's--

00:00:52.880 | - Oh, just about everything.

00:00:54.120 | (laughing)

00:00:55.720 | - I mean, don't you think,

00:00:58.720 | here's a tough question.

00:01:00.160 | Let me, so I agree with you.

00:01:01.760 | So Sophia is not,

00:01:04.120 | the general public feels that Sophia can do way more

00:01:08.300 | than she actually can.

00:01:09.360 | - That's right.

00:01:10.240 | - And the people who created Sophia

00:01:12.800 | are not honestly publicly communicating

00:01:17.800 | trying to teach the public.

00:01:19.520 | - Right.

00:01:20.360 | - But here's a tough question.

00:01:23.320 | Don't you think the same thing

00:01:28.120 | is scientists in industry and research

00:01:32.160 | are taking advantage of the same misunderstanding

00:01:34.720 | in the public when they create AI companies

00:01:37.400 | or publish stuff?

00:01:39.980 | - Some companies, yes.

00:01:41.200 | I mean, there is no sense of,

00:01:43.200 | there's no desire to delude.

00:01:44.960 | There's no desire to kind of over-claim

00:01:47.880 | what something is done.

00:01:48.720 | Right, you publish a paper on AI

00:01:49.880 | that has this result on ImageNet,

00:01:52.280 | it's pretty clear.

00:01:53.120 | I mean, it's not even interesting anymore.

00:01:55.040 | But I don't think there is that,

00:01:58.020 | I mean, the reviewers are generally not very forgiving

00:02:02.960 | of unsupported claims of this type.

00:02:07.240 | And, but there are certainly quite a few startups

00:02:09.760 | that have had a huge amount of hype around this

00:02:12.760 | that I find extremely damaging

00:02:15.560 | and I've been calling it out when I've seen it.

00:02:18.120 | So yeah, but to go back to your original question,

00:02:20.320 | like the necessity of embodiment.

00:02:23.120 | I think, I don't think embodiment is necessary.

00:02:25.680 | I think grounding is necessary.

00:02:27.200 | So I don't think we're gonna get machines

00:02:29.000 | that really understand language

00:02:30.560 | without some level of grounding in the real world.

00:02:32.480 | And it's not clear to me that language

00:02:34.440 | is a high enough bandwidth medium

00:02:36.200 | to communicate how the real world works.

00:02:38.300 | I think for this-

00:02:40.400 | - Can you talk to ground, what grounding means?

00:02:42.400 | - So grounding means that,

00:02:44.120 | so there is this classic problem of common sense reasoning,

00:02:47.800 | you know, the Winograd schema, right?

00:02:51.080 | And so I tell you the trophy doesn't fit in the suitcase

00:02:55.040 | because it's too big,

00:02:56.440 | or the trophy doesn't fit in the suitcase

00:02:57.840 | because it's too small.

00:02:59.240 | And the it in the first case refers to the trophy

00:03:01.880 | in the second case to the suitcase.

00:03:03.680 | And the reason you can figure this out

00:03:05.240 | is because you know where the trophy and the suitcase are,

00:03:07.040 | you know, one is supposed to fit in the other one

00:03:08.720 | and you know the notion of size

00:03:10.640 | and the big object doesn't fit in a small object

00:03:13.080 | unless it's a TARDIS, you know, things like that, right?

00:03:15.360 | So you have this knowledge of how the world works,

00:03:18.720 | of geometry and things like that.

00:03:20.720 | I don't believe you can learn everything about the world

00:03:24.720 | by just being told in language how the world works.

00:03:28.080 | I think you need some low-level perception of the world,

00:03:31.760 | you know, be it visual touch, you know, whatever,

00:03:33.780 | but some higher bandwidth perception of the world.

00:03:36.640 | - So by reading all the world's text,

00:03:38.880 | you still may not have enough information.

00:03:41.200 | - That's right.

00:03:42.600 | There's a lot of things that just will never appear in text

00:03:45.320 | and that you can't really infer.

00:03:47.040 | So I think common sense will emerge from, you know,

00:03:51.800 | certainly a lot of language interaction,

00:03:53.480 | but also with watching videos

00:03:55.680 | or perhaps even interacting in virtual environments

00:03:58.960 | and possibly, you know, robot interacting in the real world.

00:04:01.820 | But I don't actually believe necessarily

00:04:03.680 | that this last one is absolutely necessary,

00:04:06.040 | but I think there's a need for some grounding.

00:04:08.400 | - But the final product

00:04:11.960 | doesn't necessarily need to be embodied, you're saying.

00:04:15.240 | - It just needs to have an awareness, a grounding.

00:04:17.720 | - Right, but it needs to know how the world works

00:04:20.120 | to have, you know, to not be frustrating to talk to.

00:04:24.480 | - And you talked about emotions being important.

00:04:29.560 | That's a whole nother topic.

00:04:31.840 | - Well, so, you know, I talked about this,

00:04:34.400 | the basal ganglia as the thing that calculates

00:04:39.400 | your level of miscontentment,

00:04:43.000 | and then there is this other module

00:04:44.720 | that sort of tries to do a prediction

00:04:46.720 | of whether you're going to be content or not.

00:04:48.600 | That's the source of some emotion.

00:04:50.320 | So fear, for example, is an anticipation

00:04:53.140 | of bad things that can happen to you, right?

00:04:56.480 | You have this inkling that there is some chance

00:04:59.320 | that something really bad is going to happen to you,

00:05:00.960 | and that creates fear.

00:05:02.360 | When you know for sure that something bad

00:05:03.760 | is going to happen to you, you kind of give up, right?

00:05:05.960 | It's not there anymore.

00:05:07.560 | It's uncertainty that creates fear.

00:05:09.480 | So the punchline is,

00:05:11.240 | we're not going to have autonomous intelligence

00:05:12.560 | without emotions.

00:05:13.420 | - Whatever the heck emotions are.

00:05:18.920 | So you mentioned very practical things of fear,

00:05:21.120 | but there's a lot of other mess around it.

00:05:23.480 | - But there are kind of the results of drives.

00:05:26.400 | - Yeah, there's deeper biological stuff going on,

00:05:29.360 | and I've talked to a few folks on this.

00:05:31.440 | There's fascinating stuff that ultimately connects

00:05:34.280 | to our brain.

00:05:36.620 | (silence)

00:05:38.780 | (silence)

00:05:40.940 | (silence)

00:05:43.100 | (silence)

00:05:45.260 | (silence)

00:05:47.420 | (silence)

00:05:49.580 | (silence)

00:05:51.740 | [BLANK_AUDIO]

Yann LeCun: Sophia and Does AI Need a Body? | AI Podcast Clips