Yann LeCun: Benchmarks for Human-Level Intelligence

(gentle music) - You've written advice saying, "Don't get fooled by people who claim "to have a solution to artificial general intelligence, "who claim to have an AI system "that works just like the human brain, "or who claim to have figured out how the brain works. "Ask them what the error rate they get "on MNIST or ImageNet." - Yeah, this is a little dated, by the way.

(laughs) I mean, five years, who's counting? Okay, but I think your opinion is still, MNIST and ImageNet, yes, may be dated, there may be new benchmarks, right? But I think that philosophy is one you still and somewhat hold, that benchmarks and the practical testing, the practical application is where you really get to test the ideas.

- Well, it may not be completely practical. Like, for example, it could be a toy dataset, but it has to be some sort of task that the community as a whole has accepted as some sort of standard kind of benchmark, if you want. It doesn't need to be real.

So for example, many years ago here at FAIR, people, Jason West, Antoine Born, and a few others proposed the Babi tasks, which were kind of a toy problem to test the ability of machines to reason, actually, to access working memory and things like this. And it was very useful, even though it wasn't a real task.

MNIST is kind of halfway a real task. So, you know, toy problems can be very useful. It's just that I was really struck by the fact that a lot of people, particularly a lot of people with money to invest, would be fooled by people telling them, oh, we have, you know, the algorithm of the cortex and you should give us 50 million.

- Yes, absolutely. So there's a lot of people who try to take advantage of the hype for business reasons and so on. But let me sort of talk to this idea that the new ideas, the ideas that push the field forward may not yet have a benchmark, or it may be very difficult to establish a benchmark.

- I agree. That's part of the process. Establishing benchmarks is part of the process. - So what are your thoughts about, so we have these benchmarks on, around stuff we can do with images, from classification to captioning, to just every kind of information you can pull off from images and the surface level.

There's audio data sets, there's some video. What can we start, natural language, what kind of stuff, what kind of benchmarks do you see that start creeping on to more something like intelligence, like reasoning, like, maybe you don't like the term, but AGI, echoes of that kind of formulation? - Yeah, so a lot of people are working on interactive environments in which you can train and test intelligent systems.

So there, for example, the classical paradigm of supervised learning is that you have a data set, you partition it into a training set, validation set, test set, and there's a clear protocol, right? But what if, that assumes that the samples are statistically independent, you can exchange them, the order in which you see them shouldn't matter, things like that.

But what if the answer you give determines the next sample you see, which is the case, for example, in robotics, right? You robot does something and then it gets exposed to a new room, and depending on where it goes, the room would be different. So that creates the exploration problem.

The what if, the samples, so that creates also a dependency between samples, right? If you can only move in space, the next sample you're gonna see is gonna be probably in the same building, most likely. So all the assumptions about the validity of this training set, test set hypothesis break.

Whenever a machine can take an action that has an influence in the world, and it's what it's gonna see. So people are setting up artificial environments where that takes place, right? The robot runs around a 3D model of a house and can interact with objects and things like this.

So you do robotics by simulation, you have those, you know, a bunny eye gym type thing, or a MuJoCo kind of simulated robots, and you have games, you know, things like that. So that's where the field is going, really, this kind of environment. Now, back to the question of AGI, like, I don't like the term AGI, because it implies that human intelligence is general, and human intelligence is nothing like general, it's very, very specialized.

We think it's general, we like to think of ourselves as having general intelligence, we don't, we're very specialized. We're only slightly more general than-- - Why does it feel general? So you kind of, the term general, I think what's impressive about humans is the ability to learn, as we were talking about learning, to learn in just so many different domains.

It's perhaps not arbitrarily general, but just you can learn in many domains and integrate that knowledge somehow. - Okay. - The knowledge persists. - So let me take a very specific example. It's not an example, it's more like a quasi-mathematical demonstration. So you have about one million fibers coming out of one of your eyes, okay, two million total, but let's talk about just one of them.

It's one million nerve fibers, your optical nerve. Let's imagine that they are binary, so they can be active or inactive, right? So the input to your visual cortex is one million bits. Now, they're connected to your brain in a particular way, and your brain has connections that are kind of a little bit like a convolutional net, they're kind of local, you know, in space and things like this.

Now imagine I play a trick on you. It's a pretty nasty trick, I admit. I cut your optical nerve, and I put a device that makes a random perturbation, a permutation of all the nerve fibers. So now what comes to your brain is a fixed but random permutation of all the pixels.

There's no way in hell that your visual cortex, even if I do this to you in infancy, will actually learn vision to the same level of quality that you can. - Got it, and you're saying there's no way you've relearned that? - No, because now two pixels that are nearby in the world will end up in very different places in your visual cortex.

And your neurons there have no connections with each other because they only connected locally. - So this whole, our entire, the hardware is built in many ways to support? - The locality of the real world. Yes, that's specialization. - Yeah, but it's still pretty damn impressive. So it's not perfect generalization.

It's not even close. - No, no, it's not that it's not even close. It's not at all. - Yeah, it's not, it's specialized. - So how many Boolean functions? So let's imagine you want to train your visual system to recognize particular patterns of those one million bits. Okay, so that's a Boolean function, right?

Either the pattern is here or not here. It's a two way classification with one million binary inputs. How many such Boolean functions are there? Okay, you have two to the one million combinations of inputs. For each of those, you have an output bit. And so you have two to the one million Boolean functions of this type, okay?

Which is an unimaginably large number. How many of those functions can actually be computed by your visual cortex? And the answer is a tiny, tiny, tiny, tiny, tiny, tiny sliver like an enormously tiny sliver. - Yeah, yeah. - So we are ridiculously specialized. - But, okay, that's an argument against the word general.

I think there's a, I agree with your intuition, but I'm not sure it's, it seems the brain is impressively capable of adjusting to things. So-- - It's because we can't imagine tasks that are outside of our comprehension, right? So we think we are general because we are general of all the things that we can apprehend.

But there is a huge world out there of things that we have no idea. We call that heat, by the way. - Heat. - Heat. So, at least physicists call that heat, or they call it entropy, which is-- - Entropy. - You have a thing full of gas, right?

- Closed system full of gas. - Right? Closed or not closed. It has pressure, it has temperature, it has, and you can write equations, PV equal NRT, things like that, right? When you reduce the volume, the temperature goes up, the pressure goes up, things like that, right? For perfect gas, at least.

Those are the things you can know about that system. And it's a tiny, tiny number of bits compared to the complete information of the state of the entire system. Because the state of the entire system will give you the position and momentum of every molecule of the gas. And what you don't know about it is the entropy, and you interpret it as heat.

The energy contained in that thing is what we call heat. Now, it's very possible that, in fact, there is some very strong structure in how those molecules are moving. It's just that they are in a way that we are just not wired to perceive. - Yeah, we're ignorant to it.

And there's an infinite amount of things we're not wired to perceive. And you're right, that's a nice way to put it. We're general to all the things we can imagine, which is a very tiny subset of all things that are possible. - So it's like Kolmogorov complexity or the Kolmogorov-Chaitin sum of complexity.

Every bit string or every integer is random, except for all the ones that you can actually write down. (both laughing) - Yeah, okay, so beautifully put. But so we can just call it artificial intelligence. We don't need to have a general. - Or human level. Human level intelligence is good.

Anytime you touch human, it gets interesting because we attach ourselves to human and it's difficult to define what human intelligence is. Nevertheless, my definition is maybe damn impressive intelligence. Okay, damn impressive demonstration of intelligence, whatever. (upbeat music) (upbeat music) (upbeat music) (upbeat music) (upbeat music) (upbeat music)

Yann LeCun: Benchmarks for Human-Level Intelligence | AI Podcast Clips

Chapters

Transcript