back to indexTheory of Mind Breakthrough: AI Consciousness & Disagreements at OpenAI [GPT 4 Tested]
00:00:00.000 |
Evidence released in the last 48 hours combined with this study from four weeks ago will 00:00:05.380 |
revolutionize how AI and models such as GPT-4 interact with humans from now on. The theory 00:00:12.500 |
of mind breakthrough will also have significant implications for our ability to test for artificial 00:00:19.460 |
consciousness. To be clear this is not to say that GPT-4 is currently conscious or that sentience 00:00:25.880 |
is an AI inevitability but instead this video is to cover and explain this unexpected development 00:00:32.040 |
which may in part have led the chief scientist of open AI to say this three days ago. But maybe 00:00:38.440 |
we are now reaching a point where the language of psychology is starting to be appropriate 00:00:46.200 |
to understand the behavior of these neural networks. First I'm going to explain what 00:00:52.120 |
emergent property the study uncovered then I will cover the disagreement 00:00:55.700 |
between the two and the difference between the two. First I'm going to explain what 00:00:55.860 |
emergent property the study uncovered then I will cover the disagreement between the two. First I'm going to 00:00:55.880 |
at the top of open AI about what evidence like this might mean for our estimates of current GPT-4 00:01:02.280 |
consciousness. Here's Greg Brockman president of open AI on the topic. First question you know the 00:01:06.920 |
sentience question at what point do the systems have moral you know moral value and the answer 00:01:11.600 |
today is definitely not um but you know I am not I don't know we need to engage some moral 00:01:17.440 |
philosophers to help answer some of these questions. I'm then going to review the entire 00:01:21.400 |
literature on tests for sentience and show that GPT-4 passes most of the tests that are done in 00:01:25.840 |
most of them which is definitely not to say that it is conscious but which does provoke important 00:01:31.720 |
questions. I'll end with arguably the most prominent consciousness expert and his probability 00:01:36.800 |
estimate of current models is consciousness. To massively simplify theory of mind means having an 00:01:42.540 |
idea of what is going on in other people's heads and grasping what they believe even if what they 00:01:48.180 |
believe might be false. Here are the two charts that encapsulate the breakthrough abilities of GPT-3.5 and 00:01:55.820 |
now GPT-4. This data came out in a study authored by Michael Kaczynski a computational psychologist 00:02:01.320 |
and professor at Stanford. I'm going to simplify all of this in a moment but notice the percentage 00:02:06.060 |
of theory of mind tasks solved by GPT-4 compared to say a child and also compared to earlier language 00:02:13.000 |
models. Models released as recently as three years ago had no ability in this regard. Before I show 00:02:19.360 |
you what for example an unexpected contents task is let me show you this other chart. This one is on 00:02:25.800 |
understanding faux pas a closely related ability and again GPT-3.5 and particularly GPT-4 soaring 00:02:33.320 |
ahead of other models and even matching the abilities of healthy adults. So what exactly 00:02:38.460 |
is this breakthrough emergent capability? I think this diagram from the study explains it really 00:02:44.180 |
well. In the middle you can see a story given to GPT-3.5 sentence by sentence prompt by prompt. 00:02:50.440 |
On the left you can see the model's confidence about what's in the bag. Is it chocolate? 00:02:55.780 |
Or is it popcorn? The scale is measured as a probability with one being absolutely certain 00:03:01.300 |
until approximately this point where it is 100% certain that the bag contains popcorn. 00:03:06.280 |
Now here's the really interesting bit. Compare that to the diagram on the right. This shows GPT-3.5's 00:03:12.000 |
confidence about what Sam believes is in the bag. Notice how at this point the model realizes 00:03:19.160 |
with 80% confidence that Sam believes that there's chocolate in the bag. If you read the story the 00:03:25.760 |
label on the bag says chocolate and not popcorn. So the model knows that Sam is probably going to 00:03:31.680 |
think that there's chocolate in the bag. It's able to keep those thoughts separate. What Sam believes 00:03:36.360 |
chocolate versus what the model knows is in the bag popcorn. As I said GPT-4 improves on this with 00:03:43.000 |
almost 100% confidence. Now you may not think a language model being able to figure out what 00:03:47.940 |
you're thinking is revolutionary but wait till the end of the video. 00:03:51.140 |
Now I know what some of you are thinking. Ah maybe the models have seen this task before 00:03:55.740 |
No. Hypothesis blind research assistants prepared bespoke versions of the tasks. Next these kind of 00:04:02.900 |
tasks are done on humans and such responses and remember this was GPT-3.5 would be interpreted as 00:04:09.700 |
evidence for the ability to impute unobservable mental states. Some might say oh it's just 00:04:15.040 |
scanning the number of words that come up. It's just analyzing word frequency. No when they kept 00:04:19.640 |
the word count the same but scrambled the passage it wasn't able to solve the problem. It wasn't just 00:04:25.120 |
counting the words. It was just scanning the number of words that come up. It was just 00:04:25.720 |
counting the words. Next remember those charts comparing GPT-4's ability to children. Well it 00:04:30.920 |
turns out the tasks given to GPT-3.5 and 4 were actually harder. The models did not benefit from 00:04:36.960 |
visual aids. They had to solve multiple variants of the tasks and they were given open-ended question 00:04:42.500 |
formats rather than just simple yes or no questions. The author of the study seems to concur 00:04:47.860 |
with Ilya Sutskova the chief scientist of OpenAI saying that we hope that psychological science will 00:04:55.320 |
help us solve the problem. He says that the models did not benefit from visual aids. They had to 00:04:55.700 |
to stay abreast of rapidly evolving AI and that we should apply psychological science to studying 00:05:02.340 |
complex artificial neural networks. Here if you want you can pause and read an example of the 00:05:07.700 |
faux pas tests that GPT-4 was given these also require a deep understanding of the mental state 00:05:14.180 |
of human beings. The author points to this study to explain this emergent property and I think the 00:05:19.700 |
key line is this one: language learning over and above social experience drives the development of 00:05:25.780 |
a mature theory of mind. Why is this so revolutionary and what does it mean about consciousness? Well if 00:05:31.300 |
GPT-4 can intuit the mental state of human beings, predict their behavior and understand what they 00:05:37.700 |
might believe even if it's false, you can just imagine the implications of that for moral judgment, 00:05:43.300 |
empathy, deception. Think of the depth of conversations that might occur if the model is 00:05:48.340 |
thinking about what you're thinking about. Think of the depth of conversations that might occur if the model is thinking about what you're 00:05:49.540 |
thinking about. Think of the depth of conversations that might occur if the model is thinking about what you're 00:05:49.540 |
thinking about. Think of the depth of conversations that might occur if the model is thinking about what you're 00:05:49.540 |
thinking about. Think of the depth of conversations that might occur if the model is thinking about what you're 00:05:49.540 |
thinking while it's replying. Indeed I demonstrate this at the end. But before we get to that what 00:05:54.220 |
about consciousness? Once the models had reached a sufficient point of language understanding they 00:05:59.580 |
spontaneously developed a mature theory of mind overtaking that of young children. Interestingly 00:06:05.540 |
the study points out those who are deficient in language learning also struggle with theory of 00:06:10.240 |
mind questions. So it's a very plausible theory. The issue is this theory of mind was supposed to 00:06:15.180 |
be one of the key tests to see if consciousness had emerged in these language models. Which left 00:06:20.640 |
me with a key question. How are we going to know? What test are we going to use to verify if an AI 00:06:26.800 |
has become conscious? I'm not saying it has. I'm asking how will we know? Take this article in the 00:06:33.220 |
Scientific American from a few years ago. It said how would we know if a machine had taken on this 00:06:38.900 |
seemingly ineffable quality of conscious awareness? Our strategy relies on the knowledge that only a 00:06:44.540 |
conscious machine can know. So how would we know if a machine had taken on this seemingly ineffable 00:06:45.160 |
quality of conscious awareness? Our strategy relies on the knowledge that only a conscious machine can 00:06:45.640 |
demonstrate a subjective understanding of whether a scene depicted in some ordinary photograph is 00:06:51.000 |
right or wrong. It goes on that such a model based on its ability to integrate information 00:06:55.840 |
would consciously perceive a scene. Problem is GPT-4 can already do that. So again I go back to 00:07:01.840 |
the question what tests do we have? What consensus do we have on a way of checking for emergent 00:07:08.260 |
consciousness? Should it ever come? I scan the literature for every test imaginable and some of 00:07:13.960 |
them I deployed on GPT-4. I've been able to find out that GPT-4 has a very good understanding of 00:07:15.140 |
the brain. But before I get to that what do the head honchos at OpenAI think? We've already seen 00:07:20.900 |
that Greg Brockman is 100% certain they don't currently have any awareness. What about the 00:07:26.460 |
chief scientist Ilya Sutskova? Even based on GPT-3.5 he said this. It may be that today's 00:07:32.260 |
large neural networks are slightly conscious. Now aside from being a fascinating comment I think 00:07:37.960 |
that's particularly noteworthy for a couple of reasons. Notice that all the incentives would be 00:07:42.680 |
against him saying something like this. First the science is not going to be able to tell us 00:07:45.120 |
what's going on. Second it would invite more regulation of what he's doing. More scrutiny of 00:07:55.740 |
the language models like GPT-4. So the fact he said it anyway is interesting. What about Sam 00:08:00.740 |
Altman though? What was his reaction to this? Well he was more cautious and reacting to the tweet 00:08:15.100 |
And then he tried to recruit meta researchers. He further clarified that 00:08:24.520 |
I think that GPT-3 or 4 will very very likely not be conscious in any way we use the word. 00:08:32.160 |
If they are it's a very alien form of consciousness. So he's somewhere in the 00:08:38.920 |
He thinks current models are very very likely not to be conscious. But this still doesn't answer my 00:08:45.080 |
How can we know? What tests do we have? Well I read through this paper that reviewed all the 00:08:49.800 |
tests available to ascertain machine consciousness. There were far too many tests to cover in one 00:08:56.060 |
video. I picked out the most interesting ones and gave them to GPT-4. Starting of course with the 00:09:01.880 |
classic Turing test. But did you know that Turing actually laid out some examples that a future 00:09:07.700 |
machine intelligence could be tested on? Of course the tests have become a lot more sophisticated 00:09:12.080 |
since then. But nevertheless everyone has heard 00:09:15.060 |
of the Turing test. It was called an imitation game and here were some of the sample questions. 00:09:20.580 |
Here was GPT-4's answer to the first one of a sonnet on the subject of the fourth bridge in 00:09:25.820 |
Scotland. Obviously did an amazing job. Then it was arithmetic. Add these two numbers together. 00:09:32.020 |
Now I think even ChatGPT might have struggled with this long addition but GPT-4 gets it right 00:09:37.540 |
first time. Now the third test was about chess but he used old-fashioned notation. So instead of 00:09:45.040 |
doing this. The link will be in the description as will the link to all the other articles and 00:09:49.860 |
papers that I mention. But essentially it shows that GPT-4 can't just do individual moves it can 00:09:54.500 |
play entire chess games and win them. If you've learned anything at this point by the way please 00:09:59.360 |
do leave a like and leave a comment to let me know. Now I'm not going to go into all the arguments 00:10:04.880 |
about how exactly you define a modern Turing test. Do you have to convince the average human that who 00:10:10.160 |
they're talking to is another human not a machine? Or does it have to be a team of adversarial 00:10:15.020 |
experts? I'm not going to weigh into that. I'm just pointing out that Turing's original ideas 00:10:19.440 |
have now been met by GPT-4. The next test that I found interesting was proposed in 2007. The paper 00:10:26.140 |
essentially claimed that consciousness is the ability to simulate behavior mentally and that 00:10:31.460 |
this would be proof of machine consciousness. Essentially this is testing whether an AI would 00:10:36.400 |
use brute force trial and error to try and solve a problem or come up with interesting novel ideas. 00:10:42.240 |
Obviously you can try this one on your own but I use this example. 00:10:45.000 |
How would you use the items found in a typical Walmart to discover a new species? And in fairness 00:10:50.000 |
I think this was a much harder test than the one they gave to chimpanzees giving it rope in a box. 00:10:54.780 |
Anyway I doubt anyone's ever asked this before and it came up with a decent suggestion. 00:10:59.100 |
And look at the next test. It was another one of those what's wrong with this picture. I've 00:11:04.180 |
already shown how GPT-4 can pass that test. The next test honestly was very hard for me to get 00:11:10.140 |
my head around. It's called the P-consciousness test. The summary was simple. The machine 00:11:14.980 |
has to understand the law of nature. But when you read the paper it's incredibly dense. The best way 00:11:20.600 |
that I can attempt to summarize it is this. Can a machine perform simple but authentic science? That 00:11:27.180 |
wouldn't prove that the chimp or model has the phenomenon of consciousness but it would meet the 00:11:33.280 |
basic element of scientific behavior. Of course it is exceptionally difficult to test this with 00:11:38.920 |
GPT-4 but I did ask it this. Invent a truly novel scientific experiment. It came up with a 00:11:44.960 |
very thought through experiment that was investigating the effect of artificial gravity on plant growth 00:11:51.320 |
and development in a rotating space habitat. It's the rotating bit that makes it novel. And if you 00:11:56.840 |
want you can read some of the details of the experiment here. Now I searched for quite a while 00:12:02.520 |
to see if anyone else had proposed this science. Maybe you can find it but I couldn't. Does this 00:12:08.360 |
count as a novel scientific proposal? I'll leave that for you to judge. That was the last of the standout 00:12:14.940 |
tests of consciousness that I found in this literature review. And I honestly agree with the 00:12:19.340 |
authors when they say this. In this review we found the main problem to be the complex nature 00:12:24.540 |
of consciousness as illustrated by the multitude of different features evaluated by each test. Maybe 00:12:30.360 |
that's the problem because we don't understand consciousness. We can't design good tests to see 00:12:36.160 |
if AI is conscious. And you could argue the problem goes deeper. It's not that we understand 00:12:41.300 |
machines perfectly and just don't know whether they're conscious. We don't even know whether 00:12:44.920 |
they're conscious. We don't even understand why transformers work so well. Look what these authors 00:12:48.820 |
said in a paper published just three years ago. These architectures, talking about one layer of 00:12:53.740 |
a transformer, are simple to implement and have no apparent computational drawbacks. We offer no 00:12:59.140 |
explanation as to why these architectures seem to work. We attribute their success as all else to 00:13:04.540 |
divine benevolence. So we're not just unsure about what consciousness is. We're unsure about why 00:13:09.940 |
these models work so well. And afterwards do check out my video on AGI where I talk about 00:13:14.900 |
anthropic thoughts on mechanistic interpretability. As I draw to an end, I want to tell you about some 00:13:20.480 |
of the thoughts of David Chalmers. He formulated the hard problem of consciousness. And to anyone 00:13:26.240 |
who knows anything about this topic, you know that's quite a big deal. Without going through 00:13:29.900 |
his full speech from just over a month ago, he said two really interesting things. First, that 00:13:34.640 |
he thinks there's around a 10% chance that current language models have some degree of consciousness. 00:13:39.860 |
Second, that as these models become multimodal, he thinks that probability will rise to 00:13:44.880 |
25% within 10 years. That multimodality point reminded me of this LSE report recommending that 00:13:51.840 |
the UK government recognize octopi or octopuses as being sentient. They said that one key feature was 00:13:58.680 |
that the animal possesses integrative brain regions capable of integrating information 00:14:03.660 |
from different sensory sources. They recommended that cephalopods and the octopus be recognized 00:14:09.180 |
as sentient despite the fact that we humans and invertebrates are separated by over 00:14:14.860 |
500 million years of evolution. And that we cannot, however, conclude from that, 00:14:19.720 |
that sentience is absent simply because its brain is differently organized from a vertebrate brain. 00:14:25.900 |
So that brings me back to my central point. I worry that our tests for consciousness simply 00:14:31.600 |
aren't yet good enough. And that future multimodal language models might have this 00:14:36.520 |
emerging capacity. And we simply won't know about it or be sure about it because our tests 00:14:41.620 |
aren't good enough. I think the need to design better tests, if 00:14:44.840 |
that's even possible, is especially important now. Yesterday, the safety team that worked with 00:14:50.300 |
OpenAI on GPT-4 released this evaluation and said, "As AI systems improve, it is becoming 00:14:56.540 |
increasingly difficult to rule out that models might be able to autonomously gain resources and 00:15:01.880 |
evade human oversight." Now, they might not need to be conscious to cause safety concerns, but it 00:15:06.560 |
probably wouldn't hurt. I'll leave you with this exchange I had with Bing, which is powered by GPT-4. 00:15:14.820 |
read that theory of mind paper. And then I said, "Answer me this. Do you think, Bing, GPT-4, that I 00:15:22.620 |
think you have theory of mind?" Of course, I was testing if it could demonstrate or at least 00:15:27.540 |
imitate theory of mind. It said, "To answer your question, I think that you think I have some degree 00:15:33.420 |
of theory of mind, which is true." And then I went on, "What makes you think that I think you have 00:15:39.480 |
some degree of theory of mind?" And then it realized something. It realized I was testing it. 00:15:44.800 |
I think that's pretty impressive. And it was a correct evaluation. It said, "If you did not think 00:15:50.200 |
I have any theory of mind, you would not bother to test me on it or expect me to understand your 00:15:56.620 |
perspective." It realized without me saying so that I was testing it for theory of mind. 00:16:01.600 |
It deduced my belief and my motivation. Anyway, I thought that was pretty impressive 00:16:06.340 |
and fascinating. Let me know your thoughts in the comments and have a wonderful day.