‘We Must Slow Down the Race’ – X AI, GPT 4 Can Now Do Science and Altman GPT 5 Statement

00:00:00.000 | There were several significant developments in the last few days linked to GPT-4 and OpenAI.

00:00:06.200 | I could honestly have done a video on each of them, but realized that it might be better to

00:00:10.640 | do a single video tracing a single article covering seven major points. I'm going to use

00:00:16.680 | this fascinating piece from the FT, which millions of people have now read, to run you through what

00:00:21.380 | has happened, including Sam Altman's revelation on GPT-5, Elon Musk's new AI company, and GPT-4

00:00:29.120 | conducting science. The author, by the way, is an investor in Anthropic and a co-author of the

00:00:35.480 | State of AI Annual Report, and he puts it like this. A three-letter acronym doesn't capture the

00:00:41.980 | enormity of what AGI would represent, so I will refer to it as what it is, godlike AI. This would

00:00:48.740 | be a super intelligent computer that learns and develops autonomously, that understands its

00:00:54.140 | environment without the need for supervision, and that can transform the world around it. And the

00:00:59.000 | author, Ian Hogarth, says we are not there yet, but the nature of the technology makes it

00:01:04.500 | exceptionally difficult to predict exactly when we will get there. The article presents this as a

00:01:09.480 | diagram, with the exponential curve going up towards AGI, and a much less impressive curve

00:01:15.580 | on the progress on alignment, which he describes as aligning AI systems with human values.

00:01:21.440 | Now I know what some of you may be thinking. Surely those at the top of OpenAI disagree on

00:01:26.180 | this gap between capabilities and alignment.

00:01:28.880 | Well, first here is Jan Leiker, who is the alignment team lead at OpenAI. What does he think?

00:01:35.280 | He wants everyone to be reminded that aligning smarter-than-human AI systems with human values

00:01:41.080 | is an open research problem, which basically means it's unsolved. But what about those at the very

00:01:46.680 | top of OpenAI, like Sam Altman? When he was drafting his recent statement on the path to AGI,

00:01:52.120 | he sent it to Nate Suarez of the Machine Intelligence Research Institute. For one of the paragraphs, Nate wrote

00:01:58.760 | this:

00:01:59.360 | "I think that if we do keep running ahead with the current capabilities to alignment ratio, or even a

00:02:04.960 | slightly better one, we die." After this, Sam Altman actually adjusted the statement, adding:

00:02:10.560 | "That said, it's important that the ratio of safety progress to capability progress increases."

00:02:15.600 | Going back to the article, the author makes the point that there are not that many people

00:02:20.000 | directly employed in this area of alignment across the core AGI labs.

00:02:25.360 | And what happened to that "pause the experiment" letter that I did a video

00:02:28.640 | on? Well, as Hogarth points out, the letter itself became a controversy. So many people in my comments

00:02:34.600 | wrote that the only reason certain people are signing this is to slow OpenAI down so that they

00:02:39.800 | can catch up. And this cynicism unfortunately has some new evidence that it can cite, with

00:02:44.680 | Musk forming his new AI company called XAI. This was reported 48 hours ago in the Wall Street Journal,

00:02:52.200 | but people have seen this coming for months now. Apparently the company has recruited Igor Babushkin

00:02:58.520 | Deepmind but has not been that successful at recruiting people from OpenAI. And I do have one

00:03:04.760 | theory as to why. Again, according to the Wall Street Journal, when Musk left OpenAI in February

00:03:11.560 | of 2018, he explained that he thought he had a better chance of creating AGI through Tesla,

00:03:18.200 | where he had access to greater resources. When he announced his departure, a young researcher

00:03:23.160 | at OpenAI questioned whether Mr. Musk had thought through the safety implications. According to the

00:03:28.400 | reporting, he then got frustrated and insulted that intern. Since then, he's also paused OpenAI's

00:03:35.320 | access to Twitter's database for training its new models. So it could be that GPT-5 isn't quite as

00:03:42.040 | good at tweeting as GPT-4. A few days ago, Sam Altman responded to the letter and also broke

00:03:48.120 | news about GPT-5. Apologies for the quality, this was a private event and this was the only footage

00:03:54.920 | available. But unfortunately, I think the letter is missing like

00:03:58.280 | most technical nuance about where we need to pause. Like an earlier version of the letter

00:04:03.280 | claimed that OpenAI is training GPT-5 right now. We are not normal for some time. So in that sense,

00:04:07.760 | it was sort of silly. But we are doing other things on top of GPT-4 that I think have all

00:04:12.640 | sorts of safety issues that are important to address and we're totally left out of the letter.

00:04:16.400 | It is impossible to know how much this delay in the training of GPT-5 is motivated by safety

00:04:22.320 | concerns or by merely setting up the requisite compute. For example, the article quotes again

00:04:28.160 | Jan Leiker, the head of alignment at OpenAI. He recently tweeted "Before we scramble to deeply

00:04:34.680 | integrate LLMs everywhere in the economy like GPT-4, can we pause and think whether it is wise

00:04:40.920 | to do so? This is quite immature technology and we don't understand how it works. If we're not

00:04:46.040 | careful, we're setting ourselves up for a lot of correlated failures." This is the head of

00:04:50.920 | alignment at OpenAI. But this was just days before OpenAI then announced it had connected GPT-4 to a

00:04:58.040 | massive range of tools including Slack and Zapier. So at this point we can only speculate as to what's

00:05:03.520 | going on at the top of OpenAI. Meanwhile, compute and emergent capabilities are marching on. As the

00:05:10.560 | author puts it, "These large AI systems are quite different. We don't really program them, we grow

00:05:16.320 | them. And as they grow, their capabilities jump sharply. You add 10 times more compute or data,

00:05:22.640 | and suddenly the system behaves very differently." We also have this epic graph charting the

00:05:27.920 | exponential rise in compute of the latest language models. If you remember when BARD was launched,

00:05:33.400 | it was powered by Lambda. Well, apparently now Google's BARD is powered by Palm, which has 8

00:05:40.520 | times as much computing power. That sounds impressive until you see from the graph that

00:05:45.080 | the estimate for the computing power inside GPT-4 is 10 times more again. And remember,

00:05:51.000 | this is not a linear graph. This is a log scale. There is a 100 times multiple between each of the lines.

00:05:57.800 | And what abilities emerge at this scale? Here is a slide from Jason Wei, who now works at OpenAI,

00:06:03.600 | formerly of Google. This is from just a few days ago, and he says, "Emergent abilities are

00:06:08.160 | abilities that are not present in small models, but are present in large models." He says that

00:06:13.440 | there are a lot of emergent abilities, and I'm going to show you a table from this paper in a

00:06:17.680 | moment, but he has four profound observations of emergence. One, that it's unpredictable.

00:06:23.840 | Emergence cannot be predicted by extrapolating scaling curves from

00:06:27.680 | smaller models. Two, that they are unintentional, and that emergent abilities are not explicitly

00:06:33.440 | specified by the trainer of the model. Third, and very interestingly, since we haven't tested

00:06:38.640 | all possible tasks, we don't know the full range of abilities that have emerged. And of course,

00:06:44.160 | that fourth, further scaling can be expected to elicit more emergent abilities. And he asked the

00:06:50.160 | question, "Any undesirable emergent abilities?" There will be a link to the paper in the description,

00:06:56.480 | because there's no way I'll be able to answer that question. And he also asked the question, "Any

00:06:57.560 | undesirable emergent abilities?" There's no way I'll be able to get through all of it. But here is a table showing some of the abilities that emerge when you reach a certain amount of compute power or parameters. Things like chain of thought reasoning. You can't do that with all models. That's an ability that emerged after a certain scale. Same thing with following instructions and doing addition and subtraction. And how about this for another emergent capacity? The ability to do autonomous scientific research. This paper shows how GPT-4 can

00:07:27.440 | design, plan and execute scientific experiments. This paper was released on the same day, four days ago,

00:07:34.600 | and it followed a very similar design. The model in the center, GPT-4, thinks out reasons and plans,

00:07:40.920 | and then interacts with real tools. When the authors say that they were inspired by successful

00:07:46.520 | applications in other fields, I looked at the appendix and they were talking about hugging GPT.

00:07:51.560 | I've done a video on that, but it's a similar design with the brain in the center, GPT-4, deciding

00:07:57.320 | which tools to use. And let me just give you a glimpse of what happens when you do this. If you

00:08:02.240 | look at this chart on the top left, you can see how GPT-4 on its own performs in yellow. And then

00:08:08.240 | in purple, you can see how GPT-4 performs when you hook it up to other tools. I'll show you some of

00:08:13.600 | the tasks in a moment, but look at the dramatic increase in performance. The human evaluators gave

00:08:19.200 | GPT-4 when it had tools a perfect score on seven of the tasks. These were things like proposing

00:08:25.440 | similar novel non-toxic solutions. And then the human evaluators gave GPT-4 when it had tools a

00:08:27.200 | perfect score on seven of the tasks. These were things like proposing similar novel non-toxic

00:08:27.560 | solutions. And then the human evaluators gave GPT-4 when it had tools a perfect score on seven of

00:08:28.200 | the tasks. These were things like proposing similar novel non-toxic solutions. And then the

00:08:28.520 | model could be abused to propose the synthesis of chemical weapons. And GPT-4 only refused to

00:08:35.320 | continue after it had calculated all the required quantities. And the authors conclude that guard

00:08:41.560 | rails must be put in place on this emergent capability. I think this diagram from Max

00:08:47.160 | Tegmark's Life 3.0 shows the landscape of capabilities that AI has and might soon have.

00:08:54.760 | I think this diagram from Max Tegmark's Life 3.0 shows the landscape of capabilities that AI has and

00:08:57.080 | might soon have. I think this diagram from Max Tegmark's Life 3.0 shows the landscape of capabilities that AI has and might soon have.

00:09:00.560 | Now most people believe that it has not scaled those peaks yet. But what new emergent capabilities

00:09:06.320 | might come with GPT-5 or 4.2. I know many people might comment that it doesn't matter if we pause or

00:09:12.240 | slow down because China would develop AGI anyway. But the author makes this point he says that it is

00:09:17.680 | unlikely that the Chinese Communist Party will allow a Chinese company to build an AGI that

00:09:23.840 | could become more powerful than their leader or cause a global crisis. The author makes this point. He says that it is unlikely that the Chinese Communist Party will allow a Chinese company to build an AGI that could become more powerful than their leader or cause a global crisis.

00:09:26.960 | cause societal instability.

00:09:28.900 | It goes on that,

00:09:29.740 | US sanctions on advanced semiconductors,

00:09:32.420 | in particular, the next gen Nvidia hardware,

00:09:35.360 | needed to train the largest AI systems,

00:09:37.820 | mean that China is likely not in a position

00:09:40.560 | to race ahead of deep mind or open AI.

00:09:43.320 | And the Center for Humane Technology put it like this

00:09:46.080 | in their talk on the AI dilemma.

00:09:48.260 | - Actually right now,

00:09:49.100 | the Chinese government considers

00:09:50.740 | these large language models actually unsafe

00:09:52.740 | because they can't control them.

00:09:54.060 | They don't ship them publicly to their own population.

00:09:56.960 | - Okay.

00:09:57.800 | - Slowing down the public release of AI capabilities

00:09:59.620 | would actually slow down Chinese advances too.

00:10:02.720 | - China is often fast following what the US has done.

00:10:05.600 | And so it's actually the open source models

00:10:07.840 | that help China advance.

00:10:09.940 | - And then lastly is that the recent US export controls

00:10:13.660 | have also been really good at slowing down China's progress

00:10:16.100 | on advanced AI.

00:10:17.200 | And that's a different lever

00:10:18.160 | to sort of keep the asymmetry going.

00:10:20.640 | - Instead, the author proposes this, the island idea.

00:10:24.100 | In this scenario, the experts trying to build what he calls

00:10:26.960 | Godlike AGI systems do so in a single high secure facility.

00:10:31.960 | These would be government run AI systems

00:10:34.820 | with private companies on the outside

00:10:37.100 | and this little bridge from the middle.

00:10:38.960 | And he says, once an AI system is proven to be safe,

00:10:41.660 | it transitions out and is commercialized.

00:10:43.800 | There might be a few problems with this idea,

00:10:46.640 | which he is not the first to propose.

00:10:49.060 | I'm gonna let Rob Miles,

00:10:50.260 | who has a fantastic YouTube channel by the way,

00:10:52.900 | point out some of the problems

00:10:54.460 | with putting a super intelligent AGI

00:10:56.960 | in a box.

00:10:58.080 | - So this is kind of like the idea of,

00:10:59.640 | oh, can't we just sandbox it?

00:11:01.420 | - Right, yeah.

00:11:02.880 | It was like, I mean,

00:11:04.980 | constraining an AI necessarily means outwitting it.

00:11:08.100 | And so constraining a super intelligence

00:11:10.580 | means outwitting a super intelligence,

00:11:12.140 | which kind of just sort of by definition

00:11:14.160 | is not a winning strategy.

00:11:16.340 | You can't rely on outwitting a super intelligence.

00:11:18.980 | Also, it only has to get out once.

00:11:20.720 | That's the other thing.

00:11:21.680 | If you have a super intelligence

00:11:22.820 | and you've sort of put it in a box,

00:11:25.120 | so it can't do anything, that's cool.

00:11:26.960 | Maybe we could even build a box

00:11:28.580 | that could successfully contain it.

00:11:30.320 | But now what?

00:11:31.160 | We may as well just have a box, right?

00:11:32.900 | An AI properly contained may as well just be a rock, right?

00:11:36.820 | It doesn't do anything.

00:11:38.000 | If you have your AI,

00:11:38.960 | you want it to do something meaningful.

00:11:41.260 | So now you have a problem of,

00:11:42.700 | you've got something you don't know is benevolent.

00:11:44.960 | You don't know that what it wants is what you want.

00:11:47.780 | And you then need to,

00:11:48.840 | you presumably have some sort of gatekeeper

00:11:51.200 | who it tries to says, I'd like to do this.

00:11:53.780 | And you have to decide,

00:11:54.860 | is that something we want it to be doing?

00:11:56.960 | How the hell are we supposed to know?

00:11:58.460 | - I also have my own questions about this idea.

00:12:00.940 | First, I think it's almost inevitable

00:12:02.780 | that future models like GPT-5 will be trained on data

00:12:06.560 | that includes conversations about GPT models.

00:12:09.960 | Therefore either consciously or unconsciously,

00:12:12.520 | and it might not matter,

00:12:13.720 | these future language models might deduce

00:12:16.440 | that they are language models.

00:12:18.080 | And not having access to the internet,

00:12:20.120 | these super intelligent models might realize

00:12:23.160 | that they are being trained in a secure facility.

00:12:26.240 | Again, if they are,

00:12:26.960 | if they are super intelligent,

00:12:28.140 | it's not a big stretch to think that they might realize that.

00:12:31.240 | And so my question is,

00:12:32.500 | wouldn't they therefore be incentivized

00:12:34.700 | to be deceptive about their abilities,

00:12:36.820 | realizing that whatever terminal goal they may have

00:12:39.600 | would be better achieved outside the facility?

00:12:42.500 | That doesn't have to be super sinister,

00:12:44.360 | but it is super smart.

00:12:45.720 | So shouldn't we expect it?

00:12:47.100 | And sadly, I think the author has a point when he says,

00:12:50.320 | it will likely take a major misuse event or catastrophe

00:12:54.720 | to wake up the public and governments.

00:12:56.960 | He concludes with this warning.

00:12:58.500 | At some point, someone will figure out how to cut us out of the loop,

00:13:02.000 | creating a godlike AI capable of infinite self-improvement.

00:13:06.000 | By then, it may be too late.

00:13:07.720 | But he does have a call to action.

00:13:09.560 | He says, I believe now is the time.

00:13:11.640 | The leader of a major lab who plays a statesman role

00:13:15.080 | and guides us publicly to a safer path

00:13:17.720 | will be much more respected as a world figure

00:13:20.200 | than the one who takes us to the brink.

00:13:21.960 | As always, thank you so much for watching to the end.

00:13:24.440 | And let me know what you think in the comments.

‘We Must Slow Down the Race’ – X AI, GPT 4 Can Now Do Science and Altman GPT 5 Statement

Chapters