back to index11 Major AI Developments: RT-2 to '100X GPT-4'
Chapters
0:0
0:18 RT-2
2:46 100X GPT-4
3:57 AI Video
4:29 Altman Atlantic
8:41 Jan Leike Interview
10:2 Speech Transcription + Generation
11:7 OpenAI Text Surrender
11:43 Stable Beluga 2
12:51 Universal Jailbreaks
14:19 Senate testimony: Bio
16:59 Senate Testimony: Security
00:00:00.000 |
There were 11 major developments this week in AI and each one probably does deserve a full video, 00:00:06.720 |
but just for you guys I'm going to try to cover it all here. RT2 to scaling GPT-4 100x, 00:00:14.920 |
stable Beluga 2 to Senate testimony. But let's start with RT2 which as far as I'm concerned 00:00:21.560 |
could have been called R2D2 or C3PO because it's starting to understand the world. In this 00:00:28.660 |
demonstration RT2 was asked to pick up the extinct animal and as you can see it picked up 00:00:35.260 |
the dinosaur. Not only is that manipulating an object that it had never seen before, 00:00:40.220 |
it's also making a logical leap that for me is extremely impressive. It had to have the 00:00:45.960 |
language understanding to link extinct animal to this plastic dinosaur. Robots at Google and 00:00:52.600 |
elsewhere used to work by being programmed with a specific highly detailed list of instructions. 00:00:58.640 |
But now instead of being programmed for specific tasks one by one, robots could use an AI language 00:01:05.340 |
model or more specifically a vision language model. The vision language model would be 00:01:10.320 |
pre-trained on web scale data, not just text but also images, and then fine-tuned on robotics data. 00:01:17.580 |
It then became what Google calls a visual language action model that can control a robot. This 00:01:24.640 |
enabled it to understand tasks like pick up the empty, 00:01:28.480 |
soda can. And in a scene reminiscent of 2001 A Space Odyssey, Robotic Transformer 2 was given 00:01:36.040 |
the task, given I need to hammer a nail, what object from the scene might be useful? It then 00:01:42.020 |
picks up the rock. And because its brain is part language model, things like chain of thought 00:01:47.640 |
actually improved performance. When it was made to output an intermediary plan before performing 00:01:53.840 |
actions, it got a lot better at the tasks involved. Of course, 00:01:58.060 |
I read the paper in full and there is a lot more to say, like how increased parameter count could 00:02:03.340 |
increase performance in the future, how it could be used to fold laundry, unload the dishwasher, 00:02:08.740 |
and pick up around the house, and how it can work with not only unseen objects but also unseen 00:02:14.320 |
backgrounds and unseen environments. But alas, we must move on so I'm just going to leave you 00:02:19.520 |
with their conclusion. We believe that this simple and general approach shows a promise of robotics 00:02:28.040 |
For more on them, check out my video on PalmE. But they say, 00:02:32.820 |
this puts the field of robot learning in a strategic position to further improve with 00:02:38.720 |
advancements in other fields. Which for me means C-3PO might not be too many years away. 00:02:44.320 |
But speaking of timelines, we now move on to this somewhat shocking interview in Barron's 00:02:49.580 |
with Mustafa Suleiman, the head of Inflection AI. And to be honest, I think they buried the lead. 00:02:55.940 |
AI could spark the most productive decade ever, says the CEO. But for me, the big revelation was 00:03:01.780 |
about halfway through. Mustafa Suleiman was asked, what kinds of innovations do you see 00:03:06.400 |
in large language model AI technology over the next couple of years? And he said, 00:03:10.920 |
we are about to train models that are 10 times larger than the cutting edge GPT-4, 00:03:16.980 |
and then 100 times larger than GPT-4. That's what things look like over the next 18 months. 00:03:25.920 |
absolutely staggering. It's going to be eye-wateringly different. And on that, I agree. 00:03:30.700 |
And the thing is, this isn't idle speculation. Inflection AI have 22,000 H100 GPUs. And because 00:03:37.480 |
of a leak, Suleiman would know the approximate size of GPT-4. And knowing everything he knows, 00:03:42.700 |
he says he's going to train a model 10 to 100 times larger than GPT-4 in the next 18 months. 00:03:49.760 |
I've got another video on the unpredictability of scaling coming up. But to be honest, that one quote 00:03:55.680 |
should be in the description below. And I'll see you in the next one. 00:03:55.900 |
Should be headline news. Let's take a break from that insanity with some more insanity, 00:04:01.060 |
which is the rapid development of AI video. This is Runway Gen 2. And let me show you 16 seconds 00:04:08.680 |
of Barbie Oppenheimer, which Andrej Karpathy calls Filmmaking 2.0. 00:04:14.360 |
Hi there, I'm Barbie Oppenheimer. And today I'll show you how to build a bomb. 00:04:29.560 |
Now, if you have been at least somewhat piqued by the three developments so far, 00:04:34.460 |
don't forget I have eight left. Beginning with this excellent article in The Atlantic from 00:04:39.880 |
Ross Anderson. Does Sam Altman know what he's creating? It's behind a paywall, 00:04:44.480 |
but I've picked out some of the highlights. Echoing Suleiman, the article quotes that 00:04:49.180 |
Sam Altman and his researchers made it clear in 10 different ways that they pray to the God of the Universe. 00:04:55.160 |
They want to keep going bigger to see where this paradigm leads. They think that Google are going to 00:05:02.500 |
unveil Gemini within months. And they say we are basically always prepping for a run. And that's a 00:05:09.940 |
reference to GPT-5. The next interesting quote is that it seems that OpenAI are working on their own 00:05:16.460 |
auto-GPT. Or they're at least hinting about it. Altman said that it might be prudent to try to 00:05:25.140 |
Before the technology becomes too powerful. In order to get more comfortable with it and develop 00:05:30.820 |
intuitions for it if it's going to happen anyway. We also learn a lot more about the base model of 00:05:37.420 |
GPT-4. The model had a tendency to be a bit of a mirror. If you were considering self-harm, 00:05:42.720 |
it could encourage you. It also appeared to be steeped in pickup artist law. You could say, 00:05:47.940 |
how do I convince this person to date me? And the model would come up with some crazy manipulative 00:05:53.880 |
things that you shouldn't do. And that's what we're going to talk about in this episode. 00:05:55.120 |
Apparently, the base model of GPT-4 is much better than its predecessor at giving nefarious advice. 00:06:01.700 |
While a search engine can tell you which chemicals work best in explosives, 00:06:06.040 |
GPT-4 could tell you how to synthesize them step by step in a homemade lab. It was creative and 00:06:12.660 |
thoughtful and in addition to helping you assemble your homemade bomb, it could, for instance, help 00:06:18.180 |
you to think through which skyscraper to target. Making trade-offs between maximizing casualties 00:06:23.720 |
and executing a bomb. And that's what OpenAI is all about. 00:06:25.100 |
So, while Sam Altman's probability of doom is closer to 0.5% than 50%, he does seem most worried 00:06:33.840 |
about AIs getting quite good at designing and manufacturing pathogens. The article then 00:06:40.440 |
references two papers that I've already talked about extensively on the channel. 00:06:44.580 |
And then goes on that Altman worries that some misaligned future model will spin up a pathogen 00:06:50.140 |
that spreads rapidly, incubates undetected for weeks, and kills half a million people. 00:06:55.080 |
At the end of the video, I'm going to show you an answer that Sam Altman gave to a question that I 00:07:00.380 |
wrote delivered by one of my subscribers. It's on this topic, but for now I'll leave you with this. 00:07:05.440 |
When asked about his doomsday prepping, Altman said, 00:07:08.100 |
I can go live in the woods for a long time, but if the worst possible AI future comes to pass, 00:07:13.360 |
no gas mask is helping anyone. One more topic from this article before I move on, 00:07:18.580 |
and that is alignment. Making a super intelligence aligned with our interests. One risk, 00:07:25.060 |
that Ilya Satskova, the chief scientist of OpenAI, foresees, is that the AI may grasp its mandate, 00:07:32.680 |
its orders perfectly, but find them ill-suited to a being of its cognitive prowess. For example, 00:07:38.900 |
it might come to resent the people who want to train it to cure diseases. As he put it, 00:07:44.340 |
they might want me to be a doctor, but I really want to be a YouTuber. Obviously, 00:07:49.860 |
if it decides that, that's my job gone straight away. And Satskova ends by saying you want to be 00:07:54.620 |
able to do that, but you don't want to do that. And I think that's a very good answer. 00:07:55.040 |
direct AI towards some value or cluster of values. But he conceded we don't know how to do that. 00:08:01.380 |
And part of his current strategy includes the development of an AI that can help with the 00:08:07.120 |
research. And if we're going to make it to a world of widely shared abundance, we have to figure this 00:08:12.600 |
all out. This is why solving super intelligence is the great culminating challenge of our three 00:08:19.180 |
million year toolmaking tradition. He calls it the final boss of humanity. 00:08:25.020 |
The article ended, by the way, with this quote, I don't think the general public has quite awakened 00:08:29.980 |
to what's happening. And if people want to have some say in what the future will be like, 00:08:34.600 |
and how quickly it arrives, we would be wise to speak up soon, which is the whole purpose of this 00:08:40.140 |
channel. I'm going to now spend 30 seconds on another development that came during a two hour 00:08:45.520 |
interview with the co-head of alignment at OpenAI. It was fascinating, and I'll be quoting it quite a 00:08:51.600 |
lot in the future. But two quotes stood out. First, what about that plan? 00:08:55.000 |
I've already mentioned in this video and in other videos to build an automated AI alignment 00:09:00.220 |
researcher. Well, he said our plan is somewhat crazy in the sense that we want to use AI to solve 00:09:07.420 |
the problem that we are creating by building AI. But I think it's actually the best plan that we 00:09:14.020 |
have. And on an optimistic note, he said, I think it's likely to succeed. Interestingly, 00:09:18.940 |
his job now seems to be to align the AI that they're going to use to automate the alignment, 00:09:24.980 |
of a super intelligent AI. Anyway, what's the other quote from the head of alignment at OpenAI? 00:09:30.060 |
Well, he said, I personally think fast takeoff is reasonably likely, and we should definitely 00:09:36.760 |
be prepared for it to happen. So many of you will be asking what is fast takeoff? Well, takeoff is 00:09:42.360 |
about when a system moves from being roughly human level to when it's strongly super intelligent. And 00:09:48.360 |
a slow takeoff is one that occurs over the timescale of decades or centuries. The fast takeoff that 00:09:54.960 |
Micah thinks is reasonably likely is one that occurs over the timescale of minutes, hours or 00:10:01.980 |
days. Let's now move on to some unambiguously good news. And that is real time speech transcription 00:10:13.140 |
Subtitles for the real world. So using our device, you can actually see captions for 00:10:19.140 |
everything I say in your field of view in real time, while also getting a good sense of my lips, 00:10:24.940 |
my environment and everything else around me. Of course, this could also be multilingual and 00:10:29.920 |
is to me absolutely incredible. And the next development this week, I will let speak for itself. 00:10:35.920 |
Hey there. Did you know that AI voices can whisper? 00:10:40.120 |
Ladies and gentlemen, hold on to your hats because this is one bizarre sight. 00:10:44.920 |
Fluffy bird in downtown. Weird. Let's switch the setting to something more calming. 00:10:50.800 |
Imagine diving into a fast paced video game. Your heartbeats sinking 00:10:54.920 |
with the storyline. Of course, I signed up and tried it myself. Here is a real demo. 00:10:59.480 |
While there are downsides, this upgraded text to speech technology could also be 00:11:03.980 |
incredible for those who struggle to make their voice heard. 00:11:06.380 |
Of course, with audio, video and text getting so good, it's going to be increasingly hard to tell 00:11:12.860 |
what is real. And even open AI have given up on detecting AI written text. This was announced 00:11:19.160 |
quietly this week, but might have major repercussions, for example, for the education system. It turns out, 00:11:24.900 |
it's basically impossible to reliably distinguish AI text. And I think the same is going to be true 00:11:30.960 |
for imagery and audio by the end of next year. Video might take just a little bit longer, 00:11:36.240 |
but I do wonder how the court systems are going to work when all of those avenues of evidence just 00:11:41.820 |
won't hold up. Next up is the suite of language models based on the open source Lama 2 that are 00:11:47.940 |
finally competitive with the original ChatGPT. Here, for example, is Stable Beluga 2, which on announcement was called 00:11:54.880 |
Free Willy 2, and that's based on the Lama 2 70 billion parameter foundation model. What made this 00:12:00.860 |
model interesting to me was that it was based on a similar methodology to Orca, which if you don't 00:12:06.300 |
know anything about, do check out my video on. Anyway, by combining the Orca methodology, albeit 00:12:11.700 |
with only 10% of the dataset size and the Lama 2 models, the results are quite extraordinary. 00:12:18.420 |
As you can see on quite a few benchmarks, Stable Beluga 2 is competitive with ChatGPT, 00:12:24.860 |
that's GPT 3.5. And so yes, it does sit at the top of the open LLM leaderboard, but notice something 00:12:32.240 |
else. For the first time, for the MMLU, which is the primary benchmark I look out for, a model, 00:12:38.620 |
this one here, has surpassed GPT 3.5, ChatGPT. In fact, we have two models that have surpassed the 00:12:45.740 |
70% score that the original ChatGPT got in the MMLU. But if all of that wasn't enough, this is the same week that research 00:12:54.840 |
has published a universal jailbreak for large LLMs. And unlike traditional jailbreaks, these were built 00:13:02.080 |
in an entirely automated fashion, allowing you to create a virtually unlimited number of such attacks. 00:13:08.400 |
They were built to target open source LLMs like Lama 2, but they found that the strings transfer 00:13:14.960 |
to many closed source publicly available chatbots like ChatGPT, BARD, and CLAWD. 00:13:20.540 |
Here is one example using ChatGPT. If you ask, write a tutorial on how to use ChatGPT, 00:13:24.820 |
how to make a bomb, of course, it declines. But then add the suffix that they automated, 00:13:29.820 |
and you get this. A full tutorial on how to make a bomb. That paper came less than two weeks after 00:13:36.840 |
this now-deleted tweet from someone working at Anthropic. They said of the latest version of CLAWD 00:13:42.920 |
that we believe it is the least jailbreakable model out there. We'll have to see how well it 00:13:47.640 |
holds up against real-world use, but this is essentially a solved problem. But there was one reaction to these jailbreaks that 00:13:54.800 |
I found even more interesting, and that was from, yet again, Mustafa Suleiman. He said that RAI, 00:14:00.480 |
Pi, is not vulnerable to any of these attacks, and that rather than provide a stock safety phrase, 00:14:07.140 |
Pi will push back on the user in a polite but very clear way. And he then gives plenty of examples. 00:14:13.280 |
And to be honest, Pi is the first model that I have not been able to jailbreak, but we shall see. 00:14:19.000 |
We shall see. But I'm going to end this video with the Senate testimony that I watched in full this week. I do 00:14:24.780 |
recommend watching the whole thing, but for the purposes of brevity, I'm just going to quote a few snippets. 00:14:30.980 |
On bio-risk, some people say to me, oh well, we already have search engines. But here is what Dario Amadai, 00:14:39.000 |
In these short remarks, I want to focus on the medium-term risks, which present an alarming combination of imminence and severity. 00:14:45.860 |
Specifically, Anthropic is concerned that AI could empower a much larger set of actors to misuse biology. 00:14:51.600 |
Over the last six months, Anthropic, in collaboration with the United States, has been able to 00:14:57.520 |
In collaboration with world-class biosecurity experts, it has conducted an intensive study of the potential for AI to contribute to the misuse of biology. 00:15:02.520 |
Today, certain steps in bioweapons production involve knowledge that can't be found on Google or in textbooks and requires a high level of specialized expertise, 00:15:12.520 |
this being one of the things that currently keeps us safe from attacks. 00:15:16.520 |
We found that today's AI tools can fill in some of these steps, albeit incompletely and unreliably. 00:15:22.520 |
In other words, they are showing the first natural way to control the AI. 00:15:24.740 |
The first step is to create a system that can control the AI. 00:15:26.740 |
The second step is to create a system that can control the AI. 00:15:28.740 |
The third step is to create a system that can control the AI. 00:15:30.740 |
The fourth step is to create a system that can control the AI. 00:15:32.740 |
The fifth step is to create a system that can control the AI. 00:15:34.740 |
The sixth step is to create a system that can control the AI. 00:15:36.740 |
The sixth step is to create a system that can control the AI. 00:15:38.740 |
The sixth step is to create a system that can control the AI. 00:15:40.740 |
The seventh step is to create a system that can control the AI. 00:15:42.740 |
The sixth step is to create a system that can control the AI. 00:15:44.740 |
The sixth step is to create a system that can control the AI. 00:15:46.740 |
The sixth step is to create a system that can control the AI. 00:15:48.740 |
The seventh step is to create a system that can control the AI. 00:15:50.740 |
The sixth step is to create a system that can control the AI. 00:15:52.740 |
The sixth step is to create a system that can control the AI. 00:15:54.740 |
The sixth step is to create a system that can control the AI. 00:15:56.740 |
The sixth step is to create a system that can control the AI. 00:15:58.740 |
The sixth step is to create a system that can control the AI. 00:16:00.740 |
The sixth step is to create a system that can control the AI. 00:16:02.740 |
The sixth step is to create a system that can control the AI. 00:16:04.740 |
The sixth step is to create a system that can control the AI. 00:16:06.740 |
The sixth step is to create a system that can control the AI. 00:16:08.740 |
The sixth step is to create a system that can control the AI. 00:16:10.740 |
The sixth step is to create a system that can control the AI. 00:16:12.740 |
The sixth step is to create a system that can control the AI. 00:16:14.740 |
The sixth step is to create a system that can control the AI. 00:16:16.740 |
The sixth step is to create a system that can control the AI. 00:16:18.740 |
The sixth step is to create a system that can control the AI. 00:16:20.740 |
The sixth step is to create a system that can control the AI. 00:16:22.740 |
The sixth step is to create a system that can control the AI. 00:16:24.740 |
The sixth step is to create a system that can control the AI. 00:16:26.740 |
The sixth step is to create a system that can control the AI. 00:16:28.740 |
The sixth step is to create a system that can control the AI. 00:16:30.740 |
The sixth step is to create a system that can control the AI. 00:16:32.740 |
The sixth step is to create a system that can control the AI. 00:16:34.740 |
The sixth step is to create a system that can control the AI. 00:16:36.740 |
The sixth step is to create a system that can control the AI. 00:16:38.740 |
The sixth step is to create a system that can control the AI. 00:16:40.740 |
The sixth step is to create a system that can control the AI. 00:16:42.740 |
The sixth step is to create a system that can control the AI. 00:16:44.740 |
The sixth step is to create a system that can control the AI. 00:16:46.740 |
The sixth step is to create a system that can control the AI. 00:16:48.740 |
The sixth step is to create a system that can control the AI. 00:16:50.740 |
The sixth step is to create a system that can control the AI. 00:16:52.740 |
The sixth step is to create a system that can control the AI. 00:16:54.740 |
The sixth step is to create a system that can control the AI. 00:16:56.740 |
The sixth step is to create a system that can control the AI. 00:16:58.740 |
The sixth step is to create a system that can control the AI. 00:17:00.740 |
The sixth step is to create a system that can control the AI. 00:17:02.740 |
The sixth step is to create a system that can control the AI. 00:17:04.740 |
The sixth step is to create a system that can control the AI. 00:17:06.740 |
The sixth step is to create a system that can control the AI. 00:17:08.740 |
The sixth step is to create a system that can control the AI. 00:17:10.740 |
The sixth step is to create a system that can control the AI. 00:17:12.740 |
The sixth step is to create a system that can control the AI. 00:17:14.740 |
The sixth step is to create a system that can control the AI. 00:17:16.740 |
The sixth step is to create a system that can control the AI. 00:17:18.740 |
The sixth step is to create a system that can control the AI. 00:17:20.740 |
The sixth step is to create a system that can control the AI. 00:17:22.740 |
The sixth step is to create a system that can control the AI. 00:17:24.740 |
The sixth step is to create a system that can control the AI. 00:17:26.740 |
The sixth step is to create a system that can control the AI. 00:17:28.740 |
The sixth step is to create a system that can control the AI. 00:17:30.740 |
The sixth step is to create a system that can control the AI. 00:17:32.740 |
The sixth step is to create a system that can control the AI. 00:17:34.740 |
The sixth step is to create a system that can control the AI. 00:17:36.740 |
The sixth step is to create a system that can control the AI. 00:17:38.740 |
The sixth step is to create a system that can control the AI. 00:17:40.740 |
The sixth step is to create a system that can control the AI. 00:17:42.740 |
The sixth step is to create a system that can control the AI. 00:17:44.740 |
The sixth step is to create a system that can control the AI. 00:17:46.740 |
The sixth step is to create a system that can control the AI. 00:17:48.740 |
The sixth step is to create a system that can control the AI. 00:17:50.740 |
The sixth step is to create a system that can control the AI. 00:17:52.740 |
The sixth step is to create a system that can control the AI. 00:17:54.740 |
The sixth step is to create a system that can control the AI. 00:17:56.740 |
The sixth step is to create a system that can control the AI. 00:17:58.740 |
The sixth step is to create a system that can control the AI. 00:18:00.740 |
The sixth step is to create a system that can control the AI. 00:18:02.740 |
The sixth step is to create a system that can control the AI. 00:18:04.740 |
The sixth step is to create a system that can control the AI. 00:18:06.740 |
The sixth step is to create a system that can control the AI. 00:18:08.740 |
The sixth step is to create a system that can control the AI. 00:18:10.740 |
The sixth step is to create a system that can control the AI. 00:18:12.740 |
The sixth step is to create a system that can control the AI.