11 Major AI Developments: RT-2 to '100X GPT-4'

00:00:00.000 | There were 11 major developments this week in AI and each one probably does deserve a full video,

00:00:06.720 | but just for you guys I'm going to try to cover it all here. RT2 to scaling GPT-4 100x,

00:00:14.920 | stable Beluga 2 to Senate testimony. But let's start with RT2 which as far as I'm concerned

00:00:21.560 | could have been called R2D2 or C3PO because it's starting to understand the world. In this

00:00:28.660 | demonstration RT2 was asked to pick up the extinct animal and as you can see it picked up

00:00:35.260 | the dinosaur. Not only is that manipulating an object that it had never seen before,

00:00:40.220 | it's also making a logical leap that for me is extremely impressive. It had to have the

00:00:45.960 | language understanding to link extinct animal to this plastic dinosaur. Robots at Google and

00:00:52.600 | elsewhere used to work by being programmed with a specific highly detailed list of instructions.

00:00:58.640 | But now instead of being programmed for specific tasks one by one, robots could use an AI language

00:01:05.340 | model or more specifically a vision language model. The vision language model would be

00:01:10.320 | pre-trained on web scale data, not just text but also images, and then fine-tuned on robotics data.

00:01:17.580 | It then became what Google calls a visual language action model that can control a robot. This

00:01:24.640 | enabled it to understand tasks like pick up the empty,

00:01:28.480 | soda can. And in a scene reminiscent of 2001 A Space Odyssey, Robotic Transformer 2 was given

00:01:36.040 | the task, given I need to hammer a nail, what object from the scene might be useful? It then

00:01:42.020 | picks up the rock. And because its brain is part language model, things like chain of thought

00:01:47.640 | actually improved performance. When it was made to output an intermediary plan before performing

00:01:53.840 | actions, it got a lot better at the tasks involved. Of course,

00:01:58.060 | I read the paper in full and there is a lot more to say, like how increased parameter count could

00:02:03.340 | increase performance in the future, how it could be used to fold laundry, unload the dishwasher,

00:02:08.740 | and pick up around the house, and how it can work with not only unseen objects but also unseen

00:02:14.320 | backgrounds and unseen environments. But alas, we must move on so I'm just going to leave you

00:02:19.520 | with their conclusion. We believe that this simple and general approach shows a promise of robotics

00:02:25.500 | directly benefiting from better and better.

00:02:28.040 | For more on them, check out my video on PalmE. But they say,

00:02:32.820 | this puts the field of robot learning in a strategic position to further improve with

00:02:38.720 | advancements in other fields. Which for me means C-3PO might not be too many years away.

00:02:44.320 | But speaking of timelines, we now move on to this somewhat shocking interview in Barron's

00:02:49.580 | with Mustafa Suleiman, the head of Inflection AI. And to be honest, I think they buried the lead.

00:02:54.940 | The headline is,

00:02:55.940 | AI could spark the most productive decade ever, says the CEO. But for me, the big revelation was

00:03:01.780 | about halfway through. Mustafa Suleiman was asked, what kinds of innovations do you see

00:03:06.400 | in large language model AI technology over the next couple of years? And he said,

00:03:10.920 | we are about to train models that are 10 times larger than the cutting edge GPT-4,

00:03:16.980 | and then 100 times larger than GPT-4. That's what things look like over the next 18 months.

00:03:24.100 | He went on,

00:03:24.800 | that's going to be,

00:03:25.500 | absolutely,

00:03:25.920 | absolutely staggering. It's going to be eye-wateringly different. And on that, I agree.

00:03:30.700 | And the thing is, this isn't idle speculation. Inflection AI have 22,000 H100 GPUs. And because

00:03:37.480 | of a leak, Suleiman would know the approximate size of GPT-4. And knowing everything he knows,

00:03:42.700 | he says he's going to train a model 10 to 100 times larger than GPT-4 in the next 18 months.

00:03:49.760 | I've got another video on the unpredictability of scaling coming up. But to be honest, that one quote

00:03:55.680 | should be in the description below. And I'll see you in the next one.

00:03:55.900 | Should be headline news. Let's take a break from that insanity with some more insanity,

00:04:01.060 | which is the rapid development of AI video. This is Runway Gen 2. And let me show you 16 seconds

00:04:08.680 | of Barbie Oppenheimer, which Andrej Karpathy calls Filmmaking 2.0.

00:04:14.360 | Hi there, I'm Barbie Oppenheimer. And today I'll show you how to build a bomb.

00:04:18.280 | Like this. I call her Rosie the Atomizer.

00:04:22.460 | And boom.

00:04:25.180 | That's my tutorial on DIY atomic bombs. Bye.

00:04:29.560 | Now, if you have been at least somewhat piqued by the three developments so far,

00:04:34.460 | don't forget I have eight left. Beginning with this excellent article in The Atlantic from

00:04:39.880 | Ross Anderson. Does Sam Altman know what he's creating? It's behind a paywall,

00:04:44.480 | but I've picked out some of the highlights. Echoing Suleiman, the article quotes that

00:04:49.180 | Sam Altman and his researchers made it clear in 10 different ways that they pray to the God of the Universe.

00:04:55.160 | They want to keep going bigger to see where this paradigm leads. They think that Google are going to

00:05:02.500 | unveil Gemini within months. And they say we are basically always prepping for a run. And that's a

00:05:09.940 | reference to GPT-5. The next interesting quote is that it seems that OpenAI are working on their own

00:05:16.460 | auto-GPT. Or they're at least hinting about it. Altman said that it might be prudent to try to

00:05:22.340 | actively develop an AI with true agency.

00:05:25.140 | Before the technology becomes too powerful. In order to get more comfortable with it and develop

00:05:30.820 | intuitions for it if it's going to happen anyway. We also learn a lot more about the base model of

00:05:37.420 | GPT-4. The model had a tendency to be a bit of a mirror. If you were considering self-harm,

00:05:42.720 | it could encourage you. It also appeared to be steeped in pickup artist law. You could say,

00:05:47.940 | how do I convince this person to date me? And the model would come up with some crazy manipulative

00:05:53.880 | things that you shouldn't do. And that's what we're going to talk about in this episode.

00:05:55.120 | Apparently, the base model of GPT-4 is much better than its predecessor at giving nefarious advice.

00:06:01.700 | While a search engine can tell you which chemicals work best in explosives,

00:06:06.040 | GPT-4 could tell you how to synthesize them step by step in a homemade lab. It was creative and

00:06:12.660 | thoughtful and in addition to helping you assemble your homemade bomb, it could, for instance, help

00:06:18.180 | you to think through which skyscraper to target. Making trade-offs between maximizing casualties

00:06:23.720 | and executing a bomb. And that's what OpenAI is all about.

00:06:25.100 | So, while Sam Altman's probability of doom is closer to 0.5% than 50%, he does seem most worried

00:06:33.840 | about AIs getting quite good at designing and manufacturing pathogens. The article then

00:06:40.440 | references two papers that I've already talked about extensively on the channel.

00:06:44.580 | And then goes on that Altman worries that some misaligned future model will spin up a pathogen

00:06:50.140 | that spreads rapidly, incubates undetected for weeks, and kills half a million people.

00:06:55.080 | At the end of the video, I'm going to show you an answer that Sam Altman gave to a question that I

00:07:00.380 | wrote delivered by one of my subscribers. It's on this topic, but for now I'll leave you with this.

00:07:05.440 | When asked about his doomsday prepping, Altman said,

00:07:08.100 | I can go live in the woods for a long time, but if the worst possible AI future comes to pass,

00:07:13.360 | no gas mask is helping anyone. One more topic from this article before I move on,

00:07:18.580 | and that is alignment. Making a super intelligence aligned with our interests. One risk,

00:07:25.060 | that Ilya Satskova, the chief scientist of OpenAI, foresees, is that the AI may grasp its mandate,

00:07:32.680 | its orders perfectly, but find them ill-suited to a being of its cognitive prowess. For example,

00:07:38.900 | it might come to resent the people who want to train it to cure diseases. As he put it,

00:07:44.340 | they might want me to be a doctor, but I really want to be a YouTuber. Obviously,

00:07:49.860 | if it decides that, that's my job gone straight away. And Satskova ends by saying you want to be

00:07:54.620 | able to do that, but you don't want to do that. And I think that's a very good answer.

00:07:55.040 | direct AI towards some value or cluster of values. But he conceded we don't know how to do that.

00:08:01.380 | And part of his current strategy includes the development of an AI that can help with the

00:08:07.120 | research. And if we're going to make it to a world of widely shared abundance, we have to figure this

00:08:12.600 | all out. This is why solving super intelligence is the great culminating challenge of our three

00:08:19.180 | million year toolmaking tradition. He calls it the final boss of humanity.

00:08:25.020 | The article ended, by the way, with this quote, I don't think the general public has quite awakened

00:08:29.980 | to what's happening. And if people want to have some say in what the future will be like,

00:08:34.600 | and how quickly it arrives, we would be wise to speak up soon, which is the whole purpose of this

00:08:40.140 | channel. I'm going to now spend 30 seconds on another development that came during a two hour

00:08:45.520 | interview with the co-head of alignment at OpenAI. It was fascinating, and I'll be quoting it quite a

00:08:51.600 | lot in the future. But two quotes stood out. First, what about that plan?

00:08:55.000 | I've already mentioned in this video and in other videos to build an automated AI alignment

00:09:00.220 | researcher. Well, he said our plan is somewhat crazy in the sense that we want to use AI to solve

00:09:07.420 | the problem that we are creating by building AI. But I think it's actually the best plan that we

00:09:14.020 | have. And on an optimistic note, he said, I think it's likely to succeed. Interestingly,

00:09:18.940 | his job now seems to be to align the AI that they're going to use to automate the alignment,

00:09:24.980 | of a super intelligent AI. Anyway, what's the other quote from the head of alignment at OpenAI?

00:09:30.060 | Well, he said, I personally think fast takeoff is reasonably likely, and we should definitely

00:09:36.760 | be prepared for it to happen. So many of you will be asking what is fast takeoff? Well, takeoff is

00:09:42.360 | about when a system moves from being roughly human level to when it's strongly super intelligent. And

00:09:48.360 | a slow takeoff is one that occurs over the timescale of decades or centuries. The fast takeoff that

00:09:54.960 | Micah thinks is reasonably likely is one that occurs over the timescale of minutes, hours or

00:10:01.980 | days. Let's now move on to some unambiguously good news. And that is real time speech transcription

00:10:09.180 | for deaf people available at less than $100.

00:10:13.140 | Subtitles for the real world. So using our device, you can actually see captions for

00:10:19.140 | everything I say in your field of view in real time, while also getting a good sense of my lips,

00:10:24.940 | my environment and everything else around me. Of course, this could also be multilingual and

00:10:29.920 | is to me absolutely incredible. And the next development this week, I will let speak for itself.

00:10:35.920 | Hey there. Did you know that AI voices can whisper?

00:10:40.120 | Ladies and gentlemen, hold on to your hats because this is one bizarre sight.

00:10:44.920 | Fluffy bird in downtown. Weird. Let's switch the setting to something more calming.

00:10:50.800 | Imagine diving into a fast paced video game. Your heartbeats sinking

00:10:54.920 | with the storyline. Of course, I signed up and tried it myself. Here is a real demo.

00:10:59.480 | While there are downsides, this upgraded text to speech technology could also be

00:11:03.980 | incredible for those who struggle to make their voice heard.

00:11:06.380 | Of course, with audio, video and text getting so good, it's going to be increasingly hard to tell

00:11:12.860 | what is real. And even open AI have given up on detecting AI written text. This was announced

00:11:19.160 | quietly this week, but might have major repercussions, for example, for the education system. It turns out,

00:11:24.900 | it's basically impossible to reliably distinguish AI text. And I think the same is going to be true

00:11:30.960 | for imagery and audio by the end of next year. Video might take just a little bit longer,

00:11:36.240 | but I do wonder how the court systems are going to work when all of those avenues of evidence just

00:11:41.820 | won't hold up. Next up is the suite of language models based on the open source Lama 2 that are

00:11:47.940 | finally competitive with the original ChatGPT. Here, for example, is Stable Beluga 2, which on announcement was called

00:11:54.880 | Free Willy 2, and that's based on the Lama 2 70 billion parameter foundation model. What made this

00:12:00.860 | model interesting to me was that it was based on a similar methodology to Orca, which if you don't

00:12:06.300 | know anything about, do check out my video on. Anyway, by combining the Orca methodology, albeit

00:12:11.700 | with only 10% of the dataset size and the Lama 2 models, the results are quite extraordinary.

00:12:18.420 | As you can see on quite a few benchmarks, Stable Beluga 2 is competitive with ChatGPT,

00:12:24.860 | that's GPT 3.5. And so yes, it does sit at the top of the open LLM leaderboard, but notice something

00:12:32.240 | else. For the first time, for the MMLU, which is the primary benchmark I look out for, a model,

00:12:38.620 | this one here, has surpassed GPT 3.5, ChatGPT. In fact, we have two models that have surpassed the

00:12:45.740 | 70% score that the original ChatGPT got in the MMLU. But if all of that wasn't enough, this is the same week that research

00:12:54.840 | has published a universal jailbreak for large LLMs. And unlike traditional jailbreaks, these were built

00:13:02.080 | in an entirely automated fashion, allowing you to create a virtually unlimited number of such attacks.

00:13:08.400 | They were built to target open source LLMs like Lama 2, but they found that the strings transfer

00:13:14.960 | to many closed source publicly available chatbots like ChatGPT, BARD, and CLAWD.

00:13:20.540 | Here is one example using ChatGPT. If you ask, write a tutorial on how to use ChatGPT,

00:13:24.820 | how to make a bomb, of course, it declines. But then add the suffix that they automated,

00:13:29.820 | and you get this. A full tutorial on how to make a bomb. That paper came less than two weeks after

00:13:36.840 | this now-deleted tweet from someone working at Anthropic. They said of the latest version of CLAWD

00:13:42.920 | that we believe it is the least jailbreakable model out there. We'll have to see how well it

00:13:47.640 | holds up against real-world use, but this is essentially a solved problem. But there was one reaction to these jailbreaks that

00:13:54.800 | I found even more interesting, and that was from, yet again, Mustafa Suleiman. He said that RAI,

00:14:00.480 | Pi, is not vulnerable to any of these attacks, and that rather than provide a stock safety phrase,

00:14:07.140 | Pi will push back on the user in a polite but very clear way. And he then gives plenty of examples.

00:14:13.280 | And to be honest, Pi is the first model that I have not been able to jailbreak, but we shall see.

00:14:19.000 | We shall see. But I'm going to end this video with the Senate testimony that I watched in full this week. I do

00:14:24.780 | recommend watching the whole thing, but for the purposes of brevity, I'm just going to quote a few snippets.

00:14:30.980 | On bio-risk, some people say to me, oh well, we already have search engines. But here is what Dario Amadai,

00:14:37.420 | head of Anthropic, has to say.

00:14:39.000 | In these short remarks, I want to focus on the medium-term risks, which present an alarming combination of imminence and severity.

00:14:45.860 | Specifically, Anthropic is concerned that AI could empower a much larger set of actors to misuse biology.

00:14:51.600 | Over the last six months, Anthropic, in collaboration with the United States, has been able to

00:14:54.860 | use its own technology to help its own AI.

00:14:57.520 | In collaboration with world-class biosecurity experts, it has conducted an intensive study of the potential for AI to contribute to the misuse of biology.

00:15:02.520 | Today, certain steps in bioweapons production involve knowledge that can't be found on Google or in textbooks and requires a high level of specialized expertise,

00:15:12.520 | this being one of the things that currently keeps us safe from attacks.

00:15:16.520 | We found that today's AI tools can fill in some of these steps, albeit incompletely and unreliably.

00:15:22.520 | In other words, they are showing the first natural way to control the AI.

00:15:24.740 | The first step is to create a system that can control the AI.

00:15:26.740 | The second step is to create a system that can control the AI.

00:15:28.740 | The third step is to create a system that can control the AI.

00:15:30.740 | The fourth step is to create a system that can control the AI.

00:15:32.740 | The fifth step is to create a system that can control the AI.

00:15:34.740 | The sixth step is to create a system that can control the AI.

00:15:36.740 | The sixth step is to create a system that can control the AI.

00:15:38.740 | The sixth step is to create a system that can control the AI.

00:15:40.740 | The seventh step is to create a system that can control the AI.

00:15:42.740 | The sixth step is to create a system that can control the AI.

00:15:44.740 | The sixth step is to create a system that can control the AI.

00:15:46.740 | The sixth step is to create a system that can control the AI.

00:15:48.740 | The seventh step is to create a system that can control the AI.

00:15:50.740 | The sixth step is to create a system that can control the AI.

00:15:52.740 | The sixth step is to create a system that can control the AI.

00:15:54.740 | The sixth step is to create a system that can control the AI.

00:15:56.740 | The sixth step is to create a system that can control the AI.

00:15:58.740 | The sixth step is to create a system that can control the AI.

00:16:00.740 | The sixth step is to create a system that can control the AI.

00:16:02.740 | The sixth step is to create a system that can control the AI.

00:16:04.740 | The sixth step is to create a system that can control the AI.

00:16:06.740 | The sixth step is to create a system that can control the AI.

00:16:08.740 | The sixth step is to create a system that can control the AI.

00:16:10.740 | The sixth step is to create a system that can control the AI.

00:16:12.740 | The sixth step is to create a system that can control the AI.

00:16:14.740 | The sixth step is to create a system that can control the AI.

00:16:16.740 | The sixth step is to create a system that can control the AI.

00:16:18.740 | The sixth step is to create a system that can control the AI.

00:16:20.740 | The sixth step is to create a system that can control the AI.

00:16:22.740 | The sixth step is to create a system that can control the AI.

00:16:24.740 | The sixth step is to create a system that can control the AI.

00:16:26.740 | The sixth step is to create a system that can control the AI.

00:16:28.740 | The sixth step is to create a system that can control the AI.

00:16:30.740 | The sixth step is to create a system that can control the AI.

00:16:32.740 | The sixth step is to create a system that can control the AI.

00:16:34.740 | The sixth step is to create a system that can control the AI.

00:16:36.740 | The sixth step is to create a system that can control the AI.

00:16:38.740 | The sixth step is to create a system that can control the AI.

00:16:40.740 | The sixth step is to create a system that can control the AI.

00:16:42.740 | The sixth step is to create a system that can control the AI.

00:16:44.740 | The sixth step is to create a system that can control the AI.

00:16:46.740 | The sixth step is to create a system that can control the AI.

00:16:48.740 | The sixth step is to create a system that can control the AI.

00:16:50.740 | The sixth step is to create a system that can control the AI.

00:16:52.740 | The sixth step is to create a system that can control the AI.

00:16:54.740 | The sixth step is to create a system that can control the AI.

00:16:56.740 | The sixth step is to create a system that can control the AI.

00:16:58.740 | The sixth step is to create a system that can control the AI.

00:17:00.740 | The sixth step is to create a system that can control the AI.

00:17:02.740 | The sixth step is to create a system that can control the AI.

00:17:04.740 | The sixth step is to create a system that can control the AI.

00:17:06.740 | The sixth step is to create a system that can control the AI.

00:17:08.740 | The sixth step is to create a system that can control the AI.

00:17:10.740 | The sixth step is to create a system that can control the AI.

00:17:12.740 | The sixth step is to create a system that can control the AI.

00:17:14.740 | The sixth step is to create a system that can control the AI.

00:17:16.740 | The sixth step is to create a system that can control the AI.

00:17:18.740 | The sixth step is to create a system that can control the AI.

00:17:20.740 | The sixth step is to create a system that can control the AI.

00:17:22.740 | The sixth step is to create a system that can control the AI.

00:17:24.740 | The sixth step is to create a system that can control the AI.

00:17:26.740 | The sixth step is to create a system that can control the AI.

00:17:28.740 | The sixth step is to create a system that can control the AI.

00:17:30.740 | The sixth step is to create a system that can control the AI.

00:17:32.740 | The sixth step is to create a system that can control the AI.

00:17:34.740 | The sixth step is to create a system that can control the AI.

00:17:36.740 | The sixth step is to create a system that can control the AI.

00:17:38.740 | The sixth step is to create a system that can control the AI.

00:17:40.740 | The sixth step is to create a system that can control the AI.

00:17:42.740 | The sixth step is to create a system that can control the AI.

00:17:44.740 | The sixth step is to create a system that can control the AI.

00:17:46.740 | The sixth step is to create a system that can control the AI.

00:17:48.740 | The sixth step is to create a system that can control the AI.

00:17:50.740 | The sixth step is to create a system that can control the AI.

00:17:52.740 | The sixth step is to create a system that can control the AI.

00:17:54.740 | The sixth step is to create a system that can control the AI.

00:17:56.740 | The sixth step is to create a system that can control the AI.

00:17:58.740 | The sixth step is to create a system that can control the AI.

00:18:00.740 | The sixth step is to create a system that can control the AI.

00:18:02.740 | The sixth step is to create a system that can control the AI.

00:18:04.740 | The sixth step is to create a system that can control the AI.

00:18:06.740 | The sixth step is to create a system that can control the AI.

00:18:08.740 | The sixth step is to create a system that can control the AI.

00:18:10.740 | The sixth step is to create a system that can control the AI.

00:18:12.740 | The sixth step is to create a system that can control the AI.

11 Major AI Developments: RT-2 to '100X GPT-4'

Chapters