back to indexWhy Does OpenAI Need a 'Stargate' Supercomputer? Ft. Perplexity CEO Aravind Srinivas
00:00:00.000 |
Why does OpenAI need Microsoft to build a $100 billion Stargate supercomputer? 00:00:07.520 |
I'm going to try to give you the answer to that question in this video, 00:00:11.120 |
which in turn will give you insight into the next one to four years of AI development. 00:00:16.160 |
I'll also draw on a discussion I had last night with the Perplexity founder and former OpenAI 00:00:21.920 |
researcher Aravind Srinivas about what kind of breakthroughs this will lead to 00:00:26.480 |
and what AGI timelines he now has. And no, this is not just about Sora 00:00:31.600 |
and the OpenAI voice engine. This is about manufacturing intelligence at hard to imagine 00:00:37.440 |
scales. This report, by the way, from the information came from three different sources, 00:00:42.560 |
one of whom spoke to Sam Altman and another who viewed some of Microsoft's initial cost estimates. 00:00:48.720 |
And to give you some context, if that data center were a country, then its cost as a GDP would make 00:00:55.040 |
it the 64th richest country in the world. This supercomputer would likely be based 00:01:00.160 |
in the desert somewhere in the US and would launch around 2028. 00:01:04.800 |
Some other stages of the wider plan, though, will come online as soon as this year. 00:01:10.480 |
And again, before we get to why they're doing this, let me give you a sense of the scale. 00:01:15.760 |
The Stargate supercomputer would produce orders of magnitude more computing power 00:01:21.440 |
than what Microsoft currently supplies to OpenAI. Notice the plural orders of magnitude. 00:01:27.040 |
An order of magnitude is a 10x increase, so orders of magnitude would be at least a 100x increase. 00:01:34.320 |
And to give you one little spoiler, more computing power more or less directly 00:01:37.840 |
correlates to increased capabilities for the frontier AI models. 00:01:42.400 |
In even simpler terms, a hundred times more is a lot. 00:01:46.480 |
But why did that sentence begin with an if? If Stargate moves forward? 00:01:50.640 |
Well, the previous paragraph said this. Microsoft's willingness to go ahead with 00:01:55.040 |
the Stargate plan depends in part on OpenAI's ability to meaningfully improve the capabilities 00:02:02.000 |
of its AI. Whether that hinges on GPT 4.5 likely coming in the spring or GPT 5, 00:02:08.960 |
which many people are now agreeing with me will come at the end of this year or possibly the 00:02:13.520 |
beginning of next, we don't know. My prediction, by the way, is that OpenAI will meaningfully 00:02:18.400 |
improve the capabilities of its AI, and part of my proof is in this video, and therefore Stargate 00:02:23.440 |
will go ahead. One source said that such a project is absolutely required for artificial 00:02:28.800 |
general intelligence. That's the kind of intelligence that you would feel comfortable 00:02:32.400 |
hiring for most jobs. And the timelines for this data center dovetail quite nicely with my own 00:02:38.400 |
prediction for the first demonstration of an artificial general intelligence system. 00:02:44.000 |
Now, I know many of you will react to that and say AGI is definitely coming this year. Of course, 00:02:48.880 |
it depends on definitions, but let me give you a word from Aravind Srinivas, 00:02:53.360 |
the founder of the newly minted Unicorn Perplexity. 00:02:57.280 |
That's why you should always ask, okay, if you are actually really close to AGI, 00:03:00.880 |
if it is the case that AGI is five years away, why are you hiring so many people right now? 00:03:05.040 |
If we are really truly getting close to AGI, why are you not benefiting from AGI yourself? 00:03:09.360 |
What, like opening eyes, hiring 30 people or like 50 people a month, 100 people a month, 00:03:14.240 |
at that rate, they're going to hire like thousands a year. And over five years, 00:03:17.120 |
they would have had a company with 5,000, 10,000 employees. So why couldn't you do with 100 if AGI 00:03:22.160 |
is truly there? How many people do you really need anymore? These are the kind of questions 00:03:25.600 |
you should ask. And honestly, like someone has to physically go and maintain the cluster, 00:03:29.680 |
make these decisions on which GPUs to use, what happens when these nodes fail, 00:03:33.520 |
like systems crash, and write all these heuristic rules to deal with all these things. 00:03:37.360 |
If something goes wrong in production code, like who has to go and work on the backend servers, 00:03:41.840 |
can all these be done by an AI now? Obviously not. Every time the definition of AGI gets 00:03:46.320 |
very narrow and narrow, and it feels like narrow AI and not AGI. You see my point? 00:03:50.400 |
You should ask Dan, when will we not have an executive assistant? And that maybe that day, 00:03:55.120 |
we can call, we have something like an AGI. Back to the article though, and let me do my 00:03:58.800 |
first mini detour. I noticed a slight mathematical discrepancy in that this data center, Stargate, 00:04:04.880 |
will produce orders of magnitude, as I said, 100X more computing power. But in terms of actual 00:04:10.240 |
energy, it will need the same amount of watts as what's needed to run several large data centers 00:04:16.320 |
today. Now, of course, that's a lot, but wouldn't you need even more power than that to run 00:04:20.880 |
something that's going to give us 100X at least more computing power? Well, just for a few seconds, 00:04:26.560 |
let me bring you this chart from the chairman of TSMC. That's the company that makes around 90% 00:04:32.720 |
of the world's most advanced chips. And one key number comes at the top, the energy efficient 00:04:37.760 |
performance improves 3X every two years. So straight from TSMC, we get the projection 00:04:44.320 |
that in four years, 2028, chips will be almost 10 times more energy efficient. I thought that's 00:04:50.480 |
super interesting, but in case you're getting a little bit bored, where did the name Stargate 00:04:54.400 |
come from? Well, the codename originated with OpenAI named for the sci-fi film in which scientists 00:05:00.240 |
develop a device for traveling between galaxies. And I actually agree that the arrival of AGI will 00:05:05.920 |
be like humanity stepping through a portal, can't go back and the world will be changed forever. 00:05:11.840 |
But I know some of you are thinking, didn't Philip promise to say why they're building Stargate, 00:05:16.720 |
not just describe how they're building it. So let me get to the first reason, they're doing it 00:05:22.160 |
to match Google. Sam Altman has said privately that Google, one of OpenAI's biggest rivals, 00:05:27.600 |
will have more computing capacity than OpenAI in the near term. And he's also complained publicly 00:05:33.120 |
about not having as many AI server chips as he'd like. This insider chart from Semianalysis gives 00:05:39.440 |
us a glimpse of the scale of that discrepancy. Here we are newly arriving into quarter two of 2024, 00:05:46.880 |
and apparently the discrepancy is pretty stark between Google's capacity and OpenAI's. In the 00:05:52.800 |
words of Dylan Patel, Google's compute capabilities make everyone else look silly. Indeed, I remember 00:05:59.040 |
around a year ago when I said that it's likely Google who are on course to create AGI first, 00:06:04.880 |
many people laughed and said, just look at Bard. But I likened Google and Google DeepMind specifically 00:06:10.800 |
as like being a woken giant. We have started to glimpse the power of Gemini 1.5 and Gemini 2 is 00:06:18.240 |
likely coming in June. And if you didn't realize how dependent OpenAI are on Microsoft to compete 00:06:23.600 |
with Google, how about this? The CEO of Microsoft, Satya Nadella, recently boasted that it would not 00:06:29.600 |
matter if OpenAI disappeared tomorrow. We have all of the intellectual property rights and all of the 00:06:35.680 |
capability. We have the people, we have the compute, and we have the data, we have everything. 00:06:41.280 |
We are below them, above them, and around them. It isn't only about personnel and clever algorithms, 00:06:46.960 |
it's about supercomputers, it's about Stargate. Okay, so it's to match Google, but what is the 00:06:52.080 |
next reason for building Stargate? Well, it would be to build models like GPT-7, 7.5, and 8. And yes, 00:06:59.920 |
I am well aware that we don't even have GPT-4.5, so why am I even talking about GPT-7? Well, 00:07:06.000 |
GPT-5, according to my own research, which I published in a video, is likely training around 00:07:11.120 |
now. In fact, probably finished around now. Of course, that doesn't mean we're going to get it 00:07:15.280 |
around now. They're going to release smaller versions like GPT-4.5 and they're going to do 00:07:19.520 |
safety testing. But that's the full GPT-5 likely coming at the end of this year or the beginning 00:07:24.320 |
of next. That's trained on current generation hardware, I would say maybe a hundred thousand 00:07:29.200 |
H100s. But this year and next year, the report says, Microsoft has planned to provide OpenAI 00:07:35.440 |
with servers housing hundreds of thousands of GPUs in total. And one former Googler and director 00:07:42.080 |
Y Combinator leaked this. He spoke to a Microsoft engineer on the GPT-6 training cluster project. 00:07:49.360 |
That engineer apparently complained about the pain they were having essentially setting up 00:07:53.360 |
links between GPUs in different regions. And naturally he asked, why not just locate the 00:07:58.320 |
cluster in one region? And the Microsoft employee said, oh yeah, we tried that first. We can't put 00:08:04.320 |
more than a hundred thousand H100s in a single state without bringing down the power grid. So 00:08:10.480 |
clearly it's going to be multiple hundred thousand H100s or B100s. Check out my previous video for 00:08:17.120 |
GPT-6. But then we have a smaller phase four supercomputer for OpenAI that aims to launch 00:08:22.960 |
around 2026. Now, of course, the naming schemes might go out the window by this point, but you 00:08:28.320 |
can see why I think that the Stargate supercomputer for 2028 might be GPT-7.5, GPT-8. And it's not 00:08:36.160 |
like OpenAI aren't repeatedly telling us that scale is the way to get to AGI. Here's one of 00:08:42.400 |
their star researchers, Noam Brown saying recently that he wished every AI startup founder would read 00:08:48.320 |
the bitter lesson. Now I might do a video on that essay someday, but basically it says that it's not 00:08:53.440 |
about encoding human expert knowledge into the model. It's about building relatively simple 00:08:58.160 |
algorithms and then just scaling them up as much as you can. It's a bitter lesson because human 00:09:03.200 |
expertise and data become progressively less relevant to the model's performance. Just like 00:09:08.240 |
our bitter experience of seeing AlphaGo, which was trained in part on human expert performance 00:09:13.440 |
in Go, being superseded by AlphaZero, which wasn't, likewise for human data on the path to AGI. 00:09:21.200 |
Here's Andrej Karpathy, until fairly recently, a star OpenAI researcher speaking about a week ago. 00:09:26.800 |
Because the current models are just like not good enough. And I think there are big rocks to be 00:09:30.720 |
turned here. And I think people still haven't really seen what's possible in this space at all. 00:09:36.720 |
And roughly speaking, I think we've done step one of AlphaGo. We've done the imitation learning part. 00:09:41.280 |
There's step two of AlphaGo, which is the RL. And people haven't done that yet. And I think it's 00:09:46.800 |
going to fundamentally, this is the part that actually made it work and made something super 00:09:50.240 |
human. But I think we just haven't done step two of AlphaGo, long story short. And we've just done 00:09:54.560 |
imitation. And I don't think that people appreciate, number one, how terrible the data collection is 00:09:58.880 |
for things like ChashAPT. Say you have a problem, like some prompt is some kind of a mathematical 00:10:02.960 |
problem. A human comes in and gives the ideal solution to that problem. The problem is that 00:10:08.560 |
the human psychology is different from the model psychology. What's easy or hard for the human are 00:10:13.760 |
different to what's easy or hard for the model. And so human kind of fills out some kind of a 00:10:18.320 |
trace that comes to the solution. But some parts of that are trivial to the model. And some parts 00:10:23.280 |
of that are a massive leap that the model doesn't understand. You're kind of just losing it. And 00:10:26.480 |
then everything else is polluted by that later. And so fundamentally what you need is the model 00:10:31.360 |
needs to practice itself how to solve these problems. It needs to figure out what works for 00:10:37.280 |
it or does not work for it. But it needs to learn that for itself based on its own capability and 00:10:41.120 |
its own knowledge. So that's number one. That's totally broken, I think. It's a good initializer, 00:10:45.280 |
though, for something agent-like. And then the other thing is we're doing reinforcement learning 00:10:48.560 |
from human feedback. But that's like a super weak form of reinforcement learning. It doesn't 00:10:52.560 |
even count as reinforcement learning, I think. So RLHF is like nowhere near, I would say, RL. 00:10:57.280 |
It's like silly. And the other thing is imitation learning is super silly. RLHF is a nice improvement, 00:11:02.480 |
but it's still silly. And I think people need to look for better ways of training these models 00:11:07.280 |
so that it's in the loop with itself and in some psychology. And I think there will probably be 00:11:11.920 |
unlocks in that direction. This echoes, again, Noam Brown, who I believe is working on OpenAI's 00:11:17.040 |
Q* system, who said, "You don't get superhuman performance by doing better imitation learning 00:11:22.640 |
on human data." And that brings me nicely to the third reason for building Stargate, 00:11:28.240 |
doing longer inference, aka letting the models think for longer before they output a response. 00:11:34.800 |
In the case of AlphaGo, allowing the models to ponder or think for a minute improved the systems 00:11:40.240 |
by the equivalent of scaling those systems by 100,000x. Or in other words, GPT-5 might be 00:11:47.120 |
reminiscent of GPT-6 if we let it think for a minute, let it think for hours and hours or even 00:11:53.680 |
days, and we might get a new cancer drug. And before you immediately say he's just getting 00:11:58.560 |
silly now, well, check out this article from The Economist, "AI is taking over drug development." 00:12:03.920 |
Of course, there is way more detail and nuance than I can get to in this video, 00:12:07.680 |
but the conclusion was this. Generative AI and systems like AlphaFold are already significantly 00:12:14.160 |
accelerating biotechnology. And we will see in the next few years whether that will bring us 00:12:19.200 |
usable drugs. The analysts, they say, at Boston Consulting Group, see signs of a fast approaching 00:12:24.960 |
AI-enabled wave of new drugs. Indeed, drug regulators will need to up their game to 00:12:29.680 |
meet the challenge. It would be a good problem for the world to have. Of course, I asked the 00:12:34.240 |
Perplexity CEO about Q* and his predictions of the impacts of that system this year. But first, 00:12:41.120 |
a 30-second plug for AI Insiders. That's my Patreon, link in the description, where first 00:12:46.400 |
of all, you get exclusive videos. This one from a few days ago, I am particularly proud of. I 00:12:51.360 |
analyzed a 44-page new report on the so-called AI jobs apocalypse, and within 36 hours, I had 00:12:57.440 |
interviewed the author and produced this video. Trust me, I definitely dig beyond the headlines. 00:13:02.400 |
On Insiders, you can also ask questions of my forthcoming guests, and I used many of the 00:13:07.040 |
questions from Insiders when I interviewed Aravind. Our Discord, I'm proud to say, 00:13:11.120 |
also has a ton of professional best practice sharing across dozens of professions and fields. 00:13:17.600 |
Just a few hours ago, we got a new expert-led forum on semiconductors and hardware. And just 00:13:23.440 |
a few days before that, a new forum on alignment led by a Googler. We also have regional networking 00:13:29.840 |
across Europe and North America. But here's Aravind on what he believes Q* is and how soon 00:13:35.680 |
it's coming. So if you just clean up all the internet data and teach these models to go 00:13:40.240 |
through reasoning chains before writing an answer, they're going to get a lot more reliable. And then 00:13:45.840 |
you can think of models that can search over the chain of thought before giving you an answer, 00:13:49.680 |
rather than decoding a single chain of thought, this whole tree of thought concept. And then you 00:13:54.400 |
can extend that to thinking of models that will have a search over a tree and identify several 00:13:59.680 |
chains and look at the most plausible explanation based on the probabilities. Almost like how a 00:14:05.280 |
player in a Go or Chess match reasons through several different branches of moves and picks 00:14:10.720 |
the one that has the highest odds of success at winning the game. You can think of the inference 00:14:14.400 |
time itself going up. Right now, you use the system to chat GPT, just respond in a few seconds. 00:14:19.520 |
What if AIs are decoding with these really giant models, even bigger than GPT-4, going through 00:14:25.760 |
several reasons, chains of reasoning, several layers of depth in it, and comes back to you 00:14:31.040 |
after an hour with something that feels incredibly insightful. Now, this could be called an AGI by 00:14:37.040 |
some people. I'm sure Dennis or Sam would call this an AGI if it works, because the definition 00:14:41.840 |
that they would use here is something that truly surprises humans. Marvelous things. It feels like 00:14:47.520 |
AlphaGo or something where it's not something most humans would be able to come up with. It requires 00:14:52.480 |
several hours of thinking. So maybe we'll go far along those dimensions, might not replace our 00:14:57.600 |
executive assistants or sales and marketing or designers or programmers, but might feel like a 00:15:02.960 |
10x programmer, might feel like a 10x marketer. And I think that could happen. And that could 00:15:07.680 |
be a dimension where we see AGI progress in the near term. And I see maybe some breakthrough like 00:15:12.800 |
that's happening in '24. So far, nothing. But at least by this time next year, I think something 00:15:19.200 |
like that will be possible. You'll see a demo where it doesn't respond immediately, but it thinks 00:15:23.200 |
for quite a long time and gets back with a really cool response. And I can't help but point out that 00:15:27.520 |
if you watch my Q* video, I said one of the stars of the new system was Lucas Kaiser, one of the 00:15:33.440 |
co-authors of the original Transformers paper. And I would note that in this week's Wired interview, 00:15:38.160 |
when he was asked about Q*, the OpenAI PR person almost leapt across the table to silence him. 00:15:44.800 |
I definitely think I was onto something. And believe it or not, there is actually a fourth 00:15:49.120 |
reason for a Stargate-like supercomputer dominating different modalities, whether 00:15:54.640 |
that's audio, video, or even embedded in robotics, as we saw in my last video. But let's just take 00:16:00.240 |
audio and video. We learned a few days ago that OpenAI have had their voice engine system 00:16:05.920 |
since 2022. Basically, you can feed it 15 seconds of someone's voice and it can then 00:16:12.080 |
imitate that voice with high fidelity. Now, if you have lost your voice due to illness, 00:16:17.440 |
this is simply incredible. And I've already demonstrated what 11Labs can do before on this 00:16:22.720 |
channel. But of course, a system like this comes with risks. This is how good the system was at 00:16:28.320 |
imitating your voice two years ago. Here's the real person's voice. "Force is a push or pull 00:16:35.120 |
that can make an object move, stop, or change direction. Imagine you're riding a bike down a 00:16:41.440 |
hill. First, the push you give off the ground is the force that gets you going." And here is the 00:16:48.000 |
generated audio of that person saying whatever you'd like. In this case, let's take biology. 00:16:54.080 |
"Some of the most amazing habitats on Earth are found in the rainforest. A rainforest is a place 00:16:59.520 |
with a lot of precipitation, and it has many kinds of animals, trees, and other plants. 00:17:05.840 |
Tropical rainforests are usually not too far from the equator and are warm all year." 00:17:10.960 |
And as Noam Brown has said, yes him again, if you haven't disabled voice authentication for 00:17:15.920 |
your bank account and had a conversation with your family about AI voice impersonation yet, 00:17:20.720 |
now would be a good time. My only question is what are banks going to use? Not your voice and 00:17:26.400 |
definitely not your handwriting. As I talked about back in January, AI can mimic your handwriting 00:17:31.680 |
perfectly. And not your face, right? Because we all know about deepfakes. Well, maybe a video of 00:17:37.360 |
you, but I think almost all of us know about the progress that's being made in photorealistic 00:17:43.120 |
text to video. I'm going to show you an extract from what I think is actually quite a beautiful 00:17:48.160 |
video prompted by an artist, but generated by Sora from OpenAI. "Literally filled with hot air. 00:17:54.640 |
Yeah, living like this has its challenges. Uh, windy days for one are particularly troublesome. 00:18:01.040 |
Well, there was a one time my girlfriend insisted I go to the cactus store to get my uncle Jerry a 00:18:06.240 |
wedding present. What do I love most about my predicament? The perspective it gives me, 00:18:13.760 |
you know, I get to see the world differently. I float above the mundane and the ordinary. I 00:18:18.640 |
see things a different way from everyone else." The creators, if that's the right word to use, 00:18:23.360 |
of that clip were Shy Kids. The company, not the children, of course. They said, 00:18:28.000 |
"As great as Sora is at generating things that appear real, what excites us is its ability to 00:18:33.600 |
make things that are totally surreal." And that's a tough one, isn't it? Because I think that clip 00:18:37.920 |
really showcases how you can be creative with AI. I would indeed call that art, but I can easily see 00:18:44.400 |
the risks to the economic value of artist's work at the same time. Here's how the term AI went down 00:18:50.720 |
at one recent artist conference and festival. But let me know what you think, not only about 00:19:14.800 |
AI's impact on art, but whether you agree with me that we are, in a sense, going through a Stargate. 00:19:21.440 |
We don't know how the world will be transformed by AGI. And when it's created and we've stepped 00:19:27.120 |
through the portal, it's hard to see a way back. I'd love to know your thoughts and as ever, 00:19:32.480 |
thank you so much for watching to the end and have a wonderful day.