Back to Index

Why Does OpenAI Need a 'Stargate' Supercomputer? Ft. Perplexity CEO Aravind Srinivas


Transcript

Why does OpenAI need Microsoft to build a $100 billion Stargate supercomputer? I'm going to try to give you the answer to that question in this video, which in turn will give you insight into the next one to four years of AI development. I'll also draw on a discussion I had last night with the Perplexity founder and former OpenAI researcher Aravind Srinivas about what kind of breakthroughs this will lead to and what AGI timelines he now has.

And no, this is not just about Sora and the OpenAI voice engine. This is about manufacturing intelligence at hard to imagine scales. This report, by the way, from the information came from three different sources, one of whom spoke to Sam Altman and another who viewed some of Microsoft's initial cost estimates.

And to give you some context, if that data center were a country, then its cost as a GDP would make it the 64th richest country in the world. This supercomputer would likely be based in the desert somewhere in the US and would launch around 2028. Some other stages of the wider plan, though, will come online as soon as this year.

And again, before we get to why they're doing this, let me give you a sense of the scale. The Stargate supercomputer would produce orders of magnitude more computing power than what Microsoft currently supplies to OpenAI. Notice the plural orders of magnitude. An order of magnitude is a 10x increase, so orders of magnitude would be at least a 100x increase.

And to give you one little spoiler, more computing power more or less directly correlates to increased capabilities for the frontier AI models. In even simpler terms, a hundred times more is a lot. But why did that sentence begin with an if? If Stargate moves forward? Well, the previous paragraph said this.

Microsoft's willingness to go ahead with the Stargate plan depends in part on OpenAI's ability to meaningfully improve the capabilities of its AI. Whether that hinges on GPT 4.5 likely coming in the spring or GPT 5, which many people are now agreeing with me will come at the end of this year or possibly the beginning of next, we don't know.

My prediction, by the way, is that OpenAI will meaningfully improve the capabilities of its AI, and part of my proof is in this video, and therefore Stargate will go ahead. One source said that such a project is absolutely required for artificial general intelligence. That's the kind of intelligence that you would feel comfortable hiring for most jobs.

And the timelines for this data center dovetail quite nicely with my own prediction for the first demonstration of an artificial general intelligence system. Now, I know many of you will react to that and say AGI is definitely coming this year. Of course, it depends on definitions, but let me give you a word from Aravind Srinivas, the founder of the newly minted Unicorn Perplexity.

That's why you should always ask, okay, if you are actually really close to AGI, if it is the case that AGI is five years away, why are you hiring so many people right now? If we are really truly getting close to AGI, why are you not benefiting from AGI yourself?

What, like opening eyes, hiring 30 people or like 50 people a month, 100 people a month, at that rate, they're going to hire like thousands a year. And over five years, they would have had a company with 5,000, 10,000 employees. So why couldn't you do with 100 if AGI is truly there?

How many people do you really need anymore? These are the kind of questions you should ask. And honestly, like someone has to physically go and maintain the cluster, make these decisions on which GPUs to use, what happens when these nodes fail, like systems crash, and write all these heuristic rules to deal with all these things.

If something goes wrong in production code, like who has to go and work on the backend servers, can all these be done by an AI now? Obviously not. Every time the definition of AGI gets very narrow and narrow, and it feels like narrow AI and not AGI. You see my point?

You should ask Dan, when will we not have an executive assistant? And that maybe that day, we can call, we have something like an AGI. Back to the article though, and let me do my first mini detour. I noticed a slight mathematical discrepancy in that this data center, Stargate, will produce orders of magnitude, as I said, 100X more computing power.

But in terms of actual energy, it will need the same amount of watts as what's needed to run several large data centers today. Now, of course, that's a lot, but wouldn't you need even more power than that to run something that's going to give us 100X at least more computing power?

Well, just for a few seconds, let me bring you this chart from the chairman of TSMC. That's the company that makes around 90% of the world's most advanced chips. And one key number comes at the top, the energy efficient performance improves 3X every two years. So straight from TSMC, we get the projection that in four years, 2028, chips will be almost 10 times more energy efficient.

I thought that's super interesting, but in case you're getting a little bit bored, where did the name Stargate come from? Well, the codename originated with OpenAI named for the sci-fi film in which scientists develop a device for traveling between galaxies. And I actually agree that the arrival of AGI will be like humanity stepping through a portal, can't go back and the world will be changed forever.

But I know some of you are thinking, didn't Philip promise to say why they're building Stargate, not just describe how they're building it. So let me get to the first reason, they're doing it to match Google. Sam Altman has said privately that Google, one of OpenAI's biggest rivals, will have more computing capacity than OpenAI in the near term.

And he's also complained publicly about not having as many AI server chips as he'd like. This insider chart from Semianalysis gives us a glimpse of the scale of that discrepancy. Here we are newly arriving into quarter two of 2024, and apparently the discrepancy is pretty stark between Google's capacity and OpenAI's.

In the words of Dylan Patel, Google's compute capabilities make everyone else look silly. Indeed, I remember around a year ago when I said that it's likely Google who are on course to create AGI first, many people laughed and said, just look at Bard. But I likened Google and Google DeepMind specifically as like being a woken giant.

We have started to glimpse the power of Gemini 1.5 and Gemini 2 is likely coming in June. And if you didn't realize how dependent OpenAI are on Microsoft to compete with Google, how about this? The CEO of Microsoft, Satya Nadella, recently boasted that it would not matter if OpenAI disappeared tomorrow.

We have all of the intellectual property rights and all of the capability. We have the people, we have the compute, and we have the data, we have everything. We are below them, above them, and around them. It isn't only about personnel and clever algorithms, it's about supercomputers, it's about Stargate.

Okay, so it's to match Google, but what is the next reason for building Stargate? Well, it would be to build models like GPT-7, 7.5, and 8. And yes, I am well aware that we don't even have GPT-4.5, so why am I even talking about GPT-7? Well, GPT-5, according to my own research, which I published in a video, is likely training around now.

In fact, probably finished around now. Of course, that doesn't mean we're going to get it around now. They're going to release smaller versions like GPT-4.5 and they're going to do safety testing. But that's the full GPT-5 likely coming at the end of this year or the beginning of next.

That's trained on current generation hardware, I would say maybe a hundred thousand H100s. But this year and next year, the report says, Microsoft has planned to provide OpenAI with servers housing hundreds of thousands of GPUs in total. And one former Googler and director Y Combinator leaked this. He spoke to a Microsoft engineer on the GPT-6 training cluster project.

That engineer apparently complained about the pain they were having essentially setting up links between GPUs in different regions. And naturally he asked, why not just locate the cluster in one region? And the Microsoft employee said, oh yeah, we tried that first. We can't put more than a hundred thousand H100s in a single state without bringing down the power grid.

So clearly it's going to be multiple hundred thousand H100s or B100s. Check out my previous video for GPT-6. But then we have a smaller phase four supercomputer for OpenAI that aims to launch around 2026. Now, of course, the naming schemes might go out the window by this point, but you can see why I think that the Stargate supercomputer for 2028 might be GPT-7.5, GPT-8.

And it's not like OpenAI aren't repeatedly telling us that scale is the way to get to AGI. Here's one of their star researchers, Noam Brown saying recently that he wished every AI startup founder would read the bitter lesson. Now I might do a video on that essay someday, but basically it says that it's not about encoding human expert knowledge into the model.

It's about building relatively simple algorithms and then just scaling them up as much as you can. It's a bitter lesson because human expertise and data become progressively less relevant to the model's performance. Just like our bitter experience of seeing AlphaGo, which was trained in part on human expert performance in Go, being superseded by AlphaZero, which wasn't, likewise for human data on the path to AGI.

Here's Andrej Karpathy, until fairly recently, a star OpenAI researcher speaking about a week ago. Because the current models are just like not good enough. And I think there are big rocks to be turned here. And I think people still haven't really seen what's possible in this space at all.

And roughly speaking, I think we've done step one of AlphaGo. We've done the imitation learning part. There's step two of AlphaGo, which is the RL. And people haven't done that yet. And I think it's going to fundamentally, this is the part that actually made it work and made something super human.

But I think we just haven't done step two of AlphaGo, long story short. And we've just done imitation. And I don't think that people appreciate, number one, how terrible the data collection is for things like ChashAPT. Say you have a problem, like some prompt is some kind of a mathematical problem.

A human comes in and gives the ideal solution to that problem. The problem is that the human psychology is different from the model psychology. What's easy or hard for the human are different to what's easy or hard for the model. And so human kind of fills out some kind of a trace that comes to the solution.

But some parts of that are trivial to the model. And some parts of that are a massive leap that the model doesn't understand. You're kind of just losing it. And then everything else is polluted by that later. And so fundamentally what you need is the model needs to practice itself how to solve these problems.

It needs to figure out what works for it or does not work for it. But it needs to learn that for itself based on its own capability and its own knowledge. So that's number one. That's totally broken, I think. It's a good initializer, though, for something agent-like. And then the other thing is we're doing reinforcement learning from human feedback.

But that's like a super weak form of reinforcement learning. It doesn't even count as reinforcement learning, I think. So RLHF is like nowhere near, I would say, RL. It's like silly. And the other thing is imitation learning is super silly. RLHF is a nice improvement, but it's still silly.

And I think people need to look for better ways of training these models so that it's in the loop with itself and in some psychology. And I think there will probably be unlocks in that direction. This echoes, again, Noam Brown, who I believe is working on OpenAI's Q* system, who said, "You don't get superhuman performance by doing better imitation learning on human data." And that brings me nicely to the third reason for building Stargate, doing longer inference, aka letting the models think for longer before they output a response.

In the case of AlphaGo, allowing the models to ponder or think for a minute improved the systems by the equivalent of scaling those systems by 100,000x. Or in other words, GPT-5 might be reminiscent of GPT-6 if we let it think for a minute, let it think for hours and hours or even days, and we might get a new cancer drug.

And before you immediately say he's just getting silly now, well, check out this article from The Economist, "AI is taking over drug development." Of course, there is way more detail and nuance than I can get to in this video, but the conclusion was this. Generative AI and systems like AlphaFold are already significantly accelerating biotechnology.

And we will see in the next few years whether that will bring us usable drugs. The analysts, they say, at Boston Consulting Group, see signs of a fast approaching AI-enabled wave of new drugs. Indeed, drug regulators will need to up their game to meet the challenge. It would be a good problem for the world to have.

Of course, I asked the Perplexity CEO about Q* and his predictions of the impacts of that system this year. But first, a 30-second plug for AI Insiders. That's my Patreon, link in the description, where first of all, you get exclusive videos. This one from a few days ago, I am particularly proud of.

I analyzed a 44-page new report on the so-called AI jobs apocalypse, and within 36 hours, I had interviewed the author and produced this video. Trust me, I definitely dig beyond the headlines. On Insiders, you can also ask questions of my forthcoming guests, and I used many of the questions from Insiders when I interviewed Aravind.

Our Discord, I'm proud to say, also has a ton of professional best practice sharing across dozens of professions and fields. Just a few hours ago, we got a new expert-led forum on semiconductors and hardware. And just a few days before that, a new forum on alignment led by a Googler.

We also have regional networking across Europe and North America. But here's Aravind on what he believes Q* is and how soon it's coming. So if you just clean up all the internet data and teach these models to go through reasoning chains before writing an answer, they're going to get a lot more reliable.

And then you can think of models that can search over the chain of thought before giving you an answer, rather than decoding a single chain of thought, this whole tree of thought concept. And then you can extend that to thinking of models that will have a search over a tree and identify several chains and look at the most plausible explanation based on the probabilities.

Almost like how a player in a Go or Chess match reasons through several different branches of moves and picks the one that has the highest odds of success at winning the game. You can think of the inference time itself going up. Right now, you use the system to chat GPT, just respond in a few seconds.

What if AIs are decoding with these really giant models, even bigger than GPT-4, going through several reasons, chains of reasoning, several layers of depth in it, and comes back to you after an hour with something that feels incredibly insightful. Now, this could be called an AGI by some people.

I'm sure Dennis or Sam would call this an AGI if it works, because the definition that they would use here is something that truly surprises humans. Marvelous things. It feels like AlphaGo or something where it's not something most humans would be able to come up with. It requires several hours of thinking.

So maybe we'll go far along those dimensions, might not replace our executive assistants or sales and marketing or designers or programmers, but might feel like a 10x programmer, might feel like a 10x marketer. And I think that could happen. And that could be a dimension where we see AGI progress in the near term.

And I see maybe some breakthrough like that's happening in '24. So far, nothing. But at least by this time next year, I think something like that will be possible. You'll see a demo where it doesn't respond immediately, but it thinks for quite a long time and gets back with a really cool response.

And I can't help but point out that if you watch my Q* video, I said one of the stars of the new system was Lucas Kaiser, one of the co-authors of the original Transformers paper. And I would note that in this week's Wired interview, when he was asked about Q*, the OpenAI PR person almost leapt across the table to silence him.

I definitely think I was onto something. And believe it or not, there is actually a fourth reason for a Stargate-like supercomputer dominating different modalities, whether that's audio, video, or even embedded in robotics, as we saw in my last video. But let's just take audio and video. We learned a few days ago that OpenAI have had their voice engine system since 2022.

Basically, you can feed it 15 seconds of someone's voice and it can then imitate that voice with high fidelity. Now, if you have lost your voice due to illness, this is simply incredible. And I've already demonstrated what 11Labs can do before on this channel. But of course, a system like this comes with risks.

This is how good the system was at imitating your voice two years ago. Here's the real person's voice. "Force is a push or pull that can make an object move, stop, or change direction. Imagine you're riding a bike down a hill. First, the push you give off the ground is the force that gets you going." And here is the generated audio of that person saying whatever you'd like.

In this case, let's take biology. "Some of the most amazing habitats on Earth are found in the rainforest. A rainforest is a place with a lot of precipitation, and it has many kinds of animals, trees, and other plants. Tropical rainforests are usually not too far from the equator and are warm all year." And as Noam Brown has said, yes him again, if you haven't disabled voice authentication for your bank account and had a conversation with your family about AI voice impersonation yet, now would be a good time.

My only question is what are banks going to use? Not your voice and definitely not your handwriting. As I talked about back in January, AI can mimic your handwriting perfectly. And not your face, right? Because we all know about deepfakes. Well, maybe a video of you, but I think almost all of us know about the progress that's being made in photorealistic text to video.

I'm going to show you an extract from what I think is actually quite a beautiful video prompted by an artist, but generated by Sora from OpenAI. "Literally filled with hot air. Yeah, living like this has its challenges. Uh, windy days for one are particularly troublesome. Well, there was a one time my girlfriend insisted I go to the cactus store to get my uncle Jerry a wedding present.

What do I love most about my predicament? The perspective it gives me, you know, I get to see the world differently. I float above the mundane and the ordinary. I see things a different way from everyone else." The creators, if that's the right word to use, of that clip were Shy Kids.

The company, not the children, of course. They said, "As great as Sora is at generating things that appear real, what excites us is its ability to make things that are totally surreal." And that's a tough one, isn't it? Because I think that clip really showcases how you can be creative with AI.

I would indeed call that art, but I can easily see the risks to the economic value of artist's work at the same time. Here's how the term AI went down at one recent artist conference and festival. But let me know what you think, not only about AI's impact on art, but whether you agree with me that we are, in a sense, going through a Stargate.

We don't know how the world will be transformed by AGI. And when it's created and we've stepped through the portal, it's hard to see a way back. I'd love to know your thoughts and as ever, thank you so much for watching to the end and have a wonderful day.