Why Does OpenAI Need a 'Stargate' Supercomputer? Ft. Perplexity CEO Aravind Srinivas

00:00:00.000 | Why does OpenAI need Microsoft to build a $100 billion Stargate supercomputer?

00:00:07.520 | I'm going to try to give you the answer to that question in this video,

00:00:11.120 | which in turn will give you insight into the next one to four years of AI development.

00:00:16.160 | I'll also draw on a discussion I had last night with the Perplexity founder and former OpenAI

00:00:21.920 | researcher Aravind Srinivas about what kind of breakthroughs this will lead to

00:00:26.480 | and what AGI timelines he now has. And no, this is not just about Sora

00:00:31.600 | and the OpenAI voice engine. This is about manufacturing intelligence at hard to imagine

00:00:37.440 | scales. This report, by the way, from the information came from three different sources,

00:00:42.560 | one of whom spoke to Sam Altman and another who viewed some of Microsoft's initial cost estimates.

00:00:48.720 | And to give you some context, if that data center were a country, then its cost as a GDP would make

00:00:55.040 | it the 64th richest country in the world. This supercomputer would likely be based

00:01:00.160 | in the desert somewhere in the US and would launch around 2028.

00:01:04.800 | Some other stages of the wider plan, though, will come online as soon as this year.

00:01:10.480 | And again, before we get to why they're doing this, let me give you a sense of the scale.

00:01:15.760 | The Stargate supercomputer would produce orders of magnitude more computing power

00:01:21.440 | than what Microsoft currently supplies to OpenAI. Notice the plural orders of magnitude.

00:01:27.040 | An order of magnitude is a 10x increase, so orders of magnitude would be at least a 100x increase.

00:01:34.320 | And to give you one little spoiler, more computing power more or less directly

00:01:37.840 | correlates to increased capabilities for the frontier AI models.

00:01:42.400 | In even simpler terms, a hundred times more is a lot.

00:01:46.480 | But why did that sentence begin with an if? If Stargate moves forward?

00:01:50.640 | Well, the previous paragraph said this. Microsoft's willingness to go ahead with

00:01:55.040 | the Stargate plan depends in part on OpenAI's ability to meaningfully improve the capabilities

00:02:02.000 | of its AI. Whether that hinges on GPT 4.5 likely coming in the spring or GPT 5,

00:02:08.960 | which many people are now agreeing with me will come at the end of this year or possibly the

00:02:13.520 | beginning of next, we don't know. My prediction, by the way, is that OpenAI will meaningfully

00:02:18.400 | improve the capabilities of its AI, and part of my proof is in this video, and therefore Stargate

00:02:23.440 | will go ahead. One source said that such a project is absolutely required for artificial

00:02:28.800 | general intelligence. That's the kind of intelligence that you would feel comfortable

00:02:32.400 | hiring for most jobs. And the timelines for this data center dovetail quite nicely with my own

00:02:38.400 | prediction for the first demonstration of an artificial general intelligence system.

00:02:44.000 | Now, I know many of you will react to that and say AGI is definitely coming this year. Of course,

00:02:48.880 | it depends on definitions, but let me give you a word from Aravind Srinivas,

00:02:53.360 | the founder of the newly minted Unicorn Perplexity.

00:02:57.280 | That's why you should always ask, okay, if you are actually really close to AGI,

00:03:00.880 | if it is the case that AGI is five years away, why are you hiring so many people right now?

00:03:05.040 | If we are really truly getting close to AGI, why are you not benefiting from AGI yourself?

00:03:09.360 | What, like opening eyes, hiring 30 people or like 50 people a month, 100 people a month,

00:03:14.240 | at that rate, they're going to hire like thousands a year. And over five years,

00:03:17.120 | they would have had a company with 5,000, 10,000 employees. So why couldn't you do with 100 if AGI

00:03:22.160 | is truly there? How many people do you really need anymore? These are the kind of questions

00:03:25.600 | you should ask. And honestly, like someone has to physically go and maintain the cluster,

00:03:29.680 | make these decisions on which GPUs to use, what happens when these nodes fail,

00:03:33.520 | like systems crash, and write all these heuristic rules to deal with all these things.

00:03:37.360 | If something goes wrong in production code, like who has to go and work on the backend servers,

00:03:41.840 | can all these be done by an AI now? Obviously not. Every time the definition of AGI gets

00:03:46.320 | very narrow and narrow, and it feels like narrow AI and not AGI. You see my point?

00:03:50.400 | You should ask Dan, when will we not have an executive assistant? And that maybe that day,

00:03:55.120 | we can call, we have something like an AGI. Back to the article though, and let me do my

00:03:58.800 | first mini detour. I noticed a slight mathematical discrepancy in that this data center, Stargate,

00:04:04.880 | will produce orders of magnitude, as I said, 100X more computing power. But in terms of actual

00:04:10.240 | energy, it will need the same amount of watts as what's needed to run several large data centers

00:04:16.320 | today. Now, of course, that's a lot, but wouldn't you need even more power than that to run

00:04:20.880 | something that's going to give us 100X at least more computing power? Well, just for a few seconds,

00:04:26.560 | let me bring you this chart from the chairman of TSMC. That's the company that makes around 90%

00:04:32.720 | of the world's most advanced chips. And one key number comes at the top, the energy efficient

00:04:37.760 | performance improves 3X every two years. So straight from TSMC, we get the projection

00:04:44.320 | that in four years, 2028, chips will be almost 10 times more energy efficient. I thought that's

00:04:50.480 | super interesting, but in case you're getting a little bit bored, where did the name Stargate

00:04:54.400 | come from? Well, the codename originated with OpenAI named for the sci-fi film in which scientists

00:05:00.240 | develop a device for traveling between galaxies. And I actually agree that the arrival of AGI will

00:05:05.920 | be like humanity stepping through a portal, can't go back and the world will be changed forever.

00:05:11.840 | But I know some of you are thinking, didn't Philip promise to say why they're building Stargate,

00:05:16.720 | not just describe how they're building it. So let me get to the first reason, they're doing it

00:05:22.160 | to match Google. Sam Altman has said privately that Google, one of OpenAI's biggest rivals,

00:05:27.600 | will have more computing capacity than OpenAI in the near term. And he's also complained publicly

00:05:33.120 | about not having as many AI server chips as he'd like. This insider chart from Semianalysis gives

00:05:39.440 | us a glimpse of the scale of that discrepancy. Here we are newly arriving into quarter two of 2024,

00:05:46.880 | and apparently the discrepancy is pretty stark between Google's capacity and OpenAI's. In the

00:05:52.800 | words of Dylan Patel, Google's compute capabilities make everyone else look silly. Indeed, I remember

00:05:59.040 | around a year ago when I said that it's likely Google who are on course to create AGI first,

00:06:04.880 | many people laughed and said, just look at Bard. But I likened Google and Google DeepMind specifically

00:06:10.800 | as like being a woken giant. We have started to glimpse the power of Gemini 1.5 and Gemini 2 is

00:06:18.240 | likely coming in June. And if you didn't realize how dependent OpenAI are on Microsoft to compete

00:06:23.600 | with Google, how about this? The CEO of Microsoft, Satya Nadella, recently boasted that it would not

00:06:29.600 | matter if OpenAI disappeared tomorrow. We have all of the intellectual property rights and all of the

00:06:35.680 | capability. We have the people, we have the compute, and we have the data, we have everything.

00:06:41.280 | We are below them, above them, and around them. It isn't only about personnel and clever algorithms,

00:06:46.960 | it's about supercomputers, it's about Stargate. Okay, so it's to match Google, but what is the

00:06:52.080 | next reason for building Stargate? Well, it would be to build models like GPT-7, 7.5, and 8. And yes,

00:06:59.920 | I am well aware that we don't even have GPT-4.5, so why am I even talking about GPT-7? Well,

00:07:06.000 | GPT-5, according to my own research, which I published in a video, is likely training around

00:07:11.120 | now. In fact, probably finished around now. Of course, that doesn't mean we're going to get it

00:07:15.280 | around now. They're going to release smaller versions like GPT-4.5 and they're going to do

00:07:19.520 | safety testing. But that's the full GPT-5 likely coming at the end of this year or the beginning

00:07:24.320 | of next. That's trained on current generation hardware, I would say maybe a hundred thousand

00:07:29.200 | H100s. But this year and next year, the report says, Microsoft has planned to provide OpenAI

00:07:35.440 | with servers housing hundreds of thousands of GPUs in total. And one former Googler and director

00:07:42.080 | Y Combinator leaked this. He spoke to a Microsoft engineer on the GPT-6 training cluster project.

00:07:49.360 | That engineer apparently complained about the pain they were having essentially setting up

00:07:53.360 | links between GPUs in different regions. And naturally he asked, why not just locate the

00:07:58.320 | cluster in one region? And the Microsoft employee said, oh yeah, we tried that first. We can't put

00:08:04.320 | more than a hundred thousand H100s in a single state without bringing down the power grid. So

00:08:10.480 | clearly it's going to be multiple hundred thousand H100s or B100s. Check out my previous video for

00:08:17.120 | GPT-6. But then we have a smaller phase four supercomputer for OpenAI that aims to launch

00:08:22.960 | around 2026. Now, of course, the naming schemes might go out the window by this point, but you

00:08:28.320 | can see why I think that the Stargate supercomputer for 2028 might be GPT-7.5, GPT-8. And it's not

00:08:36.160 | like OpenAI aren't repeatedly telling us that scale is the way to get to AGI. Here's one of

00:08:42.400 | their star researchers, Noam Brown saying recently that he wished every AI startup founder would read

00:08:48.320 | the bitter lesson. Now I might do a video on that essay someday, but basically it says that it's not

00:08:53.440 | about encoding human expert knowledge into the model. It's about building relatively simple

00:08:58.160 | algorithms and then just scaling them up as much as you can. It's a bitter lesson because human

00:09:03.200 | expertise and data become progressively less relevant to the model's performance. Just like

00:09:08.240 | our bitter experience of seeing AlphaGo, which was trained in part on human expert performance

00:09:13.440 | in Go, being superseded by AlphaZero, which wasn't, likewise for human data on the path to AGI.

00:09:21.200 | Here's Andrej Karpathy, until fairly recently, a star OpenAI researcher speaking about a week ago.

00:09:26.800 | Because the current models are just like not good enough. And I think there are big rocks to be

00:09:30.720 | turned here. And I think people still haven't really seen what's possible in this space at all.

00:09:36.720 | And roughly speaking, I think we've done step one of AlphaGo. We've done the imitation learning part.

00:09:41.280 | There's step two of AlphaGo, which is the RL. And people haven't done that yet. And I think it's

00:09:46.800 | going to fundamentally, this is the part that actually made it work and made something super

00:09:50.240 | human. But I think we just haven't done step two of AlphaGo, long story short. And we've just done

00:09:54.560 | imitation. And I don't think that people appreciate, number one, how terrible the data collection is

00:09:58.880 | for things like ChashAPT. Say you have a problem, like some prompt is some kind of a mathematical

00:10:02.960 | problem. A human comes in and gives the ideal solution to that problem. The problem is that

00:10:08.560 | the human psychology is different from the model psychology. What's easy or hard for the human are

00:10:13.760 | different to what's easy or hard for the model. And so human kind of fills out some kind of a

00:10:18.320 | trace that comes to the solution. But some parts of that are trivial to the model. And some parts

00:10:23.280 | of that are a massive leap that the model doesn't understand. You're kind of just losing it. And

00:10:26.480 | then everything else is polluted by that later. And so fundamentally what you need is the model

00:10:31.360 | needs to practice itself how to solve these problems. It needs to figure out what works for

00:10:37.280 | it or does not work for it. But it needs to learn that for itself based on its own capability and

00:10:41.120 | its own knowledge. So that's number one. That's totally broken, I think. It's a good initializer,

00:10:45.280 | though, for something agent-like. And then the other thing is we're doing reinforcement learning

00:10:48.560 | from human feedback. But that's like a super weak form of reinforcement learning. It doesn't

00:10:52.560 | even count as reinforcement learning, I think. So RLHF is like nowhere near, I would say, RL.

00:10:57.280 | It's like silly. And the other thing is imitation learning is super silly. RLHF is a nice improvement,

00:11:02.480 | but it's still silly. And I think people need to look for better ways of training these models

00:11:07.280 | so that it's in the loop with itself and in some psychology. And I think there will probably be

00:11:11.920 | unlocks in that direction. This echoes, again, Noam Brown, who I believe is working on OpenAI's

00:11:17.040 | Q* system, who said, "You don't get superhuman performance by doing better imitation learning

00:11:22.640 | on human data." And that brings me nicely to the third reason for building Stargate,

00:11:28.240 | doing longer inference, aka letting the models think for longer before they output a response.

00:11:34.800 | In the case of AlphaGo, allowing the models to ponder or think for a minute improved the systems

00:11:40.240 | by the equivalent of scaling those systems by 100,000x. Or in other words, GPT-5 might be

00:11:47.120 | reminiscent of GPT-6 if we let it think for a minute, let it think for hours and hours or even

00:11:53.680 | days, and we might get a new cancer drug. And before you immediately say he's just getting

00:11:58.560 | silly now, well, check out this article from The Economist, "AI is taking over drug development."

00:12:03.920 | Of course, there is way more detail and nuance than I can get to in this video,

00:12:07.680 | but the conclusion was this. Generative AI and systems like AlphaFold are already significantly

00:12:14.160 | accelerating biotechnology. And we will see in the next few years whether that will bring us

00:12:19.200 | usable drugs. The analysts, they say, at Boston Consulting Group, see signs of a fast approaching

00:12:24.960 | AI-enabled wave of new drugs. Indeed, drug regulators will need to up their game to

00:12:29.680 | meet the challenge. It would be a good problem for the world to have. Of course, I asked the

00:12:34.240 | Perplexity CEO about Q* and his predictions of the impacts of that system this year. But first,

00:12:41.120 | a 30-second plug for AI Insiders. That's my Patreon, link in the description, where first

00:12:46.400 | of all, you get exclusive videos. This one from a few days ago, I am particularly proud of. I

00:12:51.360 | analyzed a 44-page new report on the so-called AI jobs apocalypse, and within 36 hours, I had

00:12:57.440 | interviewed the author and produced this video. Trust me, I definitely dig beyond the headlines.

00:13:02.400 | On Insiders, you can also ask questions of my forthcoming guests, and I used many of the

00:13:07.040 | questions from Insiders when I interviewed Aravind. Our Discord, I'm proud to say,

00:13:11.120 | also has a ton of professional best practice sharing across dozens of professions and fields.

00:13:17.600 | Just a few hours ago, we got a new expert-led forum on semiconductors and hardware. And just

00:13:23.440 | a few days before that, a new forum on alignment led by a Googler. We also have regional networking

00:13:29.840 | across Europe and North America. But here's Aravind on what he believes Q* is and how soon

00:13:35.680 | it's coming. So if you just clean up all the internet data and teach these models to go

00:13:40.240 | through reasoning chains before writing an answer, they're going to get a lot more reliable. And then

00:13:45.840 | you can think of models that can search over the chain of thought before giving you an answer,

00:13:49.680 | rather than decoding a single chain of thought, this whole tree of thought concept. And then you

00:13:54.400 | can extend that to thinking of models that will have a search over a tree and identify several

00:13:59.680 | chains and look at the most plausible explanation based on the probabilities. Almost like how a

00:14:05.280 | player in a Go or Chess match reasons through several different branches of moves and picks

00:14:10.720 | the one that has the highest odds of success at winning the game. You can think of the inference

00:14:14.400 | time itself going up. Right now, you use the system to chat GPT, just respond in a few seconds.

00:14:19.520 | What if AIs are decoding with these really giant models, even bigger than GPT-4, going through

00:14:25.760 | several reasons, chains of reasoning, several layers of depth in it, and comes back to you

00:14:31.040 | after an hour with something that feels incredibly insightful. Now, this could be called an AGI by

00:14:37.040 | some people. I'm sure Dennis or Sam would call this an AGI if it works, because the definition

00:14:41.840 | that they would use here is something that truly surprises humans. Marvelous things. It feels like

00:14:47.520 | AlphaGo or something where it's not something most humans would be able to come up with. It requires

00:14:52.480 | several hours of thinking. So maybe we'll go far along those dimensions, might not replace our

00:14:57.600 | executive assistants or sales and marketing or designers or programmers, but might feel like a

00:15:02.960 | 10x programmer, might feel like a 10x marketer. And I think that could happen. And that could

00:15:07.680 | be a dimension where we see AGI progress in the near term. And I see maybe some breakthrough like

00:15:12.800 | that's happening in '24. So far, nothing. But at least by this time next year, I think something

00:15:19.200 | like that will be possible. You'll see a demo where it doesn't respond immediately, but it thinks

00:15:23.200 | for quite a long time and gets back with a really cool response. And I can't help but point out that

00:15:27.520 | if you watch my Q* video, I said one of the stars of the new system was Lucas Kaiser, one of the

00:15:33.440 | co-authors of the original Transformers paper. And I would note that in this week's Wired interview,

00:15:38.160 | when he was asked about Q*, the OpenAI PR person almost leapt across the table to silence him.

00:15:44.800 | I definitely think I was onto something. And believe it or not, there is actually a fourth

00:15:49.120 | reason for a Stargate-like supercomputer dominating different modalities, whether

00:15:54.640 | that's audio, video, or even embedded in robotics, as we saw in my last video. But let's just take

00:16:00.240 | audio and video. We learned a few days ago that OpenAI have had their voice engine system

00:16:05.920 | since 2022. Basically, you can feed it 15 seconds of someone's voice and it can then

00:16:12.080 | imitate that voice with high fidelity. Now, if you have lost your voice due to illness,

00:16:17.440 | this is simply incredible. And I've already demonstrated what 11Labs can do before on this

00:16:22.720 | channel. But of course, a system like this comes with risks. This is how good the system was at

00:16:28.320 | imitating your voice two years ago. Here's the real person's voice. "Force is a push or pull

00:16:35.120 | that can make an object move, stop, or change direction. Imagine you're riding a bike down a

00:16:41.440 | hill. First, the push you give off the ground is the force that gets you going." And here is the

00:16:48.000 | generated audio of that person saying whatever you'd like. In this case, let's take biology.

00:16:54.080 | "Some of the most amazing habitats on Earth are found in the rainforest. A rainforest is a place

00:16:59.520 | with a lot of precipitation, and it has many kinds of animals, trees, and other plants.

00:17:05.840 | Tropical rainforests are usually not too far from the equator and are warm all year."

00:17:10.960 | And as Noam Brown has said, yes him again, if you haven't disabled voice authentication for

00:17:15.920 | your bank account and had a conversation with your family about AI voice impersonation yet,

00:17:20.720 | now would be a good time. My only question is what are banks going to use? Not your voice and

00:17:26.400 | definitely not your handwriting. As I talked about back in January, AI can mimic your handwriting

00:17:31.680 | perfectly. And not your face, right? Because we all know about deepfakes. Well, maybe a video of

00:17:37.360 | you, but I think almost all of us know about the progress that's being made in photorealistic

00:17:43.120 | text to video. I'm going to show you an extract from what I think is actually quite a beautiful

00:17:48.160 | video prompted by an artist, but generated by Sora from OpenAI. "Literally filled with hot air.

00:17:54.640 | Yeah, living like this has its challenges. Uh, windy days for one are particularly troublesome.

00:18:01.040 | Well, there was a one time my girlfriend insisted I go to the cactus store to get my uncle Jerry a

00:18:06.240 | wedding present. What do I love most about my predicament? The perspective it gives me,

00:18:13.760 | you know, I get to see the world differently. I float above the mundane and the ordinary. I

00:18:18.640 | see things a different way from everyone else." The creators, if that's the right word to use,

00:18:23.360 | of that clip were Shy Kids. The company, not the children, of course. They said,

00:18:28.000 | "As great as Sora is at generating things that appear real, what excites us is its ability to

00:18:33.600 | make things that are totally surreal." And that's a tough one, isn't it? Because I think that clip

00:18:37.920 | really showcases how you can be creative with AI. I would indeed call that art, but I can easily see

00:18:44.400 | the risks to the economic value of artist's work at the same time. Here's how the term AI went down

00:18:50.720 | at one recent artist conference and festival. But let me know what you think, not only about

00:19:14.800 | AI's impact on art, but whether you agree with me that we are, in a sense, going through a Stargate.

00:19:21.440 | We don't know how the world will be transformed by AGI. And when it's created and we've stepped

00:19:27.120 | through the portal, it's hard to see a way back. I'd love to know your thoughts and as ever,

00:19:32.480 | thank you so much for watching to the end and have a wonderful day.