back to index

Why Does OpenAI Need a 'Stargate' Supercomputer? Ft. Perplexity CEO Aravind Srinivas


Whisper Transcript | Transcript Only Page

00:00:00.000 | Why does OpenAI need Microsoft to build a $100 billion Stargate supercomputer?
00:00:07.520 | I'm going to try to give you the answer to that question in this video,
00:00:11.120 | which in turn will give you insight into the next one to four years of AI development.
00:00:16.160 | I'll also draw on a discussion I had last night with the Perplexity founder and former OpenAI
00:00:21.920 | researcher Aravind Srinivas about what kind of breakthroughs this will lead to
00:00:26.480 | and what AGI timelines he now has. And no, this is not just about Sora
00:00:31.600 | and the OpenAI voice engine. This is about manufacturing intelligence at hard to imagine
00:00:37.440 | scales. This report, by the way, from the information came from three different sources,
00:00:42.560 | one of whom spoke to Sam Altman and another who viewed some of Microsoft's initial cost estimates.
00:00:48.720 | And to give you some context, if that data center were a country, then its cost as a GDP would make
00:00:55.040 | it the 64th richest country in the world. This supercomputer would likely be based
00:01:00.160 | in the desert somewhere in the US and would launch around 2028.
00:01:04.800 | Some other stages of the wider plan, though, will come online as soon as this year.
00:01:10.480 | And again, before we get to why they're doing this, let me give you a sense of the scale.
00:01:15.760 | The Stargate supercomputer would produce orders of magnitude more computing power
00:01:21.440 | than what Microsoft currently supplies to OpenAI. Notice the plural orders of magnitude.
00:01:27.040 | An order of magnitude is a 10x increase, so orders of magnitude would be at least a 100x increase.
00:01:34.320 | And to give you one little spoiler, more computing power more or less directly
00:01:37.840 | correlates to increased capabilities for the frontier AI models.
00:01:42.400 | In even simpler terms, a hundred times more is a lot.
00:01:46.480 | But why did that sentence begin with an if? If Stargate moves forward?
00:01:50.640 | Well, the previous paragraph said this. Microsoft's willingness to go ahead with
00:01:55.040 | the Stargate plan depends in part on OpenAI's ability to meaningfully improve the capabilities
00:02:02.000 | of its AI. Whether that hinges on GPT 4.5 likely coming in the spring or GPT 5,
00:02:08.960 | which many people are now agreeing with me will come at the end of this year or possibly the
00:02:13.520 | beginning of next, we don't know. My prediction, by the way, is that OpenAI will meaningfully
00:02:18.400 | improve the capabilities of its AI, and part of my proof is in this video, and therefore Stargate
00:02:23.440 | will go ahead. One source said that such a project is absolutely required for artificial
00:02:28.800 | general intelligence. That's the kind of intelligence that you would feel comfortable
00:02:32.400 | hiring for most jobs. And the timelines for this data center dovetail quite nicely with my own
00:02:38.400 | prediction for the first demonstration of an artificial general intelligence system.
00:02:44.000 | Now, I know many of you will react to that and say AGI is definitely coming this year. Of course,
00:02:48.880 | it depends on definitions, but let me give you a word from Aravind Srinivas,
00:02:53.360 | the founder of the newly minted Unicorn Perplexity.
00:02:57.280 | That's why you should always ask, okay, if you are actually really close to AGI,
00:03:00.880 | if it is the case that AGI is five years away, why are you hiring so many people right now?
00:03:05.040 | If we are really truly getting close to AGI, why are you not benefiting from AGI yourself?
00:03:09.360 | What, like opening eyes, hiring 30 people or like 50 people a month, 100 people a month,
00:03:14.240 | at that rate, they're going to hire like thousands a year. And over five years,
00:03:17.120 | they would have had a company with 5,000, 10,000 employees. So why couldn't you do with 100 if AGI
00:03:22.160 | is truly there? How many people do you really need anymore? These are the kind of questions
00:03:25.600 | you should ask. And honestly, like someone has to physically go and maintain the cluster,
00:03:29.680 | make these decisions on which GPUs to use, what happens when these nodes fail,
00:03:33.520 | like systems crash, and write all these heuristic rules to deal with all these things.
00:03:37.360 | If something goes wrong in production code, like who has to go and work on the backend servers,
00:03:41.840 | can all these be done by an AI now? Obviously not. Every time the definition of AGI gets
00:03:46.320 | very narrow and narrow, and it feels like narrow AI and not AGI. You see my point?
00:03:50.400 | You should ask Dan, when will we not have an executive assistant? And that maybe that day,
00:03:55.120 | we can call, we have something like an AGI. Back to the article though, and let me do my
00:03:58.800 | first mini detour. I noticed a slight mathematical discrepancy in that this data center, Stargate,
00:04:04.880 | will produce orders of magnitude, as I said, 100X more computing power. But in terms of actual
00:04:10.240 | energy, it will need the same amount of watts as what's needed to run several large data centers
00:04:16.320 | today. Now, of course, that's a lot, but wouldn't you need even more power than that to run
00:04:20.880 | something that's going to give us 100X at least more computing power? Well, just for a few seconds,
00:04:26.560 | let me bring you this chart from the chairman of TSMC. That's the company that makes around 90%
00:04:32.720 | of the world's most advanced chips. And one key number comes at the top, the energy efficient
00:04:37.760 | performance improves 3X every two years. So straight from TSMC, we get the projection
00:04:44.320 | that in four years, 2028, chips will be almost 10 times more energy efficient. I thought that's
00:04:50.480 | super interesting, but in case you're getting a little bit bored, where did the name Stargate
00:04:54.400 | come from? Well, the codename originated with OpenAI named for the sci-fi film in which scientists
00:05:00.240 | develop a device for traveling between galaxies. And I actually agree that the arrival of AGI will
00:05:05.920 | be like humanity stepping through a portal, can't go back and the world will be changed forever.
00:05:11.840 | But I know some of you are thinking, didn't Philip promise to say why they're building Stargate,
00:05:16.720 | not just describe how they're building it. So let me get to the first reason, they're doing it
00:05:22.160 | to match Google. Sam Altman has said privately that Google, one of OpenAI's biggest rivals,
00:05:27.600 | will have more computing capacity than OpenAI in the near term. And he's also complained publicly
00:05:33.120 | about not having as many AI server chips as he'd like. This insider chart from Semianalysis gives
00:05:39.440 | us a glimpse of the scale of that discrepancy. Here we are newly arriving into quarter two of 2024,
00:05:46.880 | and apparently the discrepancy is pretty stark between Google's capacity and OpenAI's. In the
00:05:52.800 | words of Dylan Patel, Google's compute capabilities make everyone else look silly. Indeed, I remember
00:05:59.040 | around a year ago when I said that it's likely Google who are on course to create AGI first,
00:06:04.880 | many people laughed and said, just look at Bard. But I likened Google and Google DeepMind specifically
00:06:10.800 | as like being a woken giant. We have started to glimpse the power of Gemini 1.5 and Gemini 2 is
00:06:18.240 | likely coming in June. And if you didn't realize how dependent OpenAI are on Microsoft to compete
00:06:23.600 | with Google, how about this? The CEO of Microsoft, Satya Nadella, recently boasted that it would not
00:06:29.600 | matter if OpenAI disappeared tomorrow. We have all of the intellectual property rights and all of the
00:06:35.680 | capability. We have the people, we have the compute, and we have the data, we have everything.
00:06:41.280 | We are below them, above them, and around them. It isn't only about personnel and clever algorithms,
00:06:46.960 | it's about supercomputers, it's about Stargate. Okay, so it's to match Google, but what is the
00:06:52.080 | next reason for building Stargate? Well, it would be to build models like GPT-7, 7.5, and 8. And yes,
00:06:59.920 | I am well aware that we don't even have GPT-4.5, so why am I even talking about GPT-7? Well,
00:07:06.000 | GPT-5, according to my own research, which I published in a video, is likely training around
00:07:11.120 | now. In fact, probably finished around now. Of course, that doesn't mean we're going to get it
00:07:15.280 | around now. They're going to release smaller versions like GPT-4.5 and they're going to do
00:07:19.520 | safety testing. But that's the full GPT-5 likely coming at the end of this year or the beginning
00:07:24.320 | of next. That's trained on current generation hardware, I would say maybe a hundred thousand
00:07:29.200 | H100s. But this year and next year, the report says, Microsoft has planned to provide OpenAI
00:07:35.440 | with servers housing hundreds of thousands of GPUs in total. And one former Googler and director
00:07:42.080 | Y Combinator leaked this. He spoke to a Microsoft engineer on the GPT-6 training cluster project.
00:07:49.360 | That engineer apparently complained about the pain they were having essentially setting up
00:07:53.360 | links between GPUs in different regions. And naturally he asked, why not just locate the
00:07:58.320 | cluster in one region? And the Microsoft employee said, oh yeah, we tried that first. We can't put
00:08:04.320 | more than a hundred thousand H100s in a single state without bringing down the power grid. So
00:08:10.480 | clearly it's going to be multiple hundred thousand H100s or B100s. Check out my previous video for
00:08:17.120 | GPT-6. But then we have a smaller phase four supercomputer for OpenAI that aims to launch
00:08:22.960 | around 2026. Now, of course, the naming schemes might go out the window by this point, but you
00:08:28.320 | can see why I think that the Stargate supercomputer for 2028 might be GPT-7.5, GPT-8. And it's not
00:08:36.160 | like OpenAI aren't repeatedly telling us that scale is the way to get to AGI. Here's one of
00:08:42.400 | their star researchers, Noam Brown saying recently that he wished every AI startup founder would read
00:08:48.320 | the bitter lesson. Now I might do a video on that essay someday, but basically it says that it's not
00:08:53.440 | about encoding human expert knowledge into the model. It's about building relatively simple
00:08:58.160 | algorithms and then just scaling them up as much as you can. It's a bitter lesson because human
00:09:03.200 | expertise and data become progressively less relevant to the model's performance. Just like
00:09:08.240 | our bitter experience of seeing AlphaGo, which was trained in part on human expert performance
00:09:13.440 | in Go, being superseded by AlphaZero, which wasn't, likewise for human data on the path to AGI.
00:09:21.200 | Here's Andrej Karpathy, until fairly recently, a star OpenAI researcher speaking about a week ago.
00:09:26.800 | Because the current models are just like not good enough. And I think there are big rocks to be
00:09:30.720 | turned here. And I think people still haven't really seen what's possible in this space at all.
00:09:36.720 | And roughly speaking, I think we've done step one of AlphaGo. We've done the imitation learning part.
00:09:41.280 | There's step two of AlphaGo, which is the RL. And people haven't done that yet. And I think it's
00:09:46.800 | going to fundamentally, this is the part that actually made it work and made something super
00:09:50.240 | human. But I think we just haven't done step two of AlphaGo, long story short. And we've just done
00:09:54.560 | imitation. And I don't think that people appreciate, number one, how terrible the data collection is
00:09:58.880 | for things like ChashAPT. Say you have a problem, like some prompt is some kind of a mathematical
00:10:02.960 | problem. A human comes in and gives the ideal solution to that problem. The problem is that
00:10:08.560 | the human psychology is different from the model psychology. What's easy or hard for the human are
00:10:13.760 | different to what's easy or hard for the model. And so human kind of fills out some kind of a
00:10:18.320 | trace that comes to the solution. But some parts of that are trivial to the model. And some parts
00:10:23.280 | of that are a massive leap that the model doesn't understand. You're kind of just losing it. And
00:10:26.480 | then everything else is polluted by that later. And so fundamentally what you need is the model
00:10:31.360 | needs to practice itself how to solve these problems. It needs to figure out what works for
00:10:37.280 | it or does not work for it. But it needs to learn that for itself based on its own capability and
00:10:41.120 | its own knowledge. So that's number one. That's totally broken, I think. It's a good initializer,
00:10:45.280 | though, for something agent-like. And then the other thing is we're doing reinforcement learning
00:10:48.560 | from human feedback. But that's like a super weak form of reinforcement learning. It doesn't
00:10:52.560 | even count as reinforcement learning, I think. So RLHF is like nowhere near, I would say, RL.
00:10:57.280 | It's like silly. And the other thing is imitation learning is super silly. RLHF is a nice improvement,
00:11:02.480 | but it's still silly. And I think people need to look for better ways of training these models
00:11:07.280 | so that it's in the loop with itself and in some psychology. And I think there will probably be
00:11:11.920 | unlocks in that direction. This echoes, again, Noam Brown, who I believe is working on OpenAI's
00:11:17.040 | Q* system, who said, "You don't get superhuman performance by doing better imitation learning
00:11:22.640 | on human data." And that brings me nicely to the third reason for building Stargate,
00:11:28.240 | doing longer inference, aka letting the models think for longer before they output a response.
00:11:34.800 | In the case of AlphaGo, allowing the models to ponder or think for a minute improved the systems
00:11:40.240 | by the equivalent of scaling those systems by 100,000x. Or in other words, GPT-5 might be
00:11:47.120 | reminiscent of GPT-6 if we let it think for a minute, let it think for hours and hours or even
00:11:53.680 | days, and we might get a new cancer drug. And before you immediately say he's just getting
00:11:58.560 | silly now, well, check out this article from The Economist, "AI is taking over drug development."
00:12:03.920 | Of course, there is way more detail and nuance than I can get to in this video,
00:12:07.680 | but the conclusion was this. Generative AI and systems like AlphaFold are already significantly
00:12:14.160 | accelerating biotechnology. And we will see in the next few years whether that will bring us
00:12:19.200 | usable drugs. The analysts, they say, at Boston Consulting Group, see signs of a fast approaching
00:12:24.960 | AI-enabled wave of new drugs. Indeed, drug regulators will need to up their game to
00:12:29.680 | meet the challenge. It would be a good problem for the world to have. Of course, I asked the
00:12:34.240 | Perplexity CEO about Q* and his predictions of the impacts of that system this year. But first,
00:12:41.120 | a 30-second plug for AI Insiders. That's my Patreon, link in the description, where first
00:12:46.400 | of all, you get exclusive videos. This one from a few days ago, I am particularly proud of. I
00:12:51.360 | analyzed a 44-page new report on the so-called AI jobs apocalypse, and within 36 hours, I had
00:12:57.440 | interviewed the author and produced this video. Trust me, I definitely dig beyond the headlines.
00:13:02.400 | On Insiders, you can also ask questions of my forthcoming guests, and I used many of the
00:13:07.040 | questions from Insiders when I interviewed Aravind. Our Discord, I'm proud to say,
00:13:11.120 | also has a ton of professional best practice sharing across dozens of professions and fields.
00:13:17.600 | Just a few hours ago, we got a new expert-led forum on semiconductors and hardware. And just
00:13:23.440 | a few days before that, a new forum on alignment led by a Googler. We also have regional networking
00:13:29.840 | across Europe and North America. But here's Aravind on what he believes Q* is and how soon
00:13:35.680 | it's coming. So if you just clean up all the internet data and teach these models to go
00:13:40.240 | through reasoning chains before writing an answer, they're going to get a lot more reliable. And then
00:13:45.840 | you can think of models that can search over the chain of thought before giving you an answer,
00:13:49.680 | rather than decoding a single chain of thought, this whole tree of thought concept. And then you
00:13:54.400 | can extend that to thinking of models that will have a search over a tree and identify several
00:13:59.680 | chains and look at the most plausible explanation based on the probabilities. Almost like how a
00:14:05.280 | player in a Go or Chess match reasons through several different branches of moves and picks
00:14:10.720 | the one that has the highest odds of success at winning the game. You can think of the inference
00:14:14.400 | time itself going up. Right now, you use the system to chat GPT, just respond in a few seconds.
00:14:19.520 | What if AIs are decoding with these really giant models, even bigger than GPT-4, going through
00:14:25.760 | several reasons, chains of reasoning, several layers of depth in it, and comes back to you
00:14:31.040 | after an hour with something that feels incredibly insightful. Now, this could be called an AGI by
00:14:37.040 | some people. I'm sure Dennis or Sam would call this an AGI if it works, because the definition
00:14:41.840 | that they would use here is something that truly surprises humans. Marvelous things. It feels like
00:14:47.520 | AlphaGo or something where it's not something most humans would be able to come up with. It requires
00:14:52.480 | several hours of thinking. So maybe we'll go far along those dimensions, might not replace our
00:14:57.600 | executive assistants or sales and marketing or designers or programmers, but might feel like a
00:15:02.960 | 10x programmer, might feel like a 10x marketer. And I think that could happen. And that could
00:15:07.680 | be a dimension where we see AGI progress in the near term. And I see maybe some breakthrough like
00:15:12.800 | that's happening in '24. So far, nothing. But at least by this time next year, I think something
00:15:19.200 | like that will be possible. You'll see a demo where it doesn't respond immediately, but it thinks
00:15:23.200 | for quite a long time and gets back with a really cool response. And I can't help but point out that
00:15:27.520 | if you watch my Q* video, I said one of the stars of the new system was Lucas Kaiser, one of the
00:15:33.440 | co-authors of the original Transformers paper. And I would note that in this week's Wired interview,
00:15:38.160 | when he was asked about Q*, the OpenAI PR person almost leapt across the table to silence him.
00:15:44.800 | I definitely think I was onto something. And believe it or not, there is actually a fourth
00:15:49.120 | reason for a Stargate-like supercomputer dominating different modalities, whether
00:15:54.640 | that's audio, video, or even embedded in robotics, as we saw in my last video. But let's just take
00:16:00.240 | audio and video. We learned a few days ago that OpenAI have had their voice engine system
00:16:05.920 | since 2022. Basically, you can feed it 15 seconds of someone's voice and it can then
00:16:12.080 | imitate that voice with high fidelity. Now, if you have lost your voice due to illness,
00:16:17.440 | this is simply incredible. And I've already demonstrated what 11Labs can do before on this
00:16:22.720 | channel. But of course, a system like this comes with risks. This is how good the system was at
00:16:28.320 | imitating your voice two years ago. Here's the real person's voice. "Force is a push or pull
00:16:35.120 | that can make an object move, stop, or change direction. Imagine you're riding a bike down a
00:16:41.440 | hill. First, the push you give off the ground is the force that gets you going." And here is the
00:16:48.000 | generated audio of that person saying whatever you'd like. In this case, let's take biology.
00:16:54.080 | "Some of the most amazing habitats on Earth are found in the rainforest. A rainforest is a place
00:16:59.520 | with a lot of precipitation, and it has many kinds of animals, trees, and other plants.
00:17:05.840 | Tropical rainforests are usually not too far from the equator and are warm all year."
00:17:10.960 | And as Noam Brown has said, yes him again, if you haven't disabled voice authentication for
00:17:15.920 | your bank account and had a conversation with your family about AI voice impersonation yet,
00:17:20.720 | now would be a good time. My only question is what are banks going to use? Not your voice and
00:17:26.400 | definitely not your handwriting. As I talked about back in January, AI can mimic your handwriting
00:17:31.680 | perfectly. And not your face, right? Because we all know about deepfakes. Well, maybe a video of
00:17:37.360 | you, but I think almost all of us know about the progress that's being made in photorealistic
00:17:43.120 | text to video. I'm going to show you an extract from what I think is actually quite a beautiful
00:17:48.160 | video prompted by an artist, but generated by Sora from OpenAI. "Literally filled with hot air.
00:17:54.640 | Yeah, living like this has its challenges. Uh, windy days for one are particularly troublesome.
00:18:01.040 | Well, there was a one time my girlfriend insisted I go to the cactus store to get my uncle Jerry a
00:18:06.240 | wedding present. What do I love most about my predicament? The perspective it gives me,
00:18:13.760 | you know, I get to see the world differently. I float above the mundane and the ordinary. I
00:18:18.640 | see things a different way from everyone else." The creators, if that's the right word to use,
00:18:23.360 | of that clip were Shy Kids. The company, not the children, of course. They said,
00:18:28.000 | "As great as Sora is at generating things that appear real, what excites us is its ability to
00:18:33.600 | make things that are totally surreal." And that's a tough one, isn't it? Because I think that clip
00:18:37.920 | really showcases how you can be creative with AI. I would indeed call that art, but I can easily see
00:18:44.400 | the risks to the economic value of artist's work at the same time. Here's how the term AI went down
00:18:50.720 | at one recent artist conference and festival. But let me know what you think, not only about
00:19:14.800 | AI's impact on art, but whether you agree with me that we are, in a sense, going through a Stargate.
00:19:21.440 | We don't know how the world will be transformed by AGI. And when it's created and we've stepped
00:19:27.120 | through the portal, it's hard to see a way back. I'd love to know your thoughts and as ever,
00:19:32.480 | thank you so much for watching to the end and have a wonderful day.