back to indexWhy We Don’t Need More Data Centers - Dr. Jasper Zhang, Hyperbolic

00:00:22.400 |
And so my topic is why we don't need more data centers. 00:00:36.420 |
But just building data centers alone can solve the problem. 00:00:41.640 |
So wait, before we get started, let me introduce myself. 00:00:50.440 |
I did my math PhD at UC Berkeley, finished my PhD in two years, 00:00:54.200 |
which made me the fastest person in the history of Berkeley. 00:00:59.460 |
So after that, I worked at State of Securities, 00:01:02.260 |
trying to use AI and machine learning to predict the market 00:01:12.240 |
Because everyone knows that compute is actually 00:01:21.580 |
we'll spend you millions of dollars per year. 00:01:26.760 |
be solved by not just building more data centers, 00:01:33.120 |
So let's get started with the problem that we're facing. 00:01:39.460 |
so everyone knows that AI is going to integrate 00:01:46.540 |
So the demand for GPUs as well as data centers is exploding. 00:01:51.380 |
So by McKinsey, by 2030, we'll need 4x more data centers built 00:01:58.780 |
in one quarter of the time that we built in this bit. 00:02:04.060 |
But what if I tell you that you actually don't need 00:02:15.200 |
Right now, the current capacity for data center is 55 gigawatts. 00:02:22.220 |
By the median scenario, we're going to see 22% annual growth 00:02:29.640 |
So in 2030, we're going to need 219 gigawatts. 00:02:38.300 |
And however, there are a lot of challenges building data centers. 00:02:46.440 |
So it takes-- for the first StartGate data center, 00:02:49.780 |
it takes more than a billion dollars to build. 00:02:52.860 |
And then also, it's very slow to connect data center 00:02:58.160 |
For example, right now, the wait list is like seven years. 00:03:02.280 |
So you need to wait seven years to connect 100 megawatts 00:03:05.420 |
facility to the electrical grid in Northern Virginia. 00:03:11.720 |
And then, it's also very consuming a lot of energy. 00:03:17.360 |
So currently, we're spending 4% of the total electricity 00:03:21.320 |
consumption in the US for just GPUs and data centers. 00:03:24.700 |
And also, it's not very environmentally sustainable. 00:03:28.500 |
If you can look at the number, that's crazy CO2 emissions 00:03:34.520 |
And even say, if we're going to deliver all the data centers 00:03:39.380 |
on time, there is still a data center supply deficit 00:03:43.540 |
of more than 15 gigawatts in the US along by 2030. 00:03:48.720 |
And so it means that just building data center 00:03:54.740 |
On the other hand, we think the GPU utilization 00:04:01.980 |
So according to Deloitte, GPUs sit idle 80% of the time 00:04:11.920 |
According to semi-analysis, there exists 100-plus GPU clouds. 00:04:20.740 |
A lot of you guys need GPUs, but you can't find them. 00:04:25.000 |
Or you are going to pay extremely high price. 00:04:28.420 |
On the other hand, there are a lot of GPUs sit idle in data centers 00:04:34.240 |
And so naturally, a solution that we think we should build is actually 00:04:39.880 |
build a GPU marketplace or aggregation layer that 00:04:43.160 |
aggregate different data centers and GPU providers to solve the problem 00:04:50.280 |
So it doesn't necessarily need to be hyperbolic, 00:04:52.480 |
but I just use hyperbolic as an example to show you. 00:04:57.760 |
So I can just share what we're trying to solve. 00:05:03.040 |
So we're building this global orchestration layer. 00:05:06.100 |
We invented a software called HyperDOS, which is short for Hyperbolic 00:05:17.280 |
So any cluster, as long as it installed our software within five 00:05:22.460 |
minutes, suddenly the data center become a cluster in our network. 00:05:27.280 |
And on the other side, users can rent GPUs in different ways 00:05:41.680 |
And so we see that there are several benefits. 00:05:49.080 |
One, we kind of solve the matching problem of compute. 00:05:59.560 |
So you don't need to spend too much time to wait for data center. 00:06:04.780 |
And then third, you can have different options. 00:06:13.640 |
I mean, I don't have time to kind of put down the math 00:06:19.120 |
Basically, we can save the cost by 50% to 75%. 00:06:27.340 |
we're running some beta version of our marketplace right now. 00:06:48.900 |
and then have a uniform distribution channel, 00:06:57.020 |
It's the theory behind that is like the queuing theory. 00:07:04.180 |
Probably next time, if we're going to watch my talk, 00:07:21.840 |
So are you frustrated when you are trying to talk to-- 00:07:28.140 |
If you have talked to more than five, raise your hands. 00:07:34.140 |
to have five sales calls and try to know which GPUs are 00:07:43.360 |
So basically, by having this uniform platform, 00:07:50.260 |
no longer need to vet different data centers. 00:07:53.140 |
They just pick the one that they have high rating 00:08:22.860 |
So basically, we can think about a use case example. 00:08:34.200 |
So let's say if you are a startup and you want 1,000 GPUs 00:08:39.960 |
So usually, you will just reserve these 1,000 GPUs for a year. 00:08:43.820 |
You think, I might need to use these GPUs for training. 00:08:52.340 |
And then after three months, then you realize that, OK, now, 00:08:56.180 |
I have a bad idea by running those experiments. 00:09:00.940 |
And now I need 1,000 more GPUs just for a month, right? 00:09:05.040 |
And then after six months, then you finish your training job. 00:09:23.900 |
you basically can say, OK, I will rent 1,000 GPUs 00:09:34.280 |
I just rent an extra 1,000 GPUs for just a month. 00:09:44.720 |
I can release my idle GPUs on hyperbolic and try to sell them 00:09:55.320 |
then you need to rent 1,000 GPUs at the beginning. 00:09:57.920 |
And then in month three, you need to rent actually 1,000 GPUs 00:10:06.660 |
and then also think about the price difference you will have, 00:10:12.340 |
you can reduce the cost from $43.8 million to $6.9 million. 00:10:21.360 |
And you also help other people to get cheaper GPUs too, 00:10:24.520 |
because you can release those idle GPUs to other people. 00:10:27.900 |
And so this is how we think that we're going to increase 00:10:36.340 |
People only think about saving, but actually, 00:10:42.580 |
By scaling law, we know that the more compute you spend, 00:10:49.700 |
So it's not just about saving your cost by 6x. 00:11:05.300 |
only need to rely on open AI and anthropic those closed AI 00:11:09.540 |
But now suddenly, their money become more valuable, 00:11:13.680 |
and they can rent as many GPUs as they want for their training. 00:11:19.080 |
And so the next step that we think usually the GPU marketplace will 00:11:23.040 |
evolve into is that it will be an all-in-one platform for a different AI 00:11:30.020 |
Because what people really want is not just GPUs. 00:11:33.940 |
They want to run their different AI jobs, right? 00:11:37.060 |
So you will have AI inference, online inference, offline inference, 00:11:47.080 |
And so, yeah, so this is, like, to, like, some takeaway. 00:11:52.840 |
Like, basically, we don't think we need, like, just focus on building 00:11:58.980 |
We also need to do, like, smart allocation for the resources. 00:12:03.200 |
And then second, we can reduce your costs by building GPUs 00:12:10.260 |
And lastly, I think just focusing on building data centers 00:12:15.920 |
We're costing a lot of energy, taking a lot of land. 00:12:20.420 |
We should better reuse, recycle those idle compute 00:12:39.280 |
But then we're also launching our business card 00:12:41.100 |
and enterprise card that give you, like, production-ready GPUs 00:12:56.040 |
Can you tell us more about the kind of hyperbolic OS? 00:13:01.260 |
Because I know a lot of times you have a data center, 00:13:05.260 |
How does it actually work to connect it to hyperbolic itself? 00:13:24.520 |
But then even for your MacBook or for your other PC, 00:13:40.340 |
We call, like, our hyperbolic server Monarch. 00:13:50.740 |
So different Barons, they own different compute. 00:13:54.160 |
And then every time when a user wants to rent GPU, 00:14:00.160 |
And the Monarch server will send a request to the Barron. 00:14:04.140 |
And then Barron will basically provision the machines 00:14:07.540 |
and set up the SSH instance for customers to access.