back to index

Ep17. Welcome Jensen Huang | BG2 w/ Bill Gurley & Brad Gerstner


Chapters

0:0 Introduction
1:50 The Evolution of AGI and Personal Assistants
6:3 NVIDIA's Competitive Moat
15:51 The Future of Inference and Training in AI
19:1 Building the AI Infrastructure
31:35 Inventing a New Market in an AI Future
38:40 The Impact of OpenAI
43:25 The Future of AI Models
51:21 Distributed Computing and Inference Scaling
55:54 Inference Time Reasoning and Its Importance
60:46 AI's Role in Growing Business and Improving Productivity
68:0 Ensuring Safe AI Development
72:31 The Balance of Open Source and Closed Source AI

Whisper Transcript | Transcript Only Page

00:00:00.000 | what they achieved is singular, never been done before.
00:00:03.740 | Just to put in perspective, 100,000 GPUs,
00:00:06.900 | that's easily the fastest supercomputer on the planet.
00:00:10.300 | That's one cluster.
00:00:11.400 | A supercomputer that you would build
00:00:16.060 | would take normally three years to plan,
00:00:18.860 | and then they deliver the equipment,
00:00:21.780 | and it takes one year to get it all working.
00:00:25.720 | We're talking about 19 days.
00:00:29.260 | (upbeat music)
00:00:31.840 | - Jensen, nice glasses.
00:00:43.420 | - Hey, yeah, you too.
00:00:45.100 | - It's great to be with you.
00:00:46.300 | - Yeah, I got my ugly glasses on just like you.
00:00:48.460 | - Come on, those aren't ugly.
00:00:49.700 | These are pretty good.
00:00:51.060 | Do you like the red ones better?
00:00:52.360 | - There's something only your family could love.
00:00:54.420 | (laughing)
00:00:55.460 | - Well, it's Friday, October 4th.
00:00:57.700 | We're at the NVIDIA headquarters
00:00:59.020 | just down the street from Altimeter.
00:01:00.940 | - Welcome.
00:01:01.780 | - Thank you, thank you.
00:01:03.220 | And we have our investor meeting,
00:01:04.900 | our annual investor meeting on Monday,
00:01:07.220 | where we're gonna debate all the consequences of AI,
00:01:10.100 | how fast we're scaling intelligence.
00:01:11.940 | And I couldn't think of anybody better, really,
00:01:13.680 | to kick it off with than you.
00:01:15.260 | - I appreciate that.
00:01:16.380 | - As both a shareholder, as a thought partner,
00:01:18.980 | kicking ideas back and forth, you really make us smarter.
00:01:22.260 | And we're just grateful for the friendship.
00:01:24.140 | So thanks for being here.
00:01:25.220 | - Happy to be here.
00:01:26.480 | - You know, this year, the theme
00:01:28.340 | is scaling intelligence to AGI.
00:01:31.060 | And it's pretty mind-boggling
00:01:32.340 | that when we did this two years ago,
00:01:33.940 | we did it on the age of AI,
00:01:35.920 | and that was two months before Chat GPT,
00:01:38.300 | and to think about all that's changed.
00:01:39.660 | So I thought we would kick it off with a thought experiment
00:01:42.300 | and maybe a prediction.
00:01:44.020 | If I colloquially think of AGI
00:01:46.740 | as that personal assistant in my pocket.
00:01:49.600 | (laughing)
00:01:51.100 | If I think of AGI as that colloquial assistant in my pocket.
00:01:54.220 | - Oh, getting used to it.
00:01:55.060 | - Exactly.
00:01:55.980 | - Yeah.
00:01:56.820 | - You know, that knows everything about me.
00:01:59.300 | That has perfect memory of me.
00:02:00.740 | That can communicate with me.
00:02:02.940 | That can book a hotel for me,
00:02:04.260 | or maybe book a doctor's appointment for me.
00:02:06.960 | When you look at the rate of change in the world today,
00:02:09.940 | when do you think we're going to have
00:02:11.700 | that personal assistant in our pocket?
00:02:13.980 | - Soon, in some form.
00:02:17.900 | - Yeah.
00:02:18.740 | - Yeah, soon in some form.
00:02:19.940 | And that assistant will get better over time.
00:02:25.780 | That's the beauty of technology as we know it.
00:02:28.260 | So I think in the beginning it'll be quite useful,
00:02:32.220 | but not perfect.
00:02:33.260 | And then it gets more and more perfect over time,
00:02:35.140 | like all technology.
00:02:36.300 | - When we look at the rate of change,
00:02:38.260 | I think Elon has said,
00:02:39.560 | "The only thing that really matters is rate of change."
00:02:42.260 | It sure feels to us like the rate of change
00:02:45.380 | has accelerated dramatically,
00:02:47.140 | is the fastest rate of change we've ever seen
00:02:49.860 | on these questions.
00:02:50.720 | Because we've been around the rim like you
00:02:52.380 | on AI for a decade now.
00:02:55.420 | You even longer.
00:02:57.260 | Is this the fastest rate of change
00:02:59.100 | you've seen in your career?
00:03:00.500 | - It is because we've reinvented computing.
00:03:03.980 | You know, a lot of this is happening
00:03:06.620 | because we drove the marginal cost of computing down
00:03:10.420 | by 100,000X over the course of 10 years.
00:03:14.460 | Moore's law would have been about 100X.
00:03:16.980 | And we did it in several ways.
00:03:19.100 | We did it by one, introducing accelerated computing,
00:03:22.080 | taking what is work that is not very effective on CPUs
00:03:27.080 | and put it on top of GPUs.
00:03:29.480 | We did it by inventing new numerical precisions.
00:03:33.320 | We did it by new architectures, inventing a tensor core.
00:03:37.160 | The way systems are formulated, NVLink,
00:03:40.880 | added insanely fast memories, HBM,
00:03:45.880 | and scaling things up with NVLink and InfiniBand,
00:03:51.640 | and working across the entire stack.
00:03:54.000 | Basically, everything that I describe
00:03:57.080 | about how NVIDIA does things
00:03:59.040 | led to a super Moore's law rate of innovation.
00:04:03.880 | Now, the thing that's really amazing
00:04:05.920 | is that as a result of that,
00:04:07.840 | we went from human programming to machine learning.
00:04:12.320 | And the amazing thing about machine learning
00:04:13.900 | is that machine learning can learn pretty fast,
00:04:16.840 | as it turns out.
00:04:17.800 | And so as we reformulated the way we distribute computing,
00:04:22.400 | we did a lot of parallelism of all kinds, right?
00:04:26.780 | Tensor parallelism, pipeline parallelism,
00:04:28.360 | parallelism of all kinds.
00:04:30.480 | And we became good at inventing new algorithms
00:04:35.480 | on top of that, and new training methods,
00:04:38.600 | and all of this invention is compounding
00:04:41.900 | on top of each other as a result, right?
00:04:44.280 | And back in the old days,
00:04:45.820 | if you look at the way Moore's law was working,
00:04:48.520 | the software was static.
00:04:50.640 | - Right.
00:04:51.680 | - It was pre-compiled, it was shrink-wrapped,
00:04:53.400 | put into a store, it was static.
00:04:55.480 | And the hardware underneath was growing at Moore's law rate.
00:04:59.640 | Now we've got the whole stack growing, right?
00:05:01.760 | Innovating across the whole stack.
00:05:03.080 | And so I think that that's the,
00:05:04.680 | now all of a sudden we're seeing scaling.
00:05:07.360 | That is extraordinary, of course.
00:05:11.020 | But we used to talk about pre-trained models
00:05:15.820 | and scaling at that level,
00:05:17.820 | and how we're doubling the model size,
00:05:20.620 | and doubling therefore appropriately,
00:05:21.980 | and doubling the data size.
00:05:23.540 | And as a result, the computing capacity necessary
00:05:26.780 | is increasing by a factor of four every year.
00:05:29.500 | - Right.
00:05:30.320 | - That was a big deal.
00:05:31.160 | - Right.
00:05:32.000 | - But now we're seeing scaling with post-training,
00:05:35.180 | and we're seeing scaling at inference.
00:05:37.060 | Isn't that right?
00:05:37.900 | - Right.
00:05:38.740 | - And so people used to think that pre-training
00:05:40.660 | was hard and inference was easy.
00:05:43.120 | Now everything is hard.
00:05:44.440 | - Right, right.
00:05:45.280 | - Which is kind of sensible.
00:05:46.520 | The idea that all of human thinking is one shot
00:05:51.020 | is kind of ridiculous.
00:05:52.720 | And so there must be a concept of fast thinking,
00:05:55.280 | and slow thinking, and reasoning, and reflection,
00:05:58.600 | and iteration, and simulation, and all that.
00:06:01.360 | And that now it's coming in.
00:06:02.880 | - Yeah.
00:06:03.800 | I think to that point,
00:06:05.000 | one of the most misunderstood things about NVIDIA
00:06:08.080 | is how deep the true NVIDIA moat is, right?
00:06:11.540 | I think there's a notion out there
00:06:12.860 | that as soon as someone invents a new chip,
00:06:17.060 | a better chip, that they've won.
00:06:19.900 | But the truth is you've been spending the past decade
00:06:22.320 | building the full stack from the GPU, to the CPU,
00:06:25.260 | to the networking, and especially the software
00:06:27.740 | and libraries that enable applications to run on NVIDIA.
00:06:31.460 | - Yeah.
00:06:32.300 | - So I think you spoke to that.
00:06:34.000 | But when you think about NVIDIA's moat today, right?
00:06:39.000 | Do you think NVIDIA's moat today is greater
00:06:42.440 | or smaller than it was three to four years ago?
00:06:45.840 | - Well, I appreciate you recognizing
00:06:49.520 | how computing has changed.
00:06:50.800 | In fact, the reason why people thought,
00:06:53.560 | and many still do, that you designed a better chip,
00:06:57.440 | it has more flops, has more flips, and flops,
00:07:00.440 | and bits, and bytes, you know what I'm saying?
00:07:02.400 | - Yeah.
00:07:03.240 | - You see their keynote slides,
00:07:05.920 | and it's got all these flips and flops,
00:07:07.400 | and bar charts, and things like that.
00:07:09.560 | And that's all good.
00:07:10.480 | I mean, look, horsepower does matter.
00:07:13.640 | - Yes.
00:07:14.460 | - So these things fundamentally do matter.
00:07:17.040 | However, unfortunately, that's old thinking.
00:07:22.040 | It is old thinking in the sense that the software
00:07:25.200 | was some application running on Windows,
00:07:29.120 | and the software is static.
00:07:30.840 | - Right.
00:07:31.660 | - Which means that the best way for you
00:07:33.940 | to improve the system is just making faster and faster chips.
00:07:38.100 | But we realized that machine learning
00:07:41.780 | is not human programming.
00:07:43.400 | Machine learning is not about just the software.
00:07:48.060 | It's about the entire data pipeline.
00:07:50.020 | It's about, in fact, the flywheel of machine learning
00:07:53.740 | is the most important thing.
00:07:54.580 | So how do you think about enabling this flywheel
00:07:59.100 | on the one hand, and enabling data scientists
00:08:02.820 | and researchers to be productive in this flywheel?
00:08:06.340 | And that flywheel starts at the very, very beginning.
00:08:11.220 | A lot of people don't even realize
00:08:13.340 | that it takes AI to curate data to teach an AI.
00:08:18.340 | And that AI alone is pretty complicated.
00:08:20.820 | - And is that AI itself is improving?
00:08:22.740 | Is it also accelerating?
00:08:24.660 | You know, again, when we think about
00:08:26.380 | the competitive advantage, right?
00:08:28.580 | It's combinatorial of all these systems.
00:08:30.500 | - It's exactly, exactly.
00:08:32.140 | And that was exactly gonna lead to that.
00:08:34.020 | Because of smarter AIs to curate the data,
00:08:38.180 | we now even have synthetic data generation
00:08:40.740 | and all kinds of different ways of curating data,
00:08:43.920 | presenting data to, and so before you even get the training,
00:08:47.460 | you've got massive amounts of data processing involved.
00:08:50.900 | And so people think about, oh, PyTorch,
00:08:54.480 | that's the beginning end of the world,
00:08:55.820 | and it was very important.
00:08:57.580 | But don't forget before PyTorch, there's amount of work.
00:09:00.700 | After PyTorch, there's amount of work.
00:09:02.780 | And the thing about the flywheel
00:09:05.260 | is really the way you ought to think.
00:09:06.980 | How do I think about this entire flywheel?
00:09:08.980 | And how do I design a computing system,
00:09:11.620 | a computing architecture that helps you take this flywheel
00:09:14.700 | and be as effective as possible?
00:09:16.660 | It's not one slice of an application, training.
00:09:21.460 | Does that make sense?
00:09:22.300 | That's just one step, okay?
00:09:24.420 | Every step along that flywheel is hard.
00:09:27.060 | And so the first thing that you should do,
00:09:30.140 | instead of thinking about, how do I make Excel faster?
00:09:33.260 | How do I make, you know, Doom faster?
00:09:35.580 | That was kind of the old days, isn't that right?
00:09:37.860 | Now you have to think about
00:09:38.700 | how do I make this flywheel faster?
00:09:40.940 | And this flywheel has a whole bunch of different steps.
00:09:43.780 | And there's nothing easy about machine learning,
00:09:45.620 | as you guys know.
00:09:46.440 | There's nothing easy about what OpenAI does, or X does,
00:09:49.060 | or Gemini and the team at DeepMind does.
00:09:51.780 | I mean, there's nothing easy about what they do.
00:09:54.140 | And so we decided, look,
00:09:56.420 | this is really what you ought to be thinking about.
00:09:58.860 | This is the entire process.
00:10:00.660 | You want to accelerate every part of that.
00:10:03.540 | You want to respect Amdahl's Law.
00:10:05.580 | You want to, Amdahl's Law would suggest,
00:10:08.100 | well, if this is 30% of the time,
00:10:10.940 | and I accelerated that by a factor of three,
00:10:13.740 | I didn't really accelerate the entire process by that much.
00:10:17.460 | Does that make sense?
00:10:18.500 | And you really want to create a system
00:10:20.860 | that accelerates every single step of that,
00:10:23.300 | because only in doing the whole thing
00:10:25.060 | can you really materially improve that cycle time.
00:10:29.220 | And that flywheel, that rate of learning
00:10:33.580 | is really, in the end, what causes the exponential rise.
00:10:37.380 | And so what I'm trying to say is that our perspective about,
00:10:41.780 | you know, a company's perspective
00:10:43.140 | about what you're really doing
00:10:45.220 | manifests itself into the product.
00:10:48.100 | And notice, I've been talking about this flywheel--
00:10:50.460 | - The entire cycle, yeah.
00:10:51.580 | - That's right.
00:10:52.660 | And we accelerate everything.
00:10:54.940 | Right now, the main focus is video.
00:10:58.660 | A lot of people are focused on physical AI
00:11:02.620 | and video processing.
00:11:04.420 | Just imagine that front end.
00:11:06.180 | - Right.
00:11:07.020 | - The terabytes per second of data
00:11:10.420 | that are coming into the system.
00:11:12.740 | Give me an example of a pipeline
00:11:14.940 | that is going to ingest all of that data,
00:11:18.660 | prepare it for training in the first place.
00:11:21.020 | So that entire thing is CUDA accelerated.
00:11:23.860 | - And people are only thinking about text models today.
00:11:27.100 | - Yeah.
00:11:27.940 | - But the future is, you know, this video models,
00:11:31.020 | as well as, you know, using, you know,
00:11:33.100 | some of these text models, like O1,
00:11:35.580 | to really process a lot of that data
00:11:37.580 | before we even get there.
00:11:38.820 | - Yeah. - Right?
00:11:39.660 | - Yeah, yeah.
00:11:40.500 | Language models are gonna be involved in everything.
00:11:43.540 | It took the industry enormous technology and effort
00:11:48.020 | to train a language model,
00:11:49.060 | to train these large language models.
00:11:50.580 | Now we're using a large language model
00:11:52.060 | in every single step of the way.
00:11:53.980 | It's pretty phenomenal.
00:11:56.180 | - I don't mean to be overly simplistic about this,
00:11:58.580 | but again, you know,
00:12:00.460 | we hear it all the time from investors, right?
00:12:03.700 | Yes, but what about custom ASICs?
00:12:06.700 | Yes, but their competitive mode
00:12:08.780 | is going to be pierced by this.
00:12:10.260 | What I hear you saying is that in a combinatorial system,
00:12:14.180 | the advantage grows over time.
00:12:16.380 | So I heard you say that our advantage is greater today
00:12:20.140 | than it was three to four years ago
00:12:21.580 | because we're improving every component
00:12:24.620 | and that's combinatorial.
00:12:26.340 | Is that, you know, when you think about, for example,
00:12:29.580 | as a business case study, Intel, right?
00:12:33.220 | Who had a dominant mode, a dominant position in the stack
00:12:36.580 | relative to where you are today.
00:12:38.900 | Perhaps just, you know, again, boil it down a little bit.
00:12:41.980 | You know, compare, contrast your competitive advantage
00:12:45.580 | to maybe the competitive advantage they had
00:12:47.740 | at the peak of their cycle.
00:12:49.940 | Well, Intel is extraordinary.
00:12:53.420 | Intel is extraordinary because they were probably
00:12:56.940 | the first company that was incredibly good
00:13:01.940 | at manufacturing, process engineering, manufacturing,
00:13:07.460 | and that one click above manufacturing,
00:13:12.660 | which is building the chip.
00:13:14.420 | Right.
00:13:15.420 | And designing the chip and architecting the chip
00:13:19.460 | in the x86 architecture
00:13:23.020 | and building faster and faster x86 chips.
00:13:25.820 | That was their brilliance.
00:13:27.140 | And they fused that with manufacturing.
00:13:29.540 | Our company is a little different in the sense that,
00:13:34.220 | and we recognize this, that in fact, parallel processing
00:13:38.820 | doesn't require every transistor to be excellent.
00:13:42.140 | Serial processing requires every transistor to be excellent.
00:13:45.180 | Parallel processing requires lots and lots of transistors
00:13:48.980 | to be more cost-effective.
00:13:50.780 | I'd rather have 10 times more transistors, 20% slower,
00:13:55.780 | than 10 times less transistors, 20% faster.
00:13:59.700 | Does that make sense?
00:14:00.660 | They were like the opposite.
00:14:02.940 | And so single-threaded performance,
00:14:04.340 | single-threaded processing and parallel processing
00:14:06.780 | was very different.
00:14:07.620 | And so we observed that, in fact, our world
00:14:10.780 | is not about being better going down.
00:14:13.260 | We want to be very good, as good as we can be.
00:14:16.420 | But our world is really about much better going up.
00:14:19.780 | Parallel computing, parallel processing is hard
00:14:22.300 | because every single algorithm requires a different way
00:14:27.300 | of refactoring and re-architecting the algorithm
00:14:30.860 | for the architecture.
00:14:32.500 | What people don't realize is that you can have
00:14:35.540 | three different ISAs, CPU ISAs.
00:14:37.620 | They all have their own C compilers.
00:14:39.060 | You could take software and compile down to the ISA.
00:14:42.140 | That's not possible in accelerated computing.
00:14:43.820 | That's not possible in parallel computing.
00:14:45.580 | The company who comes up with the architecture
00:14:47.340 | has to come up with their own OpenGL.
00:14:50.700 | So we revolutionized deep learning
00:14:52.900 | because of our domain-specific library called CUDNN.
00:14:56.900 | Without CUDNN, nobody talks about CUDNN
00:14:58.660 | because it's one layer underneath PyTorch and TensorFlow
00:15:03.180 | and back in the old days, CAFE and Theano and now Triton.
00:15:08.180 | There's a whole bunch of different frameworks.
00:15:10.900 | So that domain-specific library, CUDNN,
00:15:14.140 | a domain-specific library called Optics,
00:15:16.580 | we have a domain-specific library called Quantum,
00:15:20.060 | Rapids, the list of aerial for--
00:15:24.380 | - Industry-specific algorithms that sit below
00:15:28.260 | that PyTorch layer that everybody's focused on.
00:15:30.180 | Like I've heard oftentimes, well, if LLMs--
00:15:33.300 | - If we didn't invent that, no application on top could work.
00:15:37.580 | You guys understand what I'm saying?
00:15:38.820 | So the mathematics is really,
00:15:40.860 | what NVIDIA is really good at is algorithm.
00:15:43.020 | That fusion between the science above,
00:15:47.060 | the architecture on the bottom,
00:15:48.940 | that's what we're really good at, yeah.
00:15:50.900 | - There's all this attention now on inference, finally.
00:15:55.420 | But I remember two years ago, Brad and I had dinner with you
00:16:00.100 | and we asked you the question,
00:16:01.740 | "Do you think your moat will be as strong in inference
00:16:06.700 | "as it is in training?"
00:16:08.100 | - Yeah, and I'm sure I said it would be greater.
00:16:11.820 | - Yeah, yeah, and you touched upon
00:16:14.420 | a lot of these elements just now,
00:16:16.100 | just the composability between,
00:16:18.300 | or we don't know the total mix at one point,
00:16:22.220 | and to a customer, it's very important
00:16:24.180 | to be able to be flexible in between.
00:16:26.460 | - That's right.
00:16:27.700 | - But can you just touch upon,
00:16:29.300 | now that we're in this era of inference?
00:16:32.300 | - It was inference, training is inferencing at scale.
00:16:36.660 | I mean, you're right.
00:16:38.060 | And so if you train well,
00:16:42.420 | it is very likely you'll inference well.
00:16:44.540 | If you built it on this architecture
00:16:46.780 | without any consideration,
00:16:48.100 | it will run on this architecture.
00:16:50.260 | You could still go and optimize it for other architectures,
00:16:53.460 | but at the very minimum,
00:16:54.860 | since it's already been architected,
00:16:57.100 | built on NVIDIA, it will run on NVIDIA.
00:16:59.340 | Now, the other aspect, of course,
00:17:01.460 | it's just kind of capital investment aspect,
00:17:05.900 | which is when you're training new models,
00:17:08.500 | you want your best new gear to be used for training,
00:17:13.500 | which leaves behind gear that you used yesterday.
00:17:18.300 | Well, that gear is perfect for inference.
00:17:20.980 | And so there's a trail of free gear.
00:17:25.140 | There's a trail of free infrastructure
00:17:27.780 | behind the new infrastructure that's CUDA compatible.
00:17:30.780 | And so we're very disciplined
00:17:32.820 | about making sure that we're compatible throughout,
00:17:37.220 | so that everything that we leave behind
00:17:39.980 | will continue to be excellent.
00:17:41.380 | Now, we also put a lot of energy
00:17:42.820 | into continuously reinventing new algorithms,
00:17:45.340 | so that when the time comes,
00:17:48.740 | the Hopper architecture is two, three, four times better
00:17:52.580 | than when they bought it,
00:17:54.180 | so that infrastructure continues to be really effective.
00:17:57.980 | And so all of the work that we do,
00:18:00.140 | improving new algorithms, new frameworks,
00:18:02.340 | notice it helps every single install base that we have.
00:18:06.820 | Hopper is better for it, Ampere is better for it,
00:18:09.260 | even Volta is better for it, okay?
00:18:11.460 | And I think Sam was just telling me
00:18:13.380 | that they had just decommissioned the Volta infrastructure
00:18:17.420 | that they have at OpenAI recently.
00:18:18.900 | And so I think we leave behind this trail of install base.
00:18:23.380 | Just like all computing, install base matters.
00:18:25.660 | And NVIDIA's in every single cloud,
00:18:27.500 | we're on-prem and all the way out to the edge.
00:18:31.260 | And so the VILA vision language model
00:18:35.300 | that's been created in the cloud
00:18:37.460 | works perfectly at the edge on the robots,
00:18:40.540 | without modification.
00:18:41.740 | It's all CUDA compatible.
00:18:42.980 | And so I think this idea of architecture compatibility
00:18:47.980 | was important for large...
00:18:50.220 | It's no different for iPhones,
00:18:51.740 | no different for anything else.
00:18:52.900 | I think the install base is really important for inference.
00:18:55.340 | But the thing that we really benefit from
00:19:00.340 | is because we're working on training
00:19:03.540 | these large language models and the new architectures of it,
00:19:07.020 | we're able to think about how do we create architectures
00:19:11.820 | that's excellent at inference someday when the time comes.
00:19:15.260 | And so we've been thinking about iterative models
00:19:18.780 | for reasoning models,
00:19:20.900 | and how do we create very interactive inference experiences
00:19:25.900 | for this personal agent of yours.
00:19:29.300 | You don't want to say something
00:19:30.820 | and have to go off and think about it for a while.
00:19:32.220 | You want it to interact with you quite quickly.
00:19:34.500 | So how do we create such a thing?
00:19:35.620 | And what came out of it was NVLink.
00:19:37.420 | NVLink so that we could take these systems
00:19:40.860 | that are excellent for training,
00:19:42.580 | but when you're done with it,
00:19:43.700 | the inference performance is exceptional.
00:19:47.020 | And so you want to optimize for this time to first token.
00:19:52.020 | And time to first token is insanely hard to do actually,
00:19:57.020 | because time to first token requires a lot of bandwidth.
00:20:01.420 | But if your context is also rich,
00:20:03.380 | then you need a lot of flops.
00:20:07.340 | And so you need an infinite amount of bandwidth,
00:20:09.620 | infinite amount of flops at the same time
00:20:11.860 | in order to achieve just a few millisecond response time.
00:20:15.620 | And so that architecture is really hard to do.
00:20:18.460 | And we invented a Grace Blackwell NVLink for that.
00:20:21.500 | - Right.
00:20:22.500 | In the spirit of time, I have more questions about that,
00:20:25.620 | - Don't worry about the time.
00:20:26.460 | Hey guys, hey, hey, hey, listen, Janine?
00:20:29.340 | - Yeah.
00:20:30.180 | - Look.
00:20:31.020 | - Let's do it until it's right.
00:20:31.860 | - Let's do it until right, there you go.
00:20:33.140 | - I love it, I love it.
00:20:33.980 | So, you know, I was at a dinner with Andy Jassy earlier.
00:20:38.420 | - See, now we don't have to worry about the time.
00:20:40.300 | - With Andy Jassy earlier this week.
00:20:42.900 | And Andy said, you know, we've got Tranium, you know,
00:20:45.620 | coming and Inferencia coming.
00:20:47.340 | And I think most people, again,
00:20:50.220 | view these as a problem for NVIDIA.
00:20:52.300 | But in the very next breath, he said,
00:20:54.900 | NVIDIA is a huge and important partner to us
00:20:57.820 | and will remain a huge and important partner for us.
00:21:00.700 | As far as I can see into the future,
00:21:02.860 | the world runs on NVIDIA, right?
00:21:05.300 | So when you think about the custom ASICs
00:21:08.140 | that are being built,
00:21:09.620 | that are going to go after targeted application,
00:21:12.260 | maybe the inference accelerator at Meta,
00:21:14.300 | maybe, you know, Tranium at Amazon, you know,
00:21:18.060 | or Google's TPUs.
00:21:19.460 | And then you think about the supply shortage
00:21:22.020 | that you have today.
00:21:23.620 | Do any of those things change that dynamic, right?
00:21:28.260 | Or are they complements to the systems
00:21:30.820 | that they're all buying from you?
00:21:33.100 | - We're just doing different things.
00:21:34.340 | - Yes.
00:21:35.820 | - We're trying to accomplish different things.
00:21:38.700 | You know, what NVIDIA is trying to do
00:21:39.820 | is build a computing platform for this new world,
00:21:42.820 | this machine learning world,
00:21:44.060 | this generative AI world, this agentic AI world.
00:21:46.860 | We're trying to create, you know, as you know,
00:21:49.700 | and what's just so deeply profound
00:21:52.860 | is after 60 years of computing,
00:21:54.940 | we reinvented the entire computing stack.
00:21:58.500 | The way you write software from programming
00:22:01.500 | to machine learning,
00:22:02.660 | the way that you process software from CPUs to GPU,
00:22:06.300 | the way that the applications from software
00:22:11.060 | to artificial intelligence, right?
00:22:12.700 | And so software tools to artificial intelligence.
00:22:16.620 | So every aspect of the computing stack
00:22:19.020 | and the technology stack has been changed.
00:22:21.420 | You know, what we would like to do
00:22:23.140 | is to create a computing platform
00:22:26.300 | that's available everywhere.
00:22:27.940 | And this is really the complexity of what we do.
00:22:31.100 | The complexity of what we do
00:22:32.100 | is if you think about what we do,
00:22:33.820 | we're building an entire AI infrastructure
00:22:36.460 | and we think of it as one computer.
00:22:38.380 | I've said before,
00:22:39.540 | the data center is now the unit of computing.
00:22:43.020 | To me, when I think about a computer,
00:22:44.740 | I'm not thinking about that chip.
00:22:46.220 | I'm thinking about this thing.
00:22:47.260 | That's my mental model and all the software
00:22:49.380 | and all the orchestration,
00:22:50.500 | all the machinery that's inside.
00:22:51.900 | That's my computer.
00:22:54.380 | And we're trying to build a new one every year.
00:22:56.780 | - Yeah.
00:22:58.020 | - That's insane.
00:22:59.100 | Nobody has ever done that before.
00:23:01.020 | We're trying to build a brand new one every single year.
00:23:04.140 | And every single year,
00:23:04.980 | we deliver two or three times more performance.
00:23:08.300 | As a result, every single year,
00:23:09.620 | we reduce the cost by two or three times.
00:23:11.660 | Every single year,
00:23:12.580 | we improve the energy efficiency by two or three times.
00:23:15.460 | Right?
00:23:16.300 | And so we ask our customers,
00:23:18.140 | don't buy everything at one time,
00:23:20.060 | buy a little every year.
00:23:21.700 | Okay?
00:23:22.540 | And the reason for that,
00:23:23.380 | we want them cost averaged into the future.
00:23:25.340 | All of it's architecturally compatible.
00:23:28.700 | Okay?
00:23:29.540 | Now, so that building that alone
00:23:32.260 | at the pace that we're doing is incredibly hard.
00:23:35.740 | Now, the double part,
00:23:36.900 | the double hard part,
00:23:38.340 | is then we take that all of that,
00:23:40.100 | and instead of selling it as a infrastructure,
00:23:43.700 | or selling it as a service,
00:23:45.260 | we disaggregate all of it,
00:23:47.620 | and we integrate it into GCP.
00:23:49.860 | We integrate it into AWS.
00:23:51.980 | We integrate it into Azure.
00:23:53.580 | We integrate it into X.
00:23:55.300 | Does that make sense?
00:23:56.140 | - Yes.
00:23:57.100 | - Everybody's integration is different.
00:23:59.140 | We have to get all of our architectural libraries,
00:24:03.220 | and all of our algorithms,
00:24:04.500 | and all of our frameworks,
00:24:05.580 | and integrate it into theirs.
00:24:07.060 | We get our security system integrated into theirs.
00:24:09.460 | We get our networking integrated into theirs.
00:24:11.420 | Isn't that right?
00:24:12.260 | - Right.
00:24:13.100 | - Then we do basically 10 integrations.
00:24:15.580 | And we do this every single year.
00:24:17.140 | - Right.
00:24:18.380 | - Now, that is the miracle.
00:24:21.060 | That is the miracle.
00:24:22.300 | - Why?
00:24:23.140 | I mean, it's madness.
00:24:24.420 | It's madness that you're trying to do this every year.
00:24:26.540 | - I'm thinking about it.
00:24:27.380 | - So, what drove you to do it every year,
00:24:31.780 | and then related to that,
00:24:33.660 | Clark's just back from Taipei, and Korea, and Japan,
00:24:36.780 | when meeting with all your supply partners,
00:24:39.460 | who you have decade-long relationships with.
00:24:42.460 | How important are those relationships
00:24:45.700 | to, again, the combinatorial math
00:24:48.260 | that builds that competitive moat?
00:24:50.180 | - Yeah, when you break it down systematically,
00:24:55.860 | the more you guys break it down,
00:24:57.260 | the more everybody breaks it down,
00:24:58.780 | the more amazed that they are.
00:25:00.460 | - Yes.
00:25:01.300 | - And how is it possible
00:25:04.060 | that the entire ecosystem of electronics today
00:25:08.340 | is dedicated in working with us
00:25:10.780 | to build, ultimately, this cube of a computer
00:25:14.500 | integrated into all of these different ecosystems,
00:25:17.380 | and the coordination is so seamless?
00:25:20.420 | So, there's obviously APIs, and methodologies,
00:25:24.260 | and business processes, and design rules
00:25:27.340 | that we've propagated backwards,
00:25:29.340 | and methodologies, and architectures,
00:25:31.500 | and APIs that we've propagated forward.
00:25:34.340 | - That have been hardened for decades.
00:25:36.660 | - Hardened for decades, yeah,
00:25:37.940 | and also evolving as we go.
00:25:40.180 | But these APIs have to come together.
00:25:43.180 | - Right, right.
00:25:44.020 | - When the time comes, all these things in Taiwan,
00:25:47.060 | all over the world being manufactured,
00:25:48.820 | they're gonna land somewhere in Azure's data center,
00:25:51.460 | they're gonna come together,
00:25:52.300 | click, click, click, click, click, click.
00:25:54.580 | - Someone just calls an OpenAI API and it just works.
00:25:57.620 | - That's right, yeah, exactly.
00:25:59.820 | - Yeah, there's a whole chain.
00:26:00.660 | - It's kind of craziness, right?
00:26:01.500 | - There's a whole chain.
00:26:02.340 | - And so, that's what we invented,
00:26:03.180 | that's what we invented,
00:26:04.020 | this massive infrastructure of computing.
00:26:07.580 | The whole planet is working with us on it.
00:26:10.180 | It's integrated into everywhere.
00:26:12.540 | It's, you could sell it through Dell,
00:26:14.180 | you could sell it through HPE.
00:26:15.900 | It's hosted in the cloud.
00:26:17.980 | It's all the way out at the edge.
00:26:21.100 | People use it in robotic systems now and human robots.
00:26:25.420 | They're in self-driving cars.
00:26:27.260 | They're all architecturally compatible.
00:26:29.140 | Pretty kind of craziness.
00:26:32.460 | - It's craziness.
00:26:33.420 | - Clark, I don't want you to leave the impression
00:26:35.540 | I didn't answer the question.
00:26:36.860 | In fact, I did.
00:26:38.100 | What I meant by that when relating to your ASIC
00:26:41.340 | is the way to think about,
00:26:45.100 | we're just doing something different.
00:26:46.660 | - Yes.
00:26:48.260 | - As a company, we want to be situationally aware,
00:26:53.260 | and I'm very situationally aware
00:26:55.940 | of everything around our company and our ecosystem.
00:26:59.180 | I'm aware of all the people doing alternative things
00:27:01.540 | and what they're doing,
00:27:03.180 | and sometimes it's adversarial to us, sometimes it's not.
00:27:08.180 | I'm super aware of it.
00:27:10.380 | But that doesn't change what the purpose of the company is.
00:27:14.660 | The singular purpose of the company
00:27:16.620 | is to build an architecture,
00:27:18.220 | that a platform that could be everywhere.
00:27:23.020 | - Right.
00:27:24.500 | - That is our goal.
00:27:26.860 | We're not trying to take any share from anybody.
00:27:28.580 | NVIDIA is a market maker, not share taker.
00:27:32.260 | If you look at our company slides,
00:27:34.140 | not one day does this company talk about market share,
00:27:38.420 | not inside.
00:27:39.260 | All we're talking about is how do we create the next thing?
00:27:43.580 | What's the next problem we can solve?
00:27:45.540 | In that flywheel, how can we do a better job for people?
00:27:49.100 | How do we take that flywheel that used to take about a year,
00:27:52.380 | how do we crank it down to about a month?
00:27:54.620 | - Yes, yes.
00:27:55.740 | - What's the speed of light of that?
00:27:57.180 | Isn't that right?
00:27:58.020 | And so we're thinking about all these different things,
00:28:00.100 | but the one thing we're not,
00:28:02.460 | we're situationally aware of everything,
00:28:05.180 | but we're certain that what our mission is,
00:28:07.820 | is very singular.
00:28:09.620 | The only question is whether that mission is necessary.
00:28:12.380 | Does that make sense?
00:28:13.220 | - Yes.
00:28:14.380 | - And all companies, all great companies,
00:28:16.580 | ought to have that at its core.
00:28:18.660 | It's about what are you doing?
00:28:21.260 | - For sure.
00:28:22.100 | - The only question, is it necessary?
00:28:23.060 | Is it valuable?
00:28:24.020 | - Right.
00:28:24.860 | - Is it impactful?
00:28:25.700 | Does it help people?
00:28:26.980 | And I am certain that you're a developer,
00:28:29.900 | you're a generative AI startup,
00:28:32.180 | and you're about to decide how to become a company.
00:28:35.580 | The one choice that you don't have to make
00:28:38.260 | is which one of the A6 do I support?
00:28:41.940 | If you just support a CUDA,
00:28:43.820 | you know you could go everywhere.
00:28:45.660 | You could always change your mind later.
00:28:47.500 | - Right.
00:28:48.340 | - But we're the on-ramp to the world of AI.
00:28:50.860 | Isn't that right?
00:28:51.700 | Once you decide to come onto our platform,
00:28:54.180 | the other decisions you could defer.
00:28:56.020 | You could always build your own A6 later.
00:28:59.220 | - Right.
00:29:00.060 | - You know, we're not against that.
00:29:00.900 | We're not offended by any of that.
00:29:02.220 | When I work with,
00:29:03.060 | when we work with all the GCPs,
00:29:05.700 | the GCPs Azure,
00:29:06.980 | we present our roadmap to them years in advance.
00:29:10.260 | They don't present their A6 roadmap to us,
00:29:12.580 | and it doesn't ever offend us.
00:29:14.740 | Does that make sense?
00:29:15.820 | We create,
00:29:16.660 | we're in a,
00:29:17.500 | if you have a sole purpose,
00:29:18.940 | and your purpose is meaningful,
00:29:21.060 | and your mission is dear to you,
00:29:23.300 | and is dear to everybody else,
00:29:25.260 | then you could be transparent.
00:29:27.060 | Notice my roadmap is transparent at GTC.
00:29:30.180 | My roadmap goes way deeper
00:29:32.740 | to our friends at Azure,
00:29:34.500 | and AWS,
00:29:35.340 | and others.
00:29:36.860 | We have no trouble doing any of that,
00:29:38.340 | even as they're building their own A6.
00:29:40.460 | - I think,
00:29:41.460 | you know,
00:29:42.300 | when people observe the business,
00:29:45.260 | you said recently that the demand for Blackwell is insane.
00:29:48.940 | You said one of the hardest parts of your job
00:29:51.220 | is the emotional toll of saying no to people
00:29:54.740 | in a world that has a shortage
00:29:58.180 | of the compute that you,
00:29:59.620 | that you can produce and have on offer.
00:30:02.020 | But critics say this is just a moment in time, right?
00:30:04.540 | They say this is just like Cisco in 2000,
00:30:08.260 | we're overbuilding fiber.
00:30:09.980 | It's gonna be boom and bust.
00:30:12.300 | You know,
00:30:13.140 | I think about the start of 23 when we were having dinner.
00:30:17.140 | The forecast for NVIDIA at that dinner in January of 23
00:30:22.140 | was that you would do 26 billion of revenue
00:30:25.420 | for the year 2023.
00:30:27.020 | You did 60 billion,
00:30:28.500 | right?
00:30:29.340 | The 25 people-
00:30:30.180 | - Let's just,
00:30:31.020 | let the truth be known.
00:30:32.300 | That is the single greatest failure
00:30:35.060 | of forecasting the world has ever seen.
00:30:36.980 | - Right, right, right.
00:30:37.980 | - Can we all,
00:30:38.820 | can we all at least admit that?
00:30:40.340 | - What, what, what, what?
00:30:41.180 | To me, to me-
00:30:42.020 | - That was my takeaway.
00:30:42.860 | I just go-
00:30:43.700 | (laughing)
00:30:44.540 | - And that was,
00:30:45.380 | and that was,
00:30:46.220 | we got so excited in November 22
00:30:47.780 | because we had folks like Mustafa from Inflection
00:30:51.540 | and Noah from Character coming in our office
00:30:54.180 | talking about investing in their companies.
00:30:56.580 | And they said,
00:30:57.420 | "Well, if you can't pencil out investing in our companies,
00:31:00.180 | then buy NVIDIA."
00:31:01.100 | Because everybody in the world
00:31:03.180 | is trying to get NVIDIA chips
00:31:04.540 | to build these applications
00:31:05.900 | that are gonna change the world.
00:31:07.300 | And of course,
00:31:08.140 | the Cambrian moment occurred with CHAT GPT,
00:31:10.980 | and notwithstanding that fact,
00:31:12.980 | these 25 analysts were so focused on the crypto winner
00:31:16.540 | that they couldn't get their head around an imagination
00:31:19.180 | of what was happening in the world, okay?
00:31:22.060 | So it ended up being way bigger.
00:31:24.540 | You say in very plain English,
00:31:26.580 | the demand is insane for Blackwell,
00:31:28.940 | that it's going to be that way for as far as you can,
00:31:31.660 | you know, for as far as you can see.
00:31:33.140 | Of course, the future is unknown and unknowable,
00:31:35.780 | but why are the critics so wrong
00:31:38.220 | that this isn't going to be the Cisco-like situation
00:31:42.700 | of overbuilding in 2000?
00:31:45.540 | - Yeah.
00:31:46.380 | The best way to think about the future
00:31:50.940 | is reason about it from first principles.
00:31:53.340 | - Correct.
00:31:54.180 | - Okay, so the question is,
00:31:55.660 | what are the first principles of what we're doing?
00:31:57.420 | Number one, what are we doing?
00:31:58.980 | What are we doing?
00:32:01.660 | The first thing that we are doing
00:32:03.060 | is we are reinventing computing.
00:32:05.060 | Do we not?
00:32:05.900 | We just said that.
00:32:06.940 | The way that computing will be done in the future
00:32:09.020 | will be highly machine-learned.
00:32:11.460 | - Yes.
00:32:12.300 | - Highly machine-learned, okay?
00:32:13.140 | Almost everything that we do,
00:32:14.540 | almost every single application,
00:32:16.380 | Word, Excel, PowerPoint, Photoshop, Premier, you know,
00:32:21.380 | AutoCAD, you give me your favorite application
00:32:27.100 | that was all hand-engineered,
00:32:29.020 | I promise you it will be highly machine-learned
00:32:31.820 | in the future, isn't that right?
00:32:33.460 | And so all these tools will be,
00:32:34.620 | and on top of that, you're gonna have machines,
00:32:37.060 | agents that help you use them.
00:32:38.940 | - Right.
00:32:39.780 | - Okay?
00:32:40.620 | And so we know this for a fact at this point, right?
00:32:43.140 | Isn't that right?
00:32:43.980 | We've reinvented computing, we're not going back.
00:32:46.020 | The entire computing technology stack
00:32:47.660 | is being reinvented.
00:32:48.500 | Okay, so now that we've done that,
00:32:50.300 | we said that software is gonna be different.
00:32:52.620 | What software can write is gonna be different.
00:32:54.620 | How we use software will be different.
00:32:56.460 | So let's now acknowledge that.
00:32:59.060 | So those are my ground truth now.
00:33:01.180 | - Yes.
00:33:02.220 | - Now the question, therefore, is what happens?
00:33:05.220 | And so let's go back and let's just take a look
00:33:07.140 | at how's computing done in the past.
00:33:09.100 | So we have a trillion dollars worth of computers
00:33:10.700 | in the past.
00:33:11.540 | We look at it, just open the door,
00:33:12.860 | look at the data center,
00:33:13.700 | and you look at it and say,
00:33:14.740 | are those the computers you want doing that,
00:33:16.900 | doing that future?
00:33:17.740 | And the answer is no.
00:33:18.780 | - Right.
00:33:19.620 | - Right, you got all these CPUs back there.
00:33:20.780 | We know what it can do and what it can't do.
00:33:23.380 | And we just know that we have a trillion dollars
00:33:25.020 | worth of data centers that we have to modernize.
00:33:26.900 | And so right now, as we speak,
00:33:28.340 | if we were to have a trajectory over the next four
00:33:31.180 | or five years to modernize that old stuff,
00:33:33.820 | that's not unreasonable.
00:33:35.860 | - Right.
00:33:36.700 | - Sensible.
00:33:37.540 | So we have a trillion--
00:33:38.380 | - And you're having those conversations
00:33:39.340 | with the people who have to modernize it.
00:33:40.780 | - Yeah.
00:33:41.620 | - And they're modernizing it on GPU.
00:33:42.820 | - That's right.
00:33:43.660 | I mean, well, let's make another test.
00:33:46.340 | You have $50 billion of CapEx you'd like to spend.
00:33:50.580 | Option A, option B, build CapEx for the future.
00:33:53.940 | - Right.
00:33:54.780 | - Or build CapEx like the past.
00:33:56.340 | - Right.
00:33:57.180 | - Now you already have the CapEx of the past.
00:34:00.300 | - Right, right.
00:34:01.460 | - It's sitting right there.
00:34:02.500 | It's not getting much better anyways.
00:34:04.020 | Moore's law has largely ended.
00:34:05.380 | And so why rebuild that?
00:34:07.380 | Let's just take $50 billion, put it into generative AI.
00:34:09.780 | Isn't that right?
00:34:10.900 | And so now your company just got better.
00:34:12.700 | - Right.
00:34:13.780 | - Now, how much of that 50 billion would you put in?
00:34:15.900 | Well, I would put in 100% of the 50 billion
00:34:18.300 | because I've already got four years
00:34:19.860 | of infrastructure behind me that's of the past.
00:34:23.180 | And so now you just, I just reasoned about it
00:34:26.900 | from the perspective of somebody thinking about it
00:34:28.620 | from first principles and that's what they're doing.
00:34:30.420 | Smart people are doing smart things.
00:34:32.260 | Now the second part is this.
00:34:34.060 | So now we have a trillion dollars worth of capacity
00:34:35.860 | to go build, right?
00:34:36.700 | Trillion dollars worth of infrastructure.
00:34:37.660 | What about, you know, call it $150 billion into it.
00:34:40.060 | - Right.
00:34:41.180 | - Okay.
00:34:42.020 | So we have a trillion dollars in infrastructure
00:34:44.980 | to go build over the next four or five years.
00:34:46.780 | Well, the second thing that we observe
00:34:48.700 | is that the way that software is written is different
00:34:53.580 | but how software is gonna be used is different.
00:34:56.420 | In the future, we're gonna have agents.
00:34:57.780 | Isn't that right?
00:34:58.620 | - Correct.
00:34:59.460 | - We're gonna have digital employees in our company.
00:35:01.140 | In your inbox, you have all these little dots
00:35:03.860 | and these little faces.
00:35:04.860 | In the future, there's gonna be icons of AIs.
00:35:08.180 | Isn't that right?
00:35:09.100 | I'm gonna be sending them.
00:35:10.660 | I'm gonna be, I'm no longer gonna program computers
00:35:13.580 | with C++.
00:35:14.740 | I'm gonna program AIs with prompting.
00:35:18.660 | Isn't that right?
00:35:19.580 | Now this is no different than me talking to my,
00:35:21.660 | you know, this morning, I wrote a bunch of emails
00:35:23.780 | before I came here.
00:35:24.660 | I was prompting my teams.
00:35:26.500 | - Of course.
00:35:27.340 | Yeah.
00:35:28.180 | - And I would describe the context.
00:35:29.460 | I would describe the fundamental constraints
00:35:32.380 | that I know of.
00:35:33.740 | And I would describe the mission for them.
00:35:35.460 | I would leave it sufficiently,
00:35:36.900 | I would be sufficiently directional
00:35:40.180 | so that they understand what I need.
00:35:41.740 | And I wanna be clear about what the outcome should be,
00:35:44.100 | as clear as I can be.
00:35:45.460 | But I leave enough ambiguous space on, you know,
00:35:48.180 | a creativity space so they can surprise me.
00:35:50.020 | Isn't that right?
00:35:50.860 | - Absolutely.
00:35:51.700 | - It's no different than how I prompt an AI today.
00:35:53.020 | - Yeah.
00:35:53.860 | - It's exactly how I prompt an AI.
00:35:55.140 | And so what's gonna happen is,
00:35:56.740 | is on top of this infrastructure of IT
00:35:59.660 | that we're gonna modernize,
00:36:01.220 | there's gonna be a new infrastructure.
00:36:03.660 | This new infrastructure are going to be AI factories
00:36:07.060 | that operate these digital humans.
00:36:10.380 | And they're gonna be running all the time, 24/7.
00:36:13.180 | - Right.
00:36:14.020 | - We're gonna have 'em for all of our companies
00:36:16.020 | all over the world.
00:36:17.380 | We're gonna have 'em in factories.
00:36:18.980 | We're gonna have 'em in autonomous systems.
00:36:20.860 | Isn't that right?
00:36:21.740 | So there's a whole layer of computing fabric,
00:36:24.860 | a whole layer of what I call AI factories
00:36:27.220 | that the world has to make
00:36:28.580 | that doesn't exist today at all.
00:36:30.300 | - Right.
00:36:31.140 | - So the question is, how big is that?
00:36:32.260 | - Right.
00:36:33.100 | - Unknowable at the moment.
00:36:34.460 | Probably a few trillion dollars.
00:36:36.180 | - Right.
00:36:37.020 | - Unknowable at the moment,
00:36:38.380 | but as we're sitting here building in,
00:36:40.580 | the beautiful thing is the architecture
00:36:42.580 | for this modernizing this new data center
00:36:45.420 | and the architecture for the AI factory is the same.
00:36:48.860 | - Right.
00:36:49.700 | - That's the nice thing.
00:36:50.540 | - And you made this clear.
00:36:52.620 | You've got a trillion of old stuff.
00:36:54.100 | You've got to modernize.
00:36:55.060 | You at least have a trillion of new AI workloads coming on.
00:36:58.380 | - Yeah.
00:36:59.220 | - Give or take, you'll do 125 billion in revenue this year.
00:37:02.860 | You know, there was, at one point somebody told you
00:37:04.820 | the company would never be worth more than a billion.
00:37:07.100 | As you sit here today, is there any reason, right,
00:37:10.540 | if you're only 125 billion out of a multi-trillion, Tam,
00:37:14.580 | that you're not going to have 2X the revenue,
00:37:16.780 | 3X the revenue in the future that you have today?
00:37:20.340 | Is there any reason your revenue doesn't?
00:37:23.420 | - No.
00:37:24.260 | - Yeah.
00:37:25.100 | - Yeah.
00:37:25.940 | As you know, it's not about,
00:37:27.740 | everything is, you know, companies are only limited
00:37:33.940 | by the size of the fish pond, you know?
00:37:36.580 | - Yes, yes.
00:37:37.420 | - A goldfish can only be so big.
00:37:39.580 | And so the question is, what is our fish pond?
00:37:42.940 | What is our pond?
00:37:44.540 | And that requires a little imagination.
00:37:46.380 | And this is the reason why market makers think
00:37:50.180 | about that future, creating that new fish pond.
00:37:54.180 | It's hard to figure this out looking backwards
00:37:57.620 | and try to take share.
00:37:58.900 | - Right.
00:37:59.740 | - You know, share takers can only be so big.
00:38:01.340 | - For sure.
00:38:02.180 | - Market makers can be quite large.
00:38:03.900 | - For sure.
00:38:04.740 | - Yeah, and so, you know, I think the good fortune
00:38:07.660 | that our company has is that since the very beginning
00:38:09.900 | of our company, we had to invent the market
00:38:12.340 | for us to go swim in.
00:38:13.860 | That market, and people don't realize this back then,
00:38:16.140 | but anymore, but, you know, we were at ground zero
00:38:20.380 | of creating the 3D gaming PC market.
00:38:22.540 | - Right, right.
00:38:23.620 | - We largely invented this market and all the ecosystem
00:38:27.220 | and all the graphics card ecosystem, we invented all that.
00:38:30.380 | And so the need to invent a new market to go serve it later
00:38:35.380 | is something that's very comfortable for us.
00:38:38.900 | - Exactly, exactly.
00:38:40.060 | And speaking to somebody who's invented a new market,
00:38:42.860 | you know, let's shift gears a little bit to models
00:38:45.100 | and open AI, open AI raised, as you know,
00:38:47.860 | six and a half billion dollars this week,
00:38:50.900 | at like $150 billion valuation.
00:38:54.460 | We both participated.
00:38:56.180 | - Yeah, really happy for them,
00:38:58.420 | really happy they came together.
00:38:59.700 | - Right.
00:39:00.540 | - Yeah, they did a great stand
00:39:01.380 | and the team did a great job, yeah.
00:39:03.300 | - Reports are that they'll do 5 billion-ish of revenue
00:39:07.020 | or run rate revenue this year,
00:39:08.780 | maybe going to 10 billion next year.
00:39:11.260 | If you look at the business today,
00:39:13.220 | it's about twice the revenue as Google was
00:39:16.220 | at the time of its IPO.
00:39:18.100 | They have 250 million-- - Is that right?
00:39:19.700 | - Yeah, 250 million weekly average users,
00:39:22.700 | which we estimate is twice the amount Google had
00:39:25.420 | at the time of its IPO. - Is that right, okay, wow.
00:39:27.260 | - And if you look at the multiple of the business,
00:39:29.380 | if you believe 10 billion next year,
00:39:31.140 | it's about 15 times the forward revenue,
00:39:33.780 | which is about the multiple of Google and Meta
00:39:35.540 | at the time of their IPO, right?
00:39:37.660 | When you think about a company that had zero revenue,
00:39:41.700 | zero weekly average users 22 months ago--
00:39:44.780 | - Brad has an incredible command of history.
00:39:47.540 | - When you think about that,
00:39:49.220 | talk to us about the importance of open AI
00:39:53.580 | as a partner to you and open AI as a force
00:39:57.620 | in kind of driving forward, you know,
00:39:59.620 | kind of public awareness and usage around AI.
00:40:02.740 | - Well, this is one of the most consequential companies
00:40:07.500 | of our time.
00:40:10.380 | The, a pure play AI company
00:40:15.380 | pursuing the vision of AGI
00:40:24.180 | and whatever its definition.
00:40:28.300 | I almost don't think it matters fully
00:40:31.620 | what the definition is, nor do I,
00:40:34.420 | you know, really believe that the timing matters.
00:40:40.220 | The one thing that I know is that AI is gonna have
00:40:45.060 | a roadmap of capabilities over time.
00:40:48.300 | And that roadmap of capabilities over time
00:40:50.740 | is gonna be quite spectacular.
00:40:53.020 | And along the way, long before it even gets
00:40:57.340 | to anybody's definition of AGI,
00:40:59.660 | we're gonna put it to great use.
00:41:01.260 | All you have to do is right now as we speak,
00:41:05.540 | go talk to digital biologists,
00:41:09.460 | climate tech researchers, material researchers,
00:41:13.540 | physical sciences, astrophysicists, quantum chemists.
00:41:19.620 | You go ask video game designers,
00:41:23.580 | manufacturing engineers,
00:41:27.780 | roboticists, pick your favorite,
00:41:30.980 | whatever industry you wanna go pick.
00:41:33.540 | And you go deep in there and you talk to the people
00:41:35.780 | that matter and you ask them,
00:41:37.620 | has AI revolutionized the way you work?
00:41:41.860 | And you take those data points and you come back
00:41:44.340 | and you then get to ask yourself,
00:41:46.900 | how skeptical do you wanna be?
00:41:50.780 | Because they're not talking about AI
00:41:53.460 | as a conceptual benefit someday.
00:41:56.820 | They're talking about using AI right now.
00:42:00.220 | Right now, ag tech, material tech, climate tech,
00:42:04.700 | you pick your tech, you pick your field of science.
00:42:08.220 | They are advancing, AI is helping them advancing their work
00:42:12.300 | right now as we speak.
00:42:13.860 | Every single industry, every single company,
00:42:16.540 | every university, unbelievable, isn't that right?
00:42:20.020 | - Right.
00:42:20.900 | - It is absolutely going to somehow transform business.
00:42:25.900 | We know that.
00:42:28.460 | - Right.
00:42:29.540 | - I mean, it's so tangible, you could--
00:42:32.180 | - It's happening today.
00:42:33.180 | - It's happening today.
00:42:34.020 | - It's happening today.
00:42:34.860 | - Yeah, yeah.
00:42:35.700 | And so I think that the awakening of AI,
00:42:40.700 | chat GPT triggered, it's completely incredible.
00:42:46.900 | And I love their velocity and their singular purpose
00:42:52.980 | of advancing this field.
00:42:56.300 | And so really, really consequential.
00:42:58.860 | And they build an economic engine
00:43:01.020 | that can finance the next frontier of models, right?
00:43:04.860 | And I think there's a growing consensus in Silicon Valley
00:43:08.580 | that the whole model layer is commoditizing.
00:43:11.140 | Lama is making it very cheap
00:43:14.980 | for lots of people to build models.
00:43:16.460 | And so early on here, we had a lot of model companies,
00:43:19.300 | character and inflection and Cohere and Mistral
00:43:23.100 | and go through the list.
00:43:25.180 | And a lot of people question whether or not
00:43:27.580 | those companies can build the escape velocity
00:43:31.140 | on the economic engine that can continue funding
00:43:34.420 | those next generation.
00:43:35.860 | My own sense is that there's gonna be,
00:43:38.380 | that's why you're seeing the consolidation, right?
00:43:40.820 | Open AI clearly has hit that escape velocity.
00:43:43.180 | They can fund their own future.
00:43:45.180 | It's not clear to me that many of these other companies can.
00:43:48.820 | Is that a fair kind of review of the state of things
00:43:52.300 | in the model layer that we're going to have
00:43:53.940 | this consolidation like we have in lots of other markets
00:43:56.940 | to market leaders who can afford,
00:43:58.860 | who have an economic engine, an application
00:44:01.220 | that allows them to continue to invest?
00:44:03.860 | - First of all, there's a different fundamental difference
00:44:09.860 | between a model and artificial intelligence, right?
00:44:14.180 | - Yeah.
00:44:15.020 | - A model is an essential ingredient
00:44:17.740 | for artificial intelligence.
00:44:19.100 | It's necessary, but not sufficient.
00:44:20.540 | - Correct.
00:44:21.620 | - And so, and artificial intelligence is a capability,
00:44:26.060 | but for what?
00:44:27.300 | - Right.
00:44:28.140 | - Then what's the application?
00:44:29.100 | - Right.
00:44:29.940 | - The artificial intelligence for self-driving cars
00:44:32.140 | is related to the artificial intelligence
00:44:35.100 | for human or robots, but it's not the same,
00:44:37.660 | which is related to the artificial intelligence
00:44:40.020 | for a chatbot, but not the same.
00:44:41.620 | - Correct.
00:44:42.460 | - And so, you have to understand the taxonomy of--
00:44:46.220 | - Stack.
00:44:47.060 | - Yeah, of the stack.
00:44:48.060 | And at every layer of the stack,
00:44:49.780 | there will be opportunities,
00:44:51.860 | but not infinite opportunities for everybody
00:44:53.580 | at every single layer of the stack.
00:44:55.460 | Now, I just said something,
00:44:57.540 | all you have to do is replace the word model with GPU.
00:45:01.340 | In fact, this was the great observation
00:45:04.700 | of our company 32 years ago,
00:45:06.900 | that there's a fundamental difference between GPU,
00:45:10.140 | graphics chip or GPU, versus accelerated computing.
00:45:15.140 | And accelerated computing is a different thing
00:45:18.460 | than the work that we do with AI infrastructure.
00:45:22.140 | It's related, but it's not exactly the same.
00:45:24.220 | It's built on top of each other.
00:45:25.540 | It's not exactly the same.
00:45:26.820 | And each one of these layers of abstraction
00:45:29.380 | requires fundamental different skills.
00:45:33.380 | Somebody who's really, really good at building GPUs
00:45:35.740 | have no clue how to be an accelerated computing company.
00:45:38.620 | I can, there are a whole lot of people who build GPUs.
00:45:42.140 | And I don't know which one came,
00:45:44.980 | we invented the GPU,
00:45:45.900 | but you know that we're not the only company
00:45:48.620 | that makes GPUs today.
00:45:49.660 | - Correct.
00:45:50.500 | - And so, there are GPUs everywhere,
00:45:52.860 | but they're not accelerated computing companies.
00:45:55.980 | And there are a lot of people who,
00:45:57.740 | you know, they're accelerators,
00:46:00.620 | accelerators that does application acceleration,
00:46:03.980 | but that's different than an accelerated computing company.
00:46:06.300 | And so for example, a very specialized AI application.
00:46:10.500 | - Right.
00:46:11.340 | - Could be a very successful thing.
00:46:12.940 | - Correct.
00:46:13.780 | And that is MTIA.
00:46:14.780 | - That's right.
00:46:15.620 | But it might not be the type of company
00:46:17.500 | that had broad reach and broad capabilities.
00:46:20.860 | And so, you've got to decide where you want to be.
00:46:23.620 | There's opportunities probably in all these different areas,
00:46:25.900 | but like building companies,
00:46:27.420 | you have to be mindful of the shifting of the ecosystem
00:46:30.380 | and what gets commoditized over time.
00:46:32.740 | Recognizing what's a feature versus a product.
00:46:37.220 | - Right.
00:46:38.060 | - Versus a company.
00:46:38.900 | - For sure.
00:46:39.740 | - Okay.
00:46:40.580 | I just went through, okay.
00:46:41.540 | And there's a lot of different ways
00:46:42.620 | you can think about this.
00:46:44.020 | - Of course, there's one new entrant
00:46:46.140 | that has the money, the smarts, the ambition.
00:46:49.300 | That's X.AI.
00:46:51.020 | - Yeah.
00:46:51.860 | - Right?
00:46:52.700 | And well, there are reports out there
00:46:54.660 | that you and Larry and Elon had dinner.
00:46:56.780 | They talked to you out of 100,000 H100s.
00:47:00.060 | They went to Memphis and built a large coherent super cluster
00:47:03.660 | in a matter of months.
00:47:05.380 | - You know.
00:47:07.580 | - So first, three points don't make a line, okay.
00:47:11.660 | Yes, I had dinner with them.
00:47:13.340 | (laughing)
00:47:15.940 | Causality is there.
00:47:18.100 | What do you think about their ability
00:47:19.620 | to stand up that super cluster?
00:47:21.980 | And there's talk out there
00:47:23.220 | that they want another 100,000 H200s, right?
00:47:26.700 | To expand the size of that super cluster.
00:47:29.300 | You know, first talk to us a little bit about X
00:47:32.060 | and their ambitions and what they've achieved.
00:47:33.580 | But also, are we already at the age of clusters
00:47:37.900 | of 200 and 300,000 GPUs?
00:47:40.740 | - The answer is yes.
00:47:43.660 | And then the, first of all,
00:47:46.540 | acknowledgement of achievement where it's deserved.
00:47:51.860 | From the moment of concept to a data center
00:47:56.860 | that's ready for NVIDIA to have our gear there,
00:48:01.900 | to the moment that we powered it on,
00:48:05.420 | had it all hooked up, and it did its first training.
00:48:08.740 | - Yeah.
00:48:09.580 | - Okay?
00:48:10.420 | - Correct.
00:48:11.780 | - That first part, just building a massive factory,
00:48:16.780 | liquid cooled, energized, permitted,
00:48:21.500 | in the short time that was done.
00:48:23.820 | I mean, that is like superhuman.
00:48:26.940 | - Right.
00:48:27.780 | - Yeah, and as far as I know,
00:48:30.380 | there's only one person in the world who could do that.
00:48:32.420 | - Right.
00:48:33.260 | - I mean, Elon is singular in this understanding
00:48:35.860 | of engineering and construction and large systems
00:48:39.820 | and marshaling resources.
00:48:44.820 | - Incredible.
00:48:45.780 | - Yeah, just, it's unbelievable.
00:48:47.340 | And then, and of course,
00:48:50.220 | then his engineering team is extraordinary.
00:48:52.260 | I mean, the software team's great,
00:48:54.100 | the networking team's great,
00:48:55.060 | the infrastructure team is great.
00:48:56.300 | You know, Elon understands this deeply.
00:48:58.580 | And from the moment that we decided to get to go,
00:49:02.580 | the planning with our engineering team,
00:49:06.300 | our networking team, our infrastructure computing team,
00:49:08.940 | the software team, all of the preparation advance,
00:49:12.140 | then all of the infrastructure, all of the logistics
00:49:17.540 | and the amount of technology and equipment that came in
00:49:21.020 | on that day, NVIDIA's infrastructure
00:49:23.740 | and computing infrastructure and all that technology,
00:49:26.420 | to training, 19 days.
00:49:28.460 | Hang on, you just, you know what?
00:49:32.540 | - Did anybody sleep 24/7?
00:49:34.980 | - No question that nobody slept.
00:49:36.860 | But first of all, 19 days is incredible,
00:49:41.700 | but it's also kind of nice to just take a step back
00:49:43.820 | and just, do you know how many days 19 days is?
00:49:46.780 | It's just a couple of weeks.
00:49:48.500 | And the amount of technology, if you were to see it,
00:49:51.700 | is unbelievable.
00:49:52.580 | All of the wiring and the networking and, you know,
00:49:55.700 | networking NVIDIA gear is very different
00:49:58.020 | than networking hyperscale data centers, okay?
00:50:00.620 | The number of wires that goes in one node,
00:50:03.500 | the back of a computer is all wires.
00:50:05.900 | And just getting this mountain of technology integrated
00:50:09.100 | and all the software, incredible.
00:50:11.060 | Yeah, so I think what Elon and the X team did,
00:50:14.700 | and I'm really appreciative that he acknowledges
00:50:19.180 | the engineering work that we did with him
00:50:21.580 | and the planning work and all that stuff.
00:50:24.300 | But what they achieved is singular, never been done before.
00:50:28.620 | Just to put in perspective, 100,000 GPUs,
00:50:31.740 | that's easily the fastest supercomputer on the planet
00:50:35.180 | as one cluster.
00:50:36.260 | A supercomputer that you would build
00:50:40.940 | would take normally three years to plan.
00:50:43.980 | - Right.
00:50:44.820 | - And then they deliver the equipment
00:50:46.620 | and it takes one year to get it all working.
00:50:51.340 | - Yes.
00:50:52.180 | - We're talking about 19 days.
00:50:53.980 | - Wow.
00:50:54.820 | - What's the credit of the NVIDIA platform, right?
00:50:57.180 | That it's, the whole processes are hardened.
00:50:59.420 | - That's right, yeah.
00:51:00.780 | Everything's already working.
00:51:02.460 | And of course there's a whole bunch of X algorithms
00:51:05.380 | and X framework and X stack and things like that.
00:51:08.580 | And we got a ton of integration we have to do,
00:51:11.580 | but the planning of it was extraordinary.
00:51:13.420 | Just pre-planning of it to, you know.
00:51:15.700 | - N of one is right.
00:51:16.860 | Elon is an N of one.
00:51:18.500 | But you answered that question by starting off saying,
00:51:20.940 | yes, 200 to 300,000 GPU clusters are here, right?
00:51:25.940 | Does that scale to 500,000?
00:51:31.100 | Does it scale to a million?
00:51:34.260 | And does the demand for your products
00:51:38.340 | depend on it scaling to millions?
00:51:41.980 | - That part, the last part is no.
00:51:46.220 | My sense is that distributed training will have to work.
00:51:50.140 | - Right.
00:51:50.980 | - And my sense is that distributed computing
00:51:53.500 | will be invented.
00:51:54.660 | - Right.
00:51:55.500 | - And some form of federated learning
00:51:58.340 | and distributed, asynchronous distributed computing
00:52:03.340 | is going to be discovered.
00:52:06.180 | And I'm very enthusiastic and very optimistic about that.
00:52:09.540 | Of course, the thing to realize is that
00:52:17.260 | the scaling law used to be about pre-training.
00:52:20.220 | Now we've gone to multimodality,
00:52:22.060 | we've gone to synthetic data generation.
00:52:24.020 | - Right.
00:52:25.220 | - Post-training has now scaled up incredibly.
00:52:29.740 | Synthetic data generation, reward systems,
00:52:32.140 | reinforcement learning based.
00:52:33.820 | And then now inference scaling has gone through the roof.
00:52:37.980 | - Right.
00:52:38.820 | - The idea that a model, before it answers your answer,
00:52:42.460 | had already done internal inference 10,000 times,
00:52:47.460 | is probably not unreasonable.
00:52:49.460 | And it's probably done tree search,
00:52:50.980 | it's probably done reinforcement learning on that,
00:52:52.660 | it's probably done some simulations,
00:52:55.860 | surely done a lot of reflection,
00:52:57.260 | it probably looked up some data,
00:52:58.500 | it looked up some information, isn't that right?
00:53:00.420 | And so its context is probably fairly large.
00:53:02.860 | I mean, this type of intelligence is,
00:53:06.540 | well, that's what we do.
00:53:08.500 | - Right.
00:53:09.340 | - That's what we do, isn't that right?
00:53:10.740 | And so the ability, this scaling,
00:53:14.500 | if you just did that math and you compound it with,
00:53:18.900 | you compound that with 4X per year
00:53:21.900 | on model size and computing size.
00:53:25.340 | And then on the other hand,
00:53:26.460 | demand continues to grow in usage.
00:53:28.540 | Do we think that we need millions of GPUs?
00:53:32.060 | No doubt.
00:53:33.060 | - Yeah.
00:53:33.900 | - Yeah, that is a first certainty now.
00:53:35.820 | - Yeah.
00:53:36.660 | - And so the question is,
00:53:37.580 | how do we architect it from a data center perspective?
00:53:39.860 | And that has a lot to do with,
00:53:42.020 | are there data centers that are gigawatts at a time,
00:53:45.620 | or are they 250 megawatts at a time?
00:53:47.380 | And my sense is that you're gonna get both.
00:53:51.660 | - I think analysts always focus on
00:53:54.060 | the current architectural bet.
00:53:56.060 | But I think one of the biggest takeaways
00:53:58.060 | from this conversation is that
00:53:59.700 | you're thinking about the entire ecosystem
00:54:02.700 | and many years out.
00:54:04.140 | So the idea that,
00:54:07.620 | because NVIDIA is just scaling up or scaling out,
00:54:10.220 | it's to meet the future.
00:54:12.940 | It's not such that you're only dependent on a world
00:54:17.140 | where there's a 500,000 or a million GPU cluster.
00:54:21.060 | By the time there's distributed training,
00:54:24.740 | you'll have written the software to enable that.
00:54:27.860 | - That's right.
00:54:28.700 | Remember, without Megatron that we developed
00:54:32.580 | some seven years ago now,
00:54:34.500 | the scaling of these large training jobs
00:54:36.700 | wouldn't have happened.
00:54:38.060 | And so we invented Megatron,
00:54:39.700 | we invented Nickel, GPU Direct,
00:54:42.820 | all of the work that we did with RDMA,
00:54:45.300 | that made it possible for easily to do
00:54:47.660 | pipeline parallelism, right?
00:54:52.180 | And so all the model parallelism that's being done,
00:54:56.300 | all the breaking of the distributed training
00:54:58.460 | and all the batching and all that,
00:54:59.980 | all of that stuff is because we did the early work.
00:55:04.540 | And now we're doing the early work
00:55:05.780 | for the future generation.
00:55:07.580 | - So let's talk about Strawberry and O1.
00:55:11.060 | I wanna be respectful of your time.
00:55:12.700 | - I got all the time in the world, guys.
00:55:14.580 | - Well, you're very generous.
00:55:15.980 | - Yeah, I've got all the time in the world.
00:55:17.620 | - But first, I think it's cool that they named O1
00:55:21.140 | after the O1 visa, right?
00:55:23.460 | Which is about recruiting the world's best and brightest,
00:55:26.940 | you know, and bringing them to the United States.
00:55:29.020 | It's something I know we're both deeply passionate about.
00:55:32.380 | So I love the idea that building a model that thinks
00:55:35.980 | and that takes us to the next level
00:55:37.700 | of scaling intelligence, right?
00:55:40.100 | Is an homage to the fact that it's these people
00:55:43.900 | who come to the United States by way of immigration
00:55:47.700 | that have made us what we are,
00:55:49.340 | bring their collective intelligence to the United States.
00:55:51.220 | - Surely an alien intelligence.
00:55:53.660 | - Certainly.
00:55:54.500 | - Yeah.
00:55:55.340 | - You know, it was spearheaded by our friend,
00:55:56.540 | Noam Brown, of course.
00:55:57.900 | He worked at Pluribus and Cicero when he was at Meta.
00:56:01.140 | How big a deal is inference time reasoning
00:56:04.180 | as a totally new vector of scaling intelligence,
00:56:08.140 | separate and distinct from, right,
00:56:10.660 | just building larger models?
00:56:12.580 | - It's a huge deal.
00:56:13.580 | It's a huge deal.
00:56:14.420 | I think the,
00:56:15.260 | a lot of intelligence can't be done a priori.
00:56:21.020 | - Right.
00:56:21.940 | - You know, and a lot of computing,
00:56:24.780 | even a lot of computing can't be reordered.
00:56:28.300 | I mean, just, you know, out of order execution
00:56:30.820 | can't be done a priori, you know?
00:56:32.500 | And so a lot of things that can only be done in runtime.
00:56:35.740 | - Right.
00:56:36.580 | - And so whether you think about it
00:56:38.940 | from a computer science perspective,
00:56:40.340 | or you think about it from an intelligence perspective,
00:56:45.140 | too much of it requires context.
00:56:47.580 | - Right.
00:56:48.620 | - The circumstance.
00:56:49.860 | - Right.
00:56:50.820 | - The type of answer you're looking for.
00:56:54.420 | Sometimes just a quick answer is good enough.
00:56:56.340 | - Right.
00:56:57.380 | - Depends on the consequential, you know,
00:57:02.180 | impact of the answer.
00:57:03.580 | - Right.
00:57:04.420 | - You know, depending on the nature of the usage
00:57:05.700 | of that answer.
00:57:06.540 | So some answers, "Please take a night."
00:57:11.020 | Some answers, "Take a week."
00:57:12.180 | - Yes.
00:57:13.020 | - Is that right?
00:57:13.860 | So I could totally imagine me sending off a prompt
00:57:16.980 | to my AI and telling it, you know,
00:57:20.420 | "Think about it for a night."
00:57:21.540 | - Right.
00:57:22.380 | - "Think about it overnight.
00:57:23.220 | "Don't tell me right away."
00:57:24.300 | - Right.
00:57:25.140 | - "I want you to think about it all night.
00:57:26.780 | "And then come back and tell me tomorrow
00:57:27.860 | "what's your best answer and reason about it for me."
00:57:30.580 | And so I think the segmentation of intelligence
00:57:36.580 | from now, from a product perspective,
00:57:39.060 | there's going to be one-shot versions of it.
00:57:40.700 | - Right, for sure.
00:57:41.700 | - Yeah.
00:57:42.540 | And then there'll be some that take five minutes, you know.
00:57:46.940 | - And the intelligence layer that roots those questions
00:57:49.580 | to the right model.
00:57:50.660 | - Yeah.
00:57:51.500 | - For the right use case.
00:57:52.340 | I mean, we were using advanced voice mode
00:57:53.780 | and no one preview last night.
00:57:55.900 | I was coaching my son for his AP history test.
00:58:00.740 | And it was like having the world's best AP history teacher.
00:58:04.540 | - Yeah, right.
00:58:05.380 | - Right next to you.
00:58:06.220 | - Yeah.
00:58:07.060 | - Thinking about these questions.
00:58:08.100 | It was truly extraordinary.
00:58:09.860 | Again, they're--
00:58:11.140 | - My tutor's an AI today.
00:58:12.700 | - Right, right.
00:58:13.540 | - I'm serious.
00:58:14.380 | - Of course, they're here today.
00:58:15.220 | - Yeah.
00:58:16.060 | - Which comes back to this, you know,
00:58:17.020 | over 40% of your revenue today is inference.
00:58:19.780 | But inference is about ready
00:58:21.380 | because of chain of reasoning.
00:58:24.180 | - Yeah.
00:58:25.020 | - Right?
00:58:25.860 | It's about ready--
00:58:26.700 | - It's about to go up by a billion times.
00:58:27.540 | - Right, by a million X, by a billion X.
00:58:30.700 | - That's right.
00:58:31.540 | That's the part that most people have, you know,
00:58:34.660 | haven't completely internalized.
00:58:36.500 | This is that industry we were talking about,
00:58:38.380 | but this is the industrial revolution.
00:58:40.540 | - Right.
00:58:41.860 | That's the production of intelligence.
00:58:43.980 | - That's right.
00:58:44.820 | - Right?
00:58:45.660 | - Yeah.
00:58:46.500 | It's going to go up a billion times.
00:58:47.340 | - Right.
00:58:48.180 | And so, you know, everybody's so hyper-focused on NVIDIA
00:58:51.740 | as kind of like doing training on bigger models.
00:58:55.380 | - Yeah.
00:58:56.220 | - Right?
00:58:57.060 | Isn't it the case that your revenue, if it's 50/50 today,
00:59:00.420 | you're going to do way more inference in the future.
00:59:02.940 | - Yeah.
00:59:03.780 | - And then, I mean, training will always be important,
00:59:06.180 | but just the growth of inference is going to be way larger
00:59:09.260 | than the growth in training.
00:59:10.380 | - We hope, we hope.
00:59:11.220 | - It's almost impossible to conceive otherwise.
00:59:12.780 | - Yeah, we hope.
00:59:13.620 | That's right, that's right.
00:59:14.460 | - Right.
00:59:15.300 | - Yeah, I mean, it's good to go to school.
00:59:17.220 | - Yes.
00:59:18.060 | - But the goal is so that you can be productive
00:59:19.620 | in society later.
00:59:20.700 | And so it's good that we train these models,
00:59:22.540 | but the goal is to inference them, you know?
00:59:24.740 | - Are you already using chain of reasoning and, you know,
00:59:30.100 | tools like O1 in your own business
00:59:32.380 | to improve your own business?
00:59:33.860 | - Yeah, our cybersecurity system today
00:59:35.580 | can't run without our own agents.
00:59:38.780 | - Okay.
00:59:39.620 | - We have agents helping us design chips.
00:59:42.340 | Hopper wouldn't be possible.
00:59:43.420 | Blackwell wouldn't be possible.
00:59:44.420 | Rubin, don't even think about it.
00:59:46.220 | We have digital, we have AI chip designers,
00:59:49.420 | AI software engineers, AI verification engineers.
00:59:52.100 | And we build them all inside because, you know,
00:59:55.340 | we have the ability and we rather use it,
00:59:59.140 | use the opportunity to explore the technology ourselves.
01:00:01.660 | - You know, when I walked into the building today,
01:00:03.380 | somebody came up to me and said, you know,
01:00:05.380 | "Ask Jensen about the culture.
01:00:06.820 | "It's all about the culture."
01:00:08.020 | I look at the business, you know,
01:00:09.740 | we talk a lot about fitness and efficiency,
01:00:12.140 | flat organizations that can execute quickly, smaller teams.
01:00:16.380 | You know, NVIDIA is in a league of its own, really,
01:00:20.580 | you know, at about 4 million of revenue per employee,
01:00:23.860 | about 2 million of profits or free cashflow per employee.
01:00:28.620 | You've built a culture of efficiency
01:00:30.660 | that really has unleashed creativity and innovation
01:00:35.660 | and ownership and responsibility.
01:00:37.780 | You've broken the mold on kind of functional management.
01:00:40.140 | Everybody likes to talk about all of your direct reports.
01:00:44.420 | Is the leveraging of AI the thing
01:00:49.220 | that's going to continue to allow you to be hyper-creative
01:00:53.220 | while at the same time being efficient?
01:00:55.500 | - No question.
01:00:56.900 | I'm hoping that someday,
01:00:59.420 | NVIDIA has 32,000 employees today.
01:01:02.180 | And we have 4,000 families in Israel.
01:01:05.140 | I hope they're well.
01:01:06.060 | I'm thinking of you guys.
01:01:07.260 | And I'm hoping that NVIDIA someday
01:01:12.060 | will be a 50,000 employee company
01:01:14.660 | with a hundred million, you know, AI assistants.
01:01:20.860 | And they're in every single group.
01:01:25.540 | We'll have a whole directory of AIs
01:01:29.340 | that are just generally good at doing things.
01:01:31.820 | We'll also have, our inbox is gonna full of directories
01:01:34.620 | of AIs that we work with that we know are really good,
01:01:37.540 | specialized at our skill.
01:01:39.580 | And so AIs will recruit other AIs to solve problems.
01:01:43.340 | AIs will be in, you know, Slack channels with each other.
01:01:46.340 | - And with humans.
01:01:47.180 | - Right, and with humans.
01:01:48.620 | And so we'll just be one large, you know,
01:01:52.020 | employee base, if you will.
01:01:54.220 | Some of 'em are digital and AI, some of 'em are biological.
01:01:57.460 | And I'm hoping some of 'em even in megatronics.
01:02:00.540 | - I think from a business perspective,
01:02:03.180 | it's something that's greatly misunderstood.
01:02:04.980 | You just described a company that's producing the output
01:02:09.980 | of a company with 150,000 people,
01:02:13.060 | but you're doing it with 50,000 people.
01:02:14.860 | Now, you didn't say I was gonna get rid of all my employees.
01:02:17.900 | You're still growing the number of employees
01:02:20.100 | in the organization, but the output of that organization,
01:02:24.060 | right, is gonna be dramatically more.
01:02:26.020 | - This is often misunderstood.
01:02:28.380 | AI is not, it's not, AI will change every job.
01:02:33.380 | AI will have a seismic impact
01:02:37.780 | on how people think about work.
01:02:39.820 | Let's acknowledge that.
01:02:41.780 | AI has the potential to do incredible good.
01:02:44.420 | It has the potential to do harm.
01:02:46.180 | We have to build safe AI.
01:02:49.220 | Let's just make that foundational, okay?
01:02:51.420 | The part that is overlooked is
01:02:54.300 | when companies become more productive
01:02:58.220 | using artificial intelligence,
01:03:00.140 | it is likely that it manifests itself
01:03:02.860 | into either better earnings or better growth or both.
01:03:07.860 | - Right.
01:03:08.860 | - And when that happens, the next email from the CEO
01:03:13.180 | is likely not a layoff announcement.
01:03:16.180 | - Of course, 'cause you're growing.
01:03:17.900 | - Yeah, and the reason for that is
01:03:19.420 | because we have more ideas than we can explore,
01:03:21.940 | and we need people to help us think through it
01:03:24.660 | before we automate it.
01:03:25.980 | And so the automation part of it, AI can help us do.
01:03:30.660 | Obviously, it's gonna help us think through it as well,
01:03:33.540 | but it's still gonna require us to go figure out
01:03:35.740 | what problems do I wanna solve?
01:03:37.740 | There are a trillion things we can go solve.
01:03:39.340 | What problems does this company have to go solve?
01:03:41.660 | And select those ideas and figure out a way
01:03:44.340 | to automate and scale.
01:03:46.740 | And so as a result, we're gonna hire more people
01:03:49.180 | as we become more productive.
01:03:50.700 | People forget that, you know?
01:03:52.780 | And if you go back in time,
01:03:55.780 | obviously we have more ideas today than 200 years ago.
01:03:58.860 | That's the reason why GDPs are larger
01:04:00.300 | and more people are employed,
01:04:01.420 | and even though we're automating like crazy underneath.
01:04:04.540 | - It's such an important point
01:04:07.220 | of this period that we're entering.
01:04:09.500 | One, almost all human productivity,
01:04:13.300 | almost all human prosperity is the byproduct
01:04:16.460 | of the automation and the technology of the last 200 years.
01:04:20.580 | I mean, you can look at, you know,
01:04:22.500 | from Adam Smith and Schumpeter's creative destruction,
01:04:26.700 | you can look at charted GDP growth per person
01:04:29.740 | over the course of the last 200 years,
01:04:31.460 | and it's just accelerated.
01:04:33.140 | Which leads me to this question.
01:04:35.100 | If you look at the '90s,
01:04:36.300 | our productivity growth in the United States
01:04:38.660 | was about 2 1/2 to 3% a year, okay?
01:04:41.820 | And then in the 2000s, it slowed down to about 1.8%.
01:04:46.100 | And then the last 10 years
01:04:47.340 | has been the slowest productivity growth.
01:04:49.660 | So that's the amount of labor and capital,
01:04:51.820 | or the amount of output we have
01:04:53.140 | for a fixed amount of labor and capital.
01:04:54.780 | The slowest we've had on record, actually.
01:04:58.100 | And a lot of people have debated the reasoning for this,
01:05:00.540 | but if the world is as you just described,
01:05:02.980 | and we're going to leverage and manufacture intelligence,
01:05:06.500 | then isn't it the case that we're on the verge
01:05:08.660 | of a dramatic expansion in terms of human productivity?
01:05:12.100 | - That's our hope.
01:05:13.020 | - Right. - That's our hope.
01:05:14.060 | And of course, you know, we live in this world,
01:05:17.100 | so we have direct evidence of it.
01:05:18.620 | - Right.
01:05:19.460 | - We have direct evidence of it,
01:05:21.300 | either as isolated of a case as a individual researcher.
01:05:25.900 | - For sure.
01:05:26.740 | - Who is able to, with AI,
01:05:28.540 | now explore science at such an extraordinary scale
01:05:33.380 | that is unimaginable.
01:05:35.940 | That's productivity.
01:05:36.940 | - Right, 100%.
01:05:37.780 | - Measure of productivity.
01:05:39.260 | Or that we're designing chips that are so incredible
01:05:44.260 | at such a high pace and the chip complexities
01:05:48.620 | and the computer complexities we're building
01:05:51.140 | are going up exponentially
01:05:52.660 | while the company's employee base
01:05:54.340 | is not measure of productivity.
01:05:57.540 | - Correct.
01:05:58.500 | - The software that we're developing
01:06:00.060 | better and better and better
01:06:01.940 | because we're using AI and supercomputers to help us.
01:06:05.340 | The number of employees is growing barely linearly.
01:06:08.220 | - Okay, okay, okay.
01:06:10.260 | Another demonstration of productivity.
01:06:13.260 | So whether it's, I can go into,
01:06:14.940 | I can spot check it in a whole bunch of different industries.
01:06:17.380 | - Yes.
01:06:18.220 | - I could gut check it myself.
01:06:19.860 | - Yes, you're in business.
01:06:21.380 | - That's right.
01:06:22.220 | And so I can, you know, and of course,
01:06:24.820 | you can't, we could be overfit,
01:06:28.620 | but the artistry of it, of course,
01:06:30.620 | is to generalize what is it that we're observing
01:06:33.740 | and whether this could manifest in other industries.
01:06:36.020 | And there's no question that intelligence
01:06:39.340 | is the single most valuable commodity
01:06:41.860 | the world's ever known.
01:06:43.180 | And now we're gonna manufacture it at scale.
01:06:45.420 | And we, all of us have to get good at,
01:06:49.420 | you know, what would happen
01:06:51.220 | if you're surrounded by these AIs
01:06:53.820 | and they're doing things so incredibly well
01:06:58.060 | and so much better than you?
01:06:59.420 | - Right.
01:07:00.260 | - And when I reflect on that, that's my life.
01:07:04.420 | I have 60 direct reports.
01:07:06.060 | - Right.
01:07:06.900 | - The reason why they're on eStaff
01:07:09.220 | is because they're world-class at what they do.
01:07:10.980 | And they do it better than I do.
01:07:12.660 | - Right.
01:07:13.500 | - Much better than I do.
01:07:14.380 | - Right.
01:07:15.460 | - I have no trouble interacting with them.
01:07:17.700 | And I have no trouble prompt engineering them.
01:07:21.100 | - Right, totally.
01:07:22.300 | - I have no trouble programming them.
01:07:24.540 | - Right, right.
01:07:25.380 | - And so I think that that's the thing
01:07:27.780 | that people are going to learn,
01:07:30.580 | is that they're all gonna be CEOs.
01:07:32.220 | - Right.
01:07:33.060 | - They're all gonna be CEOs of AI agents.
01:07:34.780 | - Right.
01:07:35.820 | - And their ability to have the creativity,
01:07:40.820 | the will,
01:07:43.140 | and some knowledge on how to reason,
01:07:49.540 | break problems down,
01:07:51.100 | so that you can program these AIs
01:07:55.580 | to help you achieve something like I do.
01:07:57.620 | - Right.
01:07:58.460 | - You know, it's called running companies.
01:07:59.300 | - Right, now it's, you mentioned something,
01:08:01.500 | this alignment and the safe AI.
01:08:03.380 | You mentioned the tragedy going on in the Middle East.
01:08:08.780 | You know, we have a lot of autonomy
01:08:11.820 | and a lot of AI that's being used
01:08:14.020 | in different parts of the world.
01:08:15.580 | So let's talk for a second about bad actors,
01:08:18.300 | about safe AI, about coordination with Washington.
01:08:23.220 | How do you feel today?
01:08:24.460 | Are we on the right path?
01:08:25.820 | Do we have a sufficient level of coordination?
01:08:28.540 | You know, I think Mark Zuckerberg has said,
01:08:30.340 | the way we beat the bad AIs is we make the good AIs better.
01:08:33.380 | How would you characterize your view
01:08:38.300 | of how we make sure
01:08:41.140 | that this is a positive net benefit for humanity,
01:08:45.500 | as opposed to, you know,
01:08:47.140 | leaving us in this dystopian world without purpose?
01:08:50.020 | - The conversation about safety
01:08:52.380 | is really important and good.
01:08:53.780 | - Yes.
01:08:54.980 | - The abstracted view,
01:08:57.500 | this conceptual view of AI
01:08:59.460 | being a large giant neural network,
01:09:02.460 | not so good.
01:09:03.580 | - Right, right.
01:09:04.420 | - Okay.
01:09:05.300 | And the reason for that is because as we know,
01:09:08.620 | artificial intelligence and large language models
01:09:10.460 | are related and not the same.
01:09:11.900 | There are many things that are being done
01:09:16.180 | that I think are excellent.
01:09:17.260 | One, open sourcing models
01:09:20.340 | so that the entire community of researchers
01:09:23.420 | and every single industry
01:09:24.540 | and every single company can engage AI
01:09:27.420 | and go learn how to harness this capability
01:09:30.140 | for their application, excellent.
01:09:31.780 | Number two, it is under-celebrated
01:09:35.420 | the amount of technology that is dedicated
01:09:38.420 | to inventing AI to keep AI safe.
01:09:41.740 | - Yes.
01:09:42.580 | - AIs to curate data, to curate information,
01:09:45.820 | to train an AI, AI created to align AI,
01:09:49.060 | synthetic data generation,
01:09:50.420 | AI to expand the knowledge of AI,
01:09:52.820 | to cause it to hallucinate less.
01:09:54.980 | All of the AIs that are being created
01:09:57.020 | for vectorization or graphing or whatever it is,
01:10:03.340 | to inform an AI, guard railing AI,
01:10:06.900 | AIs to monitor other AIs,
01:10:08.820 | that the system of AIs to create safe AI
01:10:12.900 | is under-celebrated.
01:10:14.060 | - Right.
01:10:14.900 | That we've already built.
01:10:16.300 | - That we're building everybody all over the industry,
01:10:19.940 | the methodologies, the red teaming, the process,
01:10:23.500 | the model cards, the evaluation systems,
01:10:28.500 | the benchmarking systems.
01:10:30.260 | All of that, all of the harnesses that are being built
01:10:34.100 | at the velocity that's been built is incredible.
01:10:36.460 | - I wonder if the--
01:10:37.300 | - Under-celebrated, do you guys understand?
01:10:39.100 | - Yes.
01:10:39.940 | - The world still think--
01:10:40.780 | - And there's no government regulation
01:10:42.580 | saying you have to do this.
01:10:43.860 | - Yeah, right.
01:10:44.700 | - This is the actors in the space today
01:10:46.500 | who are building these AIs,
01:10:48.540 | are taking seriously and coordinating
01:10:51.100 | around best practices with respect to these critical matters.
01:10:55.540 | - That's right, exactly.
01:10:56.460 | And so that's under-celebrated, under-understood.
01:10:59.420 | - Yes.
01:11:00.260 | - Somebody needs to, well, everybody needs
01:11:03.540 | to start talking about AI as a system of AIs
01:11:06.860 | and system of engineered systems,
01:11:09.500 | engineered systems that are well-engineered,
01:11:12.740 | built from first principles,
01:11:14.540 | well-tested, so on and so forth.
01:11:16.780 | Regulation, remember, AI is a capability
01:11:21.140 | that can be applied.
01:11:23.700 | And don't, it's necessary to have regulation
01:11:28.700 | for important technologies,
01:11:32.620 | but it's also, don't overreach to the point
01:11:37.020 | where some of the regulation ought to be done,
01:11:40.380 | most of the regulation ought to be done at the applications.
01:11:42.780 | - Right.
01:11:43.620 | - FDA, NHTSA, FDA, you name it, right?
01:11:47.180 | All of the different, all of the different ecosystems
01:11:50.540 | that already regulate applications of technology.
01:11:53.380 | - Right.
01:11:54.220 | - Now have to regulate the application of technology
01:11:57.260 | that is now infused with AI.
01:11:58.660 | - Right.
01:11:59.500 | - And so, and so I think, I think there's,
01:12:04.500 | don't, don't, don't misunderstand,
01:12:07.140 | don't overlook the overwhelming amount of regulation
01:12:10.540 | in the world that are going to have to be activated
01:12:13.260 | for AI, and don't rely on just one universal,
01:12:17.500 | galactic, you know, AI council
01:12:20.500 | that's gonna possibly be able to do this,
01:12:22.460 | because there's a reason why
01:12:23.820 | all of these different agencies were created.
01:12:25.780 | There was, there's a reason why
01:12:27.220 | all these different regulatory bodies were created.
01:12:30.780 | We'll go back to first principles again.
01:12:32.180 | - I'd get in trouble by my partner, Bill Gurley,
01:12:34.420 | if I didn't go back to the open source point.
01:12:37.540 | You guys launched a very important, very large,
01:12:39.980 | very capable open source model.
01:12:41.860 | - Yeah.
01:12:42.900 | - Recently.
01:12:43.740 | - Yeah.
01:12:44.580 | - Recently.
01:12:45.420 | - Yeah.
01:12:46.260 | - Obviously, meta is making significant contributions
01:12:50.620 | to open source.
01:12:51.940 | I find when I read Twitter, you know,
01:12:53.740 | you have this kind of open versus closed,
01:12:56.460 | a lot of, a lot of chatter about it.
01:12:59.340 | - Yeah.
01:13:00.180 | - How do you feel about open source,
01:13:02.660 | your own open source models' ability
01:13:05.620 | to keep up with frontier?
01:13:07.900 | That would be the first question.
01:13:08.980 | The second question would be,
01:13:10.580 | is that, you know, having that open source model
01:13:13.300 | and also having closed source models, you know,
01:13:16.740 | that are powering commercial operations,
01:13:18.700 | is that what you see into the future?
01:13:20.740 | And do those two things,
01:13:21.980 | does that create the healthy tension for safety?
01:13:25.380 | - Mm-hmm.
01:13:26.220 | Open source versus closed source is related to safety,
01:13:31.260 | but not only about safety.
01:13:32.300 | - Yes.
01:13:33.140 | - You know, and so, so for example,
01:13:35.020 | there's absolutely nothing wrong
01:13:37.460 | with having closed source models
01:13:39.740 | that are, that are the engines of an economic model.
01:13:42.900 | - Exactly.
01:13:43.740 | - Necessary to sustain innovation.
01:13:45.300 | - Right.
01:13:46.140 | - Okay, I celebrate that wholeheartedly.
01:13:48.260 | - Right.
01:13:49.860 | - It is, it is, it is, I believe,
01:13:52.340 | wrong-minded to be closed versus open.
01:13:57.780 | - Right.
01:13:58.620 | - It should be closed and open.
01:13:59.620 | - Plus open.
01:14:00.460 | - Yeah, right, because open is necessary
01:14:03.540 | for many industries to be activated.
01:14:05.500 | Right now, if we didn't have open source,
01:14:06.700 | how would all these different fields of science
01:14:08.860 | be able to activate, be activated on AI?
01:14:11.340 | - Right.
01:14:12.180 | - Right, because they have to develop
01:14:13.020 | their own domain-specific AIs,
01:14:14.820 | and they have to develop their own,
01:14:16.420 | using open source models, create domain-specific AIs.
01:14:20.980 | They're related, again, not the same.
01:14:22.580 | - Right.
01:14:23.420 | - Just because you have an open source model
01:14:24.580 | doesn't mean you have an AI,
01:14:25.420 | and so you have to have that open source model
01:14:27.660 | to enable the creation of AIs.
01:14:30.340 | So financial services, healthcare, transportation,
01:14:33.020 | the list of industries, fields of science
01:14:35.300 | that has now been enabled
01:14:36.580 | as a result of open source, unbelievable.
01:14:38.860 | - Are you seeing a lot of demand
01:14:40.060 | for your open source models?
01:14:41.860 | - Our open source models, so first of all,
01:14:44.260 | Lama Downloads, right, obviously.
01:14:47.660 | Yeah, Mark and the work that they've done,
01:14:49.580 | incredible, off the charts.
01:14:51.420 | - Yes.
01:14:52.260 | - And it completely activated
01:14:53.940 | and engaged every single industry,
01:14:57.180 | every single field of science.
01:14:58.620 | - Right, right, it's terrific.
01:14:59.660 | - The reason why we did Nemotron
01:15:01.220 | was for synthetic data generation.
01:15:05.380 | Intuitively, the idea that one AI
01:15:08.900 | would somehow sit there and loop
01:15:10.620 | and generate data to learn itself,
01:15:12.420 | it sounds brittle.
01:15:15.540 | - Yes.
01:15:16.380 | - And how many times you can go around
01:15:18.660 | that infinite loop, that loop, you know, questionable.
01:15:22.420 | However, it's kind of, my mental image
01:15:25.500 | is kind of like, you get a super smart person,
01:15:28.700 | put him into a padded room,
01:15:31.140 | close the door for about a month.
01:15:33.100 | You know, what comes out is probably not a smarter person.
01:15:36.860 | And so, but the idea that you could have
01:15:40.460 | two or three people sit around
01:15:42.140 | and we have different AIs,
01:15:44.300 | we have different distributions of knowledge
01:15:46.460 | and we can go QA back and forth,
01:15:48.900 | all three of us can come out smarter.
01:15:50.740 | - Right.
01:15:51.580 | - And so the idea that you can have AI models
01:15:54.940 | exchanging, interacting, going back and forth,
01:15:58.540 | debating, reinforcement learning,
01:16:01.340 | synthetic data generation, for example,
01:16:04.060 | kind of intuitively suggests it makes sense, yeah.
01:16:08.140 | And so our model, Nemotron 350B is,
01:16:11.580 | 340B is the best model in the world for reward systems.
01:16:16.580 | And so it is the best critique.
01:16:18.780 | - Okay, interesting.
01:16:19.860 | - Yeah, and so a fantastic model
01:16:23.420 | for enhancing everybody else's models.
01:16:26.220 | Irrespective of how great somebody else's model is,
01:16:29.740 | I'd heavily recommend using Nemotron 340B
01:16:32.860 | to enhance and make it better.
01:16:34.220 | And we've already seen, made Lama better,
01:16:35.980 | made all the other models better.
01:16:37.620 | - Well, we're coming to the end.
01:16:41.700 | - Thank goodness.
01:16:43.940 | - As somebody who delivered DGX-1 in 2016,
01:16:48.700 | it's really been an incredible journey.
01:16:51.500 | Your journey is unlikely and incredible at the same time.
01:16:56.100 | - Thank you.
01:16:56.940 | - You survived, like just surviving the early days
01:17:00.580 | was pretty extraordinary.
01:17:02.020 | You delivered the first DGX-1 in 2016.
01:17:06.020 | We had this Cambrian moment in 2022.
01:17:10.380 | And so I'm gonna ask you the question
01:17:11.820 | I often get asked, which is,
01:17:16.820 | how long can you sustain what you're doing today?
01:17:21.380 | With 60 direct reports, you're everywhere.
01:17:26.140 | You're driving this revolution.
01:17:28.500 | Are you having fun?
01:17:31.420 | And is there something else that you would rather be doing?
01:17:36.260 | - Is this a question about the last hour and a half?
01:17:42.900 | The answer is I had a great time.
01:17:45.780 | I couldn't imagine anything else I'd rather be doing.
01:17:48.420 | Let's see.
01:17:53.500 | I don't think it's right to leave the impression
01:17:57.260 | that our job is fun all the time.
01:18:02.260 | My job isn't fun all the time,
01:18:05.140 | nor do I expect it to be fun all the time.
01:18:07.500 | Was that ever an expectation that it was fun all the time?
01:18:10.420 | I think it's important all the time.
01:18:14.740 | I don't take myself too seriously.
01:18:16.380 | I take the work very seriously.
01:18:18.060 | I take our responsibility very seriously.
01:18:19.940 | I take our contribution and our moment in time
01:18:22.220 | very seriously.
01:18:23.060 | Is that always fun?
01:18:27.660 | But do I always love it?
01:18:31.860 | Like all things.
01:18:32.900 | Whether it is family, friends, children,
01:18:37.460 | is it always fun?
01:18:39.500 | Do we always love it?
01:18:40.660 | Absolutely, deeply.
01:18:42.460 | And so I think the,
01:18:49.700 | how long can I do this?
01:18:51.460 | The real question is how long can I be relevant?
01:18:56.820 | And that only matters, that piece of information,
01:19:01.620 | that question can only be answered
01:19:03.140 | with how am I gonna continue to learn?
01:19:06.860 | And I am a lot more optimistic today.
01:19:09.140 | I'm not saying this simply because of our topic today.
01:19:12.460 | I'm a lot more optimistic about my ability
01:19:15.020 | to stay relevant and continue to learn because of AI.
01:19:19.620 | I use it, I don't know, but I'm sure you guys do.
01:19:22.060 | I use it literally every day.
01:19:24.660 | There's not one piece of research
01:19:26.100 | that I don't involve AI with.
01:19:28.060 | There's not one question that even if I know the answer,
01:19:31.540 | I double check on it with AI.
01:19:33.620 | And surprisingly, you know,
01:19:35.900 | the next two or three questions I ask it
01:19:38.180 | reveals something I didn't know.
01:19:40.820 | You pick your topic.
01:19:42.260 | You pick your topic.
01:19:43.700 | And I think that AI as a tutor,
01:19:47.180 | AI as an assistant,
01:19:48.540 | AI as a partner to brainstorm with,
01:19:54.780 | double check my work.
01:19:57.980 | You know, boy, you guys, it's completely revolutionary.
01:20:02.460 | And that's just, you know, I'm an information worker.
01:20:05.100 | My output is information.
01:20:06.900 | And so I think the contributions
01:20:10.220 | that I'll have on society is pretty extraordinary.
01:20:13.340 | So I think if that's the case,
01:20:16.460 | if I could stay relevant like this
01:20:18.260 | and I can continue to make a contribution,
01:20:21.860 | I know that the work is important enough
01:20:25.740 | for me to want to continue to pursue it.
01:20:28.180 | And my quality of life is incredible.
01:20:30.580 | So I mean, what's there to complain about?
01:20:32.060 | - I'll say, I can't imagine,
01:20:33.540 | you and I have been at this for a few decades.
01:20:35.260 | I can't imagine missing this moment.
01:20:37.060 | - Yeah, right.
01:20:37.900 | - It's the most consequential moment of our careers.
01:20:40.140 | We're deeply grateful for the partnership.
01:20:42.300 | - Don't miss the next 10 years.
01:20:43.780 | - For the thought partnership.
01:20:45.620 | - You make us smarter.
01:20:46.660 | - Thank you.
01:20:47.660 | - And I think you're really important
01:20:50.020 | as part of the leadership, right?
01:20:51.660 | That's going to optimistically and safely lead this forward.
01:20:55.940 | So thank you for being with us.
01:20:56.780 | - Thank you.
01:20:57.620 | Really enjoyed it.
01:20:58.540 | Thanks, Brad.
01:20:59.380 | Thanks, Clark.
01:21:00.220 | Good job.
01:21:01.060 | (upbeat music)
01:21:03.580 | - As a reminder to everybody,
01:21:11.500 | just our opinions, not investment advice.