Back to Index

Ep17. Welcome Jensen Huang | BG2 w/ Bill Gurley & Brad Gerstner


Chapters

0:0 Introduction
1:50 The Evolution of AGI and Personal Assistants
6:3 NVIDIA's Competitive Moat
15:51 The Future of Inference and Training in AI
19:1 Building the AI Infrastructure
31:35 Inventing a New Market in an AI Future
38:40 The Impact of OpenAI
43:25 The Future of AI Models
51:21 Distributed Computing and Inference Scaling
55:54 Inference Time Reasoning and Its Importance
60:46 AI's Role in Growing Business and Improving Productivity
68:0 Ensuring Safe AI Development
72:31 The Balance of Open Source and Closed Source AI

Transcript

what they achieved is singular, never been done before. Just to put in perspective, 100,000 GPUs, that's easily the fastest supercomputer on the planet. That's one cluster. A supercomputer that you would build would take normally three years to plan, and then they deliver the equipment, and it takes one year to get it all working.

We're talking about 19 days. (upbeat music) - Jensen, nice glasses. - Hey, yeah, you too. - It's great to be with you. - Yeah, I got my ugly glasses on just like you. - Come on, those aren't ugly. These are pretty good. Do you like the red ones better?

- There's something only your family could love. (laughing) - Well, it's Friday, October 4th. We're at the NVIDIA headquarters just down the street from Altimeter. - Welcome. - Thank you, thank you. And we have our investor meeting, our annual investor meeting on Monday, where we're gonna debate all the consequences of AI, how fast we're scaling intelligence.

And I couldn't think of anybody better, really, to kick it off with than you. - I appreciate that. - As both a shareholder, as a thought partner, kicking ideas back and forth, you really make us smarter. And we're just grateful for the friendship. So thanks for being here. - Happy to be here.

- You know, this year, the theme is scaling intelligence to AGI. And it's pretty mind-boggling that when we did this two years ago, we did it on the age of AI, and that was two months before Chat GPT, and to think about all that's changed. So I thought we would kick it off with a thought experiment and maybe a prediction.

If I colloquially think of AGI as that personal assistant in my pocket. (laughing) If I think of AGI as that colloquial assistant in my pocket. - Oh, getting used to it. - Exactly. - Yeah. - You know, that knows everything about me. That has perfect memory of me. That can communicate with me.

That can book a hotel for me, or maybe book a doctor's appointment for me. When you look at the rate of change in the world today, when do you think we're going to have that personal assistant in our pocket? - Soon, in some form. - Yeah. - Yeah, soon in some form.

And that assistant will get better over time. That's the beauty of technology as we know it. So I think in the beginning it'll be quite useful, but not perfect. And then it gets more and more perfect over time, like all technology. - When we look at the rate of change, I think Elon has said, "The only thing that really matters is rate of change." It sure feels to us like the rate of change has accelerated dramatically, is the fastest rate of change we've ever seen on these questions.

Because we've been around the rim like you on AI for a decade now. You even longer. Is this the fastest rate of change you've seen in your career? - It is because we've reinvented computing. You know, a lot of this is happening because we drove the marginal cost of computing down by 100,000X over the course of 10 years.

Moore's law would have been about 100X. And we did it in several ways. We did it by one, introducing accelerated computing, taking what is work that is not very effective on CPUs and put it on top of GPUs. We did it by inventing new numerical precisions. We did it by new architectures, inventing a tensor core.

The way systems are formulated, NVLink, added insanely fast memories, HBM, and scaling things up with NVLink and InfiniBand, and working across the entire stack. Basically, everything that I describe about how NVIDIA does things led to a super Moore's law rate of innovation. Now, the thing that's really amazing is that as a result of that, we went from human programming to machine learning.

And the amazing thing about machine learning is that machine learning can learn pretty fast, as it turns out. And so as we reformulated the way we distribute computing, we did a lot of parallelism of all kinds, right? Tensor parallelism, pipeline parallelism, parallelism of all kinds. And we became good at inventing new algorithms on top of that, and new training methods, and all of this invention is compounding on top of each other as a result, right?

And back in the old days, if you look at the way Moore's law was working, the software was static. - Right. - It was pre-compiled, it was shrink-wrapped, put into a store, it was static. And the hardware underneath was growing at Moore's law rate. Now we've got the whole stack growing, right?

Innovating across the whole stack. And so I think that that's the, now all of a sudden we're seeing scaling. That is extraordinary, of course. But we used to talk about pre-trained models and scaling at that level, and how we're doubling the model size, and doubling therefore appropriately, and doubling the data size.

And as a result, the computing capacity necessary is increasing by a factor of four every year. - Right. - That was a big deal. - Right. - But now we're seeing scaling with post-training, and we're seeing scaling at inference. Isn't that right? - Right. - And so people used to think that pre-training was hard and inference was easy.

Now everything is hard. - Right, right. - Which is kind of sensible. The idea that all of human thinking is one shot is kind of ridiculous. And so there must be a concept of fast thinking, and slow thinking, and reasoning, and reflection, and iteration, and simulation, and all that.

And that now it's coming in. - Yeah. I think to that point, one of the most misunderstood things about NVIDIA is how deep the true NVIDIA moat is, right? I think there's a notion out there that as soon as someone invents a new chip, a better chip, that they've won.

But the truth is you've been spending the past decade building the full stack from the GPU, to the CPU, to the networking, and especially the software and libraries that enable applications to run on NVIDIA. - Yeah. - So I think you spoke to that. But when you think about NVIDIA's moat today, right?

Do you think NVIDIA's moat today is greater or smaller than it was three to four years ago? - Well, I appreciate you recognizing how computing has changed. In fact, the reason why people thought, and many still do, that you designed a better chip, it has more flops, has more flips, and flops, and bits, and bytes, you know what I'm saying?

- Yeah. - You see their keynote slides, and it's got all these flips and flops, and bar charts, and things like that. And that's all good. I mean, look, horsepower does matter. - Yes. - So these things fundamentally do matter. However, unfortunately, that's old thinking. It is old thinking in the sense that the software was some application running on Windows, and the software is static.

- Right. - Which means that the best way for you to improve the system is just making faster and faster chips. But we realized that machine learning is not human programming. Machine learning is not about just the software. It's about the entire data pipeline. It's about, in fact, the flywheel of machine learning is the most important thing.

So how do you think about enabling this flywheel on the one hand, and enabling data scientists and researchers to be productive in this flywheel? And that flywheel starts at the very, very beginning. A lot of people don't even realize that it takes AI to curate data to teach an AI.

And that AI alone is pretty complicated. - And is that AI itself is improving? Is it also accelerating? You know, again, when we think about the competitive advantage, right? It's combinatorial of all these systems. - It's exactly, exactly. And that was exactly gonna lead to that. Because of smarter AIs to curate the data, we now even have synthetic data generation and all kinds of different ways of curating data, presenting data to, and so before you even get the training, you've got massive amounts of data processing involved.

And so people think about, oh, PyTorch, that's the beginning end of the world, and it was very important. But don't forget before PyTorch, there's amount of work. After PyTorch, there's amount of work. And the thing about the flywheel is really the way you ought to think. How do I think about this entire flywheel?

And how do I design a computing system, a computing architecture that helps you take this flywheel and be as effective as possible? It's not one slice of an application, training. Does that make sense? That's just one step, okay? Every step along that flywheel is hard. And so the first thing that you should do, instead of thinking about, how do I make Excel faster?

How do I make, you know, Doom faster? That was kind of the old days, isn't that right? Now you have to think about how do I make this flywheel faster? And this flywheel has a whole bunch of different steps. And there's nothing easy about machine learning, as you guys know.

There's nothing easy about what OpenAI does, or X does, or Gemini and the team at DeepMind does. I mean, there's nothing easy about what they do. And so we decided, look, this is really what you ought to be thinking about. This is the entire process. You want to accelerate every part of that.

You want to respect Amdahl's Law. You want to, Amdahl's Law would suggest, well, if this is 30% of the time, and I accelerated that by a factor of three, I didn't really accelerate the entire process by that much. Does that make sense? And you really want to create a system that accelerates every single step of that, because only in doing the whole thing can you really materially improve that cycle time.

And that flywheel, that rate of learning is really, in the end, what causes the exponential rise. And so what I'm trying to say is that our perspective about, you know, a company's perspective about what you're really doing manifests itself into the product. And notice, I've been talking about this flywheel-- - The entire cycle, yeah.

- That's right. And we accelerate everything. Right now, the main focus is video. A lot of people are focused on physical AI and video processing. Just imagine that front end. - Right. - The terabytes per second of data that are coming into the system. Give me an example of a pipeline that is going to ingest all of that data, prepare it for training in the first place.

So that entire thing is CUDA accelerated. - And people are only thinking about text models today. - Yeah. - But the future is, you know, this video models, as well as, you know, using, you know, some of these text models, like O1, to really process a lot of that data before we even get there.

- Yeah. - Right? - Yeah, yeah. Language models are gonna be involved in everything. It took the industry enormous technology and effort to train a language model, to train these large language models. Now we're using a large language model in every single step of the way. It's pretty phenomenal.

- I don't mean to be overly simplistic about this, but again, you know, we hear it all the time from investors, right? Yes, but what about custom ASICs? Yes, but their competitive mode is going to be pierced by this. What I hear you saying is that in a combinatorial system, the advantage grows over time.

So I heard you say that our advantage is greater today than it was three to four years ago because we're improving every component and that's combinatorial. Is that, you know, when you think about, for example, as a business case study, Intel, right? Who had a dominant mode, a dominant position in the stack relative to where you are today.

Perhaps just, you know, again, boil it down a little bit. You know, compare, contrast your competitive advantage to maybe the competitive advantage they had at the peak of their cycle. Well, Intel is extraordinary. Intel is extraordinary because they were probably the first company that was incredibly good at manufacturing, process engineering, manufacturing, and that one click above manufacturing, which is building the chip.

Right. And designing the chip and architecting the chip in the x86 architecture and building faster and faster x86 chips. That was their brilliance. And they fused that with manufacturing. Our company is a little different in the sense that, and we recognize this, that in fact, parallel processing doesn't require every transistor to be excellent.

Serial processing requires every transistor to be excellent. Parallel processing requires lots and lots of transistors to be more cost-effective. I'd rather have 10 times more transistors, 20% slower, than 10 times less transistors, 20% faster. Does that make sense? They were like the opposite. And so single-threaded performance, single-threaded processing and parallel processing was very different.

And so we observed that, in fact, our world is not about being better going down. We want to be very good, as good as we can be. But our world is really about much better going up. Parallel computing, parallel processing is hard because every single algorithm requires a different way of refactoring and re-architecting the algorithm for the architecture.

What people don't realize is that you can have three different ISAs, CPU ISAs. They all have their own C compilers. You could take software and compile down to the ISA. That's not possible in accelerated computing. That's not possible in parallel computing. The company who comes up with the architecture has to come up with their own OpenGL.

So we revolutionized deep learning because of our domain-specific library called CUDNN. Without CUDNN, nobody talks about CUDNN because it's one layer underneath PyTorch and TensorFlow and back in the old days, CAFE and Theano and now Triton. There's a whole bunch of different frameworks. So that domain-specific library, CUDNN, a domain-specific library called Optics, we have a domain-specific library called Quantum, Rapids, the list of aerial for-- - Industry-specific algorithms that sit below that PyTorch layer that everybody's focused on.

Like I've heard oftentimes, well, if LLMs-- - If we didn't invent that, no application on top could work. You guys understand what I'm saying? So the mathematics is really, what NVIDIA is really good at is algorithm. That fusion between the science above, the architecture on the bottom, that's what we're really good at, yeah.

- There's all this attention now on inference, finally. But I remember two years ago, Brad and I had dinner with you and we asked you the question, "Do you think your moat will be as strong in inference "as it is in training?" - Yeah, and I'm sure I said it would be greater.

- Yeah, yeah, and you touched upon a lot of these elements just now, just the composability between, or we don't know the total mix at one point, and to a customer, it's very important to be able to be flexible in between. - That's right. - But can you just touch upon, now that we're in this era of inference?

- It was inference, training is inferencing at scale. I mean, you're right. And so if you train well, it is very likely you'll inference well. If you built it on this architecture without any consideration, it will run on this architecture. You could still go and optimize it for other architectures, but at the very minimum, since it's already been architected, built on NVIDIA, it will run on NVIDIA.

Now, the other aspect, of course, it's just kind of capital investment aspect, which is when you're training new models, you want your best new gear to be used for training, which leaves behind gear that you used yesterday. Well, that gear is perfect for inference. And so there's a trail of free gear.

There's a trail of free infrastructure behind the new infrastructure that's CUDA compatible. And so we're very disciplined about making sure that we're compatible throughout, so that everything that we leave behind will continue to be excellent. Now, we also put a lot of energy into continuously reinventing new algorithms, so that when the time comes, the Hopper architecture is two, three, four times better than when they bought it, so that infrastructure continues to be really effective.

And so all of the work that we do, improving new algorithms, new frameworks, notice it helps every single install base that we have. Hopper is better for it, Ampere is better for it, even Volta is better for it, okay? And I think Sam was just telling me that they had just decommissioned the Volta infrastructure that they have at OpenAI recently.

And so I think we leave behind this trail of install base. Just like all computing, install base matters. And NVIDIA's in every single cloud, we're on-prem and all the way out to the edge. And so the VILA vision language model that's been created in the cloud works perfectly at the edge on the robots, without modification.

It's all CUDA compatible. And so I think this idea of architecture compatibility was important for large... It's no different for iPhones, no different for anything else. I think the install base is really important for inference. But the thing that we really benefit from is because we're working on training these large language models and the new architectures of it, we're able to think about how do we create architectures that's excellent at inference someday when the time comes.

And so we've been thinking about iterative models for reasoning models, and how do we create very interactive inference experiences for this personal agent of yours. You don't want to say something and have to go off and think about it for a while. You want it to interact with you quite quickly.

So how do we create such a thing? And what came out of it was NVLink. NVLink so that we could take these systems that are excellent for training, but when you're done with it, the inference performance is exceptional. And so you want to optimize for this time to first token.

And time to first token is insanely hard to do actually, because time to first token requires a lot of bandwidth. But if your context is also rich, then you need a lot of flops. And so you need an infinite amount of bandwidth, infinite amount of flops at the same time in order to achieve just a few millisecond response time.

And so that architecture is really hard to do. And we invented a Grace Blackwell NVLink for that. - Right. In the spirit of time, I have more questions about that, but- - Don't worry about the time. Hey guys, hey, hey, hey, listen, Janine? - Yeah. - Look. - Let's do it until it's right.

- Let's do it until right, there you go. - I love it, I love it. So, you know, I was at a dinner with Andy Jassy earlier. - See, now we don't have to worry about the time. - With Andy Jassy earlier this week. And Andy said, you know, we've got Tranium, you know, coming and Inferencia coming.

And I think most people, again, view these as a problem for NVIDIA. But in the very next breath, he said, NVIDIA is a huge and important partner to us and will remain a huge and important partner for us. As far as I can see into the future, the world runs on NVIDIA, right?

So when you think about the custom ASICs that are being built, that are going to go after targeted application, maybe the inference accelerator at Meta, maybe, you know, Tranium at Amazon, you know, or Google's TPUs. And then you think about the supply shortage that you have today. Do any of those things change that dynamic, right?

Or are they complements to the systems that they're all buying from you? - We're just doing different things. - Yes. - We're trying to accomplish different things. You know, what NVIDIA is trying to do is build a computing platform for this new world, this machine learning world, this generative AI world, this agentic AI world.

We're trying to create, you know, as you know, and what's just so deeply profound is after 60 years of computing, we reinvented the entire computing stack. The way you write software from programming to machine learning, the way that you process software from CPUs to GPU, the way that the applications from software to artificial intelligence, right?

And so software tools to artificial intelligence. So every aspect of the computing stack and the technology stack has been changed. You know, what we would like to do is to create a computing platform that's available everywhere. And this is really the complexity of what we do. The complexity of what we do is if you think about what we do, we're building an entire AI infrastructure and we think of it as one computer.

I've said before, the data center is now the unit of computing. To me, when I think about a computer, I'm not thinking about that chip. I'm thinking about this thing. That's my mental model and all the software and all the orchestration, all the machinery that's inside. That's my computer.

And we're trying to build a new one every year. - Yeah. - That's insane. Nobody has ever done that before. We're trying to build a brand new one every single year. And every single year, we deliver two or three times more performance. As a result, every single year, we reduce the cost by two or three times.

Every single year, we improve the energy efficiency by two or three times. Right? And so we ask our customers, don't buy everything at one time, buy a little every year. Okay? And the reason for that, we want them cost averaged into the future. All of it's architecturally compatible. Okay?

Now, so that building that alone at the pace that we're doing is incredibly hard. Now, the double part, the double hard part, is then we take that all of that, and instead of selling it as a infrastructure, or selling it as a service, we disaggregate all of it, and we integrate it into GCP.

We integrate it into AWS. We integrate it into Azure. We integrate it into X. Does that make sense? - Yes. - Everybody's integration is different. We have to get all of our architectural libraries, and all of our algorithms, and all of our frameworks, and integrate it into theirs. We get our security system integrated into theirs.

We get our networking integrated into theirs. Isn't that right? - Right. - Then we do basically 10 integrations. And we do this every single year. - Right. - Now, that is the miracle. That is the miracle. - Why? I mean, it's madness. It's madness that you're trying to do this every year.

- I'm thinking about it. - So, what drove you to do it every year, and then related to that, Clark's just back from Taipei, and Korea, and Japan, when meeting with all your supply partners, who you have decade-long relationships with. How important are those relationships to, again, the combinatorial math that builds that competitive moat?

- Yeah, when you break it down systematically, the more you guys break it down, the more everybody breaks it down, the more amazed that they are. - Yes. - And how is it possible that the entire ecosystem of electronics today is dedicated in working with us to build, ultimately, this cube of a computer integrated into all of these different ecosystems, and the coordination is so seamless?

So, there's obviously APIs, and methodologies, and business processes, and design rules that we've propagated backwards, and methodologies, and architectures, and APIs that we've propagated forward. - That have been hardened for decades. - Hardened for decades, yeah, and also evolving as we go. But these APIs have to come together.

- Right, right. - When the time comes, all these things in Taiwan, all over the world being manufactured, they're gonna land somewhere in Azure's data center, they're gonna come together, click, click, click, click, click, click. - Someone just calls an OpenAI API and it just works. - That's right, yeah, exactly.

- Yeah, there's a whole chain. - It's kind of craziness, right? - There's a whole chain. - And so, that's what we invented, that's what we invented, this massive infrastructure of computing. The whole planet is working with us on it. It's integrated into everywhere. It's, you could sell it through Dell, you could sell it through HPE.

It's hosted in the cloud. It's all the way out at the edge. People use it in robotic systems now and human robots. They're in self-driving cars. They're all architecturally compatible. Pretty kind of craziness. - It's craziness. - Clark, I don't want you to leave the impression I didn't answer the question.

In fact, I did. What I meant by that when relating to your ASIC is the way to think about, we're just doing something different. - Yes. - As a company, we want to be situationally aware, and I'm very situationally aware of everything around our company and our ecosystem. I'm aware of all the people doing alternative things and what they're doing, and sometimes it's adversarial to us, sometimes it's not.

I'm super aware of it. But that doesn't change what the purpose of the company is. The singular purpose of the company is to build an architecture, that a platform that could be everywhere. - Right. - That is our goal. We're not trying to take any share from anybody. NVIDIA is a market maker, not share taker.

If you look at our company slides, not one day does this company talk about market share, not inside. All we're talking about is how do we create the next thing? What's the next problem we can solve? In that flywheel, how can we do a better job for people? How do we take that flywheel that used to take about a year, how do we crank it down to about a month?

- Yes, yes. - What's the speed of light of that? Isn't that right? And so we're thinking about all these different things, but the one thing we're not, we're situationally aware of everything, but we're certain that what our mission is, is very singular. The only question is whether that mission is necessary.

Does that make sense? - Yes. - And all companies, all great companies, ought to have that at its core. It's about what are you doing? - For sure. - The only question, is it necessary? Is it valuable? - Right. - Is it impactful? Does it help people? And I am certain that you're a developer, you're a generative AI startup, and you're about to decide how to become a company.

The one choice that you don't have to make is which one of the A6 do I support? If you just support a CUDA, you know you could go everywhere. You could always change your mind later. - Right. - But we're the on-ramp to the world of AI. Isn't that right?

Once you decide to come onto our platform, the other decisions you could defer. You could always build your own A6 later. - Right. - You know, we're not against that. We're not offended by any of that. When I work with, when we work with all the GCPs, the GCPs Azure, we present our roadmap to them years in advance.

They don't present their A6 roadmap to us, and it doesn't ever offend us. Does that make sense? We create, we're in a, if you have a sole purpose, and your purpose is meaningful, and your mission is dear to you, and is dear to everybody else, then you could be transparent.

Notice my roadmap is transparent at GTC. My roadmap goes way deeper to our friends at Azure, and AWS, and others. We have no trouble doing any of that, even as they're building their own A6. - I think, you know, when people observe the business, you said recently that the demand for Blackwell is insane.

You said one of the hardest parts of your job is the emotional toll of saying no to people in a world that has a shortage of the compute that you, that you can produce and have on offer. But critics say this is just a moment in time, right? They say this is just like Cisco in 2000, we're overbuilding fiber.

It's gonna be boom and bust. You know, I think about the start of 23 when we were having dinner. The forecast for NVIDIA at that dinner in January of 23 was that you would do 26 billion of revenue for the year 2023. You did 60 billion, right? The 25 people- - Let's just, let the truth be known.

That is the single greatest failure of forecasting the world has ever seen. - Right, right, right. - Can we all, can we all at least admit that? - What, what, what, what? To me, to me- - That was my takeaway. I just go- (laughing) - And that was, and that was, we got so excited in November 22 because we had folks like Mustafa from Inflection and Noah from Character coming in our office talking about investing in their companies.

And they said, "Well, if you can't pencil out investing in our companies, then buy NVIDIA." Because everybody in the world is trying to get NVIDIA chips to build these applications that are gonna change the world. And of course, the Cambrian moment occurred with CHAT GPT, and notwithstanding that fact, these 25 analysts were so focused on the crypto winner that they couldn't get their head around an imagination of what was happening in the world, okay?

So it ended up being way bigger. You say in very plain English, the demand is insane for Blackwell, that it's going to be that way for as far as you can, you know, for as far as you can see. Of course, the future is unknown and unknowable, but why are the critics so wrong that this isn't going to be the Cisco-like situation of overbuilding in 2000?

- Yeah. The best way to think about the future is reason about it from first principles. - Correct. - Okay, so the question is, what are the first principles of what we're doing? Number one, what are we doing? What are we doing? The first thing that we are doing is we are reinventing computing.

Do we not? We just said that. The way that computing will be done in the future will be highly machine-learned. - Yes. - Highly machine-learned, okay? Almost everything that we do, almost every single application, Word, Excel, PowerPoint, Photoshop, Premier, you know, AutoCAD, you give me your favorite application that was all hand-engineered, I promise you it will be highly machine-learned in the future, isn't that right?

And so all these tools will be, and on top of that, you're gonna have machines, agents that help you use them. - Right. - Okay? And so we know this for a fact at this point, right? Isn't that right? We've reinvented computing, we're not going back. The entire computing technology stack is being reinvented.

Okay, so now that we've done that, we said that software is gonna be different. What software can write is gonna be different. How we use software will be different. So let's now acknowledge that. So those are my ground truth now. - Yes. - Now the question, therefore, is what happens?

And so let's go back and let's just take a look at how's computing done in the past. So we have a trillion dollars worth of computers in the past. We look at it, just open the door, look at the data center, and you look at it and say, are those the computers you want doing that, doing that future?

And the answer is no. - Right. - Right, you got all these CPUs back there. We know what it can do and what it can't do. And we just know that we have a trillion dollars worth of data centers that we have to modernize. And so right now, as we speak, if we were to have a trajectory over the next four or five years to modernize that old stuff, that's not unreasonable.

- Right. - Sensible. So we have a trillion-- - And you're having those conversations with the people who have to modernize it. - Yeah. - And they're modernizing it on GPU. - That's right. I mean, well, let's make another test. You have $50 billion of CapEx you'd like to spend.

Option A, option B, build CapEx for the future. - Right. - Or build CapEx like the past. - Right. - Now you already have the CapEx of the past. - Right, right. - It's sitting right there. It's not getting much better anyways. Moore's law has largely ended. And so why rebuild that?

Let's just take $50 billion, put it into generative AI. Isn't that right? And so now your company just got better. - Right. - Now, how much of that 50 billion would you put in? Well, I would put in 100% of the 50 billion because I've already got four years of infrastructure behind me that's of the past.

And so now you just, I just reasoned about it from the perspective of somebody thinking about it from first principles and that's what they're doing. Smart people are doing smart things. Now the second part is this. So now we have a trillion dollars worth of capacity to go build, right?

Trillion dollars worth of infrastructure. What about, you know, call it $150 billion into it. - Right. - Okay. So we have a trillion dollars in infrastructure to go build over the next four or five years. Well, the second thing that we observe is that the way that software is written is different but how software is gonna be used is different.

In the future, we're gonna have agents. Isn't that right? - Correct. - We're gonna have digital employees in our company. In your inbox, you have all these little dots and these little faces. In the future, there's gonna be icons of AIs. Isn't that right? I'm gonna be sending them.

I'm gonna be, I'm no longer gonna program computers with C++. I'm gonna program AIs with prompting. Isn't that right? Now this is no different than me talking to my, you know, this morning, I wrote a bunch of emails before I came here. I was prompting my teams. - Of course.

Yeah. - And I would describe the context. I would describe the fundamental constraints that I know of. And I would describe the mission for them. I would leave it sufficiently, I would be sufficiently directional so that they understand what I need. And I wanna be clear about what the outcome should be, as clear as I can be.

But I leave enough ambiguous space on, you know, a creativity space so they can surprise me. Isn't that right? - Absolutely. - It's no different than how I prompt an AI today. - Yeah. - It's exactly how I prompt an AI. And so what's gonna happen is, is on top of this infrastructure of IT that we're gonna modernize, there's gonna be a new infrastructure.

This new infrastructure are going to be AI factories that operate these digital humans. And they're gonna be running all the time, 24/7. - Right. - We're gonna have 'em for all of our companies all over the world. We're gonna have 'em in factories. We're gonna have 'em in autonomous systems.

Isn't that right? So there's a whole layer of computing fabric, a whole layer of what I call AI factories that the world has to make that doesn't exist today at all. - Right. - So the question is, how big is that? - Right. - Unknowable at the moment. Probably a few trillion dollars.

- Right. - Unknowable at the moment, but as we're sitting here building in, the beautiful thing is the architecture for this modernizing this new data center and the architecture for the AI factory is the same. - Right. - That's the nice thing. - And you made this clear. You've got a trillion of old stuff.

You've got to modernize. You at least have a trillion of new AI workloads coming on. - Yeah. - Give or take, you'll do 125 billion in revenue this year. You know, there was, at one point somebody told you the company would never be worth more than a billion. As you sit here today, is there any reason, right, if you're only 125 billion out of a multi-trillion, Tam, that you're not going to have 2X the revenue, 3X the revenue in the future that you have today?

Is there any reason your revenue doesn't? - No. - Yeah. - Yeah. As you know, it's not about, everything is, you know, companies are only limited by the size of the fish pond, you know? - Yes, yes. - A goldfish can only be so big. And so the question is, what is our fish pond?

What is our pond? And that requires a little imagination. And this is the reason why market makers think about that future, creating that new fish pond. It's hard to figure this out looking backwards and try to take share. - Right. - You know, share takers can only be so big.

- For sure. - Market makers can be quite large. - For sure. - Yeah, and so, you know, I think the good fortune that our company has is that since the very beginning of our company, we had to invent the market for us to go swim in. That market, and people don't realize this back then, but anymore, but, you know, we were at ground zero of creating the 3D gaming PC market.

- Right, right. - We largely invented this market and all the ecosystem and all the graphics card ecosystem, we invented all that. And so the need to invent a new market to go serve it later is something that's very comfortable for us. - Exactly, exactly. And speaking to somebody who's invented a new market, you know, let's shift gears a little bit to models and open AI, open AI raised, as you know, six and a half billion dollars this week, at like $150 billion valuation.

We both participated. - Yeah, really happy for them, really happy they came together. - Right. - Yeah, they did a great stand and the team did a great job, yeah. - Reports are that they'll do 5 billion-ish of revenue or run rate revenue this year, maybe going to 10 billion next year.

If you look at the business today, it's about twice the revenue as Google was at the time of its IPO. They have 250 million-- - Is that right? - Yeah, 250 million weekly average users, which we estimate is twice the amount Google had at the time of its IPO.

- Is that right, okay, wow. - And if you look at the multiple of the business, if you believe 10 billion next year, it's about 15 times the forward revenue, which is about the multiple of Google and Meta at the time of their IPO, right? When you think about a company that had zero revenue, zero weekly average users 22 months ago-- - Brad has an incredible command of history.

- When you think about that, talk to us about the importance of open AI as a partner to you and open AI as a force in kind of driving forward, you know, kind of public awareness and usage around AI. - Well, this is one of the most consequential companies of our time.

The, a pure play AI company pursuing the vision of AGI and whatever its definition. I almost don't think it matters fully what the definition is, nor do I, you know, really believe that the timing matters. The one thing that I know is that AI is gonna have a roadmap of capabilities over time.

And that roadmap of capabilities over time is gonna be quite spectacular. And along the way, long before it even gets to anybody's definition of AGI, we're gonna put it to great use. All you have to do is right now as we speak, go talk to digital biologists, climate tech researchers, material researchers, physical sciences, astrophysicists, quantum chemists.

You go ask video game designers, manufacturing engineers, roboticists, pick your favorite, whatever industry you wanna go pick. And you go deep in there and you talk to the people that matter and you ask them, has AI revolutionized the way you work? And you take those data points and you come back and you then get to ask yourself, how skeptical do you wanna be?

Because they're not talking about AI as a conceptual benefit someday. They're talking about using AI right now. Right now, ag tech, material tech, climate tech, you pick your tech, you pick your field of science. They are advancing, AI is helping them advancing their work right now as we speak.

Every single industry, every single company, every university, unbelievable, isn't that right? - Right. - It is absolutely going to somehow transform business. We know that. - Right. - I mean, it's so tangible, you could-- - It's happening today. - It's happening today. - It's happening today. - Yeah, yeah.

And so I think that the awakening of AI, chat GPT triggered, it's completely incredible. And I love their velocity and their singular purpose of advancing this field. And so really, really consequential. And they build an economic engine that can finance the next frontier of models, right? And I think there's a growing consensus in Silicon Valley that the whole model layer is commoditizing.

Lama is making it very cheap for lots of people to build models. And so early on here, we had a lot of model companies, character and inflection and Cohere and Mistral and go through the list. And a lot of people question whether or not those companies can build the escape velocity on the economic engine that can continue funding those next generation.

My own sense is that there's gonna be, that's why you're seeing the consolidation, right? Open AI clearly has hit that escape velocity. They can fund their own future. It's not clear to me that many of these other companies can. Is that a fair kind of review of the state of things in the model layer that we're going to have this consolidation like we have in lots of other markets to market leaders who can afford, who have an economic engine, an application that allows them to continue to invest?

- First of all, there's a different fundamental difference between a model and artificial intelligence, right? - Yeah. - A model is an essential ingredient for artificial intelligence. It's necessary, but not sufficient. - Correct. - And so, and artificial intelligence is a capability, but for what? - Right. - Then what's the application?

- Right. - The artificial intelligence for self-driving cars is related to the artificial intelligence for human or robots, but it's not the same, which is related to the artificial intelligence for a chatbot, but not the same. - Correct. - And so, you have to understand the taxonomy of-- - Stack.

- Yeah, of the stack. And at every layer of the stack, there will be opportunities, but not infinite opportunities for everybody at every single layer of the stack. Now, I just said something, all you have to do is replace the word model with GPU. In fact, this was the great observation of our company 32 years ago, that there's a fundamental difference between GPU, graphics chip or GPU, versus accelerated computing.

And accelerated computing is a different thing than the work that we do with AI infrastructure. It's related, but it's not exactly the same. It's built on top of each other. It's not exactly the same. And each one of these layers of abstraction requires fundamental different skills. Somebody who's really, really good at building GPUs have no clue how to be an accelerated computing company.

I can, there are a whole lot of people who build GPUs. And I don't know which one came, we invented the GPU, but you know that we're not the only company that makes GPUs today. - Correct. - And so, there are GPUs everywhere, but they're not accelerated computing companies.

And there are a lot of people who, you know, they're accelerators, accelerators that does application acceleration, but that's different than an accelerated computing company. And so for example, a very specialized AI application. - Right. - Could be a very successful thing. - Correct. And that is MTIA. - That's right.

But it might not be the type of company that had broad reach and broad capabilities. And so, you've got to decide where you want to be. There's opportunities probably in all these different areas, but like building companies, you have to be mindful of the shifting of the ecosystem and what gets commoditized over time.

Recognizing what's a feature versus a product. - Right. - Versus a company. - For sure. - Okay. I just went through, okay. And there's a lot of different ways you can think about this. - Of course, there's one new entrant that has the money, the smarts, the ambition. That's X.AI.

- Yeah. - Right? And well, there are reports out there that you and Larry and Elon had dinner. They talked to you out of 100,000 H100s. They went to Memphis and built a large coherent super cluster in a matter of months. - You know. - So first, three points don't make a line, okay.

Yes, I had dinner with them. (laughing) Causality is there. What do you think about their ability to stand up that super cluster? And there's talk out there that they want another 100,000 H200s, right? To expand the size of that super cluster. You know, first talk to us a little bit about X and their ambitions and what they've achieved.

But also, are we already at the age of clusters of 200 and 300,000 GPUs? - The answer is yes. And then the, first of all, acknowledgement of achievement where it's deserved. From the moment of concept to a data center that's ready for NVIDIA to have our gear there, to the moment that we powered it on, had it all hooked up, and it did its first training.

- Yeah. - Okay? - Correct. - That first part, just building a massive factory, liquid cooled, energized, permitted, in the short time that was done. I mean, that is like superhuman. - Right. - Yeah, and as far as I know, there's only one person in the world who could do that.

- Right. - I mean, Elon is singular in this understanding of engineering and construction and large systems and marshaling resources. - Incredible. - Yeah, just, it's unbelievable. And then, and of course, then his engineering team is extraordinary. I mean, the software team's great, the networking team's great, the infrastructure team is great.

You know, Elon understands this deeply. And from the moment that we decided to get to go, the planning with our engineering team, our networking team, our infrastructure computing team, the software team, all of the preparation advance, then all of the infrastructure, all of the logistics and the amount of technology and equipment that came in on that day, NVIDIA's infrastructure and computing infrastructure and all that technology, to training, 19 days.

Hang on, you just, you know what? - Did anybody sleep 24/7? - No question that nobody slept. But first of all, 19 days is incredible, but it's also kind of nice to just take a step back and just, do you know how many days 19 days is? It's just a couple of weeks.

And the amount of technology, if you were to see it, is unbelievable. All of the wiring and the networking and, you know, networking NVIDIA gear is very different than networking hyperscale data centers, okay? The number of wires that goes in one node, the back of a computer is all wires.

And just getting this mountain of technology integrated and all the software, incredible. Yeah, so I think what Elon and the X team did, and I'm really appreciative that he acknowledges the engineering work that we did with him and the planning work and all that stuff. But what they achieved is singular, never been done before.

Just to put in perspective, 100,000 GPUs, that's easily the fastest supercomputer on the planet as one cluster. A supercomputer that you would build would take normally three years to plan. - Right. - And then they deliver the equipment and it takes one year to get it all working. - Yes.

- We're talking about 19 days. - Wow. - What's the credit of the NVIDIA platform, right? That it's, the whole processes are hardened. - That's right, yeah. Everything's already working. And of course there's a whole bunch of X algorithms and X framework and X stack and things like that.

And we got a ton of integration we have to do, but the planning of it was extraordinary. Just pre-planning of it to, you know. - N of one is right. Elon is an N of one. But you answered that question by starting off saying, yes, 200 to 300,000 GPU clusters are here, right?

Does that scale to 500,000? Does it scale to a million? And does the demand for your products depend on it scaling to millions? - That part, the last part is no. My sense is that distributed training will have to work. - Right. - And my sense is that distributed computing will be invented.

- Right. - And some form of federated learning and distributed, asynchronous distributed computing is going to be discovered. And I'm very enthusiastic and very optimistic about that. Of course, the thing to realize is that the scaling law used to be about pre-training. Now we've gone to multimodality, we've gone to synthetic data generation.

- Right. - Post-training has now scaled up incredibly. Synthetic data generation, reward systems, reinforcement learning based. And then now inference scaling has gone through the roof. - Right. - The idea that a model, before it answers your answer, had already done internal inference 10,000 times, is probably not unreasonable.

And it's probably done tree search, it's probably done reinforcement learning on that, it's probably done some simulations, surely done a lot of reflection, it probably looked up some data, it looked up some information, isn't that right? And so its context is probably fairly large. I mean, this type of intelligence is, well, that's what we do.

- Right. - That's what we do, isn't that right? And so the ability, this scaling, if you just did that math and you compound it with, you compound that with 4X per year on model size and computing size. And then on the other hand, demand continues to grow in usage.

Do we think that we need millions of GPUs? No doubt. - Yeah. - Yeah, that is a first certainty now. - Yeah. - And so the question is, how do we architect it from a data center perspective? And that has a lot to do with, are there data centers that are gigawatts at a time, or are they 250 megawatts at a time?

And my sense is that you're gonna get both. - I think analysts always focus on the current architectural bet. But I think one of the biggest takeaways from this conversation is that you're thinking about the entire ecosystem and many years out. So the idea that, because NVIDIA is just scaling up or scaling out, it's to meet the future.

It's not such that you're only dependent on a world where there's a 500,000 or a million GPU cluster. By the time there's distributed training, you'll have written the software to enable that. - That's right. Remember, without Megatron that we developed some seven years ago now, the scaling of these large training jobs wouldn't have happened.

And so we invented Megatron, we invented Nickel, GPU Direct, all of the work that we did with RDMA, that made it possible for easily to do pipeline parallelism, right? And so all the model parallelism that's being done, all the breaking of the distributed training and all the batching and all that, all of that stuff is because we did the early work.

And now we're doing the early work for the future generation. - So let's talk about Strawberry and O1. I wanna be respectful of your time. - I got all the time in the world, guys. - Well, you're very generous. - Yeah, I've got all the time in the world.

- But first, I think it's cool that they named O1 after the O1 visa, right? Which is about recruiting the world's best and brightest, you know, and bringing them to the United States. It's something I know we're both deeply passionate about. So I love the idea that building a model that thinks and that takes us to the next level of scaling intelligence, right?

Is an homage to the fact that it's these people who come to the United States by way of immigration that have made us what we are, bring their collective intelligence to the United States. - Surely an alien intelligence. - Certainly. - Yeah. - You know, it was spearheaded by our friend, Noam Brown, of course.

He worked at Pluribus and Cicero when he was at Meta. How big a deal is inference time reasoning as a totally new vector of scaling intelligence, separate and distinct from, right, just building larger models? - It's a huge deal. It's a huge deal. I think the, a lot of intelligence can't be done a priori.

- Right. - You know, and a lot of computing, even a lot of computing can't be reordered. I mean, just, you know, out of order execution can't be done a priori, you know? And so a lot of things that can only be done in runtime. - Right. - And so whether you think about it from a computer science perspective, or you think about it from an intelligence perspective, too much of it requires context.

- Right. - The circumstance. - Right. - The type of answer you're looking for. Sometimes just a quick answer is good enough. - Right. - Depends on the consequential, you know, impact of the answer. - Right. - You know, depending on the nature of the usage of that answer.

So some answers, "Please take a night." Some answers, "Take a week." - Yes. - Is that right? So I could totally imagine me sending off a prompt to my AI and telling it, you know, "Think about it for a night." - Right. - "Think about it overnight. "Don't tell me right away." - Right.

- "I want you to think about it all night. "And then come back and tell me tomorrow "what's your best answer and reason about it for me." And so I think the segmentation of intelligence from now, from a product perspective, there's going to be one-shot versions of it. - Right, for sure.

- Yeah. And then there'll be some that take five minutes, you know. - And the intelligence layer that roots those questions to the right model. - Yeah. - For the right use case. I mean, we were using advanced voice mode and no one preview last night. I was coaching my son for his AP history test.

And it was like having the world's best AP history teacher. - Yeah, right. - Right next to you. - Yeah. - Thinking about these questions. It was truly extraordinary. Again, they're-- - My tutor's an AI today. - Right, right. - I'm serious. - Of course, they're here today. - Yeah.

- Which comes back to this, you know, over 40% of your revenue today is inference. But inference is about ready because of chain of reasoning. - Yeah. - Right? It's about ready-- - It's about to go up by a billion times. - Right, by a million X, by a billion X.

- That's right. That's the part that most people have, you know, haven't completely internalized. This is that industry we were talking about, but this is the industrial revolution. - Right. That's the production of intelligence. - That's right. - Right? - Yeah. It's going to go up a billion times.

- Right. And so, you know, everybody's so hyper-focused on NVIDIA as kind of like doing training on bigger models. - Yeah. - Right? Isn't it the case that your revenue, if it's 50/50 today, you're going to do way more inference in the future. - Yeah. - And then, I mean, training will always be important, but just the growth of inference is going to be way larger than the growth in training.

- We hope, we hope. - It's almost impossible to conceive otherwise. - Yeah, we hope. That's right, that's right. - Right. - Yeah, I mean, it's good to go to school. - Yes. - But the goal is so that you can be productive in society later. And so it's good that we train these models, but the goal is to inference them, you know?

- Are you already using chain of reasoning and, you know, tools like O1 in your own business to improve your own business? - Yeah, our cybersecurity system today can't run without our own agents. - Okay. - We have agents helping us design chips. Hopper wouldn't be possible. Blackwell wouldn't be possible.

Rubin, don't even think about it. We have digital, we have AI chip designers, AI software engineers, AI verification engineers. And we build them all inside because, you know, we have the ability and we rather use it, use the opportunity to explore the technology ourselves. - You know, when I walked into the building today, somebody came up to me and said, you know, "Ask Jensen about the culture.

"It's all about the culture." I look at the business, you know, we talk a lot about fitness and efficiency, flat organizations that can execute quickly, smaller teams. You know, NVIDIA is in a league of its own, really, you know, at about 4 million of revenue per employee, about 2 million of profits or free cashflow per employee.

You've built a culture of efficiency that really has unleashed creativity and innovation and ownership and responsibility. You've broken the mold on kind of functional management. Everybody likes to talk about all of your direct reports. Is the leveraging of AI the thing that's going to continue to allow you to be hyper-creative while at the same time being efficient?

- No question. I'm hoping that someday, NVIDIA has 32,000 employees today. And we have 4,000 families in Israel. I hope they're well. I'm thinking of you guys. And I'm hoping that NVIDIA someday will be a 50,000 employee company with a hundred million, you know, AI assistants. And they're in every single group.

We'll have a whole directory of AIs that are just generally good at doing things. We'll also have, our inbox is gonna full of directories of AIs that we work with that we know are really good, specialized at our skill. And so AIs will recruit other AIs to solve problems.

AIs will be in, you know, Slack channels with each other. - And with humans. - Right, and with humans. And so we'll just be one large, you know, employee base, if you will. Some of 'em are digital and AI, some of 'em are biological. And I'm hoping some of 'em even in megatronics.

- I think from a business perspective, it's something that's greatly misunderstood. You just described a company that's producing the output of a company with 150,000 people, but you're doing it with 50,000 people. Now, you didn't say I was gonna get rid of all my employees. You're still growing the number of employees in the organization, but the output of that organization, right, is gonna be dramatically more.

- This is often misunderstood. AI is not, it's not, AI will change every job. AI will have a seismic impact on how people think about work. Let's acknowledge that. AI has the potential to do incredible good. It has the potential to do harm. We have to build safe AI.

Let's just make that foundational, okay? The part that is overlooked is when companies become more productive using artificial intelligence, it is likely that it manifests itself into either better earnings or better growth or both. - Right. - And when that happens, the next email from the CEO is likely not a layoff announcement.

- Of course, 'cause you're growing. - Yeah, and the reason for that is because we have more ideas than we can explore, and we need people to help us think through it before we automate it. And so the automation part of it, AI can help us do. Obviously, it's gonna help us think through it as well, but it's still gonna require us to go figure out what problems do I wanna solve?

There are a trillion things we can go solve. What problems does this company have to go solve? And select those ideas and figure out a way to automate and scale. And so as a result, we're gonna hire more people as we become more productive. People forget that, you know?

And if you go back in time, obviously we have more ideas today than 200 years ago. That's the reason why GDPs are larger and more people are employed, and even though we're automating like crazy underneath. - It's such an important point of this period that we're entering. One, almost all human productivity, almost all human prosperity is the byproduct of the automation and the technology of the last 200 years.

I mean, you can look at, you know, from Adam Smith and Schumpeter's creative destruction, you can look at charted GDP growth per person over the course of the last 200 years, and it's just accelerated. Which leads me to this question. If you look at the '90s, our productivity growth in the United States was about 2 1/2 to 3% a year, okay?

And then in the 2000s, it slowed down to about 1.8%. And then the last 10 years has been the slowest productivity growth. So that's the amount of labor and capital, or the amount of output we have for a fixed amount of labor and capital. The slowest we've had on record, actually.

And a lot of people have debated the reasoning for this, but if the world is as you just described, and we're going to leverage and manufacture intelligence, then isn't it the case that we're on the verge of a dramatic expansion in terms of human productivity? - That's our hope.

- Right. - That's our hope. And of course, you know, we live in this world, so we have direct evidence of it. - Right. - We have direct evidence of it, either as isolated of a case as a individual researcher. - For sure. - Who is able to, with AI, now explore science at such an extraordinary scale that is unimaginable.

That's productivity. - Right, 100%. - Measure of productivity. Or that we're designing chips that are so incredible at such a high pace and the chip complexities and the computer complexities we're building are going up exponentially while the company's employee base is not measure of productivity. - Correct. - The software that we're developing better and better and better because we're using AI and supercomputers to help us.

The number of employees is growing barely linearly. - Okay, okay, okay. Another demonstration of productivity. So whether it's, I can go into, I can spot check it in a whole bunch of different industries. - Yes. - I could gut check it myself. - Yes, you're in business. - That's right.

And so I can, you know, and of course, you can't, we could be overfit, but the artistry of it, of course, is to generalize what is it that we're observing and whether this could manifest in other industries. And there's no question that intelligence is the single most valuable commodity the world's ever known.

And now we're gonna manufacture it at scale. And we, all of us have to get good at, you know, what would happen if you're surrounded by these AIs and they're doing things so incredibly well and so much better than you? - Right. - And when I reflect on that, that's my life.

I have 60 direct reports. - Right. - The reason why they're on eStaff is because they're world-class at what they do. And they do it better than I do. - Right. - Much better than I do. - Right. - I have no trouble interacting with them. And I have no trouble prompt engineering them.

- Right, totally. - I have no trouble programming them. - Right, right. - And so I think that that's the thing that people are going to learn, is that they're all gonna be CEOs. - Right. - They're all gonna be CEOs of AI agents. - Right. - And their ability to have the creativity, the will, and some knowledge on how to reason, break problems down, so that you can program these AIs to help you achieve something like I do.

- Right. - You know, it's called running companies. - Right, now it's, you mentioned something, this alignment and the safe AI. You mentioned the tragedy going on in the Middle East. You know, we have a lot of autonomy and a lot of AI that's being used in different parts of the world.

So let's talk for a second about bad actors, about safe AI, about coordination with Washington. How do you feel today? Are we on the right path? Do we have a sufficient level of coordination? You know, I think Mark Zuckerberg has said, the way we beat the bad AIs is we make the good AIs better.

How would you characterize your view of how we make sure that this is a positive net benefit for humanity, as opposed to, you know, leaving us in this dystopian world without purpose? - The conversation about safety is really important and good. - Yes. - The abstracted view, this conceptual view of AI being a large giant neural network, not so good.

- Right, right. - Okay. And the reason for that is because as we know, artificial intelligence and large language models are related and not the same. There are many things that are being done that I think are excellent. One, open sourcing models so that the entire community of researchers and every single industry and every single company can engage AI and go learn how to harness this capability for their application, excellent.

Number two, it is under-celebrated the amount of technology that is dedicated to inventing AI to keep AI safe. - Yes. - AIs to curate data, to curate information, to train an AI, AI created to align AI, synthetic data generation, AI to expand the knowledge of AI, to cause it to hallucinate less.

All of the AIs that are being created for vectorization or graphing or whatever it is, to inform an AI, guard railing AI, AIs to monitor other AIs, that the system of AIs to create safe AI is under-celebrated. - Right. That we've already built. - That we're building everybody all over the industry, the methodologies, the red teaming, the process, the model cards, the evaluation systems, the benchmarking systems.

All of that, all of the harnesses that are being built at the velocity that's been built is incredible. - I wonder if the-- - Under-celebrated, do you guys understand? - Yes. - The world still think-- - And there's no government regulation saying you have to do this. - Yeah, right.

- This is the actors in the space today who are building these AIs, are taking seriously and coordinating around best practices with respect to these critical matters. - That's right, exactly. And so that's under-celebrated, under-understood. - Yes. - Somebody needs to, well, everybody needs to start talking about AI as a system of AIs and system of engineered systems, engineered systems that are well-engineered, built from first principles, well-tested, so on and so forth.

Regulation, remember, AI is a capability that can be applied. And don't, it's necessary to have regulation for important technologies, but it's also, don't overreach to the point where some of the regulation ought to be done, most of the regulation ought to be done at the applications. - Right. - FDA, NHTSA, FDA, you name it, right?

All of the different, all of the different ecosystems that already regulate applications of technology. - Right. - Now have to regulate the application of technology that is now infused with AI. - Right. - And so, and so I think, I think there's, don't, don't, don't misunderstand, don't overlook the overwhelming amount of regulation in the world that are going to have to be activated for AI, and don't rely on just one universal, galactic, you know, AI council that's gonna possibly be able to do this, because there's a reason why all of these different agencies were created.

There was, there's a reason why all these different regulatory bodies were created. We'll go back to first principles again. - I'd get in trouble by my partner, Bill Gurley, if I didn't go back to the open source point. You guys launched a very important, very large, very capable open source model.

- Yeah. - Recently. - Yeah. - Recently. - Yeah. - Obviously, meta is making significant contributions to open source. I find when I read Twitter, you know, you have this kind of open versus closed, a lot of, a lot of chatter about it. - Yeah. - How do you feel about open source, your own open source models' ability to keep up with frontier?

That would be the first question. The second question would be, is that, you know, having that open source model and also having closed source models, you know, that are powering commercial operations, is that what you see into the future? And do those two things, does that create the healthy tension for safety?

- Mm-hmm. Open source versus closed source is related to safety, but not only about safety. - Yes. - You know, and so, so for example, there's absolutely nothing wrong with having closed source models that are, that are the engines of an economic model. - Exactly. - Necessary to sustain innovation.

- Right. - Okay, I celebrate that wholeheartedly. - Right. - It is, it is, it is, I believe, wrong-minded to be closed versus open. - Right. - It should be closed and open. - Plus open. - Yeah, right, because open is necessary for many industries to be activated. Right now, if we didn't have open source, how would all these different fields of science be able to activate, be activated on AI?

- Right. - Right, because they have to develop their own domain-specific AIs, and they have to develop their own, using open source models, create domain-specific AIs. They're related, again, not the same. - Right. - Just because you have an open source model doesn't mean you have an AI, and so you have to have that open source model to enable the creation of AIs.

So financial services, healthcare, transportation, the list of industries, fields of science that has now been enabled as a result of open source, unbelievable. - Are you seeing a lot of demand for your open source models? - Our open source models, so first of all, Lama Downloads, right, obviously. Yeah, Mark and the work that they've done, incredible, off the charts.

- Yes. - And it completely activated and engaged every single industry, every single field of science. - Right, right, it's terrific. - The reason why we did Nemotron was for synthetic data generation. Intuitively, the idea that one AI would somehow sit there and loop and generate data to learn itself, it sounds brittle.

- Yes. - And how many times you can go around that infinite loop, that loop, you know, questionable. However, it's kind of, my mental image is kind of like, you get a super smart person, put him into a padded room, close the door for about a month. You know, what comes out is probably not a smarter person.

And so, but the idea that you could have two or three people sit around and we have different AIs, we have different distributions of knowledge and we can go QA back and forth, all three of us can come out smarter. - Right. - And so the idea that you can have AI models exchanging, interacting, going back and forth, debating, reinforcement learning, synthetic data generation, for example, kind of intuitively suggests it makes sense, yeah.

And so our model, Nemotron 350B is, 340B is the best model in the world for reward systems. And so it is the best critique. - Okay, interesting. - Yeah, and so a fantastic model for enhancing everybody else's models. Irrespective of how great somebody else's model is, I'd heavily recommend using Nemotron 340B to enhance and make it better.

And we've already seen, made Lama better, made all the other models better. - Well, we're coming to the end. - Thank goodness. - As somebody who delivered DGX-1 in 2016, it's really been an incredible journey. Your journey is unlikely and incredible at the same time. - Thank you. - You survived, like just surviving the early days was pretty extraordinary.

You delivered the first DGX-1 in 2016. We had this Cambrian moment in 2022. And so I'm gonna ask you the question I often get asked, which is, how long can you sustain what you're doing today? With 60 direct reports, you're everywhere. You're driving this revolution. Are you having fun?

And is there something else that you would rather be doing? - Is this a question about the last hour and a half? The answer is I had a great time. I couldn't imagine anything else I'd rather be doing. Let's see. I don't think it's right to leave the impression that our job is fun all the time.

My job isn't fun all the time, nor do I expect it to be fun all the time. Was that ever an expectation that it was fun all the time? I think it's important all the time. I don't take myself too seriously. I take the work very seriously. I take our responsibility very seriously.

I take our contribution and our moment in time very seriously. Is that always fun? No. But do I always love it? Yes. Like all things. Whether it is family, friends, children, is it always fun? No. Do we always love it? Absolutely, deeply. And so I think the, how long can I do this?

The real question is how long can I be relevant? And that only matters, that piece of information, that question can only be answered with how am I gonna continue to learn? And I am a lot more optimistic today. I'm not saying this simply because of our topic today. I'm a lot more optimistic about my ability to stay relevant and continue to learn because of AI.

I use it, I don't know, but I'm sure you guys do. I use it literally every day. There's not one piece of research that I don't involve AI with. There's not one question that even if I know the answer, I double check on it with AI. And surprisingly, you know, the next two or three questions I ask it reveals something I didn't know.

You pick your topic. You pick your topic. And I think that AI as a tutor, AI as an assistant, AI as a partner to brainstorm with, double check my work. You know, boy, you guys, it's completely revolutionary. And that's just, you know, I'm an information worker. My output is information.

And so I think the contributions that I'll have on society is pretty extraordinary. So I think if that's the case, if I could stay relevant like this and I can continue to make a contribution, I know that the work is important enough for me to want to continue to pursue it.

And my quality of life is incredible. So I mean, what's there to complain about? - I'll say, I can't imagine, you and I have been at this for a few decades. I can't imagine missing this moment. - Yeah, right. - It's the most consequential moment of our careers. We're deeply grateful for the partnership.

- Don't miss the next 10 years. - For the thought partnership. - You make us smarter. - Thank you. - And I think you're really important as part of the leadership, right? That's going to optimistically and safely lead this forward. So thank you for being with us. - Thank you.

Really enjoyed it. Thanks, Brad. Thanks, Clark. Good job. (upbeat music) - As a reminder to everybody, just our opinions, not investment advice.