Ep8. AI Models, Data Scaling, Enterprise & Personal AI | BG2 with Bill Gurley & Brad Gerstner

Elon is building a much, much bigger cluster to train a much, much bigger model as is open AI, as is Zuckerberg. I mean, well, Sam just said the bigger models aren't the best. Well, I mean, he may be doing the same game that everybody else is doing, Bill, and trying to throw everybody off the scent.

Great to have you guys. What a week it's been nuts. There's so much to talk about and we have our good buddy, Sonny. How are you doing? Deep Madra in the house. Sonny, somebody, Bill and I go too often when we're talking through all things. AI currently at Grok working on the inference cloud.

So you're deep in thinking about all these things, all these models, AI, and we're going to talk a lot about models today with the release of Llama 3. So it's good to have you, Sonny. Good to be here. Thanks guys. Bill, why are you in town? Board meetings, a couple of board meetings.

Good to have you. Yeah. I always like doing this in person. Yeah, I know you do. Okay. So let's, so let's roll. They're my favorite episodes when you do them live. Okay. There we go. There we go. So models, models, models, models. If AI is the next big thing, then this felt like another really important week.

I mean, we got models being dropped by Meta with Llama 3. That was the one that was really, you know, the category five earthquake. Microsoft, Snowflake, everybody seems to be out with a new model, but let's start with Zuck, huge Llama 3 unveiling, three distinct models, an 8 billion, a 70 billion, and a 405 billion parameter model, which is still training and still learning they're telling us, which is pretty fascinating, but what seems to have, you know, shocked the market is that Meta could pack so much intelligence into such a small model.

And so both models quickly shot up the rankings this week. We have some, you know, a screenshot here of that. Of course the 405 is still training and there, there've been some hints out of the, a recent podcast with Zuck and Dwarkesh about it may in fact kind of come in at the top of the polls.

We'll see, it's probably going to train for another couple of months, but I'd love to hear from both of you guys. What were your big takeaways from the launch of Llama 3 and maybe start with you, Sonny, walk us through kind of just the what and the how of Model 3 and why it really kind of shook things up.

Yeah, I would say, you know, the biggest impact of Llama 3 is its capabilities and at the size and what, you know, Zuck shared in that interview was that they basically took the model and kept training it past the chinchilla point. And so really by doing that, which is generally considered like sort of the point of diminishing returns, they were able to pack much more information and much more capability into this model with the same data set.

So, so, so just for, for, for everybody listening. So the chinchilla point, if I understand it correctly, right, that it's the by-product of this paper out of Google, which basically talked about the relationship between the optimal amount of data to use for an, for a certain amount of compute.

Yep. But in the case of Meta, when they were training Llama 3, they were basically continued with these forward passes of the data. So they were curating the data, refining the data, pushing it back into the model. And I think several people who are working on pre-training at Meta said they were even surprised that it was still learning when they took it offline on that data.

Yeah, and they only took it offline to reallocate the resources to, you know, 405 and other efforts. And I think he said Llama 4. And Llama 4. Right. So the rate of innovation, certainly not slowing down there. So a 15 trillion parameter model. 15 trillion tokens used to train it.

Oh yeah, yeah, 15 trillion tokens used to train, you know, the model. I know at Grok, you guys are deploying Llama 3. I think you deployed it the same day that it came out. So how important is this? How important a development is it in the world of models?

Well, really, you know, Zak came out and threw down for the entire world of folks that are building models. And it's really disruptive because when you look at the rankings, you have a model that's much smaller, so much easier to run on all different types of hardware and much faster.

And so those two things are like catnip for developers. And for us, we saw within the first 48 hours, it become the most popular model that we run on Grok. And so, really. Replacing what? Replacing Mixtral 8x7 for us. Interesting. Which was, you know, generally considered the best open source model at that point.

And what the capabilities have happened beyond us sort of running it, the developers that use it, the use cases we've seen it in are incredible. And people are doing a direct replacement with OpenAI across the board. They come to us. So they come to, you know, all the different providers and they replace out OpenAI and they don't really see any performance impact or any reasoning impact or, which is incredible.

And why replace? What is being optimized in the switch? Price performance, right? You get probably on from a GPT-4, you're more than 10 times cheaper, right? And you're, yeah, 10 times cheaper. And well, let me just tell you, GPT-4 is $10 per million tokens input and $30 per million token output.

And LLAMA370B is $0.60 for a million tokens input and $0.70 for a million tokens output. I mean, Bill, this seems to be playing right into your thesis around kind of just these models generally. Commoditization. Yeah, you've been skeptical about the amount of dollars it's taking to train some of these venture-backed models and the business models that would come out the other side.

Now we have a business in meta, right, that just announced they're going to spend $40 billion this year on CapEx that just trained a model that is 10x less expensive than, you know, the most performant model on the market. I mean, what does this mean for everybody else? Well, there's a couple of things that I put into the mix as I analyze this and answer your question.

You know, first, he made meta AI free and he didn't proclaim that this was temporary or that he might pull it up later. And so, you know, that combined with, I think, Perplexity claiming they're going to have an ad portion and OpenAI hinting at that, at least for the time being, I think the $20 concept is gone.

Yep. And that was a big part of OpenAI's revenue, apparently, or we believe rumored to be. Yeah, over 50% of their revenue is, I think, from consumer. And so that's gone. As long as meta's free, I don't think anyone pays the $20. And I will say that is as things sit today.

If some crazy feature comes along, you know, we've talked about personal memory, maybe that comes back. But for now, it feels dead. Right. And then I thought the podcast that you mentioned was just incredibly, like, disclosive, transparent, thoughtful. You're talking about Zuckerberg on Dwarkish? Yes. Yes. I thought it was incredible.

And it's funny, because it came out at the exact same time that Sam and Brad were on Harry Stebbings in 20 BC, and Dario was on Ezra Klein. And I would encourage people to listen to all three of them, but Dario and Sam talk in these high-level platitudes about how this stuff's going to cure cancer, and we're all going to not have to work anymore.

And Zuck was down in the weeds, in the meat, being super transparent. And I was just like, "Holy sh*t, maybe this guy's in charge now." Right, right. You know? I mean, I saw a lot of people on Twitter saying this was checkmate on all of these closed models that have gotten started and venture-backed over the course of last year.

And certainly, you know, I'm not cheering for that. I'm not sure that I would go so far as to declare it. But I think if you're in the business of producing a closed model, right? We've talked a lot on this pod. There are one of two ways that you can build a business.

You either have to sell it to a consumer for 20 bucks a month, or advertising, you have to get a billion consumers to use your product, or you have to sell it to an enterprise. Somebody has to pay you. And now you have a disruptor coming along and saying, "It's open, it's cheap or free, and I'm not going to charge for it." That's hard to compete with.

And I use them all because I'm curious and I love playing. I'll do the same query on four of them. But right now, it's hard to believe that any of them, including the Google one, are going to have escape velocity because of differentiation. I'm not seeing, maybe you guys are, I'm not seeing an element of differentiation on the consumer-facing tool that's radically different.

It's going to cause 80% market share. The differentiation is happening on the infrastructure side. And there's another thing that, you know, Zuck said in that, which was, I think, a big throwdown, which is, we're going to spend $100 billion on this. But if the community, because he's made it open, makes it 10% better, right, in some parameter, that's $10 billion savings for us.

Right. If you just look at what they did this week, they pushed AI Search across their entire family of applications. They have 3 billion people using those every day. He said on his earnings call this week, they already have tens of millions of people running AI Searches on their applications.

And he was even honest in the podcast where he's like, I don't know if this is where people want to do these searches. And it just earned so much credibility with me when someone kind of, you know, comes clean that way. I think one of the other interesting developments, again, just getting back, if I had one big takeaway, it was this large versus small.

Okay. And so I think we have to... Before you go there, I got a question for you. Yeah. Could the CapEx thing be a throw down? Like, could it be a signal to the rest of the community? This is where we're going to be. Like, just to, could it be a move to tell everyone else, if you want to stay in this game, you got to play at this level.

It's also two birds with one stone, right? I think there is that, but there's the second one, which is, you know, also talked about in another set of tweets this week, by putting more effort and resources towards the training, they reduce the inference costs. Yeah. Right, for sure. And for them, you can imagine everything that you're talking about, making it free everywhere, putting it inside all the products, millions of people using it.

That's a huge impact. And I would remind people, you know, something we talked about several episodes ago, but there was a, there was a podcast I listened to, um, where Amazon was talking about, um, their Alexa product. And they said, you know, inference was way more of the cost than, than the training, like night and day.

And so there's a real world application that's been alive. And, and if that's, do you believe that's true for almost all development projects? Oh, yeah. Yeah. And so it's, it's weird to, you know, to your question, is capital a signal? Of course it is. Yeah. I mean, in 2021, we talked about capital as, as, as being the kingmaker, like who would, who, who would win the game?

I think there are four important ingredients, uh, to compete in this market. Number one, you have to have capital and the leaders are spending 40 billion a year. There aren't many sovereigns on the planet that can afford to spend $40 billion a year. Second, what you need, we just, the big, uh, you know, innovation this week is that data is scaling in the way that compute is scaling.

So you need a lot of data. They have a massive amount of data. Third, you need compute. That's not just about capital. You have to have people who know how to build and stand up infrastructure. You have to have relationships with the entire supply chain, right? And fourth, you have to have distribution, right?

And so they're touching 3 billion consumers. They have a business model. I think he said 50% of the content on Instagram was AI generated in the quarter. Right? So not AI generated, it was AI suggested, right? So it was no longer about your friends looking at something. It was something that, you know, they did.

So that gives them a huge advantage, but I do think there's, you know, when I get back to this bifurcation and why I think this is important, we have these smaller models that are going to be specialized and have specialized use cases. You know, Microsoft is out with Fi and I want you to talk a little bit about that, Sonny, but we don't see any slowing down on the push to bigger models as well.

So we really see both of these things happening simultaneously. And an analog I was discussing with our team was if you think about different use cases, you might build a small rocket to get a satellite into space and you might build a big rocket to try to get to Mars.

Okay. Now those are both rockets. But they have radically different use cases and radically different cost structures. And I think that the cost structures that are going to be associated with frontier level models, there are going to be very few companies on the planet that are going to be able to build those models.

Because I think the latest discussions, whether it's, you know, Stargate out of Microsoft, a hundred thousand GPU cluster, Elon's talking about a hundred thousand GPU cluster, Mark is talking about that. I just don't know many companies that are going to be able to compete with that. Yeah, I'll take maybe just a slightly tangential view to that, which is if you think about, you know, Meta's history in open source, open compute project, PyTorch, React.js, these are just infrastructure components for them, right?

And they put the investment in so that they can drive improvements in the supply chain. They can drive the ecosystem to make it better. And I think they've really taken that approach with this technology and said, "Hey, this is a infrastructure level component that, you know, we want the ecosystem to make better." And everyone else is in the business of models, whether you're a hyperscaler or whether, you know, you're one of these model companies.

And I think that's a distinctly different approach for them that puts them at an advantage. But by the way, I think this is worth drilling in on. So, unless one of you correct me, they are not in the cloud hosting business and remain not in the cloud. You're talking about Meta.

Meta. Correct. And so the people that they're up against have businesses they're running based on these things. They're developing this thing, spending some number of billions and putting it out as open source. I think it's a little different than the open compute part where I don't think they felt the differentiation of their architecture had any impact on the strategic execution of their company, almost the opposite, like it's a commodity, so let's exploit it like a commodity.

Here, this feels more kind of like a badass throwdown where there's a very intentional element of burning, you know, the strategic ground out there for everyone. You know, similar, I think, to what maybe Google did with Android when they came out, like just protect all around me by making it very hard to have differentiated products built on AI that you might come after me with.

I mean, one of the- Is that fair or is that not? Yeah, I think it's fair, but I think I just want to make a couple of points that he, that Mark Zuckerberg talked about on the Dworkish podcast. One was, he said that they do in fact have revenue sharing relationships with the hyperscalers such that when they use their models, they ought to get compensated something for that.

Now, he didn't go, I think he said it wasn't a very big number, but relative to $165 billion in revenue, nothing is a big- Any color on that? Yeah, no, nothing is a big number. You know, the hyperscalers have definitely been squeezing all the model makers, right? And they have a really interesting position because, you know, especially the ones that are creating their own because they have to create a marketplace and they have to ensure that they're operating sort of in a free market capacity.

But it's difficult, right, when you have your own models, because there's obviously a lot of interest to drive that. I can definitely confirm that the data clouds are paying a revenue share to the open models. I don't know what the revenue share is, but there will be some compensation.

And listen, that compensation can change over time. So that's one bit of it. The second thing is super important for all of us to listen to this again. Zuck said, "We believe in open source, but there may come a time where we have a discovery in our largest model, perhaps, that is fundamental and economic to our business, where we will elect to no longer open source said model." So you can see a world where they will always open source the 7B or, you know, he said he wants to build a 1B or a 500 million parameter model or the 70B, but you can also see a world where their most sophisticated model is not open source because he says, "Listen, I want to build the best personal AI in the world.

It's central to what our business is about. We want to have the advantage associated with that." So I think the strategy for me, it feels like the reason the earth shook this week is that this felt like the most significant development and disruptive element in the model marketplace. I think it's going to be very difficult for new entrants to be venture-backed because to, you know, open AI will continue to get funding because they have this incredible team.

They have a hundred million people using the product and paying them for the product, but I think for all the other closed models, they're going to have trouble getting follow-on financing. And I think any new models that come along, you would have to have something so different, such as orthogonal angle of attack in order to get funding.

So I think to your point, by throwing down on the CapEx that you're going to spend, you are clearing the market of potential competitors, right? It's a very quickly depreciating asset. Boy, I mean, that's just so, like, unbelievable is the steepness of the price curve on a slightly older model.

Like, and if people are maximizing ROI on an inference basis, they're going to use, they're going to take advantage of that like crazy. I mean, it's going to- We took LLAMA 2 out of Grok Cloud. It's not even available. We just took it out and replaced it with LLAMA 3 and all the developers went to LLAMA 3.

But it's already, LLAMA 3 is already one 20th, one, whatever. Yeah. So there's no reason. Well, I mean, it seems to me where the value is again, coming back to maybe we'll switch to, you know, this is the right transition to talk about enterprise AI because the value is not in the model, right?

Just like the value is not in storage, right? You could say storage is a part of the AWS cloud, but there's not a lot of value in that thing unto itself. The value is in the enterprise relationship. The value is in, right, the number of services that you're offering to your customers.

So Microsoft and Google are out tonight. Both clouds accelerated their growth on the back of AI. 64% of Fortune 500 customers are now Azure OpenAI customers, which I thought was pretty extraordinary. Big numbers. That's a- GitHub Copilot growing 35%, quarter over quarter. And the number of use cases seem absolutely wild.

And what's even crazier is Satya said on the call, the revenue growth would be even higher, but they're GPU constrained. Yes, you heard me say it, Bill. They're GPU constrained. We'll come back to that. So I look at this and, you know, GCP is accelerating. Azure's accelerating. My assumption is you, we heard it out of ServiceNow, their demand, you know, is accelerating.

So clearly enterprises are finding value in use, you know, in this. So Sonny, talk to us a little bit about what you're seeing. I know you have a hundred thousand developers in the long tail now using, or I think a lot of big enterprises as well, using the Grok cloud.

But what are these enterprise use cases? And are you surprised when you see these hyperscalers racking up these numbers? I'm not surprised. And let me level up the question for a quick second into like, where is that spend coming from? And right now, and this is, you know, even verified by this report that Andreessen Horowitz put out a couple of weeks ago around enterprise AI.

And what they really showed is like the distribution of use is coming from IT to the business units to support. And it's not in these innovation arms, because when usually you see these technologies, when they're there, you understand the budgets are limited. So that's awesome. They also just as a relative point, they showed that folks are tripling their AI spend this year, right?

And so that that kind of lines up to what we're seeing there. Right. And we'll show these slides. Yeah, we'll show these charts there. And, you know, I think the most interesting thing, and I'll get into the use cases that 82 percent of the respondents said they are or either already on open source or will move to open source.

So that's the interesting fact that's happening there. The use cases really. And, you know, let's maybe we got a little bit of alpha from Michael Dell a couple of weeks ago when, you know, he really talked to us about this use case for enterprise rag, right, where there's all this data.

And I want to be able to reason over that data with a model. Right. And so, you know, his interests, obviously, what he's selling alongside, you know, his partners. But I think in the cloud, you're seeing that heavily happen right now. Customer support is number two. I know you guys just financed a company in this space.

So congratulations on that deal, which is really interesting. And then I think content moderation and content generation. I think we don't really talk about it enough. But if you think about a business, this is happening all over the place all the time. Right. And we see a ton of use cases still there, where whether it's a daily report or whether it's something you send out to your customers and all of that coming out of those enterprise systems and being sent out.

I mean, Bill, do you remember? I mean, 18 months ago, pre-chat GPT, I imagine less than five percent of enterprises in this, you know, were building AI production use cases. Today, I don't know an enterprise that's not at least running a test use case. Well, you said it was in a percentage there.

It wasn't 100. No, no. But that was 64 percent using Azure. But I think it's probably close to 100 percent. I can't imagine a company in the S&P 500 that's not at least testing AI. Right. You would really have to be asleep. Do you remember any other technologies that went from zero to ubiquity this fast?

I mean, maybe the Internet itself. People said, oh, my God, I got to get on the Internet. Mobile, mobile. I think mobile. Although this one, but I don't think that's a secret. We've talked about how the incumbents moved very quickly here. And I think you can give OpenAI a lot of credit because they were out selling the mission and out talking to the customer base and doing everything they could to promote.

It's also easy to use. Like when you talked about some of these other technologies, like going to cloud was like a real effort. Right. Right. Using it to migrate your entire database. Exactly. You had to do a lot of real work. This is an API call. Right. Right. And again, credit to OpenAI.

They're the ones that led everyone down that path. And everyone else now is OpenAI compatible or has a similar looking API. It's very easy to use. And part of the reason, you know, one of the things, you know, you mentioned, Michael Dell, he tweeted the other day this Barclays survey that I thought was really fascinating.

So this is among the enterprise CIOs moving back to hybrid and on prem. The number was that 83 percent of respondents said that they were going to repatriate at least some of their workloads right back to on prem. And that was up from 49 percent or 43 percent in 2020.

Right. And so I think it's an interesting case that you're moving back. My sense is it's because they don't trust certain data in the cloud. Right. So they want they don't want to run maybe code generation, you know, tools in the cloud. And the other one is just data gravity.

Maybe they have on prem databases and they don't want the cost and the headache associated with moving that to the cloud. Do you see this in other parts of of your world, Sonny? Yeah, you know, definitely a lot. A third one, which is I think there's still a lack of trust.

And this gets expanded every time. You know, we had that interview with the OpenAI CTO where they asked her, hey, have you trained this on, you know, data? And she didn't answer the question quite well. And so I think and I've heard this, you know, in conversation with hyperscalers where customers will not trust and hyperscale will legally sign that they will not train.

They still will not trust. Right. They just they believe that all these stories around the data make these models better, that everyone is just wants a way to get access to that data to make the models better. So I think the combination of those three factors is 100 percent what we see.

And so what what happens, you know, with us, which is just, you know, basically maybe a pattern, people come and try something in the cloud, make sure that it works and then immediately want to get on the phone with you and say, hey, can I can I get this on prem?

Interesting. Or at least sequestered. Or sequestered, yeah. Like I use on prem as like some virtual. I think I thought there were two things in these podcasts that we keep referencing that relate to the enterprise decision making. One, you know, Zuck said something that kind of makes sense to me.

He just said, like, you know, cramming data in the context window feels a little hacky or he I don't know what his exact words were. And so I think there's still this this future in front of us where data gets deeper integrated in the model and the trust issues there.

And we don't quite know how that's all going to come together. It's still TBD. Yes. Yeah. And you guys probably haven't tried it because it's just not, you know, feasible if you're not a developer, but using like a million, you know, contact like a context window of like a million tokens.

It's like really hard. Yes. You can't use it up, you're saying. Well, you can use it up. But the amount of gathering and work you have to do to get a million to, you know, think about it's like several books, you know. And so, you know, people talk about it like it's this wonderful thing, but it's not it's not, you know, overly usable.

And then the other one, I thought the most interesting thing out of the the the Sam podcast was he talked about whether or not developers were kind of going wholesale on top of open AI or whether they were just using it in a lightweight way and then doing a bunch of stuff externally.

And he implied that most people are doing the latter. But then he said, if you do that, we're going to steamroll you and you need to bet on us being successful. And which would mean dumping your data and and trusting open AI more fully. I don't know. What was your interpretation of what he was trying?

It was exactly that, plus the following. If you if you take his, you know, take him for what he was saying, which is the models are going to get better. Well, what room does it leave for anything else? Because if you shouldn't be taking a model and wrapping it with your own, you know, your own code or your own technology or framework, and then you're going to assume the model gets better.

Well, why do you need what? Why do you need whatever I'm building if the model can just do everything? I actually I actually thought Sam and Brad were really articulate on this point, whether you believe them or not. And I think it was consistent with the tweet that Aaron Levy sent out yesterday, which is people are not thinking ambitiously enough as to where these things are going.

And, you know, today we're really in the land of answers, right? We're running some rag over some, you know, HR data that we have in our company and building a little chat bot so it can answer questions more efficiently than my HR group can can answer questions. But they're saying it really needs to think about agentic thinking.

Like, what is that multistep reasoning that can be done in the business? And, you know, I know how big your HR group. Well, here it's pretty easy to do. Here it's pretty easy to do. But, you know, so my sense is that I'm kind of in this Aaron Levy camp that when you look out two or three years.

I mean, listen, every week we're blown away by, you know, how these models are progressing. It's hard for me to think in three years at the rate of progress and the amount of investment that's going into this, that we're not going to be a lot further down the path in terms of this in terms of this reasoning.

And when we get there, I think people are going to want that to be more proprietary, because I think the advantages that are going to inure to the enterprise are even more. Let me throw one other thing in here. You know, I was sitting with my team this week and we're trying to figure out who are the winners and losers, not of the providers of the arms, but the buyers of the arms.

OK. So if every Fortune 500 company is buying AI, one of the things that Bill often reminds me is fine. It will give a little improvement to an airline that starts using AI. But airlines are a competitive industry and they're just going to compete away all the profits. And so that's a defensive move, right?

You don't actually improve the business model because all the earnings get competed away. So what you want to find is a market leader, somebody who has 70 or 80 percent of a given market who gets to hang on, right, to all of this. Or compound their lead. Right. Or compound their lead.

And so, you know, there's a company coming public in a few weeks called Lineage, which is in the cold storage business. So they basically are an integral part of the food supply chain. You know, any refrigerated storage of, you know, a food. And I think they have a huge percentage of the market.

And I think they have 50 data analysts and scientists now in San Francisco, because if they can turn the screw a quarter of an inch on spoilage, a quarter of an inch on energy consumption to keep this food all bottom line, and so it's all to their bottom line.

And by the way, it doesn't get competed away. Yeah. Right. And so they're looking at leveraging, you know, I happen to know because they were a Snowflake customer and they were using some Snowflake AI, you know, to improve these use cases. And so I think they're going to be a whole host of businesses, Bill, industrial businesses that capture some of these profits and get to hold on to them.

I'd be interested in Sonny's reaction to to your question and maybe to my answer, which is I when when I meet a company and see them using AI in a way that feels like ultra compelling from us, improvement of their own strategic business position. It's almost always a more traditional AI model that's running a very particular optimization problem.

It's not an LLM application. And this stuff's all happening simultaneously. You know, I think I think that I think that's true. I don't think it particularly matters because what generative AI has done, what the chat moment has done is it's caused every enterprise to get off their ass to get all their data organized because that's a condition required to benefit from any of this stuff.

But then I think what they do figure out along the way is some basic, you know, machine learning around time series or forecasting or things that have been around for quite a while, Bill, is where they get the most bang for the buck, maybe not from the generative AI, but they might get there because they got into the pool because they were motivated by generative AI.

I certainly think it's an accelerant based on everything we're seeing. Yeah. You know, I'll disagree with you. I think, you know, what what this technology really enables is, you know, we get spoiled in Silicon Valley because we can get the best engineers to build like the most difficult things.

But I think for the average business to do most problems, whether it was, you know, pre generative AI was very, very difficult. Now you can basically take a generative AI model and have it do one of the most advanced things in the world. And, you know, we've shared an example in our chat, right, where you can take a picture of a plate of food and tell it to return to you what's in that food and how many calories might it be and what's the portion size.

Right. That's done sort of, again, with one prompt. And so now you've given that ability to every business, every small business. Right. It's like this business you're talking about. They can do a lot of improvements without having to have 50 people in San Francisco. So I think that's where the improvements are really going to come.

Although I could push back on you and use your own statement about open AI. And if they achieve everything like what they may be commoditized just because. Well, someone still has to take it and apply it to that business. Right. And it may just be the one, you know, the one tech person in that business.

It's a I was on a walk last weekend with a great economist over at Stanford, and we were talking about whether or not about the amount of productivity improvement that would be unleashed into the economy because of AI. And what was interesting is, you know, productivity has actually been under assault in this country because we've limited immigration, which was a huge source of productivity because de-globalization is actually hurting productivity because we're not moving the productions of goods and services to the lowest cost places, you know, anything that's causing friction.

So it's like all the goodness to come out of AI. We need it just to replace the headwinds that we have on productivity in other places. But I digress. Let's move on to the. Well, can I add one thing to that, which is I think just building on the point that you said, like Aaron Levy was talking about, we're not thinking about it big enough.

And, you know, where I where I, you know, someone share this on Twitter and I can't find the original author. But if we can, some point we'll share it, which is in the industrial revolution. You saw, you know, car making go from something bespoke one car per day to a factory making a thousand.

Same for clothing, same for farms. And, you know, we've looked at technology as this huge accelerant, but we really haven't had the industrial revolution for technology. It's still pretty bespoke. You know, one developer writing code and and now you have this idea where, you know, go back to a place where you spend a lot of time travel search.

Right. You could have one agent do a thousand or an agent and a thousand instances of it, do a thousand searches for you and find what you're looking for. We haven't seen that in technology yet. And I think that's that's the era we're really about to go into, which ties back to, you know, the point that you said that you were having on your walk around efficiency for for for society.

Yeah, I mean, I think I think about it in the context of what we've called business intelligence, right? We've been investors in companies like Tableau, you know, obviously Snowflake, et cetera, over the years. And, you know, it's not really business intelligence, right? Issuing me a report that tells me how many black T-shirts I sold yesterday, right, is nice, but it's not all that informative.

What you would like is an agent to scour all of your data, compare it to all the data of other companies and say, here's something that is anomalous or we can predict something or suggest something based upon patterns we're seeing in other businesses. That's all. You know, we've been talking about that for a decade, right?

I actually think we're getting a lot closer to that moment where now we're going to be able to have these resources, because what are these things do really well, Bill? They devour data. They spot patterns and they predict. Yes. Right. Take what you said times a thousand shouldn't be a single age.

It could be a thousand of them doing it on your data. We beat up what it's going to do in the enterprise. But, you know, one of the areas that I'm even more excited about as these models get smaller is what it's going to mean for personal search and personal AI.

So when we think about that, you know, Google reported tonight they had billions of what they call their SGE searches. So these are, you know, their AI searches. They talked about dramatically driving down the cost of inference of those searches that you can probably tell us a little about.

You know, Meta has rolled out search across all of their apps. There's a search bar on Facebook, on IG, on WhatsApp. And you can search any topic. You can go there and say, hey, show me the recipe for fried chicken or show me how to, you know, play a guitar or show me where I should stay at a hotel, you know, when I'm visiting Milan.

And Zuck did say in his announcement, kind of as a shot across the bow at Claude and at ChatGPT that they had the most cap, the most capable, free personal assistant, right, you know, that you could get out there. You know, we had Apple announce Open ELM, which were these models from 270 million parameters to three billion parameters.

You know, it seems like the next step that everybody's looking at is really the smaller models that can get us to, you know, a personal assistant on device, whether it's on phone, whether it's on glasses, et cetera. So when you looked at the announcements this week, you can go to either of you.

It felt to me like the disruption caused by LLAMA 3 was almost more impactful to what we're going to see along the lines of consumer AI and search than it was in the enterprise. Any thoughts about that? Yeah, I think, you know, it ties back to a point we touched on earlier, right?

The as we make smaller models more capable and we make even smaller and smaller models that can maybe reference those larger models, we're on to a place where it becomes more affordable. Right. What we don't really think about, you know, if you think about the larger models is even so crazy.

It's like a year ago. Right. You know, all the way back a year ago, back a year ago. You're using a, you know, thirty thousand dollar plus unit of compute to run this thing with, you know, hundreds of gigabytes of memory. Now, whether you look at the Apple stuff or fi that came out of Microsoft, you can run that on your phone.

People are already running it on their phone. I saw a demo of some folks running it in Apple Vision Pro. Right. No specialized hardware. Right. And the key is, you know, if we're going to run it on the phone, we got to compress all of that intelligence into a smaller and smaller model that's less power consumptive.

Right. If you put one of these larger models on, it burns up the battery, burns up the phone, too much heat, you know, generated by that. Bill, you referenced a quote, you know, Zuck from Dworkash, where he said, I don't think in the future we're going to be primarily shoving all these things in the context window to ask more complicated questions.

There will be a different stores of memory or different custom models that are more personalized to people. One of the things that I was most intrigued by in that interview was his focus on the personalization to people. He went so far as to say, understanding the content around emotions is a different modality unto itself, which got me thinking, you know, not only are they producing smaller models, but they probably have the largest store of human emotions.

What reactions to one another emoticons to one another? You know, biggest, certainly social graph on the planet, which seems to put them in a really good position when it comes to this personal assistant that we all talk about. I know you your view is we're not going to get anywhere close until we get memory and we haven't solved memory.

Well, I mean, he hinted at it, but everyone hints at it. It comes up a lot. And there's a and I mean, I push it to Sonny, but it's unclear whether you can accomplish what people hope to be achieved in a personal assistant with rag, with fine tuning, or if you really need a model to be actually, you know, trained on my data.

And that latter part, no one knows how to do a fat cost effectively. Yeah. So I don't know. I don't know what pieces have to fall in place for us to get to that place. Yeah. And there's no secret. Like everyone seems to be aware that that's the end goal.

But I don't I think there are a few breadcrumbs that were. I don't know if you say, you know, I'll suggest some of them. Maybe you can say a few breadcrumbs that were dropped out there, both by Apple and by by Zuckerberg in this regard. I mean, I'll just kick it off by saying what he said in that podcast is like, listen, in the first instance, what we do is we build software around the model that kind of hacks this stuff together and we see kind of what works.

And so, yes, in the first instance, it may in fact be you have a really small model, you do some rag on it. Maybe in certain instances it communicates with a more sophisticated model. But in that, you know, in that rag can be a lot of personal information. I think Apple has said the same thing.

But then what he importantly said is if that works, then on the next go around, we figure out how to build that into the model itself. Yeah, I think building on that, if you look at the breadcrumbs from, you know, all the major folks, and I think there was like a a Wired article that came out where Sam said, you know, the next model necessarily won't be bigger.

I think he did say that. I thought that was interesting. Yeah. And I and the reason is, and you know, you had a thing, Brad, last year at the Barn where you had Brad Lightcaps being right. And the general message that keeps coming out of the open AI contingent is that customization and memory.

And so my and I don't have anything beyond this, but I would say my guess is that's what they focused on with GPT-5. That's an important point. Like, I think in GPT-5, it's not going to be the final state, but I think you're going to see the beginnings of memory and the beginnings of actions.

Right. And this is, you know, months away. And you and I have a bet on this. I know. Well, well, yeah, but that could be a major another, you know, tremor. But one is one interpretation of the statement that the models aren't going to get bigger. One, it could be a mea culpa to the thing.

Like, OK, like, I don't want to play this game anymore. But two, it could mean that the the LLM training has kind of just run its course and you got to go do this next thing. But the next thing is not a necessarily an exponential leap. It may it may be like an early alpha or beta, and it may be a little more stumbly as you.

I don't think I see little evidence that the scaling has run its course. I mean, like the smartest people on the planet who are putting their own money, real money up against this. Elon is building a much, much bigger cluster to train a much, much bigger model as is open AI, as is Zuckerberg.

I mean, what Sam just said, the bigger models aren't. Well, I mean, he may be doing the same game that everybody else is doing. I'm trying to throw everybody off the scent of building a bigger model. Why is he trying to build his own chips, nuclear power plants and everything else if he's not going to build big models?

I mean, you only don't take him at his word. Well, I'm just saying that I think the world, as I said earlier, is bifurcating into two like a world of specialized models. We are going to have very large frontier models. There will be a point at which you hit the you hit diminishing returns.

Jan LeCun has said we're going to need a different architecture to get to AGI. He speculated that it's probably two or three generations more of scaling before we get to that point where it no longer makes economic sense to continue to scale it. But we're going to I mean, that's a lot of if it continues apace.

That's a lot of developments over the course of the next two, three generations before we hit the upper limits of that. And by the way, I think we're already seeing some creative things like the data scaling that we saw, you know, past the chinchilla point. Those are really creative innovations to get around or to augment kind of the compute problem.

So to me, I come back to this and I, you know, it makes me really excited again about where we are in this state of consumer search, you know, and personal assistance. Google's probably innovating better than they ever have because they're pushed out of their monopoly position by everybody else.

Now, it sounds like, you know, they're seeing some great results come out of that. You know, I thought it was really interesting, you know, when you see, you know, Dolly and and and David Woodland, who's the product lead on on on Metaglass's talk about what they announced this week.

Now it has meta AI with vision. It's, you know, now available to everybody. You know, not only can you use these things to call and to message using WhatsApp, but as all these integrations and these overlays. So, I mean, we haven't seen this kind of shake up in the world of search and in the world of consumer products in a while.

And now, you know, there was this all this noise this week about Humane. Yep. Right. The startup up in San Francisco. And, you know, it got panned in a consumer review, you know, and one of the biggest challenges with that product, because I use the product as well, right, is the models weren't small enough.

It doesn't have it can't run the inference on device. So it has to go out to the cloud to do it. And the second you have to go out to the cloud, it ruins the experience because now you have latency. We're a year away, probably max from that thing, being able to have a billion parameter or 500 million parameter model that basically has all the capability you need it to have.

Totally agree. And we're also compressing the amount of time that it takes to go out to the cloud because we'll get those models to start running faster. So we're going to see a convergence there on two fronts, the local and then the ability for that model to reach out in the cloud and get a faster response out of the cloud.

That's what I think, you know, is being underestimated. You know, just swinging back around to model size, right. The smaller models run faster just naturally. And so that gets you to faster responses. And we know the Internet's been on a huge push for lower latency across whether it's loading web pages or search results or whatever it is.

And so I think we're starting to see a push in that direction. We all got kind of comfortable with the pace of chat GPT. But if you kind of go away for a second and try one of these smaller models somewhere else and go back to chat GPT, you'll really have that.

Like we all had that moment for a bit between dial up and high speed Internet where we maybe had dial up at home, still in high speed at work. That's the feeling that you get when you switch between those two things. One of the debates I know, Sonny, you've been having, you know, with our team and I'm firmly, you know, in your camp on is, you know, this idea dating back 20 years when it comes to consumer products, even, you know, speed improvements that are barely perceptible at Google have pretty important implications for their revenue.

And so I think what we're seeing with these smaller models and all of these other developments and you guys are helping, you know, certainly to lead the way at Grok, you're just seeing massive improvements in token per second. And I think, you know, when you start having agents talk to agents, you take humans out of the loop, right?

Now, computers can talk really fast to one another, but we have to have low cost, fast inference, you know, that's able to support that. We do. And we think of the use cases that we all like. And I think we all love perplexity. But you think about, you know, what happens behind the scenes when you type like a small request, it shoots off something into a couple of different places, including a search results, pictures and all those kind of things that all that has to be processed by the LLM like really quickly.

Did you play with the Meta AI picture generator? I did. Where you just add another little word or add to it? Yes. What do you think of it? That speed is insane. Yeah. And compared to like a year ago when you were doing it, when you'd wait, you know, 15 seconds to get one.

Right, for the next image. This gets back to Sonny's point as well. When you're doing that, just think if the cost of inference was really high, there's no way he could roll that out to three billion people, right? Because all of a sudden people would start playing with it.

And his OpEx on, you know, on the business would blow up in his face. Part of the reason he's, I think, pushing toward these smaller models, opening these models, you know, and he said in that podcast interview, they helped me lower the cost of inference. You know, we eat the cost of training.

So, you know, we can we can lower the cost of inference. Well, maybe to, you know, just to wrap, we can. I want to hit on a few topics, Sonny, that we've covered over the course of the last few weeks. Of course, Bill and I did have been doing a couple of deep dives on full self-driving at Tesla, as well as, you know, their ride share project that's now moved front and center because of the breakthroughs they've had on FSD.

And on the Tesla earnings call this week, they answered some of our questions. So a couple of the questions Bill and I had is, is this going to occur within the Tesla app? Well, you can see here, you know, this beautiful depiction, you know, of a ride share within the Tesla app.

There are a lot of Tesla app holders. We had a question was whether or not it was going to be owned and operated or whether it was going to leverage the millions and millions of cars that are out there in the fleet. And Elon, I thought, elegantly put this, you know, we're going to be both Uber and Airbnb.

You know, we're going to, you know, own some of the fleet. We're also going to let those people who buy cars from us, you know, put their cars into the fleet. My own hunch is that it will also be distributed both one P and three P, although he didn't go so far as saying that.

And what I mean by that is not only distributed in the Tesla app. My hunch is that as this scales, it'll make sense to do a partnership with Uber. And frankly, I wouldn't be surprised to see some of the people who operate on the Uber platform become fleet operators of Tesla's for Tesla.

And so I think there's a really interesting opportunity for an integration there. But I thought that was pretty consistent. We weren't too far off in terms of our estimation there. And there's I mean, this is kind of they obviously have already made it clear they're going to be talking about this for a long time.

But this is the kind of first draft, if you will. I think there's a lot to see as this stuff rolls out. You know, Waymo's had to apply for these licenses to get these cars on the street. Yes. We don't have Tesla's, one, they haven't even applied for those things, but we don't have them driverless on the road yet.

Right. Which would be a step that would need to take place before this was rolled out. But it intersects with that really big purchase of, you know, H100s that they talked about as well. You know, we now have a lot of companies that are reported and there's not one of them yet that has not raised their capex guidance to buy more to buy more GPS.

I mean, Elon himself is going to let his in the past four weeks, the incremental purchases they've signaled are in and not just Tesla. But there's data on the internets today that X.AI has raised six billion. Presumably most of that's going into infrastructure as well. And, you know, you and I have a couple of bets going.

But, you know, when it comes to whether or not GPUs are undersupplied or oversupplied, you know, what I've stipulated is every supply shortage does ultimately result in a glut. But people have been calling for this glut now for, you know, 12 months anyway. And they're calling for it again this year.

We're not going to see it again this year. There's, you know, and- So you bought the dip? We own plenty. And, you know, and you just see it, you know, in fact, you know, Meta was down 15 or I think it ended up down 10 or 11 percent. And one of the major reasons it was down is Zuckerberg said, I'm going to put the, you know, the accelerator to the floor.

He increased the midpoint of his CapEx guide by three or four billion dollars. You know, which I said, you know, I had a lot of people inbound to me and say, hey, you know, they're no longer being efficient or they're no longer being fit. Which, you know, to which I responded by saying in two years, that company has gone from 22 billion in net income to 55 billion in net income.

They've reduced their headcount from 85,000 people to 69,000 people. What they are demonstrating is what you can do when you're efficient. You can redeploy all of that incremental profitability into investing, not in some 10 year project that we don't know what the payback is, but directly into GPUs and AI, where you can see the payback in a pretty short period of time, leveraging it in their core business.

And so, you know, it was, you know, while we're on it, you know, well, he went and bought gym equipment. I did see that, too. I did see that, too. Bill, you and I talked about IPOs. You know, Dan Primek, you know, came out with this article that was pretty controversial, I think, among VCs.

I saw a lot of people responding to it. It was entitled VCs, you're blowing it. And there was one line in there that caught my attention, you know, where he said VCs let startups stay private too long, often well past their hyper growth phase that justified sky high valuations.

You and I debated this last week. How much revenue do you have to have to go public? I think you and I are both in the camp that if you have 100 million of trailing revenue, you're growing well. You have great unit economics. You can certainly go public if you price it right.

I'm you know, I've said it on Twitter. I say it in boardrooms. I think being in the public markets is a great place for companies to be. I think it, you know, it puts them in the big league. It makes them you know, there's plenty of room to innovate there.

However, what I would say to Dan is, you know, when we sit on the board, we can advise, but ultimately we're not the decision maker. Right. It's got to be a collaboration with the founder of the company. And ultimately, I think the company should go public when it's the right time for the company to go public.

And for some companies, that is at that earlier phase. I think a lot more could go public at that earlier phase. But I also think there are certain situations, you know, take SpaceX, for example, where I think it's behooved them to stay private longer. Right. And they've had plenty of access to the private markets to raise capital.

So I didn't know if you had any reactions to the Dan. Well, I mean, I think in a lot of ways, I agree with what you're saying. And I disagree maybe with the way it was positioned. But keep in mind, Dan's one of the very few analysts and writers that focuses on LPs like most of these writers focus on VCs or the founders themselves or whatever.

And he's he's constantly talking to LPs. And I think there is a a very real situation, especially where we came out of Zerp, where there's a vast amount of paper marks that are sitting on these LP books that are aging out, that are that are exposed to dilution, you know, on an annual basis.

And I suspect they're very nervous and I suspect they're talking to him and that that's where he's building this thesis. And I think that's probably right. I also think that a number of people, you know, that invest in late stage and people that we know have have built a business model where they kind of like companies staying private longer.

They are the ones that everyone talks about. Amazon went public at this price and then the public captured it. They they kind of view their game as capturing that growth instead of the public markets. And the third thing I would just say is our business has got nothing but more competitive from the minute I entered it to today.

And I think that's going to keep happening. And that competition forces people to be very founder friendly, to say what they want to hear, to support secondaries, which we've talked about, that when you support massive secondaries, you're taking the number one pressure out of the system that used to lead founders to to want to go public because their employees are like, I need liquidity, I need liquidity.

So you do a release valve and you take that away. And and I do think there will be a lot of situation. And then actually one last thing to mention, just because of where we came from, evaluation perspective, we know a lot of people are sitting there afraid that they can't meet their last mark.

For sure. And so then you're kind of in Never Neverland. And how do you get out of this? And where are the odds you're going to grow back to that? And there's kind of a lack of. So anyway, those dimensions, I think he's hitting at it right. I agree with you.

And this is where I think you got it wrong, is that like no single VCs going to stand up and make a company go public, right? That's not going to happen. I mean, I do see the market evolving. Listen, you got to get you're in you raise 10 year funds.

You need to get liquidity. You're in the business to provide returns and liquidity to your partners. If I look at the private equity business, right, they evolved in such a way where they didn't have to take the company's public. They would just sell to another private equity company. And it may very well be in the VC landscape, Bill.

And I see this more and more. You know, I know a big company right now raising it over 10 billion. And I know a lot of early stage VCs who are selling into that round. Right. And so that I thought there was one. Didn't Rippling say 600 million was going to people that are early?

So that's not the one I was referencing. But Rippling may, in fact, be one of those. And so the market may be responding to some of these imperatives. You know, by LPs to get liquidity. But what I would say to Dan is certainly with the with the venture capitalists sitting around this table, you have two people who think that the public markets are a great place for companies to innovate, companies to thrive, companies to raise capital, companies to recruit, build, build brands, et cetera.

And and we know we're in the business of liquidity. Seeking as an entrepreneur, though, can I add one thing to that, which is I think we're going to see like a bifurcation and builds on our conversation from earlier. There's going to be companies that are cheaper and cheaper to build and run.

And, you know, that has one impact. But there's companies that are going to get a lot more capital to build and run. And so that may force folks into the public markets sort of like the way it was in the late 90s with a lot of those business because they needed a lot more capital.

What do you guys think about that? I there was a there was a interesting article that Tim O'Reilly wrote about a month ago. I put the link in there, but he implied that he also attacked the VCs and said they were going about it all wrong. But he he implied that the AI was now in its Uber phase where the, you know, talk about Uber and Lyft and DoorDash all raising billions and it's spilling out on the floor.

And I personally think part of it is simply a recognition by the investment community writ large that network effects exist and increasing return exists. Yes. And so when they think this is the next big thing and they see open AI take a lead, their gut response is, well, if I had invested early in Amazon or Google or whatever, I'd get paid almost no matter what the price.

And so it's the institutionalization of a belief in network effects that's leading to the the money pouring in. And then and then it's a competitive dynamic. Like once your company raises, you know, 200 million, a billion, if you're in that market, you raise it, too. And it does create chaos.

Like, I do think it creates chaos. Maybe we'll just we'll end with just kind of the volatile week it's been in markets. You know, we talked to, you know, you asked me what I thought was going to happen this week. You told me you'd give me a scorecard. But, you know, the reason it's so volatile this week is not just because we've had some mixed earnings reports or at least mixed reactions.

But I think the economic backdrop is, you know, is unsettling. GDP came in a lot weaker than expected this morning. At the same time, the PC for Q1 came in a little bit higher. Now we got the monthly PC report coming out tomorrow. We'll see where that shakes out.

But this idea that we could have a slowing economy at the same time that we have inflation continuing to go up. Right. This is this very fearful place called stagflation that nobody wants to be in. Now the market is now pushed out the rate cut forecast to December of this year.

Right. So this higher for longer is now in place. Remember, when we started the year, we thought we were going to have six rate cuts, very accommodating Goldilocks environment. What's surprising to me, to be perfectly honest, is how well technology stocks have performed outside probably software, but how well they perform notwithstanding this fact.

And the only reason they've been able to do that is the reacceleration caused by AI. So if you look at what happened tonight, Google beat. They issued a dividend for the first time. They announced a buyback. Right. So they're really they're their margins are expanding. So they're finding efficiencies in that business.

They're listening to the markets. But, you know, I think impressively, you know what they're doing in that core business, the things they announced around SG, you got to give that management team a lot of credit. We talked about what's the stock. The stock was up, I think, 10 or 15 percent after hours.

So I think it's at an all time high and and and doing incredibly well. Metta, you know, missed is down 10 percent, but still up 30 percent on the year. So I imagine those those two companies are about in the same area. You know, year to date, we talked about why they why they got hammered is because, you know, they're going to invest even more aggressively in A.I., which is a long term investor.

I'm pretty thrilled about. And then if you look at Azure or Microsoft's quarter, it was pretty blockbuster. And we'll show this chart by jamming. But Azure, I contributed seven percent of growth this quarter. So it's now translates into about a four billion run rate business, you know, that didn't even get broken out until five quarters ago, you know.

So, again, I think if you look at technology generally, it's performing. It's performing really well, despite this kind of volatile economic backdrop. And we'll see where where PC rolls in on Friday. We'll see where the rest of technology comes in. My hunch is that the largest companies in technology, back to your network effects and your scale advantages.

I'm not sure that smaller tech technology companies are seeing the benefits that the largest technology companies. Certainly, it looks like, you know, so we'll see that as it reports. But I think the largest data platforms and hyperscalers continue to benefit. Boys has been fun. Yeah, great. Let's do it again.

Thanks for being on, Sonny. Until next time. Bye. I. I. I. I. I. I. I. I. I. I. I. I.

Ep8. AI Models, Data Scaling, Enterprise & Personal AI | BG2 with Bill Gurley & Brad Gerstner

Chapters

Transcript