Windsurf: The Enterprise AI IDE

(upbeat music) - Hey, everyone. Welcome to the Latent Space Podcast. - This is Alessio, partner and CTO at Decibel Partners. And I'm joined by my co-host Swiggs, funder of Small.ai. - Hey, and today we are delighted to be, I think the first podcast in the new Codium office. So thanks for having us and welcome Farun and Anshul.

- Thanks for having us. Yeah, thanks for having us. - This is the Silicon Valley office? - Yeah. - So like, what's the story behind this? - The story is that the office was previously, so we used to be on Castro Street. So this is in Mountain View. And I think a lot of the people at the company previously, you know, were in SF or still in SF.

And actually one thing if you notice about the company is it's actually like a two minute walk from the Caltrain. And I think we were, we didn't want to move the office like very far away from the Caltrain. That would probably, you know, piss off a lot of the people that lived in San Francisco, this guy included.

So we were like scouting a lot of spaces in the nearby area. And this area popped up. It previously was being leased by, I think, Facebook/WhatsApp. And then immediately after that, Ghost Autonomy. And then now here we are. And we also, you know, I guess one of the things that the landlord told us was this was the place that they shot all the scenes for Silicon Valley, at least like externally and stuff like that.

So that just became a meme. Trust me, that wasn't like the main reason why we did it. But we've leaned into it. - It doesn't hurt. - Yeah. - Yeah. And obviously that played a little bit into your launch with Windsurf as well. So, let's get caught up. You were guest number four?

- I think I was two. - Maybe it was two. - Might have been two. - So a lot has happened since then. You've raised a huge round and also just launched your idea. Like what's been the progress over the last year or so since the in-space people last saw you?

- Yeah. So I think the biggest things that have happened are Codiums extensions have continued to gain a lot of sort of popularity. You know, we have over 800,000 sort of developers that use that product. Lots of large enterprises also use the product. We were recently awarded JP Morgan Chase's Hall of Innovation Award, which is usually not something a company gets, you know, within a year of deploying an enterprise product.

And then large companies like Dell and stuff use the product. So I think we've seen a lot of traction on the enterprise space. But I think one of the most exciting things we've launched recently is this actually IDE called Windsurf. And I think for us, one of the things that we've always thought about is how do we build the most powerful AI system for developers everywhere?

The reason why we started out with the extension system was we felt that there were lots of developers that were not going to be on one platform. And that still is true, by the way. Outside of Silicon Valley, a lot of people don't use GitHub, right? This is like a very surprising finding, but most people use GitLab, Bitbucket, Garrett, Perforce, CVS, Harvest, Mercurial.

I could keep going down the list, but there's probably 10 of them. GitHub might have less than 10% penetration of the Fortune 500, full penetration. It's very small. And then also on top of that, GitHub has very high switching costs for source code management tools, right? Because you actually need to switch over all the dependent systems on this workflow software.

It's much harder than even switching off of a database. So because of that, we actually found ways in which we could be better partners to our customers, regardless of where they started their source code. And then more specifically on the IDE category, a lot of developers, surprise, surprise, don't just write TypeScript and Python, right?

They write Java, they write Golang, they write a lot of different languages. And then a high quality language servers and debuggers matter. Very honestly, JetBrains has the best debugger for Java. It's not even close, right? These are extremely complex pieces of software. We have customers where over 70% of their developers use JetBrains.

And because of that, we wanted to provide a great experience wherever the developer was. But one thing that we found was lacking was, you know, we were running into the limitations of building within the VS Code ecosystem on the VS Code platform. And I think we felt that there was an opportunity for us to build a premier sort of experience.

And that was within the reach of the team, right? The team has done all the work, all the infrastructure work to build the best possible experience, right? And plug it into every ID. Why don't we just build our own ID that is by far the best experience? And as these agentic products sort of become more and more possible, and all the research we've done on retrieval and just reasoning about code bases became more and more to life.

We were like, hey, if we launch this agentic product on top of a system that we didn't have a lot of control over, it's just gonna limit the value of the product and we're just not gonna be able to do the best tool. That's why we were super excited to launch Windsurf.

I do think it is the most powerful IDE system out there right now. In the capability, right? And this is just the beginning. I think we suspect that there's much, much more we can do, more than just the auto-complete sort of side, right? When we originally talked, probably auto-complete was the only piece of functionality the product actually had.

And we've come a long way since then, right? These systems can now reason about large code bases without you adding everything, right? Like when you use Google, do you say like @NewYorkTimesPost, blah, blah, blah, and like ask it a question? No. We want it to be a magical experience where you don't need to do that.

We want it to actually go out and execute code. We think code execution is a really, really important piece. And when you write software, you not only just kind of come up with an idea, the way software kind of gets created is software is originally this amorphous blob. And as time goes on and you have an idea, the blob and the cloud sort of disappear and you see this mountain.

And we want it to be the case that as soon as you see the mountain, the AI helps you get to the mountain. And as soon as you see the mountain, the AI just creates the mountain for you, right? And that's why we don't believe in this sort of modality where you just write a task and it just goes out and does it, right?

It's good for zero to one apps. And I think people have been seeing Windsor is capable of doing that. And I'll let Anshul talk about that a little bit. But we've been seeing real value in real software development, which is more to say, this is not to say that current tools can't, but I think more in the process of actually evolving code from a very basic idea.

Code is not really built as you have a PRD and then you get some output out. It's more like you have a general vision. And yes, and as you write the code, you get more and more clarity on approaches that don't work and do work. You're killing ideas and creating ideas constantly.

And we think Windsor is the right paradigm for that. - Can you spell out what you couldn't do in VS Code? Because I think when we did the cursor episode, explain that everybody on Agri News is like, oh, why did you fork? Why you could have done it in an extension?

Like, can you maybe just explain more of those limitations? - I mean, I think a lot of the limitations around like APIs are pretty well-documented. I don't know if we need to necessarily go down that rabbit hole. I think it was when we started thinking, okay, what are the pieces that we actually need to give the AI to get to that kind of emergent behavior that Bruin talked about, right?

And yes, we were talking about all the knowledge retrieval systems that we've been building for the enterprise all this time. Like, that's obviously a component of that. You know, we were talking about all the different tools that we could give it access to, so they can go like do that kind of terminal execution, things like that.

And the third main category that we realized would be like kind of that magical thing where you're not out there writing out a PRD. You're not scoping the problem for the AI. Is that if we're actually being able to understand the kind of the trajectory of what developers are doing within the editor, right?

If we actually are be able to see like, oh, the developer just went and opened up this part of the directory and tried to view it, then they made these kind of edits. And they tried to do like some kind of commands in the terminal. And if we actually understand that trajectory, then our ability for the AI to just be immediately be like, oh, I understand your intent.

This is what you want to do without you having to spell it all out for it. That is one like that kind of like magic would really happen. I think that was kind of like that intuition. So you have the restrictions of the APIs that are well-documented. We have the kind of vision of like what we actually need to be able to hook into to really expose this.

And I think it was that combination of those two where we were like, I think it's about time to do the editor. The editor was not like a necessarily like a new idea. I think we've been talking about the editor for a very long time. I think it's like, of course, we just pulled it all together in the last couple of months, but it was always something in the back of the mind.

And it's only when we started realizing, okay, the models are not capable of doing this. We actually can look at this data. Like we have a really good context awareness system. We're like, I think now's the time. And we went on and executed on it. - So it's basically not, it's not like one action you couldn't do, but it's like how you brought it all together.

It's like the VS Code is kind of like sandbox, so to speak. - Yeah, let me maybe like even just to go one step deeper on each of the aspects that Anshul talked about. Let's go with the API aspect. So right now, I'll give you an example. Supercomplete is actually a feature that I think is like very exciting about the product, right?

It can suggest refactors of the code. I think you can do it quickly and very powerfully. On VS Code, actually the problem for us wasn't actually being able to implement the feature. We had the feature for a while. Problem was actually even to show the feature, VS Code would not expose an API for us to do this.

So what we actually ended up doing was dynamically generating PNGs to actually go out and showcase this. It was not really aligned. We actually ended up doing it ourselves and it took us a couple hours to actually go out and implement this, right? And that wasn't because we were bad engineers.

No, our good engineering time was being spent fighting against the system rather than being a good system. Another example is we needed to go out and find ways to refactor the code. The VS Code API would constantly keep breaking on us. We'd constantly need to show a worse and worse experience.

This actually comes down to the second point which Anshul brought up, which is like, we can come up with great work and great research. All the work we have here is not like, the research on Cascade is not like a couple month thing. This is like a nine months to a year thing that we've been investigating as a company.

Investing in on evals, right? Even the evals for this are a lot of effort, right? A lot of actually systems work to actually go out and do it. But ultimately, this needs to be a product that developers actually use. And I think, let's even go for Cascade, for example, and looking at the trajectory.

- And can you define Cascade because that's the first time you brought it up? - Yeah, so Cascade is the product that is the actual agentic part of the product, right? That is capable of taking information from both these human trajectories and these AI trajectories, what the human ended up doing, what the AI ended up doing, to actually propose changes and actually execute code to finally get you the final work output, right?

I'll even talk about something very basic. Cascade gives you a bunch of code. We want developers to very easily be able to review this code, okay? Then we can show developers a hideous UI that they don't want to look at. And no one's gonna really use this product. And we think that this is like a fundamental building block for us to make the product materially better.

If people are not even willing to use the building block, where does this go, right? And we just felt our ceiling was capped on what we could deliver as an experience. Interestingly, JetBrains is a much more configurable paradigm than VS Code is. But we just felt so limited on both the sort of directions that Anshul said, that we were just like, "Hey, if we actually remove these limitations, "we can move substantially faster." And we believe that this was a necessary step for us.

- I'm curious more about the evals side of it, 'cause you brought it up. And we have to ask about evals anytime anyone brings up evals. How do you evaluate a thing like this that is so multi-step and so spanning so much context? - So what you can imagine we can sort of do, and this is like one of the beautiful things about code is code can be executed.

We could go take a bunch of open source code, we can find a bunch of commits, right? And we can actually see if some of these commits have tests associated with them. We can start stripping the commits, and the approach of stripping the commits is good because it tests the fact that the code is in an incomplete state, right?

When you're writing the commit, the goal is not the commit has already been written for you, you're giving it in a state that where the entire thing has not been written. And can we go out and actually retrieve the right snippets and actually come up with a cohesive plan and iterative loop that gets you to a state where the code actually passes?

So you can actually break down and decompose this complex problem into like a planning, retrieval, and multi-step execution problem. And you can see on every single one of these axes is it getting better. And if you do this across enough repositories, you've turned this highly discontinuous and discrete problem of make a PR work versus make it not work into a continuous problem.

And now that's a hill you can actually climb, and that's a way that you can actually apply research where it's like, "Hey, my retrieval got way better. "This made my eval get better," right? And then notice how the way the eval works is I'm not that interested in the eval where purely it's a commit message and you finish the entire thing.

I'm more interested in the code is in an incomplete state, and the commit message isn't even given to you because that's another thing about developers. They are not willing to tell you exactly what's in their head. That's the actual important piece of this problem. We believe that developers will never completely pose the problem statement, right?

Because the problem statement lives in their head. Conversations that you and I have had at the coffee area, conversations that I've had over Slack, conversations I've had over Jira, right? Maybe not Jira, let's say Linear, right? That's the cool thing now. - Don't talk about Jira. - Yeah, so conversations I've had on Linear, and all of these things come together to actually finally propose sort of a solution there, which is why we want to test the incomplete code.

What happens if the state is in an incomplete state and am I actually able to make this pass without the commit? And can I actually guess your commit well? Now you can convert the problem into a mask prediction problem where you want to guess both the high-level intent and as well as the remainder of changes to make the actual test pass.

And you can imagine if you build up all of these, now you can see, hey, my systems are getting better. Retrieval quality is getting better. And you can actually start testing this on larger and larger code bases, right? And I guess that's one thing that we, honestly, to be honest, we could have done a little faster.

We had the technology to go out and build these zero-to-one apps very quickly. And I think people are using Windsurf to actually do that. And it's extremely impressive. But the real value, I think, is actually much deeper than that. It's actually that you take a large code base, and it's actually a really good first pass.

And I'm not saying it's perfect, but it's only going to keep getting better. We have deep sort of infrastructure to that actually is validating that we are getting better on this dimension. - We mentioned the end-to-end evals that we have for this system, which I think are super cool.

But I think you can even decompose each of those steps, right? The ideas of, just take a retrieval, for example, right? Like, how can we make eval for retrieval really good? And I think this is just a general thing that's been true about us as a company is like, most evals and benchmarks that exist out there for software development is kind of bogus.

There's not really a better way of putting it. Like, okay, you have Sweebench, that's cool. No actual professional work looks like Sweebench. Like, human eval, same thing. These things are just a little kind of broken. So when you're trying to optimize against a metric that's a little bit broken, you end up making kind of suboptimal decisions.

So something that we're always very keen on is like, okay, what is the actual metric that we want to test for this part of the system? And so take retrieval, for example. A lot of the benchmarks for these embedding-based systems are like needle-in-the-haystack problems. Like, I want to find this one particular piece of information out of all this potential context.

That's not really what actually is necessary for doing software engineering because code is a super distributed knowledge store. You actually want to pull in snippets from a lot of different parts of the code based in order to do the work, right? And so we built systems that, instead of looking at retrieval at one, you're looking at retrieval at like 50.

What are the 50 highest things that you can actually retrieve? And are you capturing all of the necessary pieces for that? And what are all the necessary pieces? Well, you can look again back at old commits and see what were all the different files that together were edited to make a commit because those are semantically similar things that might not actually show if you actually try to map out a code graph, right?

And so we can actually build these kind of golden sets. We can do this evaluation even for sub-problems in the overall task. And so now we have an engineering team that can iterate on all of these things and still make sure that the end goal that we're trying to build too is really, really strong so that we have confidence in what we're pushing out.

- And by the way, just to say one more thing about the SuiteBench thing, just to showcase these existing metrics, I think benchmarks are not a bad thing. You do want benchmarks. Actually, I would prefer if there were benchmarks versus, let's say, everything was just vibes, right? But vibes are also very important, by the way, because they showcase where the benchmark is not valuable because actually, vibes sometimes show you where criminal issues exist in the benchmark.

But you look at some of the ways in which people have optimized SuiteBench. It's like, make sure to run PyTest every time X happens. And it's like, yeah, sure. You can start prompting it in every single possible way. And if you remove that, suddenly it doesn't get good at it.

It's like, what really matters here? What really matters here is across a broad set of tasks, you're performing high-quality suggestions for people, and people love using the product. And I think, actually, the way these things work is beyond a certain point. Because yes, I actually think it's valuable beyond a certain point.

But once it starts hitting the peak of these benchmarks, getting that last 10% actually probably is counterintuitive to the actual goal of what the benchmark was. Like, you probably should find a new hill to climb rather than sort of p-hacking or really optimizing for how you can get higher on the benchmark.

- Yeah, we did an episode with Anthropic about their recent SuiteAgent, SuiteBench results. And we talked about the human eval versus SuiteBench. Or human eval is kind of like a Greenfield benchmark. You need to be good at that. SuiteBench is more excessing. But it sounds like, I mean, your eval creation is similar to SuiteBench as far as using get up commits and kind of like that history.

But then it's more like masking at the commit level versus just testing the output of the thing. - Cool. We have some listener questions, actually, about the Windsurf launch. And obviously, I also want to give you the chance to just respond to Hacker News. (laughing) - Oh, man. (laughing) - Hey, let me tell you something very, very interesting.

I love Hacker News as much as the next person. But the moment we launched our product, the first comment, like this was a year ago, the first comment was, "This product is a virus." And we were like-- - This was the original Codium launch like two years ago. - This is the original.

- Like, "I am analyzing the binary as we speak. "We'll report back." - And then he was like, "It's a virus." And I was like, "Dude, like, it's not a virus." (laughing) - We just want to give autocomplete suggestions. That's all we want to do. - Yeah. - Okay.

- Wow, I didn't expect that. - And then there was, like, TO drama. There's enough drama on the launch to cover, but I don't know if we want to just make this a Cascade piece. But we had a bunch of people in our Discord try out the product, give a lot of feedback.

One question people have is, like, to them, Cascade already felt pretty agentic. Like, is that something you want to do more of? You know, obviously, since you just launched on IDE, you're kind of like, you're focusing on having people write the code. But maybe this is kind of like the Trojan horse to just do more full-on, end-to-end, like, code creation.

- Dev-in style. - Yeah, I think it's, like, how do you get there in a real, principled manner, right? We have, obviously, enterprise asking us all the time, like, "Oh, when's it going to be end-to-end work?" The reality is, like, okay, well, if we have something in the IDE that, again, can see your entire actions and get a lot of intent that you can't actually get if you're not in the IDE, if the agent there has to always get human involvement to keep on fixing itself, it's probably not ready to become a full end-to-end automated system, 'cause then we're just going to turn into a linter where, like, it produces a bunch of things and no one looks at any of it.

Like, that's not the great end state. But if we start seeing, like, oh, yeah, there's common patterns that people do that never require human involvement, just end-to-end just totally works without any intent-based information, sure, that can become fully agentic, and we'll learn what those tasks are pretty quickly, 'cause we have a lot of data.

- Maybe add on to that. I think that if the answer is, like, fully agentic is Dev-in, I think, like, yes. The answer is this product should become fully agentic, and limited human interaction is the goal, is 100% the goal, and I think, honestly, of all usable products right now, I think we're the closest right now, of all usable products in an IDE.

Now, let me caveat this by saying, I think there are lots of hard problems that have yet to be solved that we need to go out and solve to actually make this happen. Like, for instance, I think one of the most annoying parts about the product is the fact that you need to accept every command that kind of gets run.

It's actually fairly annoying. I would like it to go out and run it. Unfortunately, me going out and running arbitrary binaries has some problems in that if it, like, RMRs my hard disk, I'm not gonna be-- - It's actually a virus. - I'm not gonna actually-- - It's a virus, yeah.

- The hacker needs to be with you. Yeah, it does become a virus. I think this is solvable with, like, with complex systems. I think we love working on complex systems infrastructure. I think we'll solve it. Now, the simpler way to go about solving this is, don't run it on the user's machine and run it somewhere else because then if you bork that machine, you're kind of totally fine.

Now, I think, though, maybe there's a little bit of trade-off of, like, running it locally versus remotely, and I think we might change our mind on this, but I think the goal for this is not for this to be the final state. I think the goal for this is, A, it's actually able to do very complex tasks with limited human interaction, but it needs to know when to actually go back to the human.

Also, on top of that, compress every cycle that the agent is running. Right now, actually, I even feel like the product is too slow for me sometimes right now. Even with it running really fast, it's objectively pretty fast. I would still want it to be faster, right? So there is, like, systems work and probably modeling work that needs to happen there to make the product even faster on both the retrieval side and the generation side, right?

And then finally speaking, I think another key piece here that's, like, really important is I actually think asking people to do things explicitly is probably going to be more of an anti-pattern if we can actually go and passively suggest the entire change for the user. So almost imagine, as the user is using the product, that we're gonna suggest the remainder of the PR without the user kind of, like, even asking us for it.

I think this is sort of the beginning for it, but yeah, like, these are hard problems. I can't give a particular deadline for this. I think this is, like, a big step up than what we had, particularly in the past. But I think what Anshul said is 100% true, but the goal is for us to get better at this.

- I mean, the remote execution thing is interesting. You wrote a post about the end of localhost. - Yeah. - And that's almost like, then we were kind of like, well, no, maybe we do need the internet and, like, people want to run things. But now it's like, okay, no, actually, I don't really care.

Like, I want the model to do the thing. And if you were, like, you can do a task end to end, but it needs to run remotely, not on your computer. I'm sure most people would say, yeah. - No, I agree with that. - I'm cool with it. - I actually agree with it running remotely.

That's not a security issue. I actually, I totally agree with you that it's possible that everything could run remotely. - That's how it is at most, like, big companies, like, Facebook, like, nobody runs things locally. - No one does. In fact, you connect to a remote machine. - It's essentially to the mainframe.

- You're right on that. Maybe the one thing that I do think is kind of important for these systems that is more than just running remotely is basically, like, you know, when you look at these agents, there's kind of, like, a rollout of a trajectory. And I kind of want to roll this trajectory back, right?

In some ways, I want, like, a snapshot of the system that I can, like, constantly checkpoint and move back and forth. And then also on top of that, I might want to do multiple rollouts of this. So basically, I think there needs to be a way to almost, like, move forward and move backwards the system.

And whether that's locally or remotely, I think that's necessary. But every time, if you move the system forward, it, like, destroys your machine. It's probably going to be a hard system to kind of, or potentially destroys your machine. That's just not a workable solution. So I think the local versus remote, I think you still need to solve the problem of this thing is not going to destroy your machine on every execution, if that makes sense.

- Yeah, yeah. There is a category of emerging infrastructure providers that are working on time travel VMs. - And if Verne's first episode on this podcast was an indication, we like infrastructure problems. - Yeah, okay. All right, oh, so you're going there. All right, okay. - Well, that's funny, right?

It's like, when we first had you, you were doing so much on, like, actual model inference, optimization, all these things. And today, it's almost like-- - It's Cloud, it's 4.0. - It's like, you know, people are, like, forgetting about the model. You know, and now it's all about at a higher level of extraction.

- Yeah. So maybe I can say, like, a little bit about how our strategy on this has, like, evolved. Because it objectively has, right? I think I would be lying if I said it hasn't. The things, like, autocomplete and supercomplete that run on every keystroke are entirely, like, our own models.

And by the way, that is still because properties, like FIM, fill-in-the-middle capabilities, are still quite bad with the current model. - Non-existent. - They're all, they're very bad. Non-existent, they're not good, actually, at it. - Because FIM is an actual, like, how you order the tokens. - It's how you order the tokens, actually, in some ways.

And this is the, this is sort of, if you look at what these products have sort of become, and this is great, is a lot of the clods in the open AIs have focused on, kind of, the chat, like, assistant API, where it's, like, complete pieces of work, message, another complete piece of work.

So multi-turn, kind of, back-and-forth systems. In fact, like, actually, even these systems are not that good at making point changes. When they make point changes, they kind of are, like, off here and there by a little bit. Because, yeah, when you, like, are doing multi-point, kind of, like, conversations, it's, you know, exact GIFs getting applied is not, like, even a perfect science, still yet.

So we care about that. The second piece where we've actually, sort of, trained our models is actually on the retrieval system. And this is not even for embedding, but, like, actually being able to use high-powered LLMs to be able to do much higher-quality retrieval across the code base, right?

So this is actually what Anshul said. For a lot of the systems, we do believe embeddings work, but for complex questions, we don't believe embeddings can encapsulate all the granularity of a particular query. Like, imagine I have a question on a code base of, find me all quadratic-time algorithms in this code base.

Do we genuinely believe the embedding can encapsulate the fact that this function is a quadratic-time function? No, I don't think it does. So you are going to get extremely poor precision recall at this task. So we need to apply something a little more high-powered to actually go out and do that.

So we've actually built, like, large distributed systems to actually go out and run these at scale, run custom models at scale across large code bases. So I think it's more a question of that. The planning models right now, undoubtedly, I think the CLODS and the OpenAIs have the best products.

I think LLAMA4, depending on where it goes, it could be materially better. It's very clear that they're willing to invest a similar amount of compute as the OpenAIs and the Anthropics. So we'll see. I would be very happy if they got really good, but unclear so far. - Don't forget Grok.

- Hey, dude, I think Grok is also possible. - Yeah. - Right? I think, don't doubt Elon, yeah. - Okay, so I didn't actually know, it's not obvious when I use Cascade. I should also mention that, you know, I was part of the preview, thanks for letting me in, and I've been maining Windsurf for a long time.

It's not actually obvious. You don't make it obvious that you are running your own models. - Yeah. - I feel like you should, so that, like, I feel like it has more differentiation. Like, I only have exclusive access to your models via your IDE than having the dropdown that says Cloud and 4.0, 'cause I actually thought that was what you did.

- No, so actually, the way it works is the high-level planning that is going on in the model is actually getting done with products like the Cloud, but the extremely fast retrieval, as well as the ability to, like, take the high-level plan and actually apply it to the code base is proprietary systems that are running internally.

- And then the stuff that you said about embeddings not being enough, are you familiar with the concept of late interaction? - No, I actually have never-- - Yeah, so this is Colbert, or the guy, Omar Khattab, from, I think, Stanford, has been promoting this a lot. It is basically what you've done.

- Okay. (laughs) - Sort of embedding on retrieval rather than pre-embedding. - Okay, that's a-- - In a very loose sense. - I think that sounds like a very good idea that is very similar to what we're doing. - Sounds like a very good idea. (laughs) I don't think we'd say that.

- That's like the meme of Obama giving himself a medal right there. (laughs) - Well, I mean, there might be something to learn from contrasting the ideas and seeing where-- - Absolutely. - Like, the subtle opinion differences. It's also been applied very effectively to vision understanding. Because vision models tend to just consume the whole image, if you are able to sort of focus on images based on the query, I think that can get you a lot of extra performance.

- The basic idea of using compute in a distributed manner to do operations over a whole set of raw data rather than a materialized view is not anything new, right? I think it's just like, how does that look like for LLMs? - When I hear you say build large distributed systems, you have a very strange product strategy of going down to the individual developer but also to the large enterprise.

Is it the same in front that serves everything? - I think the answer to that is yes. The answer to that is yes. And the only reason why for us the answer is yes. And to be honest, our company is a lot more complex than I think if we just wanted to serve the individual.

And I'll tell you that because we don't really pay other providers to do things for our indexing. We don't pay other providers to do our serving of our own customer models, right? And I think that's a core competency within our company that we have decided to build. But that's also enabled us to go and make sure that when we're serving these products in an environment that works for these large enterprises, we're not going out and being like, we need to build this custom system for you guys.

This is the same system that serves our entire user base. So that is a very unique decision we've taken as a company. And we admit that there are probably faster ways that we could have done this. - I was thinking, you know, when I was working with you for your enterprise piece, I was thinking like this philosophy of go slow to go fast, like build deliberately for the right level of abstraction that can serve the market that you really are going after.

- Yeah, I mean, I would say like I was writing, when writing that piece, you're like looking back and reading it back, it sounds so like almost obvious and not all of those are really conscious decisions we made, like I'll be the first to admit that. But like, it does help, right?

When we go to like an enterprise that has tens of thousands of developers, and they're like, oh, wow, like, you know, we have tens of thousands of developers and like, does your infrastructure work for tens of thousands of developers? We can turn around and be like, well, we have hundreds of thousands of developers or an individual plan that we're serving.

Like, I think we'll be able to support you, right? So like, being able to do those things, like we started off by just like, let's give it to individuals, let's see what people like and what they don't like and learn. But then those become value propositions when we go to the enterprise.

- And to recap, when you first came on the pod, it was like auto-completion is free and co-pilot was 10 bucks a month. And you said, look, what we care about is building things on top of code completion. How did you decide to like just not focus on like short-term kind of like growth monetization of like the individual developer and like build some of this because the alternative would have been, hey, all these people are using it.

It's like, we're going to make this other like five bucks a month plan monetized. - I think this might be a little bit of like commercial instinct that the company has and unclear if the commercial instinct is right. I think that right now, optimizing for making money off of individual developers is probably the wrong actually strategy.

Largely because I think individual developers can switch off of products like very quickly. And unless we have like a very large lead, trying to optimize for making a lot of profit off of individual developers, it's probably something that someone else could just vaporize very quickly. And then they move to another product.

And I'm going to say this very honestly, right? Like when you use a product like Codium on the individual side, there's not much that prevents you to switch onto another product. I think that will change with time as the products get better and better and deeper and deeper. I constantly say this, like there's a book in business called like seven powers.

And I think one of the powers that a business like ours need to have is like real switching costs. But like you first need something in the product that makes people switch on and stay on before you think about how do you make people switch off. And I think for us, we believe that there's probably much more differentiation we can derive in the enterprise by working with these large companies in a way that is like, that is interesting and scalable for them.

Like I'll be maybe more concrete here. Individual developers are much more sort of tuned towards small price changes. They care a lot more, right? Like if our product is 10, 20 bucks a month instead of 50 or 100 bucks a month, that matters to them a lot. And for a large company where they're already spending billions of dollars on software, this is much less important.

So you can actually solve maybe deeper problems for them and you can actually kind of provide more differentiation on that angle. Whereas I think individual developers could be churny as long as we don't have the best product. So focus on being the best product, not trying to like take price and make a lot of money off of people.

And I don't think we will, for the foreseeable future, try to be a company that tries to make a lot of money off individual developers. - I mean, that makes sense. So why $10 a month for Windsurf? - Why $10? - $10 a month was actually the pro plan.

So we launched our individual pro plan before Windsurf existed. Because I think there's, let's all chat. We all said to be financially responsible. (laughing) - Yeah, yeah, we can't run out of money. - Maybe it's $150 million, you're good. - There's a lot of things because of our infrastructure background we can give for essentially free, like unlimited auto-complete, unlimited chat on our faster models.

We give a lot of things actually out for free. But yeah, when we started doing things like the super-completes and really large amounts of indexing and all these things, there is real cogs here. We can't ignore that. And so we just created $10 a month pro plan mostly just to cover the cost.

We're not really operating, I think, on much of a margin there either. But like, okay, just to cover us there. So for Windsurf, it just ended up being the same thing. And everyone who downloads Windsurf in the first, I forget, a couple of weeks, they get two weeks for free.

Let's just have people try it out, let us know what they like, what they don't like. And that's how we've always operated. - I've talked to a lot of CTOs in the Fortune 100 where most of the engineers they have, they don't really do much anyway. The problem is not that the developer costs 200K and you're saving 8K.

It's like that developer should not be paid 200K. But that's kind of like the base price. But then you have developers getting paid 200K, they should be paid 500K. So it's almost like you're averaging out the price because most people are actually not that productive anyway. So if you make them 20% more productive, they're still not very productive.

And I don't know in the future, is it that the junior developer salary is like 50K? And it's like the bottom of the end gets kind of like squeezed out and then the top end gets squeezed up? - Yeah, maybe Alessio, one thing that I think about a lot, because I do think about this, the per seat, anything, all of this stuff, I think about a good deal.

Let's take a product like Office 365. I will say a lawyer at Codium uses Microsoft Word way more than I do. I'm still footing the same bill. But the amount of value that he's driving from Office 365 is probably tens of thousands of dollars. By the way, everyone, Google Docs, great product.

Microsoft Word is a crazy product. It made it so that the moment you review anything in Microsoft Word, the only way you can review it is with other people in Microsoft Word. It's like this virus that penetrates everything. And it not only penetrates it within the company, it penetrates it across company too.

The amount of value it's driving is way higher for him. So for these kinds of products, there's always going to be, for these kinds of products, this variance between who gets value from these products. And you're right, it's almost like a blended. 'Cause you're actually totally right. Probably this company should be paying that one developer maybe like four times as much.

But in a weird way, software is this team activity enough that there's a bunch of blended outcomes. But hey, 20% of the four times, and there are four people, is still going to cover the cost across the four individuals, right? And that's how roughly these products kind of get priced out.

- I mean, more than about pricing, this is about the future of the software engineer. - We could be very wrong also. - Yeah, I think nobody knows. - Reserve the right to be incredibly off. - Yeah. - I mean, business model does impact the product, product does impact the user experience.

So it's all of a kind, I don't mind. We are, we do, are as concerned about the business of tech as the tech itself. - That's cool. - Speaking of which, there's other listener questions. Shout out to Daniel Infeld, who's pretty active in our Discord, just asking all these things.

Multi-agent, very, very hot and popular, especially from like the Microsoft research point of view. Have you made any explorations there? - I think we have. I don't think we've called it a multi-agent, which is more so like once you, this notion of having many trajectories that you can spawn off, that kind of like validate sort of some different hypotheses and you can kind of pick the most interesting one.

This is stuff that we've actually analyzed internally at the company. By the way, the reason why we have not put these things in, actually, is partially because we can't go out and execute some random stuff in peril in the meantime. In the meantime-- - Because of the side effects.

- Because of the side effects, right? So there are some things that are a little bit dependent on us unlocking more and more functionality internally. And then the other thing is, in the short term, I think there is like also a latency component. And I think all of these things can kind of be solved.

I actually believe all of these things are solvable from. They're not unsolvable from. And if you want to run all of them in parallel, you probably don't want end machines to go out and do it. I think that's unnecessary, especially if most of them are I/O bound kind of operations where all you're doing is reading a little bit of data and writing out a little bit of data.

It's not extremely compute intensive. I think that it's a good idea and probably something we will pursue and is going to be in the product. - I'm still processing what you just said about things being I/O bound. So you can, so for a certain class of concurrency, you can actually just run it all in one machine.

- Yeah, why not? Because if you look at the changes that are made, right? And for some of these, it's writing out like what, a couple of thousand bytes? Maybe like tens of thousands of bytes on every, it's not a lot, very small. - What's next for Cascade or Windsurf?

- Oh, there's a lot. I don't know. We did an internal poll and we were just like, are you more excited about this launcher or the launch that's happening in a month? Or like what we're going to come out with in a month? And it was like almost uniformly in a month.

I think like, you know, there's some like obvious ones. I don't know how much for you want to say. I don't want to speak of, but I think you'd look at all the same axes of the system, right? Like how can we improve the knowledge retrieval? Like we'll always keep on figuring out how to improve knowledge retrieval.

In the, in our launch video, we even showed some of like the early explorations we have about looking into other data sources. That might not be the coolest thing to the individual developer building a zero to one app, but you can really believe that like the enterprise customers really think that that's very cool, right?

I think on the tool side, I think there's a whole lot more that we can do. I mean, of course, I mean, everyone's talked about not just suggesting the terminal command, but actually executing them. Like, I think that's going to be huge. Unlock, you look at that, the actions that people are taking, right?

Like the human actions, the trajectories that we can build, like how can we make that even more detailed? And I think all of those things, and you make some like even a cleaner UI, like the idea of looking at future trajectories, trying a few different things, and like suggesting potential next actions to be taken.

That doesn't really exist yet, but like it's pretty obvious, I think, how that would look like, right? You open up Cascade, and instead of like starting typing, it's just like, here's a bunch of things that we want to do. We kind of joke that's like Clippy's coming back, but like maybe now's the time for Clippy to really shine, right?

So I think there's a lot of ways that we can take this, which I think is like the very exciting part. We're calling each of our launches waves, I believe, because we want to really double down on the aquatic themes. - Oh yeah, does someone actually windsurf for the company?

Is that? - I don't think. (laughing) - We're living out our dream of being cool enough to windsurf through the rocks. - Yeah, I don't think we can. - Yeah, all right. (laughing) - That's something we learned, 'cause I don't think any of us are windsurfers. Like in our launch video, we have someone like using windsurf on a windsurf.

That was like-- - You saw that? - You saw that in the beginning of the video. Someone has a computer. And we didn't realize like now apparently is like the time of the year where there's like not enough wind to windsurf. So we were trying to figure out how to do this like, you know, launch video with windsurf on the windsurf.

Every windsurfer we talked to were like, yeah, it's not possible. And there was like one crazy guy who was like, yeah, I think we can do this. And we made it happen. - Oh, okay. - That's funny. Is there anything that you want feedback on? Like maybe there's a fork in the road, you want feedback, you want people to respond to this podcast and tell you what they want?

- Yeah, I think there's a lot of things that I think could be more polished about the product that we'd like to improve. Lots of different environments that we're going to improve performance on. And I think we would love to hear from folks across the gamut. Like, hey, like, if you have this environment, you use Windows and X version, it didn't work or this language.

- Oh yeah. - It was like very poor. I think we would like to hear it. - Yeah, I gave Prep and Kevin a lot of shit for my Python issues. - Yeah, yeah, yeah. And I think there's a lot to kind of improve on the environment side. I think like, for instance, even just a dumb example, and I think sort of this was a common one.

It's like, yeah, like the virtual environment, where's the terminal running? What is all this stuff? These are all basic things that like, to be honest, this is not rocket science, but we need to just fix it, right? We need to fix it. So we would love to hear like all the feedback from the product.

Like, was it too slow? Where was it too slow? What kind of environments could it work way more in? There's a lot of things that we don't know. We, luckily, we're daily users of the product internally. So we're getting a lot of feedback inside, but I will say like, there's a little bit of Silicon Valley-ism in that a lot of us develop on Mac.

A lot of people, once again, over 80% of developers are on Windows. So yeah, there's a lot to learn and probably a lot of improvements down the line. - Are you personally tempted, as your CEO of the company, to switch to Windows just to feel something? (laughing) - Feel the Windows.

- You know what? - To feel the pain. - You know what? Maybe I should. Actually, I think I will. - Your customers, everyone says it's 80, 90% are on Windows. You live in Windows. You will never not see something that missed. - So I think in the beginning, part of the reason why we were hesitant to do that was a lot of our architectural decisions to work on across every IDE was because we built a platform-agnostic way of running the system on the user's local machine that was only buildable, easily buildable on dev containers that lived on a particular type of platform.

So Mac was nice for that. But now, there's not really an excuse if I can also make changes to the UI and stuff like that. And yeah, WSL also exists. That's actually something that we need to add to the product. That's how early it is that we have not actually added that.

- We don't have remote. - Anything else about Codium at large? You still have your core business of the enterprise Codium. Anything moving there or anything that people should know about? - Anshul, you want to take that? - I think a lot are still moving there. I think it would be a little bit very kind of egotistical of us to be like, oh, we have Windsurf now.

All of our enterprise customers are going to switch to Windsurf and this is the only, like, no, we still support the other IDE. - You just talked about your Java guys loving JetBrains. They're never going to leave JetBrains. - They're not. Like, I mean, forget JetBrains. There's still tons and tons of enterprise people on Eclipse.

Like, we're still the only code assistant that has an extension in Eclipse. That's still true years in, right? And but like, that's because that's our enterprise customers. And the way that we always think about it is like, how do we still maximize the value of AI for every developer?

And I don't think that part of who we are has changed since the beginning, right? And there's a lot of like, meeting the developers where they are. So I think on the enterprise side, we're still pretty invested in doing that. We have like a team of engineers dedicated just to making enterprise successful and thinking about the enterprise problems.

But really, if we think about it from the really macro perspective, it's like, if we can solve all the enterprise problems for enterprise, and we have products that developers themselves just truly, truly love, then we're solving the problem from both sides. And I think it's one of those things where, I think when we started working with the enterprise and we started building like dev tools, right?

We started as an infrastructure company, now we're building dev tools for developers. You really quickly understand and realize just how much developers loving the tool make us successful in an enterprise. There's a lot of enterprise software that developers hate. - I want to draw this flywheel. - But like, we're giving a tool for people where they're doing their most important work.

They have to love it. And it's not like, we're like trying to convince, the executives at this company also ask their developers a lot, do you love this? Like, that is like, almost always a key aspect of whether or not Codium is accepted or not, like into the organization.

I don't think we go from zero to 10 million ARR in less than a year in an enterprise product if we don't have a product that developers love. So I think that's why we're just, the IDE is more of a developer love kind of play. It will eventually make it to the enterprise.

We still solve the enterprise problems. And again, we could be completely wrong about this, but we hope we're solving the right problems. - It's interesting, I asked you this before we started rolling, but like, it's the same team. That's the same engine team. Like I, in any normal company or like, you know, my normal mental model of company construction, if you were to have like effectively two products like this, like you would have two different teams serving two different needs, but it's the same team.

- Yeah, I think one of the things that's maybe unique about our company is like, this has not been one company the whole time, right? Like we were first, like this GPU virtualization company pivoted to this. And then after that, we're making some changes. And like, I think there's like a versatility of the company and like this ability to move where we think the instinct, we have this instinct where, and by the way, the instinct could be wrong, but if we smell something, we're going to move fast.

And I think it's more a testament to I think the engineering team rather than any one of us. - I'm sure you had December 19th, 2022. You had one of our guests post, "What building Copilot for Rx really takes?" Estimate inference to figure out latency quality. Build first party instead of using third party as API.

Figure out real time because ChadGBT and Dali, Dali, the RIP, are too slow. Optimize prompt because context window's limited, which is maybe not that true anymore. And then merge model outputs with the UX to make the product more intuitive. Is there anything you would add? - I'd give myself like a B minus on that.

- Yeah, no, it's pretty good. - So some parts of that are accurate. Even like the context, like the one that you called out, like yeah, models have like larger context. Like so now that's absolutely true. Like it's grown a lot, but look at like an enterprise code base.

They have like tens of millions of lines of code. That's hundreds of billions of tokens. - Never going to change. - Still being really good at like being able to piece together this like distributed knowledge for what's important. So I think like there are figures there that I think are still pretty accurate.

There's probably some that are less so. - First party versus third party. - First party versus third party. I think we're like, we're wrong there. - We just got it wrong. - I think I would nuance that to be like, there are certain things that it's really important to first party.

Like, you know, autocomplete, you have a really specific application that you can't just prompt engineer your way out of or just maybe even like fine tune afterwards. Like you just can't do that. I think there's truth there, but like, let's also be realistic. Like the stuff that's coming out for the third model providers, like Cascade and Windsurf would not have been possible if it wasn't for the rapid improvements with 4.0 and 3.5.

Like that just wouldn't have been possible. So I'll give myself a B minus. I'll say I passed, but yes, two hours, two years later. - Just to be clear, we're not grading. It's more of a, what would you, you know. - Like where are they now? - What would you have added?

What would you like? - Yeah, I mean like that first post, right? Like that was when we had literally, I think that was like a few weeks after we had launched Codium. I think that's like, you know, Swiggs and I were talking like maybe we can write this 'cause we're like one of the first products that people can actually use with AI.

That's cool. - I specifically like the Co-Pilot for X thing 'cause everyone, it is so hot. Like everyone wanted to do Co-Pilot. - Everyone was just like, you know, chat GPT, that's all that was. So, but I think like, you know, that we didn't have an enterprise product. I don't even think we were like necessarily thinking of an enterprise product at that point, right?

So like all of the learnings that like, you know, we've had from the enterprise perspective, which is why I loved coming back for like a third time now on the blog. Some of those, I think we kind of like figured, some of those we just honestly walked backwards into.

Had to get lucky a lot of the ways. Like we had many, we just did a lot. Like there's so many of like opportunities and deals that we had that we like lost for a variety of reasons that we had to like learn from. There's just so much more to add that there's no way I would have gotten that right in 2022.

- Can I mention one thing that I think is, hopefully this is not very controversial, but it's like true about our engineering team as a whole. I don't think most of us got much value from ChaiGBT. Largely because I think the problem was, and this is maybe a little bit of a different thing.

It's like a lot of the engineers at the company who have been writing software for like over eight years. And this is not to say they know everything that ChaiGBT knows, they don't. They'd already gotten good enough at searching for Stack Overflow. Invested a lot in searching code base, right?

They can very quickly grab through the code. Incredibly fast, like every tool. And they've spent like eight years mastering that skill. And ChaiGBT being this thing on the side that you need to provide a lot of context to, we were not able to actually get, like my co-founder just basically never used ChaiGBT at all.

Literally never did. And because of that, probably at the time, one of our incorrect sort of assumptions was probably that, hey, like a lot of these passive systems need to get good because they're always there and these active systems are going to be behind. I think actually Cascade was the big thing.

Is there a company where everyone is now using it? Literally everyone. Biggest skeptics. And we have a lot of people at the company that are skeptical of AI. I think this is actually important. Don't hire them. No, I think here's the important thing. Those people that were skeptical about AI previously worked in autonomous vehicles.

These are not crypto people. These are people that care about technology and want to work on the future. Their bar for good is just very high. They will not form a cult of this is awesome. This is going to change the world. They were not going to be the kind of people on Twitter that are like, yeah, this changes everything.

Like software as we know it is dead. No, there are people that are going to be incredibly honest. And we know if we hit the bar that is good for them, we found something special. And I think at that time, we probably had a lot of sentiment like that.

That has changed a lot now. And I think it's actually important that you have believers that are incredibly future looking and people that kind of reign it in. Because otherwise you just have, you know, this is like autonomous vehicles. You have a very discreet problem. People are just working in a vacuum.

And there's no signal to kind of bring you down to reality. But you have no good way to kill ideas. And there are a lot of ideas we're going to come up with that are just terrible ideas. But we need to come up with terrible ideas. Otherwise, like how does anything good come out?

And I don't want to call these skeptics. Skeptics suggest that they don't know. They're realists. They're the type of people that when they see waitlist on a product online, they just will not believe it. They will not think about it at all. - Kudos for launching without a waitlist.

- Yeah, yeah. By the way, we will never launch with a waitlist. We will never launch with a waitlist. That's the thing at the company. We'd much rather be a company that's considered the boring company than a company that launches once in a while and like, hopefully it's good.

- My joke is generative AI has gotten really good at generating waitlists. - Yeah, absolutely. Also, just to clarify, both of us used to work in autonomous vehicles, so it doesn't come across as-- - Oh, yeah, yeah. - Autonomous driving. - Autonomous vehicles. No, we love that technology. - We love it.

Like, I love hard technology problems. That's what I live for. - Amazing. Just push back on the first party thing. I accept that the large model labs have just like done a lot of work for you that you didn't need to duplicate. But you now are sitting on so much proprietary data that it may be worth training on the trajectories that you're collecting.

So maybe it's a pendulum back to first party. - Yeah, I mean, I think like, I mean, we've been pretty clear from like a security posture perspective. Like, I think there's like, both like, you know, customer trust and like-- - I mean, I kind of want, like, let me opt out.

- I think that there is signals that we do get from our users that we can utilize. Like, there's a lot of preference information that we get, for example. - Which is effectively what you're saying, of like, our trajectories-- - Go ahead. - Our trajectory's good. We, like, I will say this, the super complete product that we have has gotten materially better because of us not only using synthetic data, but also getting the preference data from our users of like, hey, given these set of trajectories, here's actually what a good outcome is.

And in fact, one of the really beautiful parts about our product that is very different than a chat GPT is, we can not only see if the acceptance happened, but if something more than the acceptance happened, and it happened even more than that, right? Like, let's say you accepted it, but then after accepting it, you deleted three or four items in there.

We can see that. So that actually lets us get to even better than acceptance as a metric, because we're in the ultimate work output of the developer. - It's the preference between the acceptance and what actually happened. If you can actually get ground truth of what actually happened, this is the beauty of being an ID, then like, yeah, you get a lot, a lot of information there.

So that's data that's helpful. - Did you have this with the extension, or is this pure Windsurf? - We had this with the extension, yes. - Yeah, okay, all right. - Yes. - The Windsurf just gives you more of the ID. - Yes. So that means you can also start getting more information.

Like, for instance, the basic thing that Anshul said, we can see if like, a file explorer was opened. It's actually just a piece of information we just cannot see previously. - Sure. A lot of intent in there, a lot of intent. - Second one. - Oh boy. - How to make AI UX your mode.

- Oh man, isn't that funny that we now created like, the full UX experience in an ID? I think that one, that one is pretty accurate. - That one's an A? - I think that one, I'll give myself, I think like, we were doing that within the extensions. I still think that's true within the extensions as well, right?

Like, we got very, very creative with things. Like, Roon mentioned the idea of just like, you know, essentially rendering images to display things. Like, we get creative to figure out what the right UX is doing there. Like, we could create a really like, dumb UX that's like, a side panel, like, whatever.

Like, but no, actually going that extra mile does make that experience as good as it possibly can there. But yeah, now like, look at some of the UX that we're able to build in like, in Windsurf. And it's just like, it's fun. The first time I saw, 'cause now we can do command in the terminal.

Like, you can not have to search for a batch command. The first time I saw that, I was like, I just started smiling. And like, it's like, it's not, it's not like, Cascade. It's not like, a gentic system right in the lab. But I'm like, that is just a very, very cool UX.

- We literally couldn't do that in VSCode. - Yeah, I understand that, yeah. I've implemented a 60-line batch command called please. And you can, you know, do that. - Oh, wow, that's cool. - Yeah, so please, English, and then-- - You know, that's actually really cool because one of the things I think we believe in is actually, I like products like autocomplete more than command, purely because I don't even want to open anything up.

So that thing where I just can type and not have to press some button shortcuts to go in a different place, I actually like that too. - Yeah, and I actually adopted WARP, the terminal WARP, initially for that, 'cause they gave that away for free. But now it's everywhere, so I can turn off a WARP and not give Sequoia my batch commands.

(all laughing) - I'm with you. No, I use, I use WARP, no, no, look. I use, okay, I don't know. Gotta go on a rant. Hopefully somebody at WARP. This is like WARP product feedback. But they basically had this thing where you can do kind of like pound, and then write in Azure language.

But then they have also the auto-infer of what you're typing is natural language, and those are different. When you do the pound, it's only like it gives you a predetermined command. When you like talk to it, it generates a flow. It's a bit confusing of a UX, but going back to your post, you had the three Ps of AI/UX.

- What were they again? - Present, practical, powerful. - Actually, that was really good. I liked it. - And I think like in the beginning, being present was enough. Maybe even when you launch, it's like, oh, you have AI, that's cool, other people don't have it. Do you think we're still in the practical, where like the experience is actually, like the model doesn't even need to be that powerful, like just having better experience is enough?

Or like do you think like really being able to do the whole, because your point was like you're powerful when you generate a lot of value for the customer. You're like practical when you're basically like wrapping it in a nicer way. Yeah, where are we in the market? - I think there's always gonna be room for like practical UX, like getting it.

I mean, the command terminal, that's like a very practical UX, right? Like I do think with things like Cascade and these agentic systems, like we are starting to get onto powerful. 'Cause like there's so many pieces, like from a UX perspective that make Cascade really good. Like it's just really like micro things that are like just all over the place.

But as you know, we're streaming in, we're showing like the changes, we're like allowing you to jump and open diffs and see it. We can run background terminal commands. You can see what like is what's running, background processes are running. Like there's all these small UX things that together come to a really powerful and intuitive UX.

I think we're starting to get there. It's definitely just the start. And that's why we're so excited about where all this is gonna go. I think we're starting to see the glimpses of it. I'm excited. It's gonna be a whole new ballgame. - Yeah, awesome. First of all, it's just been really nice to work with you.

It's like, you know, I do work with a number of guests, posters, and you know, not everyone makes it through to the end and nobody else has done it three times. So kudos. (all laughing) This one was more like the money one, which I, you know, it's funny 'cause I think developers are like quite uninterested in money.

Isn't it weird? - Yeah, I mean, I think like, I don't know if this is just the nature of our company. Like, I think there's something you said, like there's all like the San Francisco AI companies and like, everyone's like hyping each other, like on the tech and everything, which is like great.

The tech's really important. We're here in Mountain View, beautiful office. We just really care about like actually driving value and making money, which is kind of like a core, a core part of the company, but yeah. - I think maybe the selfish way of saying that, or like a little more of the selfless way is like, yeah, we can be kind of like this VC funded company forever, but ultimately speaking, you know, if we actually want to transform the way software happens, we need this part of the business that's cash regenerative that enables us to actually invest tremendously in the software.

And that needs to be durable cash, be cash that like churns the next year. And we want to set ourselves up to be a company that is durable and can actually solve these problems. - Yeah, yeah, excellent. So for people, obviously we're going to link in the show notes, but for people who are listening to this for the first time, I had a lot of trouble naming this piece.

So we originally called it, you had like how to make money something. - I, I, it was, it was, I apologize. I was super bitty. I was, I think it was like writing part of that story, like on a plane flight. So I apologize for that. - No, you had like $3 signs in the title.

- Oh, I absolutely had $3. I was like, I'm, I can't do that. So it's either building AI for the enterprise. And then I also said the worst, the most dangerous thing an AI startup can do is build for other AI startups, which I think you all, both of you will co-sign.

And I think basically the main thesis, which I really liked was like, go slow to go fast. Like here's the, if you actually build for like security, compliance, personalization, usage analytics, latency budgets, and scale from the start, then you're going to pay that cost now, but eventually it's going to pay off in the long run.

And this is the actual insight. You cannot do this later. Like if you build the easy thing first as an MVP, it's like, yeah, I like just ship it with like whatever's easy to do. And then you tack on the enterprise ready.io set of like 12 things that you have, you actually end up with a different product or you end up worse off than if you had started from the beginning.

So that I had never heard before. - Yeah. I mean, we see that like repeatedly. I mean, just like right now, you know, we're have a lot of customers in like the defense space, for example, we're going through FedRAMP accreditation right now and people that we're working with, they saw like all the fact like, oh yeah, we already have a containerized system.

We can already like deploy in these manners. We can already do like X, we've already gone through like security. They're like, oh, you guys are going to have a much easier time doing this, right? Then most companies that are just like, okay, we have like a big SaaS blob and now we need to like do all these things.

It might sound like a really deep thing. I think like it's just anyone who's like worked like, you know, for like an extended period of time at like a company, like on a certain project has probably seen this happen, right? Like the technology just keeps on improving. And then you realize that like, you have to now like re-architect your whole system to get something improving.

Like just making that kind of change when you've invested so much effort, like people have like important hours, they're emotionally invested in whatever it might be. It's really hard to make that change. So I'm sure we're going to hit that also. Like, yes, I think we've done things a little bit earlier than most companies.

I think we're going to hit points where we're going to see parts of our systems where like, oh, we really need to re-architect that. Actually, we've definitely hit that already, right? And I think that's just like at like the project level, the product level, or is that like your whole company, right?

I think the thesis behind here is like, to some degree, your company needs to have this DNA from the beginning. And I think then you'll be able to go through those bumps a lot more smoother and be able to drive the value. I don't know, Varun. - Yeah, can I say two points?

So first point I'd like to say is, this is something that me and Douglas, my co-founder, talk about a lot. It's like, you know, there's this constant thing of like build versus buy. I think the answer is like, a lot of the time the answer should be buy, right?

Like we're not going to go build our own sales tool. We should go buy Salesforce, right? That's kind of dumb. That's undifferentiated. And the reason why you go with buy instead of build is, hey, like, look, the ROI of what exists out there is good. From like an opportunity cost standpoint, it's better to actually go out and buy it than build it and do a shittier job, right?

There's a company that's actually going out and focused on that. But here's the hidden thing that I think is like really important. When you go out and buy, you're losing a core competency inside the company. And that's a core competency you can never get. It's always very hard. Like startups are so limited on time.

Let me just say, like, let's say as a company, we did not invest in, I don't know, model inference. Yeah, we have like a custom inference runtime. We give that up right now. We will never get it back. It's going to be very hard to get it back, right?

- You can't just use VLM and Telsa RT. - I mean, that would be your only option. - Or just let me put it, if we use VLM, we would not be talking with you right now. Like, yeah, we would have, yeah. But the point is, this is more a question of like, like, you know, I try to think about it from first minutes, it was like, Google's a great company, makes a lot of money.

What happens if they actually made the search index of the product something that someone else bought for them? It's like, they could. Maybe someone else could have done a good job. Maybe that's like a bad example. But like, particularly because Google is a search index, but like, tough luck getting that core competency back.

You've lost it, right? And I think for us, it's more a question of like, what core competencies do we need inside the business? And yeah, like, sometimes it's painful. Like, sometimes actually, like, some of these core competencies are annoying. Sometimes we'll be behind, behind what exists out there. Right, and we need, just need to be very honest.

That's where the truth-seekingness of the company matters. Like, are we really honest about this core competency? Can we actually keep up? If the answer is we truly can't keep up, then why are we keeping up with the trade? We should just buy, right? Like, let's not build. If the answer is we can, and we think that this will differentiatedly make our company a better company in the long-term, then the answer is we need to.

We need to, because like, the race is not won in the next year. The race is won over the next, like, five, 10 years, right? Maybe even longer, right? So that's like, that's maybe one thing. And then the second thing, actually, from like, the enterprise standpoint, I think one of the unique parts of the company is now is, we have both this individual and enterprise side, and usually companies stick to one or the other.

And I think that needs to be part of the DNA. I think kind of early on in the company, as Anshul said, I mean, there's stories of companies like Dropbox and stuff that tried. And Dropbox is an amazing company, fantastic company, that one of the fastest-growing consumer companies of all time, consumer more on the software company of all time.

But yeah, like, when you have everyone sort of product-oriented on the consumer side, the enterprise is just, it's checking off a lot of boxes that ultimately do not help the consumer at all, doesn't help your growth metrics. And effectively, if the original group of people didn't care, it's incredibly hard to get them to care down the line, right?

It's incredibly hard, why do it? And you need to feel like, hey, this is like, this is an important part for the company's viability. So I think that there's a little bit of like, the build versus buy part, and then also like, the cultural DNA of the company that I think are both really important.

And yeah, it's something we think about all the time. - I have the privilege of being friends with you guys off the air. I don't feel like, like, I think I know your work histories. Like, you say cultural DNA, but like, it's not like you've built like, giant enterprise SaaS before, right?

- Yeah, I think, yeah. - So like, where are you getting this from? - Yeah, in fact, I think the only other sort of, I guess like, when I look at my previous internships, maybe Anshul can provide some context here. It's like, I worked at like, LinkedIn, and then Quora, and then Databricks.

And to be honest, like, I was not that interested in B2B ETL software that much. That's not what drives me when I wake up at night. So, because of that, I decided to go work in an autonomous vehicle company immediately after. I think part of it comes down to, maybe a little bit of the unique aspect of the company, and the fact that we pivoted as a company, is like, we want to be a durable company.

And then the question is, how do you work backwards from that? There's a lot of things about being very honest about what we're good at and what we're not good at. Like, I think, surprisingly, enterprise sales is like, not something that like, it came out of the womb knowing how to do.

I didn't really know. And because of that, like, obviously like, a lot of sales happen between sort of folks like Anshul and I helping partner with companies. But very soon, we hired actually a VP of sales, and we've actually been deeply involved in the process of scaling out like, a large go-to-market team.

And I think it's more a question of like, what matters to the company, and how do you actually go out and build it? And I think one of the people that I think about a lot, actually, is someone like Alex Wang. He dropped out of college. He was a year younger than us at MIT.

And he has figured out how to constantly change the direction of the company. Effectively, it starts out as like, you know, human task interface, then an AV labeling company, then a cataloging company, then now a generative AI labeling company. And every time, the revenue of the company kind of goes up by a factor of 10, even though the business is doing something largely different.

- I mean, now it's all about military contracts. - Yeah, now it's probably gonna be military, and then after that, it might be taking over the world. Like, he's just gonna keep increasing the stakes. And like, there's no playbook on how this really works. It's just a little bit of like, you know, solve a hard problem and work backwards from that, right?

- And we'll get lucky along the way. Like, I don't think like, you think everything from first principles to the best of our abilities, but there's just so many variable unknowns that, yeah, like, we don't know everything that's happening in every company out there, and everyone knows how fast the AI space is moving.

Like, we have to be pretty good at adapting. - I wanna double-click on one thing, just because you brought it up, and it's like a rare thing to touch on. VP of sales. We don't get to actually, we talk to pretty early-stage founders mostly. They don't usually have a pretty built-out sales function.

Advice, what kind of sales works in this kind of field? You know, what didn't work? You know, anything that you can share with other founders? - I think one of the hard parts about hiring people in sales, and I really, like, Graeme Unschul can also attest, like, we have amazing VP of sales at the company.

One of the things is like, if you're purely a developer salespeople, their job is to like, talk like, really well, prim and proper. I mean, very obvious if you hear like, me talk, like, I'm not a very polished person. - You're great, by the way. I don't know. - Compared to most pure, pure salespeople.

So actually, just checking based on the way they speak is not that interesting. I think like, you know, what matters in a space like ours that is very quickly, moving very quickly, I think is like, intellectual curiosity is very important. Intellectual horsepower. Understanding how to build a factory. I'm not trying to minimize it, but in some ways, you need to build something incredibly scalable here, right?

It's almost like every year, you're kind of making this factory twice, thrice, maybe as big, right? Because in some ways, you have people that are quota-carrying, you need some number of people, and you need to make the math work. And you actually, the process of building a factory is not something you can just take someone who is a great rep at another company and just make them build a factory.

This is actually a very different skill. How do you actually make sure you have hundreds of people that actually deeply understand the product? Actually, Onshore works very closely also with sales to make sure that they're enabled properly. Make sure that they understand the technology. Our technology is also changing very quickly.

Let's maybe take an example on how our company is very different than a company like MongoDB. When you sell a product like MongoDB, no one at the company is interested in how the data is being stored. It's not that interesting, right? I love databases. I would be interested. But most people are like, solve the application problem I have at hand.

People are curious about how our technology works. People are curious about RAG, right? People that are buying our technology. And imagine we had a sales team that is scaling where no one understands any of this stuff. We're not gonna be great partners to our customers. So how do you create almost this growing factory that is able to actually distribute the software in a way that is true to our partners?

And also at the same time, taking on all the new parts of our product, right? They're actually able to expound on new parts of our product. So, sorry, that was more a statement of building a scalable sales team. But in terms of who you hire is, you just need to have a sense.

In some ways, this is maybe an example of talk to enough people, find out what good looks like potentially in your category and find someone who's good and humble and willing to work with you. - Yeah, that's just generic hiring. - It's just generic hiring. - I think here, there's sales for AI or sales for AI infrastructure.

And then there's also the sales feeding into products in a way that we're talking about here, right? Where they basically tell you what they need. I imagine a lot of that happened. - I think a lot of that happened. I mean, it still happened. Varun mentioned, Varun, myself, a number of other people who are developers by trade engineers.

We're pretty involved in the sales process 'cause there's a lot to learn, right? Before we went out and hired a sales leader, yeah, if all we went is like, neither of us had ever done a sale for Codium in our lives and we went and tried to find a sales leader, we probably would have not hired the right person.

- Yeah, we had sold a product to like 30 or 40 customers at that time. - We had done like hundreds and hundreds of deals cycles ourselves personally, right? Without, I mean, we read a lot of books and we just did a lot of stuff and we learned like what messaging worked, like what did we need to do?

And then I think we found like the right person, right? Second Varun, like Graham's amazing and who we brought on as our VP of sales. That just has to be part of the nature and it doesn't stop now. Like just because we have a VP of sales and people dedicated to sales, it doesn't stop that we can't be involved or like engineering can't be involved, right?

Like we have lots of people, like we hire plenty of deployed engineers, right? These are people like, you know, I think like Palantir kind of made this really famous. - Four deployed engineers. - Like deployed engineers like work very, very closely with the sales team on very technical aspects because they can also understand like what are people trying to do with AI?

- As in they work at Codium as deployed engineers? - Yeah. - Okay. - And then they partner with our account executives to like make our customers successful and like learn what is it that people are actually getting value with AI, right? And like that's information that we keep on collating.

And it's like, we will both jump into any deal cycle just to learn more because that's how we're going to just keep on building like the best product. It comes back to the same, like just care, I don't know. And hopefully we build the right thing. - Cool guys.

Thank you for the time. It's great to have you back on the pod. - Yeah, thanks a lot for having us. Hopefully in a year we can do another one. - Yeah. - We'll be a billion by then. - Yeah, exactly. At this rate, then next year. - We try not thinking about that.

- Try to not be a zero billion company. - That's, well, I'm glad there's that, yeah. - All right, cool, that's it. - Awesome. (upbeat music) (upbeat music) (upbeat music) (upbeat music) (upbeat music)

Windsurf: The Enterprise AI IDE

Chapters

Transcript